normalization

Why my regression model always be dominanted by one feature?

nick

2022年5月30日 08:02

I am working on a financial predict problem. which means it is a time series prediction problem. I have three features, which have high correlation(each two's corr is about 0.6) And I do the linear regression fit. I assume that the coefficient should be similiar among these three features, but i get a coefficient vector like this: [0.01, 0.15, 0.01] which means the second features have the biggest coff(features are normalized), and it can dominant the prediction result. I dont …

Topic: normalization regression feature-selection machine-learning

Category: Data Science

why zero centring of data from activation function is good for deep nural network?

Sahil Lohiya

2022年5月29日 18:50

I was reading an article that mentioned reasons why tanh is better than sigmoid and one reason was that tanh gives zero-centered data but I couldn't understand why and how it will affect our network. kindly give math light, intuitive answers.

Topic: activation-function normalization deep-learning statistics machine-learning

Category: Data Science

PCA and orange software

ACIU

2022年5月23日 19:00

I am analysing if 15 books can be grouped according to 6 variables (of the 15 books, 2 are written by an author, 6 by an other one, and 7 by an other one). I counted the number of occurrences of the variables and I calculated the percentage. Then I used Orange software to use PCA. I uploaded the file. selected the columns and row. And when it comes to PCA the program asks me if I want to normalize …

Topic: orange3 pca orange normalization

Category: Data Science

What parameters to use when normalising training, validation, and testing data?

Rocky the Owl

2022年5月19日 19:02

I know a similar post was made here, but I wanted to ask some follow up questions. I am conducting a cross-validation search to find values of a set of hyper-parameters and need to normalise the data. If we split up the data as follows: 'Training' (call this set 'A' for now) and testing data Split the 'training' into training (call this set 'B' for now) and validation sets what parameters should be used when normalising the datasets? Am I …

Topic: training normalization cross-validation python

Category: Data Science

Generalize min-max scaling to vectors

Gilad Deutsch

2022年5月18日 15:08

I am combining several vectors, where each vector is a certain kind of embedding of some object. Since each embedding is very different (some have all components between $[0, 1]$ some have components in the range of around 60 or 70 etc.) I want to rescale the vectors before combining them. I thought about using something like min-max rescaling, but I'm not sure how to generalize it to vectors. I could do something of the sort - $\frac{v-|v_{min}|}{|v_{max}|-|v_{min}|)}$ but I …

Topic: embeddings normalization feature-scaling

Category: Data Science

Data normalization in nonstationary data classification with Learn++.NSE based on MLP

Alexander Okunev

2022年5月17日 19:02

I need to predict technical aggregate condition using vibration monitoring data. We consider this data to be nonstationary i.e. distribution parameters and descriptive statistics are not constant. I found that one of the best algorithms for such tasks in Learn++.NSE and we us it with MLP as a base classifier. As I know, it's necessary no normalize data for operations with ANN. We decided to normalize using mean, stdev and sigmoidal function. We train networks of ensemble with sets with …

Topic: normalization neural-network time-series

Category: Data Science

Do I need to encode numerical variables like "year"?

smarks70

2022年5月9日 11:56

I have a simple time-series dataset. it has a date-time feature column. user,amount,date,job chris, 9500, 05/19/2022, clean chris, 14600, 05/12/2021, clean chris, 67900, 03/27/2021, cooking chris, 495900, 04/25/2021, fixing Using Pandas, I split this column into multiple features like year, month, day. ## Convert Date Coloumn into Date Time type data["date"] = pd.to_datetime(data["date"], errors="coerce") ## Order by User and Date data = data.sort_values(by=["user", "date"]) ## Split Date into Year, Month, Day data["year"] = data["date"].dt.year data["month"] = data["date"].dt.month data["day"] = data["date"].dt.day …

Topic: normalization feature-scaling encoding dataset

Category: Data Science

Proper iteration over time series data for LSTM neural network

Հայկ Ավետիսյան

2022年5月9日 11:45

I’m using the supervised learning method with an LSTM network to predict forex prices. To achieve this I’m using deeplearning4j library but I doubt several points of my implementation. I turned off the mini batch feature, then I created many trading indicators from forex data. The point is to provide random chunks of data to the neural network on every epoch and ensure that after every epoch the network state was cleaned. To achieve this I created a dataset iterator …

Topic: lstm normalization supervised-learning neural-network time-series

Category: Data Science

Standardization in combination with scaling

Caterina

2022年5月6日 05:01

Would it be ok to standardize all the features that exhibit normal distribution (with StandardScaler) and then re-scale all the features in the range 0-1 (with MinMaxScaler). So far I've only seen people doing one OR the other, but not in combination. Why is that? Also, is the Shapiro Wilk Test a good way to test if standardization is advisable? Should all features exhibit a normal distribution or are you allowed to transform only the ones that do have it?

Topic: training normalization preprocessing feature-scaling machine-learning

Category: Data Science

Would I be able to combine features on a different unit scale after normalizing?

cZeph

2022年5月5日 13:47

I'd like to explore some interactions between my variables but they're on different measurement scales. Would for example the absolute value of the difference of them after scaling make sense? From what I understand having them scaled on a 1-0 range would heavily rely on their max and min values, from this assumption it seems to me that interaction within them would not make sense since their position in their own scale would heavily depend on the observation.

Topic: feature-engineering normalization

Category: Data Science

Normalizing data from same variable but different individuals

Benisburgers

2022年5月4日 04:27

I'm new to machine learning. I have the following scenario: I have five individuals that are each carrying an accelerometer. That sensor measures movement/acceleration on a scale from 0 to 255, 0 being no movement, 255 being max movement (at a 5-minutes interval). Some individuals carry sensors that are more sensitive, and some that are less sensitive. As such, some individuals' sensors will provide higher values, and some individuals' sensors will provide lower values, for the same movements. Using a …

Topic: gaussian normalization discriminant-analysis classification

Category: Data Science

Correcting for one of multiple strong batch effects in a dataset

bglbrt

2022年5月2日 19:41

I am wondering which statistical tools to use when analysing data that have multiple strong batch effects (distributions vary from one batch to another). I would like to correct batch effect when it originates from one variable, without taking off the potential batch effect from other variables. If this is unclear, taking a short example is probably the best way to go to explain my problem: Imagine that we have 10 persons taking part in an experiment. The experiment is …

Topic: normalization preprocessing feature-scaling regression

Category: Data Science

Is it better to use a MinMax or a Log Return normalization to predict stock price movements?

Vincent Roye

2022年4月29日 21:04

I am trying to use a LSTM model to predict d+2 and d+3 closing prices. I am not sure whether I should normalize the data with a MixMax scaler (-1,+1) using the log return (P(n)-P(0))/P(0) for each sample I have tried quite a lot of source code from Github and they don't seem to converge on any technique.

Topic: lstm normalization feature-scaling time-series

Category: Data Science

Normalize data from different groups

formicaman

2022年4月27日 23:03

I have data that has been grouped into 27 groups by different criteria. The reason for these groupings is to show that each group has different behavior. However, I would like to normalize everything to the same scale. For example, I would like to normalize to a 0-1 scale of 0-100, that way I could say something like $43^{rd}$ percentile and it would have the same meaning across groups. If I were to just, say, standardize each individually by subtracting …

Topic: groupby normalization pandas python

Category: Data Science

How to deal with data having 0 values in many columns?

Akshat Jain

2022年4月26日 23:06

I am trying to implement logistic regression but the dataset that I have have many columns with skewed data and most of them have 0 as values. I also the skewness of data for many columns its going above 190. But it's not only for training data, it's the same for testing data too. I tried using log method to remove skewness but because most of the value is 0 it messed up my data. I don't know how to …

Topic: normalization preprocessing dataset data-cleaning

Category: Data Science

sklearn MinMaxScaler: Inverse does not equal original

Bochra BEN JABALLAH

2022年4月26日 11:06

I am using MinMaxScaler on a large dataset (2201887, 3) to normalize features. Inversed values does not match originals. I tested with the target column, first (a), I applied the scaler on 10 values, then did the inverse transformation and I was able to get original values. Then (b), I inverted 10 normalized values after applying MinMaxScaler on the whole column and results were completely different : Result of (a) : Result of (b) : How can I have the …

Topic: lstm normalization feature-scaling deep-learning neural-network

Category: Data Science

How to normalize test data according to the training data if the normalization on the training data is performed row wise?

jerry

2022年4月26日 08:06

I read in several places about the normalization of features in the machine learning method. But I normalize my training data row-wise as shown in the following code. I showed only two samples of training data. My question is that while performing the normalization on test data, should I choose the minimum and maximum value of each test sample to normalize each test data, or should I uses the minimum and maximum values from the training data? As an explanation …

Topic: normalization machine-learning

Category: Data Science

Normalization and Denormalization

Tarun Sharma

2022年4月20日 07:07

I have few queries. 1) Is normalization required for ANN / CNN /LSTM ? 2) If we normalize the data with MinMax Scaler, then in that case how to denormalize it and when to denormalize it so that we can get the Error Metrics in the original format?

Topic: metric machine-learning-model normalization deep-learning time-series

Category: Data Science

When to use Standard Scaler and when Normalizer?

Heisenbug

2022年4月20日 00:49

I understand what Standard Scalar does and what Normalizer does, per the scikit documentation: Normalizer, Standard Scaler. I know when Standard Scaler is applied. But in which scenario is Normalizer applied? Are there scenarios where one is preferred over the other?

Topic: normalization scikit-learn python data-cleaning

Category: Data Science

How to save pixels after normalization

Zehra N.

2022年4月12日 23:51

I want to normalize my images and use them in the training. But I couldn't find a way to save images after making changes below...How can I save it? files = ["/content/drive/MyDrive/Colab Notebooks/images/evre1/xyz.png", "/content/drive/MyDrive/Colab Notebooks/images/evre1/xty.png"] def normalize(files): for i in files: image = Image.open(i) new_image =image.resize((224,224)) pixels = asarray(image) # convert from integers to floats pixels = pixels.astype('float32') # calculate global mean and standard deviation mean, std = pixels.mean(), pixels.std() # print('Mean: %.3f, Standard Deviation: %.3f' % (mean, std)) # …

Topic: normalization image-classification computer-vision deep-learning

Category: Data Science

About