How to normalize test data according to the training data if the normalization on the training data is performed row wise?
I read in several places about the normalization
of features in the machine learning method. But I normalize my training data row-wise as shown in the following code. I showed only two samples of training data. My question is that while performing the normalization on test data, should I choose the minimum and maximum value of each test sample to normalize each test data, or should I uses the minimum and maximum values from the training data?
As an explanation in the first row -3 is one feature, -2 is second 0 is third and 3 is the fifth feature. And the second row is the second sample comprising of 5 features from -4 to 2. Similar to all other machine learning algorithms each row corresponds to one sample consisting of 5 features.
data = np.array([[-3,-2,0, 2,3],[-4,-1,0,3,2]])
print(data)
print(data.shape)
for i in range(len(data)):
print(i: ,i)
old_range = np.amax(data[i]) - np.amin(data[i])
new_range = 2
new_min = -1
data_norm = ((data[i] - np.amin(data[i])) / old_range)*new_range +
new_min
print(data_norm)
Result
[-1. -0.66666667 0. 0.66666667 1. ]
[-1. -0.14285714 0.14285714 1. 0.71428571]
Topic normalization machine-learning
Category Data Science