MLP classifier Gridsearch CV parameters to tune?

I'm looking to tune the parameters for sklearn's MLP classifier but don't know which to tune/how many options to give them? Example is learning rate. should i give it[.0001,.001,.01,.1,.2,.3]? or is that too many, too little etc.. i have no basis to know what is a good range for any of the parameters. Processing power is limited so i can't just test the full range. If anyone has a general guide of which are the most important to tune and …
Category: Data Science

Null Inputs/Inhibiting Inputs & Outputs with Scikit-Learn MLPRegressor

I'm trying to build a general predictive model of a model of a machine. I've got a variable number of sensor inputs, and I'd like to create a MLPRegressor that can estimate outputs from the input values. I know I can create individual AIs to model each individual output (ie. if I have 5 inputs, I can make 5 different AIs with 4 inputs each). But given that I have some large numbers of inputs, I was hoping for a …
Category: Data Science

Training loss stuck in the starting epochs but then starts decreasing. What could be the reason for it?

I am training a model where I found a unique problem that for starting 4 epochs, my loss did not change with the epochs but after that, it started changing. Could it be because of the high learning rate, local minima or something else like some regularisation parameter is high??
Category: Data Science

What is the difference between keras tuned hyperparameters and manually defined Sequential model with same hyperparameters?

I have a dataset that I divided into 10 splits of training, validation and test sets for a regression problem. I used the first split and RandomSearch in keras-tuner to arrive at the best hyperparameters for a MLP model with two hidden layers. The hyperparameters that I tuned for are the number of neurons in the first hidden layer, the number of neurons in the second hidden layer and the learning rate. I loaded the 'best model' and applied this …
Category: Data Science

Can a multilayer perceptron classify binary values?

I have a dataset in which the response variable is Sick(1) or not sick (2). As for the variables, there are a few numeric ones (2/14), all the others are variables by levels (example: 1-Abdominal pain, 2-Throat pain...). I had two questions: 1-Can a multilayer perceptron classify a binary variable? Or can it only return numerical values? 2-Can binary or leveled variables be passed as training to the multilayer perceptron? Thank you very much.
Category: Data Science

Multilayer perceptron does not converge

I have been coding my own multi layer perceptron in MATLAB and it compiles without error. My training data features, x, has values from 1 to 360, and the training data output, y, has the value of $\sin(x)$. The thing is my MLP only decreases the cost for the first few iterations and will get stuck at 0.5. I have tried including momentum, but it does not help and increasing the layers or increasing the neurons does not help at …
Category: Data Science

Neural Network regression negative performance

I have a problem with the performance of a multi layer perceptron regressor (neural network) and I cannot figure out why. Task: I am trying to improve a time series prediction. I have predictions of a physical parameter of the last 4 years along with the quasi true values. I train the NN with the predictions for -7 days until +1 days around the day I am interested in as features, in order to obtain a better prediction for that …
Category: Data Science

Feature scaling for MLP neural network sklearn

I am working with a dataset where the features have multiple scales. Before running scikit-learns's MLP neural network I was reading around and found a variety of different opinions for feature scaling. Some say you need to normalize, some say only standardize, others say, in theory, nothing is needed for MLP, some say only to scale training data and not testing data, the scikit-learn documentation says MLP is sensitive to feature scaling? This has left me very confused on which …
Category: Data Science

What does "expansion layer" mean?

Recently, I found "expansion layer" term in the next paper: Liu, Ze, et al. "Swin transformer: Hierarchical vision transformer using shifted windows." arXiv preprint arXiv:2103.14030 (2021). This term is mentioned in the context of Multilayer perceptron (MLP). So I have tried to figure out its meaning on my own, but I would not be able to find anything particular. Also I found "expansion ratio" term (again in MLP context) in this paper: Wu, Haiping, et al. "Cvt: Introducing convolutions to …
Category: Data Science

Linear regression with Pytorch not converging

I am trying to perform a simple linear regression using Pytorch lightning (a network with only one neuron). The network is supposed to learn a simple function: y=-4x. The size of my dataset is 1000 and contains points from the line y=-4x with a small amount of gaussian noise. The dataset looks like this: I am facing a strange problem where the model only converges when the batch size is small enough and when I don't shuffle random data in …
Category: Data Science

What are the key differences between a MLP with lagged features and a RNN

I've been working with MLP's for a while. Whenever I assumed that the past values of a feature might be useful for predicting the future values of Y, I would just create a new column in my data frame with Feature(t-1). This process would be repeated for further lags t-2,t-3...t-n. Besides the obvious problem of the curse of dimensionality, I am worried that the MLP doesn't know how to weight those time lagged features that are now in a new …
Category: Data Science

Inbetween CNN and MLP: neural network architecture for "close to convolutional" problem?

I am looking to approximate an (expensive to calculate precisely) forward problem using a NN. Input and output are vectors of identical length. Although not linear, the output somewhat resembles a convolution with a kernel, but the kernel is not constant but varies smoothly along the offset in the vector. I can only provide a limited training set, so I'm looking for a way to exploit this smoothness. Correct me if I'm wrong (I'm completely new to ML/NN), but in …
Category: Data Science

How to improve the learning rate of an MLP for regression when tanh is used with the Adam solver as an activation function?

I'm trying to use an MLP to approximate a smooth function f : R^3 -> R, that takes a point in space as an argument, and returns a scalar value. The MLP architecture has a 3-dimensional (for 3 point coordinates) input layer, N hidden layers and a single linear scalar output layer, since the output should be the function value: x x x x x x x x x x ... x x x x x x x x x …
Category: Data Science

Why there is a marked difference in metric scores using linear regression or MLP as readout for echo state network?

I am using a reservoir computing architecture comprising of an echo state network as per the paper Reservoir Computing Approaches for Representation and Classification of Multivariate Time Series Briefly, the architecture has four parts; Reservoir module (echo state network) Dimensional reduction module Representation module Readout module (linear regression, SVM or MLP) For a multivariate time series classification task that I am doing, keeping all parameters the same in parts 1-3 from above, when I use linear regression as readout, I …
Category: Data Science

Improve model accuracy in multi-classification problem

I use a MLP to classify three different classes A, B, C. The loss function I use is categorical cross entropy and the optimiser is adam. To estimate my models performance I use 10-fold Cross Validation. On average i get 60% accuracy score but I need it to be higher. The confusion matrix for the classes A,B,C, I get is the following: Class A Class B Class C 14440 8118 11229 6045 21863 5879 6207 4264 23315 The amount of …
Category: Data Science

Structuring experiment/training data with months in mind

We're using a whole year's data to predict a certain target variable.The model works like data - OneHot encoding the categorical variables - MinMaxScaler - PCA (to choose a subset of 2000 components out of the 15k) - MLPRegressor. When we're doing a ShuffleSplit cross-validation and everything is hunky-dory (r^2 scores above 0.9 and low error rates), however in real life, they're not going to use the data in the same format (e.g. a whole year's data), but rather a …
Category: Data Science

Testing a Binary Classifier

I have been training a binary multilayer perceptron on a database made out of roughly 3600 0 values, and 4 1 values. Afterwards, I'm testing the MLP on a test set made out of 7 0 values and 7 1 values. The little amount of 1's in my database is due to the fact that the data collection of this class is rather hard. My MLP is yielding good results, however, my question is if I can interpret these results …
Category: Data Science

Is it possible that MLP has better accuracy than CNN?

I am working on the epilepsy classification system which consumes EEG signals and in the result says if withing the certain period is a seizure or not. I take an advantage of Keras API for the sake of network training. I am giving a try at a few different neural network configurations and now I wonder is it possible that MLP is better than CNN in 1D classification in some cases? My question is not only related to EEG or …
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.