I'm looking to tune the parameters for sklearn's MLP classifier but don't know which to tune/how many options to give them? Example is learning rate. should i give it[.0001,.001,.01,.1,.2,.3]? or is that too many, too little etc.. i have no basis to know what is a good range for any of the parameters. Processing power is limited so i can't just test the full range. If anyone has a general guide of which are the most important to tune and …
I'm trying to build a general predictive model of a model of a machine. I've got a variable number of sensor inputs, and I'd like to create a MLPRegressor that can estimate outputs from the input values. I know I can create individual AIs to model each individual output (ie. if I have 5 inputs, I can make 5 different AIs with 4 inputs each). But given that I have some large numbers of inputs, I was hoping for a …
I am training a model where I found a unique problem that for starting 4 epochs, my loss did not change with the epochs but after that, it started changing. Could it be because of the high learning rate, local minima or something else like some regularisation parameter is high??
I have a dataset that I divided into 10 splits of training, validation and test sets for a regression problem. I used the first split and RandomSearch in keras-tuner to arrive at the best hyperparameters for a MLP model with two hidden layers. The hyperparameters that I tuned for are the number of neurons in the first hidden layer, the number of neurons in the second hidden layer and the learning rate. I loaded the 'best model' and applied this …
I have a dataset in which the response variable is Sick(1) or not sick (2). As for the variables, there are a few numeric ones (2/14), all the others are variables by levels (example: 1-Abdominal pain, 2-Throat pain...). I had two questions: 1-Can a multilayer perceptron classify a binary variable? Or can it only return numerical values? 2-Can binary or leveled variables be passed as training to the multilayer perceptron? Thank you very much.
I have been coding my own multi layer perceptron in MATLAB and it compiles without error. My training data features, x, has values from 1 to 360, and the training data output, y, has the value of $\sin(x)$. The thing is my MLP only decreases the cost for the first few iterations and will get stuck at 0.5. I have tried including momentum, but it does not help and increasing the layers or increasing the neurons does not help at …
I have a problem with the performance of a multi layer perceptron regressor (neural network) and I cannot figure out why. Task: I am trying to improve a time series prediction. I have predictions of a physical parameter of the last 4 years along with the quasi true values. I train the NN with the predictions for -7 days until +1 days around the day I am interested in as features, in order to obtain a better prediction for that …
I am working with a dataset where the features have multiple scales. Before running scikit-learns's MLP neural network I was reading around and found a variety of different opinions for feature scaling. Some say you need to normalize, some say only standardize, others say, in theory, nothing is needed for MLP, some say only to scale training data and not testing data, the scikit-learn documentation says MLP is sensitive to feature scaling? This has left me very confused on which …
Recently, I found "expansion layer" term in the next paper: Liu, Ze, et al. "Swin transformer: Hierarchical vision transformer using shifted windows." arXiv preprint arXiv:2103.14030 (2021). This term is mentioned in the context of Multilayer perceptron (MLP). So I have tried to figure out its meaning on my own, but I would not be able to find anything particular. Also I found "expansion ratio" term (again in MLP context) in this paper: Wu, Haiping, et al. "Cvt: Introducing convolutions to …
I am trying to perform a simple linear regression using Pytorch lightning (a network with only one neuron). The network is supposed to learn a simple function: y=-4x. The size of my dataset is 1000 and contains points from the line y=-4x with a small amount of gaussian noise. The dataset looks like this: I am facing a strange problem where the model only converges when the batch size is small enough and when I don't shuffle random data in …
I've been working with MLP's for a while. Whenever I assumed that the past values of a feature might be useful for predicting the future values of Y, I would just create a new column in my data frame with Feature(t-1). This process would be repeated for further lags t-2,t-3...t-n. Besides the obvious problem of the curse of dimensionality, I am worried that the MLP doesn't know how to weight those time lagged features that are now in a new …
I am looking to approximate an (expensive to calculate precisely) forward problem using a NN. Input and output are vectors of identical length. Although not linear, the output somewhat resembles a convolution with a kernel, but the kernel is not constant but varies smoothly along the offset in the vector. I can only provide a limited training set, so I'm looking for a way to exploit this smoothness. Correct me if I'm wrong (I'm completely new to ML/NN), but in …
Am building MLP models on forecasting timeseries data. Am new in the field of machine learning and I have read about Detrending and normalisation. So which method (normalisation or detrending) will be suitable for use on timeseries dataset intended for building MLP models.
I'm trying to use an MLP to approximate a smooth function f : R^3 -> R, that takes a point in space as an argument, and returns a scalar value. The MLP architecture has a 3-dimensional (for 3 point coordinates) input layer, N hidden layers and a single linear scalar output layer, since the output should be the function value: x x x x x x x x x x ... x x x x x x x x x …
I have a simple NN which I have made with SKLearn. I have extracted: The weights sent to each node The bias assigned to each activation function But I can't see a way to get the output of the activation function in SKLearn, does anyone have any ideas? Thank you!!
I am using a reservoir computing architecture comprising of an echo state network as per the paper Reservoir Computing Approaches for Representation and Classification of Multivariate Time Series Briefly, the architecture has four parts; Reservoir module (echo state network) Dimensional reduction module Representation module Readout module (linear regression, SVM or MLP) For a multivariate time series classification task that I am doing, keeping all parameters the same in parts 1-3 from above, when I use linear regression as readout, I …
I use a MLP to classify three different classes A, B, C. The loss function I use is categorical cross entropy and the optimiser is adam. To estimate my models performance I use 10-fold Cross Validation. On average i get 60% accuracy score but I need it to be higher. The confusion matrix for the classes A,B,C, I get is the following: Class A Class B Class C 14440 8118 11229 6045 21863 5879 6207 4264 23315 The amount of …
We're using a whole year's data to predict a certain target variable.The model works like data - OneHot encoding the categorical variables - MinMaxScaler - PCA (to choose a subset of 2000 components out of the 15k) - MLPRegressor. When we're doing a ShuffleSplit cross-validation and everything is hunky-dory (r^2 scores above 0.9 and low error rates), however in real life, they're not going to use the data in the same format (e.g. a whole year's data), but rather a …
I have been training a binary multilayer perceptron on a database made out of roughly 3600 0 values, and 4 1 values. Afterwards, I'm testing the MLP on a test set made out of 7 0 values and 7 1 values. The little amount of 1's in my database is due to the fact that the data collection of this class is rather hard. My MLP is yielding good results, however, my question is if I can interpret these results …
I am working on the epilepsy classification system which consumes EEG signals and in the result says if withing the certain period is a seizure or not. I take an advantage of Keras API for the sake of network training. I am giving a try at a few different neural network configurations and now I wonder is it possible that MLP is better than CNN in 1D classification in some cases? My question is not only related to EEG or …