Is this XGBoost model tending to overfit?

Here is the list of hyperparameters that I used: params = { 'scale_pos_weight': [1.0], 'eta': [0.05, 0.1, 0.15, 0.9, 1.0], 'max_depth': [1, 2, 6, 10, 15, 20], 'gamma': [0.0, 0.4, 0.5, 0.7] } The dataset is imbalanced so I used scale_pos_weight parameter. After 5 fold cross validation the f1 score that I got is: 0.530726530426833
Category: Data Science

Hyper-parameter tuning of NaiveBayes Classier

I'm fairly new to machine learning and I'm aware of the concept of hyper-parameters tuning of classifiers, and I've come across a couple of examples of this technique. However, I'm trying to use NaiveBayes Classifier of sklearn for a task but I'm not sure about the values of the parameters that I should try. What I want is something like this, but for GaussianNB() classifier and not SVM: from sklearn.model_selection import GridSearchCV C=[0.05,0.1,0.2,0.3,0.25,0.4,0.5,0.6,0.7,0.8,0.9,1] gamma=[0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0] kernel=['rbf','linear'] hyper={'kernel':kernel,'C':C,'gamma':gamma} gd=GridSearchCV(estimator=svm.SVC(),param_grid=hyper,verbose=True) gd.fit(X,Y) print(gd.best_score_) print(gd.best_estimator_) …
Category: Data Science

CNN for subsets of a dataset - how to tune hyperparameters

I have a dataset and would like to train CNNs on subsets of different size of the dataset. I already have a CNN, which classifies very well if I use the entire dataset. Now the question arises if I should really try to additionally optimize the parameters of the CNN for the subsets, regardless of whether I do Data Augmentation or not? Does it really make sense if I try to change the CNN model for the subsets by using …
Category: Data Science

Why does hyperparameter tuning occur on validation dataset and not at the very beginning?

Despite doing/using it a few times, I'm still slightly confused by the use of a validation set for hyper parameter tuning. As far as I can tell, I choose a model, train it on training data, assess performance on training data, then do hyper parameter tuning assessing model performance on validation data, then choose the best model and test this on test data. In order to do this, I basically need to pick a model at random for training data. …
Category: Data Science

MLP classifier Gridsearch CV parameters to tune?

I'm looking to tune the parameters for sklearn's MLP classifier but don't know which to tune/how many options to give them? Example is learning rate. should i give it[.0001,.001,.01,.1,.2,.3]? or is that too many, too little etc.. i have no basis to know what is a good range for any of the parameters. Processing power is limited so i can't just test the full range. If anyone has a general guide of which are the most important to tune and …
Category: Data Science

How are parameters selected in cross-validation?

Suppose I'm training a linear regression model using k-fold cross-validation. I'm training K times each time with a different training and test data set. So each time I train, I get different parameters (feature coefficients in the linear regression case). So I will have K parameters at the end of cross-validation. How do I arrive at the final parameters for my model? If I'm using it to tune hyperparameters as well, do I have to do another cross-validation after fixing …
Category: Data Science

How to suppress "Estimator fit failed. The score on this train-test" warning message?

I am working on hyper-tuning random forest classifier with following parameters in random search CV In [100]: # defining model Model = RandomForestClassifier(random state=1) # Parameter grid to pass in RandomSearchCV param grid = { "n_estimators": [200,250,300], "min_samples_leaf": np.arange(1, 4), "max_features": [np.arange(0.3, 0.6, 0.1),'sqrt'],"max_samples": np.arange(0.4, 0.7, 0.1)} #Calling RandomizedSearchcV randomized_cv = RandomizedSearchCV(estimator=Model, param distributions=param grid, n_iter=10, n_jobs = -1, scoring=metrics.make_scorer(metrics.recall_score)) #Fitting parameters in RandomizedSearchcv randomized cv.fit(X train, y train) print ("Best parameters are {} with CV score={}:" .format (randomized_cv.best params_,randomized_cv.best_score_)) …
Category: Data Science

binary classification pipeline to select threshold

There are quite a few questions regarding the optimisation of binary threshold in a classification problem. However, I haven't found a single end-to-end solution to this problem. In an existing project, I have come up with the following pipeline to train a binary classifier: Outer-CV due to small to moderate data size. Inner-CV to tune hyperparameters Train model with tuned hyperparameters on outer-cv trainset Predict on the outer-cv test set Find optimal threshold using prediction probabilities Get score converting prediction …
Category: Data Science

How to improve regression neural network?

I am new to deep learning and data science and trying to increase my knowledge by working on some hackathons. Currently, the hackathon project I am working on has the task to predict the closing price of crypto-currency based on 48 parameters with ~1200 records. By far I was able to achieve some good accuracy from the model but still, my score is very low. I have tried many things from knowledge but it doesn't seem to be affecting the …
Category: Data Science

How to choose max layers and units to search over in hyper parameter tuning

When performing any hyper parameter tuning, let's say random search for simplicity, and I want to search over a minimum to max units/nodes in a layer, and a minimum to max number of layers, are there rules to guide what is a "large enough" number for my search? Currently all I know is "that should be good enough/large enough, let's search in there". I could be not searching a large enough space, or searching a space that's far too large …
Category: Data Science

RandomizedSearchCV doesn't stop running

I'm trying to optimize the hyperparameters of my model using RandomizedSearchCV. However, it doesn't stop running even if I define few iterations. Someone could help me? The code I'm using is presented below: def build_classifier(optimizer, units, alpha, l1): model = tf.keras.Sequential() model.add(tf.keras.layers.LSTM(units, kernel_regularizer = regularizers.l1(l1 = l1), input_shape= (None, n_features), return_sequences = True)) model.add(tf.keras.layers.LSTM(units, kernel_regularizer = regularizers.l1(l1 = l1), return_sequences = True)) model.add(tf.keras.layers.LSTM(units, kernel_regularizer = regularizers.l1(l1 = l1), return_sequences = False)) model.add(tf.keras.layers.Dense(5)) model.compile(optimizer = optimizer, loss = 'mae') return model …
Category: Data Science

Benefits of using Deep Learning-specific hyperparameter optimization tools vs. sklearn?

There are quite a few library for hyperparameter optimization that are specific to Keras or other Deep Learning libraries, like Hyperas or Talos. My question is, what's the main benefit of using these libraries compared to, for example, sklearn.model_selection.GridSearchCV() or sklearn.model_selection.RandomizedSearchCV?
Category: Data Science

When using optuna I should return accuracy or loss as objective value?

I am using optuna for hyperparameter tuning for my segmentation model. At the model, I am returning accuracy as an objective value since I realised that it tries to optimize to get the best result based on the objective value. I tried the same with returning (1-loss) but I am not sure what goes with either loss or accuracy when tuning. Also for loss is there another way than 1-loss to optimize or tune based on the loss curve?
Category: Data Science

My own model trained on the full data is better than the best_estimator I get from GridSearchCV with refit=True?

I am using an XGBoost model to classify some data. I have cv splits (train, val) and a separate test set that I never use until the end. I have used GridSearchCV to determine the best parameters and fed my cv splits (5 folds) into it as well as set refit=True so that once it figures out the best hyperparameters it trains on the full data (all folds as opposed to just 4/5 folds) and returns the best_estimator. I then …
Category: Data Science

Is Loss value (e.g., MSE loss) used in the calculation for parameter update when doing gradient descent?

My question is really simple. I know the theory behind gradient descent and parameter updates, what I really haven't found clarity on is that is the loss value (e.g., MSE value) used, i.e., multiplied at the start when we do the backpropagation for gradient descent (e.g., multiplying MSE loss value with 1 then doing backprop, as at the start of backprop we start with the value 1, i.e., derivative of x w.r.t x is 1)? If loss value isn't used …
Category: Data Science

Efficient Searching for a basis of information as a hyperparameter in a large possible hyperparameter space

I have a set of inputs, let's call them 'I', that can be fed through a complicated group of functions to produce/calculate a wide variety of outputs (let's call them 'O'). I want to find a subset of outputs (let's call them 'O-prime') within 'O' that contain sufficient information to form a basis in order to find/reconstruct a point in the 'I'-space accurately. In other words I want to pick 'O-prime' such that I am able to uniquely identify any …
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.