CNN for subsets of a dataset - how to tune hyperparameters

I have a dataset and would like to train CNNs on subsets of different size of the dataset. I already have a CNN, which classifies very well if I use the entire dataset. Now the question arises if I should really try to additionally optimize the parameters of the CNN for the subsets, regardless of whether I do Data Augmentation or not? Does it really make sense if I try to change the CNN model for the subsets by using …
Category: Data Science

macro average and weighted average meaning in classification_report

I use the "classification_report" from from sklearn.metrics import classification_report in order to evaluate the imbalanced binary classification Classification Report : precision recall f1-score support 0 1.00 1.00 1.00 28432 1 0.02 0.02 0.02 49 accuracy 1.00 28481 macro avg 0.51 0.51 0.51 28481 weighted avg 1.00 1.00 1.00 28481 I do not understand clearly what is the meaning of macro avg and weighted average? and how we can clarify the best solution based on how close their amount to one! …
Category: Data Science

How do I test one-shot model preformance against flawed categories?

I'm in the process of reworking the ASAM database. Excerpted, it looks like this: 4155 PIRATES BULK CARRIER GULF OF ADEN: Bulk carrier fired upon 3 Aug 09 at 1500 UTC while underway in position 13-46.5N 050-42.3E. Ten heavily armed pirates in two boats fired upon the vessel underway. The pirates failed to board the vessel due to evasive action taken by the master. All crew and ship properties are safe (IMB). 4156 PIRATES CARGO SHIP NIGERIA: Vessel (SATURNAS) boarded, …
Category: Data Science

Why is my training accuracy decreasing higher degrees of polynomial features?

I am new to Machine Learning and started solving the Titanic Survivor problem on Kaggle. While solving the problem using Logistic Regression I used various models having polynomial features with degree $2,3,4,5,6$ . Theoretically the accuracy on training set should increase with degree however it started decreasing post degree $2$ . The graph is as per below
Category: Data Science

How do I perform Leave One Out Cross Validation For Top n Recommendation Sytems?

I am new in making recommendation systems . I am using the surpriselib library to evaluate my recommendations. All the Accuracy Metrics are well supported in this library. But I also want to compute the Hit Rate of my top n recommender system. I know the formula for hit rate is: (no items users have already purchased)/(no of users) But this does not makes sense to me because to train and test the user vs item ratings I have only …
Category: Data Science

Orange v3.32: Accuracy and precision not showing up

As explained in the orangehelp files the test and score widget would provide an accuracy colum like "CA". I only have MSE RMSE MAE and R2 besides the times. Furthermore, the predictions widget has a tickmark in the lower left with text "show performance scores". Nothing happens if ticked or not. Last but not least the evaluation results won't be displayed in the confusion matrix, also. What am I doing wrong? I test and score with cross-validation or random sampling. …
Category: Data Science

How to compute f1_score for multiclass multilabel classification

I have used one hot encoder [1,0,0][0,1,0][0,0,1] for my functional classification model. The predicted probabilities for test data yprob = model.predict(testX) gives me : yprob = array([[0.18120882, 0.5803128 , 0.22847839], [0.0101245 , 0.12861261, 0.9612609 ], [0.16332535, 0.4925239 , 0.35415074], ..., [0.9931931 , 0.09328955, 0.01351734], [0.48841736, 0.25034943, 0.16123319], [0.3807928, 0.42698202, 0.27493873]], dtype=float32) I would like to compute the Accuracy, F1 score and the confusion matrix from this. The sequential api offers a predict_classes function to do it. yclasses = model.predict_classes(testX) and …
Category: Data Science

How to verify if the behavior of CNN model is correct?

I am exploring using CNNs for multi-class classification. My model details are: and the training/testing accuracy/loss: As you can see from the image, the accuracy jumped from 0.08 to 0.39 to 0.77 to 0.96 in few epochs. I have tried changing the details of the model (number of filters, kernel size) but I still note the same behavior and I am not experienced in deep learning. Is this behavior acceptable? Am I doing something wrong? To give some context. My …
Category: Data Science

How to add class labels to confusion matrix of multi class classification

How do I add class labels to the confusion matrix? The label display number in the label not the actual value of the label Eg. labels = ['A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z'] Here is the code I used to generate it. x_train, y_train, x_test, y_test = train_images, train_labels, test_images, test_labels model = KNeighborsClassifier(n_neighbors=7, metric='euclidean') model.fit(x_train, y_train) # predict labels for test data predictions = model.predict(x_test) # Print overall accuracy print("KNN Accuracy = ", metrics.accuracy_score(y_test, predictions)) # Print confusion matrix cm = confusion_matrix(y_test, predictions) plt.subplots(figsize=(30, …
Category: Data Science

Validation loss and validation accuracy stay the same in NN model

I am trying to train a keras NN regression model for music emotion prediction from audio features. (I am a beginner in NN and I am doing this as study project.) I have 193 features for training/prediction and it should predict valence and arousal values. I have prepared a NN model with 5 layers: model = Sequential() model.add(Dense(100, activation='elu', input_dim=193)) model.add(Dense(200, activation='elu')) model.add(Dense(200, activation='elu')) model.add(Dense(100, activation='elu')) model.add(Dense( 2, activation='elu')) And this is my loss and optimizer metrics: model.compile( loss = …
Category: Data Science

How to measure multi-label multi-class accuracy

I have a model that has multi-label multi-class targets Example Age Height Weight Mark Distance Red Yellow Green Blue Black White 14 160 62 78 103 0 1 1 1 1 0 56 177 90 99 363 1 1 0 0 0 0 32 179 79 83 737 0 0 0 0 1 0 17 180 94 75 360 1 0 1 1 1 1 43 186 102 51 525 0 0 0 0 0 0 55 168 74 48 …
Category: Data Science

model.fit vs model.evaluate gives different results?

The following is a small snippet of the code, but I'm trying to understand the results of model.fit with train and test dataset vs the model.evaluate results. I'm not sure if they do not match up or if I'm not understanding how to read the results? batch_size = 16 img_height = 127 img_width = 127 channel = 3 #RGB train_dataset = image_dataset_from_directory(Train_data_dir, shuffle=True, batch_size=batch_size, image_size=(img_height, img_width), class_names = class_names) ##Transfer learning code from mobilenetV2/imagenet here to create model initial_epochs = …
Category: Data Science

What is mean accuracy and why is it a harsh metric for multi-label validation?

The score method docs for scikit-learn's SGDClassifier have the following description: Return the mean accuracy on the given test data and labels. In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted. What is meant by the terms mean accuracy and subset accuracy? Could you please elaborate on why this is a harsh metric, perhaps with an example?
Topic: accuracy
Category: Data Science

Is it possible for the (Cross Entropy) test loss to increase for a few epochs while the test accuracy also increases?

I came across the question stated in the title: When training a model with the cross-entropy loss function, is it possible for the test loss to increase for a few epochs while the test accuracy also increases? I think that it should be possible, as the Cross Entropy loss is a measure of the "distance" between some 1-hot encoded vector to my model's predicted probabilities, and not a direct measure of my model's accuracy. But I was unable to find …
Category: Data Science

Accuracy after selftraining didn't change

I used Decisiton Tree Classifier which I trained with 50 000 samples. I have also set with unlabeled samples, so I decided to use self training algorithm. Unlabeled set has 10 000 samples. I would like to ask if it is normal, that after retrainig model with these 10 000 unlabeled samples, accuracy didn't chaned as well as confusion matrix has same values? I expected some changes (better or worse prediction). Thank you in advance.
Category: Data Science

Same validation accuracy, different train accuracy for two neural networks models

I'm performing emotion classification over FER2013 dataset. I'm trying to measure different models performance, and when I checked ImageDataGenerator with a model I had already used I came up with the following situation: Model without data augmentation got: train_accuracy = 0.76 val_accuracy = 0.70 Model with data augmentation got: train_accuracy = 0.86 val_accuracy = 0.70 As you can see, validation accuracy is the same in both models, but train accuracy is significantly different. In this case: Should I go with …
Category: Data Science

Standard datasets for Classical Machine Learning tasks

I'm aware of and have worked with many datasets in Classical ML as well as DL. I am also aware of some of the standard datasets in DL (for example ImageNet for Image Classification, etc.) However, I was wondering if there are any standard datasets (or benchmarks for accuracy) for the Classical methods such as Regression, GBM, SVM, etc. More specifically, are there any standard datasets that can be used to measure the accuracy of a new method? Given that …
Category: Data Science

Metrics for presenting RNN/LSTM result

I am working on two different architectures based on the LSTM model to predict the user's next action based on the previous actions. I am wondering, what is the best way to present the result? Is it okay to present only the prediction accuracy? Or Should I use other metrics? I found a paper using top_K_accuracy whereas on a different paper I found AUC or ROC. Overall, I would like to know what is the state of the art of …
Category: Data Science

Deep Learning accuracy vs Confusion Matrix accuracy

I am working on deep learning with fer2013 dataset. After training the model I got val_precision: 0.9168 (precision: 0.8492) Epoch 67/100 238/238 [==============================] - 31s 130ms/step - loss: 1.5087 - tp: 2622.4142 - fp: 474.9121 - tn: 45584.3013 - fn: 5054.1213 - accuracy: 0.8972 - precision: 0.8492 - recall: 0.3410 - auc: 0.9042 - prc: 0.6758 - val_loss: 0.9754 - val_tp: 1389.0000 - val_fp: 126.0000 - val_tn: 22698.0000 - val_fn: 2415.0000 - val_accuracy: 0.9046 - **val_precision: 0.9168** - val_recall: 0.3651 …
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.