I have a dataset and would like to train CNNs on subsets of different size of the dataset. I already have a CNN, which classifies very well if I use the entire dataset. Now the question arises if I should really try to additionally optimize the parameters of the CNN for the subsets, regardless of whether I do Data Augmentation or not? Does it really make sense if I try to change the CNN model for the subsets by using …
I use the "classification_report" from from sklearn.metrics import classification_report in order to evaluate the imbalanced binary classification Classification Report : precision recall f1-score support 0 1.00 1.00 1.00 28432 1 0.02 0.02 0.02 49 accuracy 1.00 28481 macro avg 0.51 0.51 0.51 28481 weighted avg 1.00 1.00 1.00 28481 I do not understand clearly what is the meaning of macro avg and weighted average? and how we can clarify the best solution based on how close their amount to one! …
I'm in the process of reworking the ASAM database. Excerpted, it looks like this: 4155 PIRATES BULK CARRIER GULF OF ADEN: Bulk carrier fired upon 3 Aug 09 at 1500 UTC while underway in position 13-46.5N 050-42.3E. Ten heavily armed pirates in two boats fired upon the vessel underway. The pirates failed to board the vessel due to evasive action taken by the master. All crew and ship properties are safe (IMB). 4156 PIRATES CARGO SHIP NIGERIA: Vessel (SATURNAS) boarded, …
I am new to Machine Learning and started solving the Titanic Survivor problem on Kaggle. While solving the problem using Logistic Regression I used various models having polynomial features with degree $2,3,4,5,6$ . Theoretically the accuracy on training set should increase with degree however it started decreasing post degree $2$ . The graph is as per below
In a data classification problem (with supervised learning), what should be the ideal difference in the training set accuracy and testing set accuracy? What should be the ideal range? Is a difference of 5% between the accuracy of training and testing set okay? Or does it signify overfitting?
I am new in making recommendation systems . I am using the surpriselib library to evaluate my recommendations. All the Accuracy Metrics are well supported in this library. But I also want to compute the Hit Rate of my top n recommender system. I know the formula for hit rate is: (no items users have already purchased)/(no of users) But this does not makes sense to me because to train and test the user vs item ratings I have only …
As explained in the orangehelp files the test and score widget would provide an accuracy colum like "CA". I only have MSE RMSE MAE and R2 besides the times. Furthermore, the predictions widget has a tickmark in the lower left with text "show performance scores". Nothing happens if ticked or not. Last but not least the evaluation results won't be displayed in the confusion matrix, also. What am I doing wrong? I test and score with cross-validation or random sampling. …
I have used one hot encoder [1,0,0][0,1,0][0,0,1] for my functional classification model. The predicted probabilities for test data yprob = model.predict(testX) gives me : yprob = array([[0.18120882, 0.5803128 , 0.22847839], [0.0101245 , 0.12861261, 0.9612609 ], [0.16332535, 0.4925239 , 0.35415074], ..., [0.9931931 , 0.09328955, 0.01351734], [0.48841736, 0.25034943, 0.16123319], [0.3807928, 0.42698202, 0.27493873]], dtype=float32) I would like to compute the Accuracy, F1 score and the confusion matrix from this. The sequential api offers a predict_classes function to do it. yclasses = model.predict_classes(testX) and …
I am exploring using CNNs for multi-class classification. My model details are: and the training/testing accuracy/loss: As you can see from the image, the accuracy jumped from 0.08 to 0.39 to 0.77 to 0.96 in few epochs. I have tried changing the details of the model (number of filters, kernel size) but I still note the same behavior and I am not experienced in deep learning. Is this behavior acceptable? Am I doing something wrong? To give some context. My …
How do I add class labels to the confusion matrix? The label display number in the label not the actual value of the label Eg. labels = ['A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z'] Here is the code I used to generate it. x_train, y_train, x_test, y_test = train_images, train_labels, test_images, test_labels model = KNeighborsClassifier(n_neighbors=7, metric='euclidean') model.fit(x_train, y_train) # predict labels for test data predictions = model.predict(x_test) # Print overall accuracy print("KNN Accuracy = ", metrics.accuracy_score(y_test, predictions)) # Print confusion matrix cm = confusion_matrix(y_test, predictions) plt.subplots(figsize=(30, …
I am trying to train a keras NN regression model for music emotion prediction from audio features. (I am a beginner in NN and I am doing this as study project.) I have 193 features for training/prediction and it should predict valence and arousal values. I have prepared a NN model with 5 layers: model = Sequential() model.add(Dense(100, activation='elu', input_dim=193)) model.add(Dense(200, activation='elu')) model.add(Dense(200, activation='elu')) model.add(Dense(100, activation='elu')) model.add(Dense( 2, activation='elu')) And this is my loss and optimizer metrics: model.compile( loss = …
The following is a small snippet of the code, but I'm trying to understand the results of model.fit with train and test dataset vs the model.evaluate results. I'm not sure if they do not match up or if I'm not understanding how to read the results? batch_size = 16 img_height = 127 img_width = 127 channel = 3 #RGB train_dataset = image_dataset_from_directory(Train_data_dir, shuffle=True, batch_size=batch_size, image_size=(img_height, img_width), class_names = class_names) ##Transfer learning code from mobilenetV2/imagenet here to create model initial_epochs = …
The score method docs for scikit-learn's SGDClassifier have the following description: Return the mean accuracy on the given test data and labels. In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted. What is meant by the terms mean accuracy and subset accuracy? Could you please elaborate on why this is a harsh metric, perhaps with an example?
I came across the question stated in the title: When training a model with the cross-entropy loss function, is it possible for the test loss to increase for a few epochs while the test accuracy also increases? I think that it should be possible, as the Cross Entropy loss is a measure of the "distance" between some 1-hot encoded vector to my model's predicted probabilities, and not a direct measure of my model's accuracy. But I was unable to find …
I used Decisiton Tree Classifier which I trained with 50 000 samples. I have also set with unlabeled samples, so I decided to use self training algorithm. Unlabeled set has 10 000 samples. I would like to ask if it is normal, that after retrainig model with these 10 000 unlabeled samples, accuracy didn't chaned as well as confusion matrix has same values? I expected some changes (better or worse prediction). Thank you in advance.
I'm performing emotion classification over FER2013 dataset. I'm trying to measure different models performance, and when I checked ImageDataGenerator with a model I had already used I came up with the following situation: Model without data augmentation got: train_accuracy = 0.76 val_accuracy = 0.70 Model with data augmentation got: train_accuracy = 0.86 val_accuracy = 0.70 As you can see, validation accuracy is the same in both models, but train accuracy is significantly different. In this case: Should I go with …
I'm aware of and have worked with many datasets in Classical ML as well as DL. I am also aware of some of the standard datasets in DL (for example ImageNet for Image Classification, etc.) However, I was wondering if there are any standard datasets (or benchmarks for accuracy) for the Classical methods such as Regression, GBM, SVM, etc. More specifically, are there any standard datasets that can be used to measure the accuracy of a new method? Given that …
I am working on two different architectures based on the LSTM model to predict the user's next action based on the previous actions. I am wondering, what is the best way to present the result? Is it okay to present only the prediction accuracy? Or Should I use other metrics? I found a paper using top_K_accuracy whereas on a different paper I found AUC or ROC. Overall, I would like to know what is the state of the art of …