Chi-square as evaluation metrics for nonlinear machine learning regression models

I am using machine learning models to predict an ordinal variable (values: 1,2,3,4, and 5) using 7 different features. I posed this as a regression problem, so the final outputs of a model are continuous variables. So an evaluation box plot looks like this: I experiment with both linear (linear regression, linear SVMs) and nonlinear models (SVMs with RBF, Random forest, Gradient boosting machines ). The models are trained using cross-validation (~1600 samples), and 25% of the dataset is used …
Category: Data Science

How to assign costs to the confusion matrix

I am trying to assign costs to the confusion matrix. That is, in my problem, a FP does not have the same cost as a FN, so I want to assign to these cases a cost "x" so that the algorithm learns based on those costs. I will explain my case a little more with an example: When we want to detect credit card fraud, it does not have the same cost to predict that it is not fraud when …
Category: Data Science

IterativeImputer Evaluation

I am having a hard time evaluating my model of imputation. I used an iterative imputer model to fill in the missing values in all four columns. For the model on the iterative imputer, I am using a Random forest model, here is my code for imputing: imp_mean = IterativeImputer(estimator=RandomForestRegressor(), random_state=0) imp_mean.fit(my_data) my_data_filled= pd.DataFrame(imp_mean.transform(my_data)) my_data_filled.head() My problem is how can I evaluate my model. How can I know if the filled values are right? I used a describe function before …
Category: Data Science

Orange v3.32: Accuracy and precision not showing up

As explained in the orangehelp files the test and score widget would provide an accuracy colum like "CA". I only have MSE RMSE MAE and R2 besides the times. Furthermore, the predictions widget has a tickmark in the lower left with text "show performance scores". Nothing happens if ticked or not. Last but not least the evaluation results won't be displayed in the confusion matrix, also. What am I doing wrong? I test and score with cross-validation or random sampling. …
Category: Data Science

Need term or method name for evaluation of CNN without ground truth using e.g. a regression model

I have the following problem, I have trained a CNN and I can evaluate the network in-sample. I want to use the trained model for the class prediction of images for which I have no ground truth. However, there are other features referenced to these images that I can implement in a regression model along with predicted labels to predict Y. The only way to evaluate somehow the CNN is to infer if the predicted labels have an effect on …
Category: Data Science

Reinforcement Learning : Why acting greedily with the optimal value function gives you the optimal policy?

The course of David Silver about Reinforcement Learning explains how you get the optimal policy from the optimal value function. It seems to be very simple, you just have to act greedily, by maximizing at each step the value function. In the case of a small grid world, once you have applied the Policy Evaluation algorithm, you get for example the following matrix for the value function : You start from the up-left corner and the unique actions are the …
Category: Data Science

model.fit vs model.evaluate gives different results?

The following is a small snippet of the code, but I'm trying to understand the results of model.fit with train and test dataset vs the model.evaluate results. I'm not sure if they do not match up or if I'm not understanding how to read the results? batch_size = 16 img_height = 127 img_width = 127 channel = 3 #RGB train_dataset = image_dataset_from_directory(Train_data_dir, shuffle=True, batch_size=batch_size, image_size=(img_height, img_width), class_names = class_names) ##Transfer learning code from mobilenetV2/imagenet here to create model initial_epochs = …
Category: Data Science

Evaluation method for multi-class classification problem modeled as binary classification problem

I should mention that even though I have some basic knowledge regarding ML, it is the first big ML project I am working on and for the proposal of my research project I need to suggest an evaluation metric. The problem is a multiclass(16 classes) classification problem where one data point can be classified in multiple classes (not ranking based though). I plan to model it as a binary classification problem for each class but for the related evaluation metrics …
Category: Data Science

Match between objective function and evaluation metric

Does the objective function for model fitting and the evaluation metric for model validation need to be identical throughout the hyperparameter search process? For example, can a XGBoost model be fitted with the Mean Squares Error (MSE) as the objective function (setting the 'objective' argument to reg:squarederror: regression with squared loss), while the cross validation process is evaluated based on a significantly different metric such as the gamma-deviance (residual deviance for gamma regression)? Or should the evaluation metric match the …
Category: Data Science

Inference speed of ReLU networks

I'm fairly new in the topic, and I was wondering whether some of you can point to existing works in which the inference of deep neural networks with ReLU activation functions is tested on GPUs as a function of the number of hyperparameters. Just to have a rough idea on how fast those networks can give an answer back for, e.g., approximation/regression purposes.
Category: Data Science

Best metric to evaluate model probabilities

i'm trying to create ML model for binary classification problem with balanced dataset and i care mostly about probabilities. I was trying to search web and i find only advices to use AUC or logloss scores. There is no advices to use Brier score as evaluation metric. Can i use brier score as evaluation metric or there is some pitfalls within it? As i can understand if i will use logloss score as evaluation metric the "winner one" model will …
Category: Data Science

Comparing Dataset - Should I use the same Test dataset?

I am training ML CNN model. I want to compare different images dataset. The dataset all have different characteristics (Translated or not, Rotated or not, etc.). I do not modify the ML model between the different dataset training. Should I use the same Test dataset to compare them ? This dataset would not be changed through the testing and would contain data that can't be found else where. It would not be more suited for a specific training dataset. Or …
Category: Data Science

What metrics work well in unbalanced assemblies?

I wanted to know if there are some metrics that work well when working with an unbalanced dataset. I know that accuracy is a very bad metric when evaluating a classifier when the data is unbalanced but, what about for example the Kappa index? Best regards and thanks.
Category: Data Science

How to draw each ROC curve of an SVM model with cross validation

I would like to make a graph like the following in python: That is, one curve for each fold. I have the following code where I use an SVM model to classify some data kf = KFold(n_splits=10) a, fma, fmi = [], [], [] for train, eval in kf.split(x_train): x_train_i, x_eval_i, y_train_i, y_eval_i = x_train[train], x_train[eval], y_train[train], y_train[eval] c = svm.SVC(kernel='rbf', gamma='scale', C=40).fit( x_train_i, y_train_i ) p = c.predict(x_eval_i) acc = c.score(x_eval_i, y_eval_i) f1ma = f1_score(y_eval_i, p, average='macro') f1mi = …
Category: Data Science

How to estimate missing values when calculating NDCG

I would like to compare recommendations methods using NDCG metric on MovieLens dataset. In ranking problem, the goal is to rank items based on their relevance for user. Ranking models can be learned based on ratings matrix, where each user rates small subset of all items. Ratings for other items are unknown. Collaborative Filtering methods can be used to create model which generalize training datasets and predict ratings for unrated items. Let's consider following example on dataset consisted of 5 …
Category: Data Science

How to determine the "total number of relevant documents" in calculatiion of Recall in Precision and Recall if it's not known? Can it be estimated?

On Wikipedia there is a practical example of calculating Precision and Recall: When a search engine returns 30 pages, only 20 of which are relevant, while failing to return 40 additional relevant pages, its precision is 20/30 = 2/3, which tells us how valid the results are, while its recall is 20/60 = 1/3, which tells us how complete the results are. I absolutely don't understand how one can use the Precision and Recall in real/life scenario of total number …
Category: Data Science

What am I supposed to see on tensorboard images tab?

I'm training an object detection model with Tensorflow and monitor the training task with tensorboard. I was expecting in the Images tab of tensorboard that displayed images would show a bounding box (at a specific point of training). What I see though is only images with an orange line drawn above the picture (the same orange that I expect for the bounding box). Am I missing something? Am I right when I say that a bounding box should appear or …
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.