confidence

Interpreting confidence interval results for datasets

dmnte

2022年5月20日 01:02

I have created a dataset automatically and wanted to clarify my interpretation of the amount of noise using the confidence interval. I selected a random sample and manually annotated the sample and found that 98% of the labels were correct. Based on these values I then calculated the confidence interval at 99% which gave a lower bound of 0.9614 and upper bound of 0.9949. Does this mean that the noise in the overall dataset is between the lower and upper …

Topic: confidence text-classification dataset statistics

Category: Data Science

How to use confidence labels?

2022年5月13日 07:23

I have 2 sets of training data in csv files. The training data have class labels, 1 for memorable, and 0 for not memorable. In addition, there is also a confidence label for each sample. The class labels were assigned based on decisions from 3 people viewing the photos. When they all agreed, the class label could be considered certain, and a confidence of 1 was written down. If they didn't all agree, then the classification decided on by the …

Topic: binary-classification confidence labels dataset python

Category: Data Science

Are experiments using confidence interval can be said a statistical test

honolulu

2022年5月13日 07:18

I am working on some algorithm that is comparing results with other model using confidence interval , 90%. Can this be said a statistical test ? I read a article where it said about statistical test with some confidence level. Is confidence level same as confidence interval in statistical tests ?

Topic: confidence hypothesis-testing descriptive-statistics data statistics

Category: Data Science

Effect of size of data on the confidence on the coefficients in Linear regression?

NAS

2022年5月7日 14:53

What is the impact of size data on the confidence (p-value) of model coefficients?. Does increasing the size of data always improve the confidence in the model coefficients? Suppose I have 100 data points. I created another data from the same data by duplicating the original data 100 times. i.e. I have 100,000 data points now. If I run the model on two data sets, what would impact the model coefficients and why? I appreciate any help you can provide.

Topic: confidence pvalue linear-regression

Category: Data Science

Plotting confidence intervals

freshman

2022年5月4日 16:02

For the following dataframe, I am trying to plot the means of a sample of 5 random rows . And also plot their respective confidence intervals using errorbars. I am unable to figure how to plot the confidence intervals using errorbars. col0 col1 col2 col3 col4 col5 col6 col7 0 0 1 2 3 4 5 6 7 1 8 9 10 11 12 13 14 15 2 16 17 18 19 20 21 22 23 3 24 25 26 …

Topic: confidence python statistics

Category: Data Science

Evaluate Dendrogram Statistical Significance

Mirko

2022年5月3日 10:33

I have N=21 objects and each one has about 80 possible not NaN descriptors. I carried out a hierarchical clustering on the objects and I obtained this dendrogram. I want some kind of 'confidence' index for the dendrogram or for each node. I saw many dendrograms with Bootstrap values (as far as I understand it is the same as Monte Carlo Cross-Validation, but I might be wrong), and i think that in my case they could be used as well. …

Topic: confidence bootstraping scipy python clustering

Category: Data Science

How are the confidence intervals of a model interpreted?

PicaR

2022年4月14日 14:31

I am doing some work with R and after obtaining the confusion matrix I have obtained the following metrics corresponding to a logistic regression: Accuracy : 0.7763 95% CI : (0.6662, 0.864) No Information Rate : 0.5395 P-Value [Acc > NIR] : 1.629e-05 And it is not clear to me how CI would be interpreted. Maybe it would be that the Accuracy can take values between 0.666 and 0.864? What does it mean that the CI are so large? If …

Topic: confidence machine-learning-model rstudio evaluation machine-learning

Category: Data Science

Relation between Cross Validation and Confidence Intervals

Hing Wong

2022年3月7日 00:07

I've read from a source which I forgot where that 'In cross validation, the model with best scores at 95% confidence interval is picked'. But according to my stat knowledge, in order for CI (confidence interval) to works, you need normality assumption about the sampling statistics of the experiment. But how come from that unknown source it seems to simply use results from each flow to construct the sample mean & the confidence interval. It seems to me that neither …

Topic: confidence cross-validation

Category: Data Science

Association rules - Find 100% confidence rules

AskSmart

2021年11月22日 17:23

Suppose there are 100 items, numbered 1 to 100, and also 100 baskets, also numbered 1 to 100. Item i is in basket b if and only if i divides b with no remainder. Thus, item 1 is in all the baskets, item 2 is in all fifty of the even-numbered baskets, etc. For example Basket 12 consists of items {1, 2, 3, 4, 6, 12}. (a) Describe all the association rules that have 100% confidence. Give an example. I'm …

Topic: confidence data-science-model mathematics association-rules data-mining

Category: Data Science

Approximation of a confidence scores from a neural network with a final softmax layer: Softmax vs other normalization methods

SantoshGupta7

2021年10月29日 08:52

Say that there is a neural network for classification and the 2nd to last layer are 3 nodes, and the final layer is a softmax layer. During training the softmax layer is needed, but for inference it is not; the arg max can simply be taken from the 3 nodes. What about for getting some sort of approximation for confidence from the neural network? Using the softmax for normalization makes less sense, since it gives a ton of weight to …

Topic: confidence softmax probability machine-learning

Category: Data Science

Confidence intervals for evaluation on test set

Tom

2021年10月21日 13:21

I'm wondering what the "best practise" approach is for finding confidence intervals when evaluation the performance of a classifier on the test set. As far as I can see, there are two different ways of evaluating the accuracy of a metric like, say, accuracy: Evaluate the accuracy using the formula interval = z * sqrt( (error * (1 - error)) / n), where n is sample size, error is classification error (i.e. 1-accuracy) and z is a number representing multiples …

Topic: uncertainty confidence classification statistics machine-learning

Category: Data Science

data analysis leads to linear regression model: how to proceed with prognosis?

Nimrod Ets

2021年10月13日 04:13

Data analysis of a large dataset of project management data together with working hours led me to a surprisingly simple linear model over the key milestones of all projects. Now I am a bit at loss on how to proceed. The stakeholder wants a prediction on working hours spent per milestone and total working hours needed for one project. 1.) Do I calculate an average linear regression plus confidence interval and use that for prediction other project outcomes? 2.) Do …

Topic: confidence bootstraping linear-regression

Category: Data Science

Excluding data via confidence score: Is it a good idea?

Amirhossein Rezaei

2021年9月11日 09:32

Let's say I have a model which has a binary classification task (Two classes of 0 and 1) and therefore, it outputs a number between 0 and 1, if it is greater than 0.5 we consider it to be class 1 and 0 the other way around. Now let's say we remove any results in the test set that its output is between two thresholds of 0.4 and 0.6 to make the model more confident. To be more clear, if …

Topic: binary-classification confidence deep-learning

Category: Data Science

How to State The Confidence of Accuracy/Inaccuracy?

mmarion

2021年9月1日 22:01

Consider that I have a dataset automatically acquired by a machine that returns the following measurements: [111, 121, 114, 154, 149, 150] I then go and manually check how these values received by the machine compare to the true values, and I get the following measurements when checking manually: [112, 121, 114, 154, 149, 149] As you can see, the datasets differ in two places (I measured 112 where the machine saw 111 and I measured 149 where the machine …

Topic: confidence accuracy

Category: Data Science

How to predict only those values that our model is 95% sure of?

Adarsh Wase

2021年8月13日 20:21

I have 5 classes. I made a XGBoost Classification model and used model.predict(test) to predict the classes of test dataset. Out of all those values predicted by my model, I would like to know only those values that my model is more than 95% sure that the predicted value is correct. I mean, I would only like those predictions that my model is very confident of predicting. How do I find those predictions?

Topic: confidence machine-learning

Category: Data Science

Producing a confidence output to use in a weighted average layer

No idea what I'm doing

2021年4月23日 10:40

I am trying to solve a problem where the input consists of an unordered set of observations (variable size), and the desired output is a single value describing some property of the observations as a whole. The straight-forward approach would be to train a network to handle an individual observation and fit it to the desired output. When predicting, feed in all observations from the input, and average the result. This works to an extent, however, the big issue is …

Topic: confidence keras

Category: Data Science

Confidence score for all observations is between 0.50 - 0.55

Aman Raparia

2021年3月19日 11:01

Hello Data Science Stack Exchange Community, This question will appear to be open-ended, however any answers or thought will be much appreciated. I am trying to go-through a pre-trained Random Model Classifier with minimum documentation like what was the confusion matrix, ROC-AUC curve of the classification problem when the model was developed. I only have the pickle file and the data set on which it needs to run. When I ran the model I observed for most of the cases …

Topic: confidence machine-learning-model random-forest scikit-learn

Category: Data Science

Narrow confidence Interval of forecast

user2293224

2020年10月6日 23:11

I am new to data science so please accept my apology in advance. I am trying to predict the value using ARIMA. I have got weekly value for the current year. Based on the available weekly values, I predict the remaining weekly values of the year. I am using ARIMA model. After forecasting the value, I plotted the predicted value and confidence interval as shown in the figure: The red dotted line is the prediction and grey area shows the …

Topic: confidence python-3.x arima python

Category: Data Science

Confidence in the rewards for a RL task

ilirosmanaj

2020年10月1日 09:42

For a RL task that I am trying to solve, for which I train once per day, I have the rewards stored for each of those days, so that I can see the progress on daily basis. In the beginning of the learning process, the reward for a given state fluctuates quite much. After about 10 days or so, the rewards start to normalize and the fluctuations are very small. In these cases, when I notice very small change in …

Topic: confidence reinforcement-learning

Category: Data Science

Difference between test statistic and sample statistic?

Subhash C. Davar

2020年9月3日 05:34

Test statistic and sample statistic are used to check statistical significance. How do I understand the procedure in background of reaching a conclusion about an effect-size estimate.

Topic: estimation confidence sampling

Category: Data Science

About