naive-bayes-classifier

Hyper-parameter tuning of NaiveBayes Classier

Sameer Zahid

2022年6月4日 16:33

I'm fairly new to machine learning and I'm aware of the concept of hyper-parameters tuning of classifiers, and I've come across a couple of examples of this technique. However, I'm trying to use NaiveBayes Classifier of sklearn for a task but I'm not sure about the values of the parameters that I should try. What I want is something like this, but for GaussianNB() classifier and not SVM: from sklearn.model_selection import GridSearchCV C=[0.05,0.1,0.2,0.3,0.25,0.4,0.5,0.6,0.7,0.8,0.9,1] gamma=[0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0] kernel=['rbf','linear'] hyper={'kernel':kernel,'C':C,'gamma':gamma} gd=GridSearchCV(estimator=svm.SVC(),param_grid=hyper,verbose=True) gd.fit(X,Y) print(gd.best_score_) print(gd.best_estimator_) …

Topic: hyperparameter-tuning naive-bayes-classifier hyperparameter scikit-learn machine-learning

Category: Data Science

Naives Bayes Text Classifier Confidence Score

Jim

2022年6月3日 20:06

I am experimenting with building a text classifier using Naive Bayes which has been pretty successful on my test data. One thing i am looking to incorporate is handling text that does not fit into any predefined category that I trained the model on. Does anyone have some thoughts on how to do this? I was thinking of trying to calculate the confidence score for each document, and if < 80 % confidence, for example, it should label the data …

Topic: naive-bayes-classifier nlp python

Category: Data Science

Very low probability in naive Bayes classifier 1

R. Cox

2022年6月3日 15:04

I have some training data (TRAIN) and some test data (TEST). Each row of each table contains an observed class (X) and some columns of binary (Y). I'm using a Python script that is intended to predict the probability (Pr) of X given Y in the test data based on the training data. It uses a Bernoulli naive Bayes classifier. Here is my script: https://stackoverflow.com/questions/55187516/look-up-bernoullinb-probability-in-dataframe It works on the dummy data that is included with the script. On the real …

Topic: prediction probability naive-bayes-classifier machine-learning

Category: Data Science

Bad Input Shape -- How to interpret and Diagnose; Also side ML question

thefairpharaoh

2022年5月27日 19:05

I apologize I am a ML novice, but I am trying to learn. I am making a classifier based on this dataset to predict mental health disorders based on features. I wanted to run a very simple NB classifer model but I keep getting a bad input shape error (I want to feed in features such as age, ethnicity and gender to yield potential diagnoses). Unfortunately, I am having trouble diagnosing where my error is coming from and troubleshooting. Any …

Topic: naive-bayes-classifier machine-learning

Category: Data Science

How to calculate true positive, true negative, false positive, negative and postive with Bayes Classifer from scratch

Evan Gertis

2022年5月14日 14:25

I am working on implementing a Naive Bayes Classification algorithm. I have a method def prob_continous_value which is supposed to return the probability density function for an attribute given a class attribute. The problem requires classifying the following datasets: Venue,color,Model,Category,Location,weight,Veriety,Material,Volume 1,6,4,4,4,1,1,1,6 2,5,4,4,4,2,6,1,1 1,6,2,1,4,1,4,2,4 1,6,2,1,4,1,2,1,2 2,6,5,5,5,2,2,1,2 1,5,4,4,4,1,6,2,2 1,3,3,3,3,1,6,2,2 1,5,2,1,1,1,2,1,2 1,4,4,4,1,1,5,3,6 1,4,4,4,4,1,6,4,6 2,5,4,4,4,2,4,4,1 2,4,3,3,3,2,1,1,1 Venue,color,Model,Category,Location,weight,Veriety,Material,Volume 2,6,4,4,4,2,2,1,1 1,2,4,4,4,1,6,2,6 1,5,4,4,4,1,2,1,6 2,4,4,4,4,2,6,1,4 1,4,4,4,4,1,2,2,2 2,4,3,3,3,2,1,1,1 1,5,2,1,4,1,6,2,6 1,2,3,3,3,1,2,1,6 2,6,4,4,4,2,3,1,1 1,4,4,4,4,1,2,1,6 1,5,4,4,4,1,2,1,4 1,4,5,5,5,1,6,2,4 2,5,4,4,4,2,3,1,1 The code for this is written like so: from numpy.core.defchararray import count, index import …

Topic: naive-bayes-algorithim implementation naive-bayes-classifier

Category: Data Science

How can I create a "trained" dataset for categorizing news articles?

salamander

2022年5月10日 07:03

I am trying to automatically categorize news articles according to their primary topics, i.e. politics, entertainment, sports, business, technology, health, etc. There are some labeled datasets out there, but ideally I would like to create my own (for potential commercial usage later on). I am using python, but an answer clear enough with relation to any language would be sufficient. So, what would be the best way to go about this task? My current thoughts are: Determine the most popular …

Topic: naive-bayes-classifier classification nlp python data-mining

Category: Data Science

How to deal with missing data for Bernoulli Naive Bayes?

Chuck

2022年5月9日 21:08

I am dealing with a dataset of categorical data that looks like this: content_1 content_2 content_4 content_5 content_6 0 NaN 0.0 0.0 0.0 NaN 1 NaN 0.0 0.0 0.0 NaN 2 NaN NaN NaN NaN NaN 3 0.0 NaN 0.0 NaN 0.0 These represent user downloads from an intranet, where a user is shown the opportunity to download a particular piece of content. 1 indicates a user seeing content and downloading it, 0 indicates a user seeing content and not …

Topic: missing-data naive-bayes-classifier scikit-learn classification python

Category: Data Science

Naive Bayes Predict type = 'raw' returning NA

Teja

2022年5月6日 20:07

I have build a naive bayes model for text classification.It is predicting correctly.But it is returning 'NA' in prediction results if i put 'type = raw'.i have seen some results in stackoverflow to add some noise.when i do that i am getting all A category as 0's and all B category as 1's.How can i get correct probabilities in naive bayes? library('tm'); library('e1071'); library('SparseM'); Sample_data <- read.csv("products.csv"); traindata <- as.data.frame(Sample_data[1:60,c(1,2)]); testdata <- as.data.frame(Sample_data[61:80,c(1,2)]); trainvector <- as.vector(traindata$Description); testvector <- as.vector(testdata$Description); trainsource …

Topic: naive-bayes-classifier r machine-learning

Category: Data Science

Attitude to text mining and preparing tokens, irrelevant words, low accuracy

heisenberg7584

2022年5月2日 10:01

For purpose of quite big project I am doing a text mining on some documents. My steps are quite common: All to lower case Tokenization Stop list and stop words Lemmatizaton Stemming Some other steps like removing symbols. Then I prepare bag of words, make DTF and classify to 3 classes with SVM and Naive Bayes. But the accuracy I get is not too high (50-60%). I think that may be because in array of words after all the steps …

Topic: classifier naive-bayes-classifier text-mining classification

Category: Data Science

Language Detection using pycld2

natt010

2022年5月2日 08:43

I am trying to use the pycld2 package to detect multiple languages in text. This package provides Python bindings for the Compact Language Detect 2 (CLD2) This is the example I am testing out: import pycld2 as cld2 text = '''The universal connection with an additional advantage: Push-in connection. Terminate solid and stranded (Class B 7 strands or less), as well as ferruled conductors, by simply pushing them in – no tools required. La connessione universale con un ulteriore vantaggio: …

Topic: text-classification multiclass-classification naive-bayes-classifier nlp machine-learning

Category: Data Science

Naive Bayes TfidfVectorizer predicts everything to one class

Justas Vasiljevas

2022年5月1日 14:47

I'm trying to run Multinomial Bayes classificator on various balanced data sets and comparing 2 different vectorizers: TfidfVectorizer and CountVectorizer. I have 3 classes: NEG, NEU and POS. I have 10000 documents. NEG class has 2474, NEU 5894 and POS 1632. Out of that I have made 3 differently balanced data sets like this: text counts: NEU NEG POS Total number NEU balance dataset 5894 2474 1632 10000 NEG balance dataset 2474 2474 1632 6580 POS balance dataset 1632 1632 …

Topic: text-classification tfidf naive-bayes-classifier classification python

Category: Data Science

Naive bayes expectation maximization vs logistic regression for binary classification

Lilo

2022年4月29日 04:06

Assuming I'm dealing with binary classification. For what kind of data Naive bayes using expectation maximization would give a better solution and for what kind of data logistic regression would be the better choice?

Topic: naive-bayes-classifier logistic-regression classification

Category: Data Science

How to make use of POS tags as useful features for a NaiveBayesClassifier for sentiment analysis?

emily

2022年4月27日 00:04

I'm doing sentiment analysis on a twitter dataset (problem link). I have extracted the POS tags from the tweets and created tfidf vectors from the POS tags and used them as a feature (got accuracy of 65%). But I think, we can achieve a lot more with POS tags since they help to distinguish how a word is being used within the scope of a phrase. The model I'm training is MultnomialNB(). The problem I'm trying to solve is to …

Topic: naive-bayes-classifier sentiment-analysis nlp machine-learning

Category: Data Science

How are the weights defined in a (linear-chain) Conditional Random Field?

bolli

2022年4月14日 10:35

Edit: i saw that i mixed up i (in the graph) and t (in the formula), in the following i equivalent to t I am trying to understand the theory behind linear chain Conditional Random Fields. I have now read "An Introduction to Conditional Random Fields" by McCallum and Sutton, I think McCallum is one of the "inventors" of CRFs. In this work you can find the following representation of the CRF as a graph (I added some annotations to …

Topic: probability rnn naive-bayes-classifier nlp machine-learning

Category: Data Science

Really confused with characteristics of Naive Bayes classifiers?

achhainsan

2022年4月12日 22:04

Naive Bayes classifiers have the following characteristics-: They are robust to isolated noise points because such points are averaged out when estimating contiditional probabilities from data. Naive Bayes classifiers can also handle missing values by ignoring the example during model building and classification. They are robust to irrelevant attributes. If X_i is an irrelevant attributet then P(X_i/Y) becomes almost uniformly distributed. The class conditional probability for X_i has no impact on overall computation of posterior probability. I barely understand anything …

Topic: naive-bayes-algorithim bayesian naive-bayes-classifier classification data-mining

Category: Data Science

Confused on Naive Bayes classifier

Alpha code

2022年4月11日 12:02

In the last part of Andrew Ng's lectures about Gaussian Discriminant Analysis and Naive Bayes Classifier, I am confused as to how Andrew Ng derived $(2^n) - 1$ features for Naive Bayes Classifier. First off, what does he mean by features in the context he was describing? I initially thought that the features were characteristics of our random vector, $x$. I know that for the total possibilities of $x$ it is $2^n$ but I do not understand how he was …

Topic: mathematics bayesian gaussian naive-bayes-classifier machine-learning

Category: Data Science

what type of machine learning should i implement for this case

ahmed mani

2022年4月10日 23:13

I'm still newbie in machine learning and i need a algorithm that can study linear functions it doesn't have to be a function as i have x and y coordinates and i can feed it that, what it should do is look at a certain points and determine if the line leading to it is straight or not calculate how many points etc... and return a value from 0 to 1 describing the probability this is event a if its …

Topic: naive-bayes-classifier scikit-learn classification python

Category: Data Science

Naive Bayes classifiers working principal raise question

Encipher

2022年4月7日 13:46

Naive Bayes classifier works on the principal of conditional independence. Take an example, a bank manager wants to know how risky it is to approve loan for a customer depending upon customers background. Like the income,job,account balance, previous loan history, property of customer etc. Now we often see that to find out the risky or not risky for a loan approval naive Bayes classifier works well for prediction.However the attributes that I defined in the above they are definitely depended …

Topic: naive-bayes-classifier classification

Category: Data Science

Naive Bayes and Support Vector Machine (NBSVM) Classification

OldTimeRambler

2022年3月31日 19:01

I am relatively new to datascience and have a question about NBSVM. I have a two class problem and text data (headlines from the newspaper). I want to use NBSVM to predict whether a headline has the label 0 or 1. How I understood it, how I have to proceed now: convert the headlines to a document term matrix calculate the log-count ratio. As I understood it, these are the probabilities of the individual documents for a class (i.e. the …

Topic: naive-bayes-classifier svm

Category: Data Science

Create dataset from 3d batches of feature maps for classification model

PatrickHellman

2022年3月18日 19:36

I am trying to create dataset of batches of several volumetric features and labels. It's a nifti volumes with extracted features from brain. I am trying to create a pair of feature X and labels y for further classification model. So the output of feature array should be X = np.array([[feature1[batch1, batch2, …], feature2[batch1,batch2, …], feature3[batch1, batch2, …], [batch1, batch2, …]]) Y = np.array([[[1], [0]...]) I stuck on error of concatenating it to one dataset: ValueError: all the input arrays …

Topic: numpy naive-bayes-classifier scikit-learn python

Category: Data Science

About