I have a classification problem with data that comes in pairs. A pair consists of two datapoints (A,B) or (B,A), each datapoint containing 20 features. After receiving about 30 pairs, my goal is to separate the A and B classes using a GMM using feature similarity. For each datapoint, it is not known beforehand to what class it belongs, but it is however known that is of the opposite class as the other datapoint in its pair. Is there any …
I want to specify some probabilistic clustering model (such as a mixture model or lda) over words, and instead of using the traditional method of representing words as an indicator vector , I want to use the corresponding word embeddings extracted from word2vec, glove, etc. as input. While treating word embeddings from my word2vec as an input to my GMM model, I observed that my word embeddings for each feature had a normal distribution, i.e. feature 1..100 were normally distributed …
By using this code, can I compute the AUC: clf=GaussianMixture(n_components=3).fit(X_train) scores=clf.score_samples(X_test) scores=np.exp(scores) fpr, tpr, _=roc_curve(Y_trut, scores)# scores are the PDF roc_auc = auc(fpr, tpr) For the roc_curve function, is it correct to pass scores as probability density or probability?
Why Gaussian mixture model uses Expectation maximization instead of Gradient descent? What other models uses Expectation maximization to find best optimal parameters instead of using gradient descent?
When using the EM algorithm in Gaussian Mixture Models (GMM), in the E-step, we take each x set in the training dataset to calculate and update the "weight" and parameters of each Gaussian distribution of the clusters (M-step). I have read that we do this until it converges. I am a little confused here. Does that mean it loops through the whole training dataset X every time in "one step" of the EM algorithm? Or is "one step" corresponding to …
I have a feature vector with different data types, Considering all the data in that feature vector. I have to classify as Good or Bad. Which algorithm should be used to just get the output Good or bad based on different data types in a feature vector? The feature vectors are as follows: [Application_Name(string) , Uptime (Integer) , Criticality factor (0-1 float value) and few integer type ]