Is there a Gaussian Mixture Model for data with opposing pairs?

I have a classification problem with data that comes in pairs. A pair consists of two datapoints (A,B) or (B,A), each datapoint containing 20 features. After receiving about 30 pairs, my goal is to separate the A and B classes using a GMM using feature similarity. For each datapoint, it is not known beforehand to what class it belongs, but it is however known that is of the opposite class as the other datapoint in its pair. Is there any …
Category: Data Science

Treating Word Embeddings as Multivariate Gaussian Random Variables

I want to specify some probabilistic clustering model (such as a mixture model or lda) over words, and instead of using the traditional method of representing words as an indicator vector , I want to use the corresponding word embeddings extracted from word2vec, glove, etc. as input. While treating word embeddings from my word2vec as an input to my GMM model, I observed that my word embeddings for each feature had a normal distribution, i.e. feature 1..100 were normally distributed …
Category: Data Science

Gaussian Mixture Models Clustering

When using the EM algorithm in Gaussian Mixture Models (GMM), in the E-step, we take each x set in the training dataset to calculate and update the "weight" and parameters of each Gaussian distribution of the clusters (M-step). I have read that we do this until it converges. I am a little confused here. Does that mean it loops through the whole training dataset X every time in "one step" of the EM algorithm? Or is "one step" corresponding to …
Category: Data Science

Which algorithm or tool to use to classify as good or bad?

I have a feature vector with different data types, Considering all the data in that feature vector. I have to classify as Good or Bad. Which algorithm should be used to just get the output Good or bad based on different data types in a feature vector? The feature vectors are as follows: [Application_Name(string) , Uptime (Integer) , Criticality factor (0-1 float value) and few integer type ]
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.