I am beginner in machine learning. My project is to make search engine based on AI which shows related articles when we search on website. For this i decided to train my own embedding. I found two methods for this: One is to train network to find next word( i.e inputs=[the quick,the quick brown,the quick brown fox] and outputs=[brown, fox,lazy] Other method is to train with nearest words(i.e [brown,fox],[brown,quick],[brown,quick]). Which method should i use and after training how should i …
Even if a time series is constructed up of numbers only, finding abstract fixed-dim vector representation would be interesting for classification/clustering purposes. As we can learn & find abstract representations/embeddings of text/images, can we do something similar on Time series? Finding such ways would result in better clustering & related tasks instead of traditional ways using some statistical measures like Pearson correlation etc. All thoughts are welcome.
I want to use a transformer model to do classification of fixed-length time series. I was following along this tutorial using keras which uses time2vec as a positional embedding. According to the original time2vec paper the representation is calculated as $$ \boldsymbol{t2v}(\tau)[i] = \begin{cases} \omega_i \tau + \phi_i,& i = 0\\ F(\omega_i \tau + \phi_i), & 1 \leq i \leq k \end{cases} $$ The mentioned tutorial simply concatenates this embedding with the input. Now, I understand the intention of the …
I'm using this Universal Sentence Encoder (USE) model to get embeddings of a set of texts, each text corresponding to a newspaper article. In order to build a Recommender System, I generate user embeddings by averaging the embeddings of items a user has read, and then I look for other texts that are cosine-similar to this user (basically, the method returns a set of items that are similar to this user embedding). Now, the problem is that the mentioned model …
I have a set of data that contains the different lengths of sequences. On average the sequence length is 600. The dataset is like this: S1 = ['Walk','Eat','Going school','Eat','Watching movie','Walk'......,'Sleep'] S2 = ['Eat','Eat','Going school','Walk','Walk','Watching movie'.......,'Eat'] ......................................... ......................................... S50 = ['Walk','Going school','Eat','Eat','Watching movie','Sleep',.......,'Walk'] The number of unique actions in the dataset are fixed. That means some sentences may not contain all of the actions. By using Doc2Vec (Gensim library particularly), I was able to extract embedding for each of the sequences …
OK, let's say we have well-labeled images with non-discrete labels such as brightness or size or something and we want to generate images based on it. If it were done with a discrete label it could be done like: def forward(self, inputs, label): self.batch = inputs.size(0) h = self.res1(inputs) h = self.attn(h) ... h = self.res5(h) h = torch.sum((F.leaky_relu(h,0.2)).view(self.batch,-1,4*4), dim=2) outputs = self.fc(h) if label is not None: embed = self.embedding(label) outputs += torch.sum(embed*h,dim=1,keepdim=True) The embedding can be made to …
I'm trying to solve a problem which is as follows: I need to train the autoencoder to extract useful data from text. I will use the trained autoencoder in another model to extract features. The goal is to teach the autocoder to compress the information and then reconstruct the exact same string. I solve the problem of classification for each letter. My dataset: X_train_autoencoder_raw: 15298 some text... 1127 some text... 22270 more text... ... Name: data, Length: 28235, dtype: object …
I'm trying to build an encoder-decoder network in Keras to generate a sentence of a particular style. As my problem is unsupervised i.e. I don't have the ground truths for the generated sentences, I use a classifier to help during training. I pass the decoder's output into the classifier to tell me what style the decoded sentence is. The decoder outputs a softmax distribution which I was intending to feed straight into the classifier but I realised that it has …
I have trained my triplet loss model using FaceNet's architecture. I used 11k hands dataset. Now I want to see how well my model performed, so I feed it 2 images of the same class and get back their embeddings. I want to compare the distance between these embeddings and if that distance is not larger than some threshold I can say that the model correctly classifies these 2 images as of the same class. How do I select the …
I am combining several vectors, where each vector is a certain kind of embedding of some object. Since each embedding is very different (some have all components between $[0, 1]$ some have components in the range of around 60 or 70 etc.) I want to rescale the vectors before combining them. I thought about using something like min-max rescaling, but I'm not sure how to generalize it to vectors. I could do something of the sort - $\frac{v-|v_{min}|}{|v_{max}|-|v_{min}|)}$ but I …
There are several popular word embeddings available (e.g., Fasttext and GloVe); In short, those embeddings are a tool to encode words along with a sensible notion of semantics attached to those words (i.e. words with similar sematics are nearly parallel). Question: Is there a similar notion of character embedding? By 'character embedding' I understand an algorithm that allow us to encode characters in order to capture some syntactic similarity (i.e. similarity of character shapes or contexts).
From page 3 of this paper Knowledge Graph Embeddings and Explainable AI, they mentioned as below: Note that knowledge graph embeddings are different from Graph Neural Networks (GNNs). KG embedding models are in general shallow and linear models and should be distinguished from GNNs [78], which are neural networks that take relational structures as inputs However, it's still vague to me. It seems that we can get embeddings from both of them. What are the difference? How should we choose …
For unsupervised text clustering, the key thing is the init embedding for text. If we want to use deepcluster for text, the problem for text is how to get the init embedding from deep model. BERT can not get good init embedding. If we do not use deep model, is there better way to get embedding better than glove wordvec?
I am struggling to understand how word embedding works, especially how the embedding matrix $W$ and context matrix $W'$ are created/updated. I understand that in the Input we may have a one-hot encoding of a given word, and that in the output we may have the word the most likely to be nearby this word $x_i$ Would you have any very simple mathematical example?
I wrote an algorithm for generating node embeddings based on the graph's topology. Most of the explanation is done in the readme file and the examples. The question is: Am I reinventing the wheel? Does this approach have any practical advantages over existing solutions for embeddings generation? Yes, I'm aware there are many algorithms for this based on random walks, but this one is pure deterministic linear algebra and it is quite simple, from my perspective. In short, the algorithm …
I welcome any suggestions to solve the following hard problem: I have a dataset of float feature vectors of size 512 where each feature vector is extracted from a face image. I want to generate a key given a feature vector (this key can be a number/binary code/etc) that is consistent to each person without comparisons between feature vectors. The only input I have is the given feature vector. for example if I see a photo of me I want …
I'm puzzling to understand why the method of averaging word embeddings works in order to obtain sentence embedding, in particular considering the exercize of this post How to obtain vector representation of phrases using the embedding layer and do PCA with it. My current question actually is to understand the theory behind that more practical post. The answer to the question linked uses a method for sentence embedding that is averaging the word embeddings (in the most naive and simplest …
I am little bit confused about encoding categorical variables. There are other posts/blogposts on this issue but none is talking about the problem I am facing. I have a dataset with mixed variables (i.e, numerical as well as categorical). Some of the categorical variables has a lot of categories (close to 100). So instead of using One Hot encoders, I am looking into using embeddings. My goal is to: Use the embeddings of the categorical variables and extract them and …
I'm trying to use PyTorch BigGraph pre-trained embeddings of Wikidata items for disambiguation. The problem is that the results I am getting by using dot (or cosine) similarity are not great. For example, the similarity between the Python programming language and the snake with the same name is greater than between Python and Django. Does anybody know if there is a Wikidata embedding that results in better similarities? The only alternative I've found is Webmembedder embeddings but they are incomplete. …
I'm using Autoencoder for feature extracting. I stuck with how to choose good number of dimension of encoder layer (latent layer). After training dataset, the model gave the latent layer (embedding layer) with some zero value in the vector result. For example, the embedding layer have 4 dimensions, one of node (unit) in embedding layer has value [0.67 0.0 2.13 0.43]. That I suppose they should 4 values different zero value. I think my problem that I choose too many …