GloVe Embedding Matrix "could not broadcast input array from shape (0) into shape (300)"
I'm working on Quora Question Pairs data set.
I'm trying to get embedding matrix for GloVe with the following code:
EMBEDDING_DIM = 300
embedding_matrix = np.zeros((len(word_index) + 1, EMBEDDING_DIM))
for word, i in word_index.items():
embedding_vector = embeddings_index.get(word)
if embedding_vector is not None:
embedding_matrix[i] = embedding_vector
however I get the following error: ValueError: could not broadcast input array from shape (0) into shape (300)
I searched the internet but couldn't find any tips. This is how I lobe GloVe:
embeddings_index = {}
GLOVE = ./glove.840B.300d.txt
with open(GLOVE) as f:
for line in f:
word, coefs = line.split(maxsplit=1)
coefs = np.fromstring(coefs, f, sep= )
embeddings_index[word] = coefs
print(Found %s word vectors. % len(embeddings_index))
I tried to keep it simple. If you like I can add the other codes. Any ideas on how to solve this error?
Topic embeddings word-embeddings kaggle
Category Data Science