Storing Big Matrix in DataBase
I have a mysql database with the following format:
id string
1 foo1...
2 foo2...
.. ...
There are >100k entries in this db.
What I want to do is for each string, compare it to each other string and store some metric of the comparison. Doing this will essentially yield a 2D matrix of size NxN
where N
is the number of row in the db.
My initial thought was creating another db where each index corresponds to the string of the index in the first db and each column is the value from comparing the two strings. For example, id 1 column 2 in the second db would be the value outputted from comparing id1 and id2 in the first db.
The format of the second db:
id col1 col2 col3 ....
1 1 0.4 0.5 .....
... ... ... ...
This way of creating the second db would result in 100k rows x 100k columns, which is the issue at hand. What is the best way to handle large data sets like this? Is storing the data in a text file more efficient (say each text file corresponds to one row in the second db.)
Topic relational-dbms bigdata
Category Data Science