Should I use MongoDB instead of storing data in CSV in python?

I am currently storing data crawled from multiple websites having same but still different structure so every crawler is saving data in separate csv. I am planning to store the data using MongoDB instead of storing it in csv.

  1. Will this be beneficial in saving space ?
  2. Overall will this be advantageous to do or will there be any drawbacks apart from me having to change the code ?

Topic data csv mongodb python data-mining

Category Data Science


If you're a beginner it might be useful to experiment with a database. In realistic settings, data is usually found in databases. So knowing how to interact with a database is also important for a data scientist.

cv data is tabular, while mongodb is key-value so it is also an opportunity to explore new data representation schemes.


CSV is a file format and MongoDB is database.

That beeing said, data will be saved in MongoDB via JSON (JavaScript Object Notation) format.

It will save space, thats the point of database software BUT you say still different structure so every crawler is saving data in separate csv so you have to bring the data in certain normal forms to save them in Database, hence there is more complexity involved. Overall depends on your usage, but quick and dirty solution is csv file.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.