Data cleaning in Pandas, where the csv file has all data of each row in 1 field

I have really messy data that looks like this:

As you can see all the data in each row is contained in 1 column separated by a semi colon.

How do I arrange this data so that they are spread out over more columns? For example, category_id, category_id_lvl_0 etc., to be in separate columns and the rows underneath corresponding to that columns i.e ones that are separated by the semi colon to fall under the column of category_id, category_id_lvl_0...

Topic data-wrangling data-cleaning

Category Data Science


That to me doesn't seem like messy data at all, it is just a csv file with a ; delimiter. Depending on the region settings excel can use different delimiters when saving data as .csv file, ; being one of them. By default pandas assumes a , as the delimiter, which in this case does not apply. Try reading it in by specifying the correct delimiter using the sep argument as follows:

import pandas as pd

df = pd.read_csv(filename, delimiter=";")

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.