Date time conversion in a CSV column

I am new to data science. I am attempting to write a program using regression techniques, and all of my values are numerical, except for the date and time (UTC), which are written in this format: HH:MM:SS MM/DD/YY. The date and time are a part of a CSV file and I do not know how to alter the column. I have looked around for how to convert this to a numerical value, but all the results put the date before the time. Other than that, I am having a hard time finding people that changed more than a single date. If anyone could guide me on how to make the time and date readable (using LinearRegression().fit() from the sklearn.linear_model library) I would greatly appreciate it.

P.S. Do I even have to convert it to a number? Can I keep it as the date and time or do I need to convert it?

EDIT:

algaeData = pd.read_csv(r'my_file').drop(columns=['Type', 'Device Type', 'Device S/N', 'Mooring', 'MRPT  NOTES'])
algaeData['Date (UTC)'] = pd.to_datetime(algaeData['Date (UTC)'], format='%H:%M:%S %m/%d/%y')

x = algaeData.drop(columns=['BGA (ug/L) (ug/L)'])
y = algaeData['BGA (ug/L) (ug/L)']
x, y = np.array(x), np.array(y)

model = LinearRegression().fit(x, y)

Topic dataframe data-formats python

Category Data Science


If you're using pandas you can convert your column pretty easily using

df['col'] =  pd.to_datetime(df['col'], format='%H:%M:%S %m/%d/%Y')

That will read your dates as a datetime64[ns] object. Which sklearn will be able to parse when you fit your LinearRegression model using that predictor.

Though I fail to understand what you're trying to do when you say

Other than that, I am having a hard time finding people that changed more than a single date.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.