I am trying to convert my time watched in a netflix show to a float so I can total it up. I cannot figure out how to convert it. I have tried many ways, including: temp['Minutes'] = temp['Duration'].apply(lambda x: float(x)) Error: ValueError: could not convert string to float: '00:54:45' 2022-05-18 05:21:42 00:54:45 NaN Ozark: Season 4: Mud (Episode 13) NaN Amazon FTVET31DOVI2020 Smart TV 00:54:50 00:54:50 US (United States) Wednesday 2022-05-18 I have pulled the day of week and Day …
So I have two datasets (dataset1, dataset2) that are related in this way: (Think of a checkout counter in a supermarket) A customer comes in with an item. The cashier registers the item to dataset1 directly, in particular registering the current time. The cashier does the usual cashier thing (scanning the item, waiting for the customer to pay, point is there is some time gap). After the customer paid, now the cashier needs to register the sale to dataset2, including …
I manipulate the time series using the different structures of the neural networks in order to make a prediction, and I wonder if there is a way to choose the parameters of the networks intelligently? from the characteristics of the signal, namely (trend, seasonality ...) can we choose these parameters that will make learning better?
I am trying to figure out the delay between changing the speed of a pump that pumps a modifier into a process and the change in Amps drawn by an extruder at the end of the process. The amps drawn are changing constantly and are effected by other variables but the amps are held within a range by changing the speed of the modifier pump. Because the amps drawn are constantly changing you can't just look at the trend line …
I have a question, related to parallel work on python How I can use Processers =1,2,3... on k nearest neighbor algorithm when K=1, 2, 3,.. to find the change in time spent, speedup, and efficiency. What is the appropriate code for that?
I'm working on a classification task(The dataset is 400,000 rows and 30 columns) and one of my features was date-time. I've extracted the month, day of the week, and hour from the dataset (year is a single value and I don't think minutes will have much influence). Since they're now categorical variables how do I deal with them? Should I leave them as a single row or use one-hot encoding or go for target encoding?
I'll just dive right in. I have a decent-size (100K observations) dataset of time-varying continuous and categorical predictors. Categorical predictors, actually, usually do not change, however, continuous one change every day. Another level of complexity - the one that I am struggling with - is the fact that the data are clustered at several levels (measurements coming from the same individual over time, with multiple individuals in the data set). So, I have something like: id | day | cont_predictor …
I am working with time-series data and am interested in adding time-stamp data (as a feature) into the (DNN) model. From the things I have read online so far, my only option is to come up with my own set of features (like Weekday, day, month, time of the day etc...) and feed it to the model. So my first question is, is it the only way to do it? Any leads to a different path? I am worried about …
I have a dataset with a time column and a separate date column in .xlsx format. time column has values in the below format: 12:32:21.499145197 12:32:21.499145197 date column has values in the below format: Apr 10, 2018 Apr 10, 2018 When I read the Excel file in Python, both get object datatype. So I first correct the datatypes. For date I use below code: df["dateConv"]= pd.to_datetime(df["date"]) df["dateConv"] I am unable to correct the datatype for the time column. I tried …
I have problem statement to predict the probability of solving a task depending on multiple features for e.g. when the task was created, the time needed to work on a task, etc Please find a dummy snippet attached task_id date_time_open time_needed day_created time_created status aa 12/09/2019 20 hrs Tuesday 3 pm done cc 17/10/2019 4 hrs Friday 10 pm not_done I know I can run a classification model to identify the class. However, things complicate when I add a time …
I have a dataset containing the walking times it takes to walk from one postal bin to another postal bin in a mail distribution center. The features I have, include the total mail workers on the floor (indicates how busy it is), the seniority of the mail workers, the rack number where the worker is now, the rack number of where the worker needs to go to pick up his/her container with letters. The rack numbers reflect the physical order …
When I add in Time Series version 0.3.12 in Orange version 3.30 "last version" the Yahoo Finance widget is not included. How do I add in the Yahoo Finance widget?
Is there any implementation of Time Series Clustering which allows me to segment using two or more series of the same phenomenon both as input for the algorithm? Suppose I have $A_{i,t}=X_{i,t}$ and $B_{i,t}=X_{i,t}-X_{i,t-1}$ for a set $i=1,...,n$ of individuals and $T$ times. I would like $A_{i,t}$ and $B_{i,t}$ to be part of the clustering algorithm both as part of the data for the individual $i$. The only implementations I am aware of are only for $A_{i,t}$ or $B_{i,t}$, not …
I have the following data in table where I want to calculate the average time between 1st and 2nd call. I know how to get the average, but I have a though time to figure out how to subtract the 2nd from 1st attempt since it is in the same column and I am more familiar to subtracting things between columns.
I am curious about how I would begin to approach this problem. I am working with a time series multi-indexed data frame (consisting of precomputed log returns) of various stocks. In this dataframe, the ticks are 1 second each and I have 300 ticks of each stock (5 minutes). One of the things that I would like the neural network to accomplish is being able to predict the movement of the next 5 minutes of log returns. In doing so, …
I was able to convert the 9.2e18 AD to a date, but I am confused about the exact date. Which date is 9.2e18 AD and and 9.2e18 BC? Time span (absolute) - [9.2e18 BC, 9.2e18 AD] i.e +/- 9.2e18 years NumPy documentation, section "Datetime Units" under "Datetimes and Timedeltas" Code Meaning Time span (relative) Time span (absolute) Y year +/- 9.2e18 years [9.2e18 BC, 9.2e18 AD] M month +/- 7.6e17 years [7.6e17 BC, 7.6e17 AD] W week +/- 1.7e17 years …
I am working on a problem to estimate task completion time in kanban (project management tool). While doing EDA, I looked at tasks that are either done or cancelled. In this case, I defined the completion time as the time taken from task creation to done/cancelled. I noticed I am running into an issue with that definition. I am disregarding tasks that have not been done yet. If we think of "task = done" as "event = 1", this is …
Say for example I am building a model to predict a customer churn event from Spotify, with my target being whether a customer churns in the next 90 days. One feature I might expect could be predictive of this event is customers checking their billing statements online - so I might engineer features for each customer on each training date to encode the information of how many times they have checked their billing statements. For example, I might create a …
For processes of discrete events occurring in continuous time with time-independent rate, we can use count models like Poisson or Negative Binomial. For discrete events that can occur once per sample in continuous time, with a time-dependent rate, we have survival models like Cox Proportional Hazards. What can we use for discrete event data in continuous time where there is an explicit time-dependence that we want to learn? I understand that sometimes people use sequential models where each node is …
I have an Excel file with a column that contain both date and time in a single cells as follows: 2020-07-10T13:32:01+00:00 I'm wondering how to extract this cell, split date and time (that are separated by a "T") and write them into 2 new columns "Date" and "Time", and able to use them afterwards to, for example, do Time math operations. I'd have a start from pandas: df = pd.read_excel('file.xlsx') def convert_excel_time(excel_time): return pd.to_datetime() but I actually don't know if …