Perform DTW simultaneously on multiple trajectories

Good day, I have ~50 sample trajectories (timeseries) showing reactor temperature over time for a given process. In addition, I have a reference signal of the ideal trajectory for this process. I would like to synchronize all the sample trajectories with the reference trajectory. Performing DTW with 1 sample signal and reference produces new signals along a common axis (as it should). My question is how can I perform this synchronization of all sample trajectories with the reference simultaneously? Such …
Category: Data Science

Clustering of time series data

I have a time series data set. I want to use Dynamic time warping for distance measurement. For algorithm, I was thinking of using either K-means DTW Barycenter Averaging (DBA) or K-medoids. Data has outliers. My goal is to identify demand pattern. I am not sure which one to use. Which one would be better in terms of accuracy and evaluation? What are the advantages and disadvantages of both of the algorithms and what type of validation should I use? …
Category: Data Science

Does it make sense to compare two distances computed with Dynamic Time Warping?

Assume we are measuring the temperature $f_i(A,T_k)$ of the engine $i \in \{1,2\}$ of a given boat $A$, at $f_s = 1\text{Hz}$, for timesteps $T_k$ and some trips $k \in \{1,2,...,n\}$. Denoting $d_{1,2}^k(A)$ the dynamic time warping between $f_1(A, T_k)$ and $f_2(A, T_k)$, does it make sense to compare $d_{1,2}^a(A)$ and $d_{1,2}^b(A)$ for $a\neq b$ ? The dynamic time warping is based on an optimal path which depends on the two time series we want to compute the distance on. …
Category: Data Science

Clustering Multi-Variate Time Series Data

My end goal is to plot a PCA like plot (2D scatter plot) for my 3 variable time series data to see if there are natural clusters in the data. I don't have any sort of classification sorted out yet, just raw data entries. I like the idea of using DTW for each signal, resulting in a cost or distance result for each signal (3 values). From there I would then run PCA on the 3 cost values. The hangup …
Category: Data Science

The starting point of Dynamic Time Warping

Consider there are 2 real-time time series. And they didn't start at the same time. Say, time-series $s_1$ is from $t_1$ to $t_n$ and time-series from $t_2$ to $t_m$. And $t_1$ is not equal to $t_2$. I didn't see any assumption of DTW so I guess it is still applicable to this situation with some necessary modification. So, all in all, how can we calculate the distance using DTW between 2 time series if their starting points are not the …
Category: Data Science

In DTW, is the distance the sum of the shortest path's elements or the fathest element?

The title says it. In dynamic time warping, I keep hearing that the distance between two distances is the sum of the shortest path's elements. But I also see the distance as the element in the farthest corner? Could somebody please clear this up. Example: x = [0,1,1,2] z = [0,2,2] gives this cost matrix: [[ 0. inf inf inf] [inf 0. 2. 4.] [inf 1. 1. 2.] [inf 2. 2. 2.] [inf 4. 2. 2.]] meaning that the distance …
Category: Data Science

Is it possible to reduce the time of computing DTW with dtw-python package by disabling computation of?

I am trying to classify some time series using dtw-python package which is a python version of R package implementing Dynamic Time Warping described in this nice paper. By default a call to dtw function returns DTW object containing the distance as well as found warping path (docs). Classification with K-Nearest-Neighbours requires computing DTW(X,Y) for each of possible pairs $X \in T, Y \in E$, where $T$ is a test set and $E$ is eval set. Results can be stored …
Category: Data Science

how to perform clustering using dtw and some clustering method like kmeans

I have a timeseries(temperature of a sensor)and I want to apply an unsupervised clustering that. I've already done that using sklearn library and Kmeans. but the problem is that I don't know how to add DTW as a metric. I need to compare timeseries using DTW and I have no idea how to do that. Also as I saw in this link Kmeans can not perform very well on some kind of data, some methods like OPTICS and DBSCAN may …
Category: Data Science

Dynamic Time Warping with fast lookup?

My understanding of Dynamic Time Warping is that the algorithm always requires calculation with each comparison/training series and that there is no way to extract the "essence" from a given training series in the form of coefficients, which could then be compared to coefficients for the test series similar to what might be possible with different wavelet transforms, Discrete Cosine Transform, etc. In experimenting with Python's fastdtw package, DTW does appear to be somewhat slow (despite the name of the …
Category: Data Science

Signal correlation - matching specific points

Question: What are some recommended techniques for matching specific patterns in data sets? Background I have several thousand sites for which I have collected time series data. In the example image below we have increasing time in y-direction with data streams from 2 different instruments shown from two sites. I have added hand-picked, color-coded correlation markers for four significant events. Normally we would pick 10-15 of these markers at each measurement location. Typically, these markers are correlated by hand, however, …
Category: Data Science

How to do phoneme segmentation using dynamic time warping?

Background Information: Dynamic Time Warping (DTW): In time series analysis, dynamic time warping (DTW) is one of the algorithms for measuring similarity between two temporal sequences, which may vary in speed. (Source: Wikipedia) Phoneme Segmentation: Phoneme segmentation is the ability to break words down into individual sounds. For example, a child may break the word “sand” into its component sounds – /sss/, /aaa/, /nnn/, and /d/. (Source) The Question(s): (a) How can we do phoneme segmentation using DTW? (b) Which …
Category: Data Science

Calculate the similarity between pairs of time series data

I have 5 pieces of time series data. It is the weekly sales of 5 different stores (A,B,C,D,E). There are no missing values. A quick visual inspection shows that these 5 pieces of time series data have similar trend & seasonality. I would like to calculate/quantify how similar Store B,C,D,E is to Store A respectively. I know how to calculate the simple cosine distance and Euclidean distance, and I have experience dealing with time series data (e.g. ARIMA, Prophet), but …
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.