how to align sliding window to extract features from multi modal timeseries data?

I have two datasets that are collected at different frequencies at the same time. One is recorded at 128Hz and another one is recorded at 512 Hz. I am trying to extract some features using the moving window technique but I have some problems. Frequencies of both datasets are different. the timestamp is in unix format and changes in nanoseconds. hence there won't be any match at the start and end of each second or minute. one of the datasets …
Category: Data Science

How can we predict a value after several rows of data?

I have a regression problem in which for each week I have several rows (variable between rows i.e 1 week might have 1800 rows and other might have 5000 rows). My target is to predict a value at end of each week's data. Here's an example of what I need : x is a feature y is the target. Week 1 ; x1, x2, x3.. x90 Week 1 ; v1, v2, v3... v90 .... 100 more rows Week 1 ; …
Category: Data Science

Where can I find free multi-instance single-label datasets for object detection?

I'm trying to find free multi-instance single-label datasets for object detection online. By "multi-instance and single-label" I mean that each image contains only object belonging to one class, but can contain more than one object of a certain class. I found a lot of datasets for multi-label, but none for single-label Any ideas are highly appreciated, thanks in advance
Category: Data Science

CNN with Multi channel input or CNN with Multi instance learning?

I have 500 Dicom images of medical scan of patients. These are 3 dimension scans , shape = [300 x 300 x 3]. From these I have extracted Front and side views. So, for each patient I have 2 images of shape [300 x 300]. In order to build a classifier, Should stack these 2 views and train a CNN {[300 x 300 x 2] x 500} -> Multi channel input, Or should i pass each view as a new …
Category: Data Science

Text classification with multiple documents per labeled datapoint

I have a dataset with a label TRUE or FALSE for each person, but each person has multiple documents associated with them (emails and documents). Right now I use a Random Forest Classifier on a bag of words consisting of all words in all documents put together per person (so that I have one row with all words and a label). It performs reasonably well, but I was wondering if you guys have some suggestions about how I can use …
Category: Data Science

Reducing High generalization-error on industrial fault data

I have a industrial dataset containing labeled machine data for fault classification(3 classes: 1 ok, 2 for faults). The problem is that i have less (~16) different machines, thus iam currently having instance shift problems: The accuracies on the training set is perfect but validation on holdout instances fails. As background information, the machine data is time series, where i extracted statistical (domain specific) features from (14 in total). This features are my dataset for classification. I tried different model …
Category: Data Science

SMOTE for multi-instance learning i.e num_rows(x_train) > num_rows(y_train)

I have an imbalanced dataset and I wish to predict classes(0 or 1). Sample x_train: id date c1 c2 . . . . . . c20 101 13-02-2015 2 7 . . . . . . 14 101 14-02-2015 24 7 . . . . . . 8 . . . 105 13-02-2015 12 5 . . . . . . . 4 . . Sample y_train id class 101 1 105 1 107 0 . . . Now I …
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.