Feature extraction; similarity and classification of accelerometer data
I have several expert persons performing the same specific action (for example, squat or leap forward) multiple times. Say 5 persons do 100 squats each. They have an accelerometer attached to the same body parts. I record the accelerometer readings and get 100*5 = 500 data samples. They do it for multiple different actions (squat, push up, leap forward, etc). The way they record the action is as follows:
- Start recording (push button)
- Do the action
- Stop recording (push button)
Now I need to see that another person is doing the actions in the correct order. For example, squat, leap forward, stand up, drop down, push up. I take his accelerometer data and continuously feed it to the classifier that needs to tell me if he now has done exactly a squat action, not a leap or a push-up. So, when the first action, namely squat, was identified, I check against leap forward and so on.
There are several problems with this:
- These data samples have different amounts of values, since somebody is squatting a bit slower, others do it a bit faster. So, some data samples have 250 XYZ values, others have 220 or 270, etc. (in range of +-50). What I do for now is make stricter rules. I discard all the data samples that exceed 250 readings and for ones that have fewer values than 250, I append the values from the beginning to the end so that it gives 250 in total. Works fine, since there is a windup for every action where the person is standing still for a brief moment before he performs the action. This is not optimal, because the experts need to redo the action if they were too slow (the windup was too long) + I append fake data. What would be a better solution to handle this?
- For now I am using Random Forest, AdaBoost classifiers with low/high pass filtered accelerometer data that I map to 750 columns (250 X, 250 Y, 250 Z) with 1 class column. So the prediction tells me something like 70% leap, 25% squat, 5% push up. The classification is sometimes wrong or not precise enough. Thus, I was thinking of extracting some features from my signal series and feed them to the algorithm instead. My problem is that I do not know what features to extract.
The majority of papers that I found focus on human activity recognition to differentiate between walking, running, ascending, and descending stairs. They were not very helpful in the regard that they have continuous data flow of a person walking/running for hours and they use much more sample data. In contrast, with my task, I have data set instances that are separate from each other.
I am not asking to solve those whole tasks for me, just guide me into some direction with a good explanation of why it might be useful.
Topic sensors classification feature-extraction machine-learning
Category Data Science