"Up or down but not sideways" bimodal time series prediction - what is the best way to model it?

Say I have a time series (e.g. bitcoin price). I want to predict tomorrow's price, specifically tomorrow's % change in price from today. Let's say this is gaussian distributed, with the mean at 0%.

If the market is trending up, the price prediction should be higher (e.g. +3.1%).

If the market is trending down, the price prediction should be lower (e.g. -5.4%).

If the market is trending sideways, the price prediction should be neutral (e.g. 0%).

However, there are times when the market is ready to move up or down but not sideways. (This may happen because of some controversy in the news, and everyone is waiting to see what everyone else will do, and whoever moves first starts the trend, like a butterfly effect.) This is the part I'm interested in modeling.

If the model is trained to simply predict a target float value (e.g. -5.4), this can tell us about trends up, down, or sideways, but it can't tell us when the market is in a state of controversy (up or down but not sideways).

If the model is trained instead to predict a mean and variance (i.e. a unimodal gaussian distribution), this tells us about model confidence, but it still isn't enough.

  • Confident uptrend = mean positive, variance small
  • Confident downtrend = mean negative, variance small
  • Confident sideways trend = mean zero, variance small
  • Up or down but not sideways = mean zero, variance high
  • Random / model has no fucking clue = mean zero, variance high

This is bad, because up or down but not sideways looks the same as a random guess. It's even worse because zero should be the least likely outcome, yet the model has predicted zero to be most likely (because the mean of the distribution is centered at zero).

A mixture of two gaussians, however, is able to modal bimodal distributions like this:

Is this the best way to model this problem? A complication I see here is: gaussians can't be skewed. There must be better, more flexible, distributions.

Plan B--

The other method I thought of was using a softmax over N classes, discretizing the distribution. Then the model gives a probability for each price% range, and even more flexibility on distribution shapes.

If I went the softmax route, what should I be careful of? Does this run into issues with class imbalance? Is there some superior logic to choosing the number of classes, and the range of each class? Should classes be balanced? (ie. each class is equally likely) Or should classes be linearly binned by z-score? (ie. between 0-.5 std, .5-1 STD, 1-1.5 std, 1.5-2 std, etc) A complication I see with these is the artefacts emerging from coincidences of how the bins align with the data points.

What is the best way to model this problem? Are there better, more flexible, continuous distributions?

(Context: neural net in pytorch)

Topic softmax gaussian finance time-series predictive-modeling

Category Data Science


Seems like Mixture density networks provide a great solution.

mixture density networks

Gaussian mixtures can be numerically compared to both the linear regression MSE loss approach and the softmax cross-entropy loss approach via negative log likelihood.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.