Proper datashape and model architecture for recognizing highs and lows in a chart
I am using a Keras LSTM model to try to pinpoint the highs and lows (relative high points and low points) in a chart (I need the actual coordinates to those highs and lows, not just an image). The training process has no errors in it but the prediction output is completely irrelevant to the training output.
what I've done so far is, I created the output data by feeding the input data to an algorithm from Scipy, argrelextrema. For personal reasons I want to be able to replicate that behavior using Machine learning.
Each individual input is an array of 500 values. If I print an individual input, I'll get:
[0.99943984 0.99958635 0.99954326 0.99965529 0.99984488 1.
0.99978456 0.99968976 0.99948293 0.99947431 0.99952602 0.99935367
0.99950017 0.99942261 0.99930196 0.99931919 0.99929334 0.99933643
0.99926749 0.99935367 0.99938814 0.99939675 0.99940537 0.99943123
0.99935367 0.99935367 0.99930196 0.99924163 0.99918993 0.9991296
0.99943123 0.99944846 0.99950879 0.99942261 0.99950879 0.99957773
0.99954326 0.99942261 0.99941399 0.99948293 0.99937952 0.99938814
0.99937952 0.99925887 0.99920716 0.99919855 0.99916407 0.99908651
0.99885383 0.9987418 0.99869871 0.99846603 0.99861254 0.99843156
0.99819026 0.9976732 0.99754393 0.99736296 0.99744913 0.99758702
0.99729401 0.99702686 0.9970441 0.99658735 0.99690621 0.99675971
0.99653565 0.99684589 0.99707857 0.99718198 0.99703548 0.99685451
0.99667353 0.9968028 0.99644947 0.9962685 0.99639776 0.99650118
0.9965615 0.99638053 0.99637191 0.99660459 0.99620817 0.99610476
0.9960789 0.99619955 0.99584622 0.99603582 0.99570834 0.99576005
0.99591517 0.99588931 0.99576005 0.99552737 0.99560493 0.99534639
0.99545842 0.9956394 0.99562216 0.99571696 0.99581175 0.99563078
0.99494136 0.99533777 0.99531192 0.99541533 0.99559631 0.99560493
0.99559631 0.99571696 0.99571696 0.99567387 0.99559631 0.99588931
0.99569972 0.99577728 0.99587208 0.99577728 0.99577728 0.99559631
0.99577728 0.99583761 0.99589793 0.99576005 0.99567387 0.9957859
0.99564801 0.99638915 0.99671662 0.99683727 0.99678556 0.99662183
0.99702686 0.99688898 0.99686312 0.99690621 0.9965615 0.99649256
0.99729401 0.99678556 0.99687174 0.99637191 0.99591517 0.99613061
0.99649256 0.99666491 0.99673386 0.99718198 0.99730263 0.99688036
0.99644085 0.996415 0.99608752 0.99619955 0.99622541 0.9959324
0.99574281 0.99574281 0.99634606 0.99659597 0.99685451 0.99653565
0.99599273 0.99577728 0.9956911 0.99553598 0.99535501 0.9957859
0.99594964 0.99570834 0.99616508 0.99600996 0.9962685 0.99631158
0.99613061 0.99668215 0.99689759 0.99701824 0.9968028 0.99671662
0.99663906 0.9965615 0.99668215 0.99816441 0.99865562 0.99866424
0.99827644 0.99815579 0.99788864 0.99776799 0.99790588 0.99822473
0.99819888 0.99794897 0.99840571 0.99831953 0.99818165 0.99750946
0.99770767 0.99774214 0.99783693 0.99775937 0.99742328 0.99747499
0.99783693 0.99801791 0.99802653 0.9974836 0.99697515 0.99744052
0.99682865 0.99674247 0.99681142 0.99631158 0.9966563 0.99674247
0.99659597 0.99652703 0.99702686 0.99744913 0.99845742 0.99785417
0.99775076 0.99754393 0.99676833 0.99659597 0.99600134 0.99673386
0.99668215 0.99702686 0.99650979 0.99650979 0.99711304 0.99753531
0.99726816 0.9971906 0.99707857 0.99701824 0.99756116 0.99740604
0.99759564 0.99716475 0.99715613 0.99711304 0.99691483 0.99688036
0.99690621 0.99625988 0.99600996 0.99574281 0.99569972 0.99569972
0.99579452 0.99596687 0.99606167 0.99606167 0.9959324 0.99580314
0.99456217 0.9939934 0.99403649 0.99442429 0.9942864 0.99457079
0.99469144 0.99450185 0.99465697 0.99468282 0.99451908 0.99453632
0.99488103 0.99494136 0.9949155 0.99479485 0.99453632 0.99441567
0.99427779 0.99431226 0.9945277 0.99434673 0.99388137 0.99419161
0.99376934 0.9942347 0.9943812 0.99429502 0.9941399 0.99413128
0.99412267 0.99395893 0.9939934 0.99451047 0.99473453 0.99454494
0.99459664 0.99468282 0.99469144 0.99457079 0.99458803 0.99468282
0.99481209 0.99475176 0.99480347 0.99479485 0.99488965 0.99429502
0.99427779 0.99432087 0.99471729 0.99483794 0.99469144 0.99476038
0.99478624 0.99492412 0.99489827 0.99488965 0.99482932 0.994769
0.99461388 0.9945277 0.99422608 0.99426055 0.99421746 0.99415714
0.99438982 0.99451908 0.99451908 0.99456217 0.99456217 0.99450185
0.99443291 0.99458803 0.99457941 0.99461388 0.9946742 0.994769
0.99479485 0.99488103 0.99484656 0.99469144 0.99460526 0.99465697
0.99455356 0.99449323 0.99461388 0.99457079 0.99466559 0.99489827
0.99484656 0.99473453 0.99465697 0.99460526 0.99499306 0.9951568
0.99477762 0.99488103 0.9949155 0.99493274 0.99516542 0.99509648
0.99524298 0.99512233 0.99480347 0.99489827 0.99500168 0.99492412
0.9948638 0.99498444 0.99522574 0.99519127 0.99529469 0.99528607
0.9953033 0.99513095 0.99506201 0.99527745 0.99560493 0.99589793
0.99551875 0.99533777 0.99573419 0.99569972 0.99534639 0.99546704
0.9956394 0.99566525 0.99584622 0.99582037 0.99583761 0.9957859
0.99582037 0.99569972 0.99571696 0.99567387 0.99563078 0.99555322
0.99558769 0.99585484 0.99607029 0.99599273 0.9959324 0.99646671
0.99669077 0.99663906 0.99675971 0.99690621 0.99666491 0.99676833
0.99686312 0.99775076 0.99736296 0.99737157 0.99729401 0.99714751
0.99713028 0.99683727 0.9968028 0.99715613 0.99723369 0.99726816
0.99735434 0.99737157 0.99731987 0.99736296 0.9969493 0.9971906
0.99724231 0.99700963 0.99699239 0.99683727 0.99695792 0.99655288
0.99644947 0.99655288 0.99655288 0.9965615 0.99638053 0.9959324
0.99632882 0.99590655 0.99625126 0.99633744 0.99649256 0.99682003
0.99679418 0.99668215 0.99682865 0.99655288 0.99632882 0.99648394
0.9962685 0.99639776 0.99651841 0.99644085 0.99652703 0.99619094
0.99637191 0.99631158 0.99667353 0.9969493 0.99696654 0.99687174
0.99697515 0.99736296 0.99778523 0.99763872 0.99769043 0.99749222
0.99721645 0.99725954 0.99771629 0.99793173 0.99790588 0.99800929
0.9981127 0.9979662 0.99815579 0.998061 0.99807823 0.99805238
0.99840571 0.99871595 0.99925025 0.99902619 0.99914684 0.99906066
0.99877627 0.99845742 0.9985953 0.99843156 0.99885383 0.99856945
0.99875904 0.99900034 0.99868148 0.99850912 0.99875904 0.99905204
0.99912099 0.99972423]
And plotting the same input, I'll get:
Each individual output is also an array of 500 values (same length as input). Now there is a complication with the output because there are different number of highs and lows in each chart (each input), so if I was to regard that number as the output length, there would be a mismatch among the output length of the dataset. So I implemented something like one-hot encoding, and put 0 (or a very small number to avoid vanishing gradiant) where the index on the chart holds no high or low, and put the value of high or low where the index holds a high or a low.
So here is an individual output:
[[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e+00]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[9.99129603e-01]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[9.99577728e-01]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[9.94941356e-01]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[9.97302631e-01]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[9.95355010e-01]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[9.98664242e-01]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[9.96311585e-01]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[9.98457415e-01]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[9.96001344e-01]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[9.97595636e-01]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[9.93993399e-01]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[9.94941356e-01]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[9.93769336e-01]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[9.94924120e-01]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[9.94157137e-01]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[9.95897931e-01]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[9.95553219e-01]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[9.97750756e-01]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[9.95906549e-01]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[9.96828652e-01]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[9.96190936e-01]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[9.99250252e-01]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]
[1.00000000e-08]]
and if I plot that output in its raw form, I get:
But if I extract the highs and lows from the output using the following code:
y_index= []
y_vals = []
for z, i in enumerate(y[0]):
if i 0.00000001: # high or low is greater than a very small value used to pad the output data
y_index.append(z)
y_vals.append(i)
And plot it:
plt.scatter(y_index, y_vals)
plt.show()
I'll get:
And if I plot the input and the output at the same chart using the code:
plt.plot(x[0])
y_index = []
y_vals = []
for z, i in enumerate(y[0]):
if i 0.00000001:
y_index.append(z)
y_vals.append(i)
plt.scatter(y_index, y_vals)
plt.show()
The picture becomes clear:
here's the model that I am using:
x = x.reshape(len(x), 500, 1)
y = y.reshape(len(y), 500, 1)
# print(x[0].shape)
# print(x[0])
# print(y[0].shape)
print(y[0])
return
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.InputLayer(input_shape=(x.shape[1:])))
model.add(tf.keras.layers.Dense(16, activation=tf.keras.layers.LeakyReLU()))
# model.add(BatchNormalization())
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.Dense(16, activation=tf.keras.layers.LeakyReLU() ))
model.add(BatchNormalization())
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.Dense(8, activation = tf.keras.layers.LeakyReLU()))
# model.add(BatchNormalization())
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.Dense(1, activation = 'sigmoid'))
mcp_save = ModelCheckpoint(f'model' + '_EPOCH-{epoch:02d}-loss-{loss:.12f}.h5',save_best_only=True, monitor='val_loss', mode='min')
earlyStopping = EarlyStopping(monitor='val_loss', patience=40, verbose=0, mode='auto')
# reduce_lr_loss = ReduceLROnPlateau(monitor='val_accuracy', factor=0.1, patience=15, verbose=1, epsilon=1e-4, mode='min')
csv_logger = CSVLogger(f'model_log.csv', append=True, separator=',')
callbacks = [mcp_save, csv_logger, earlyStopping]
model.compile(loss='mse', optimizer=tf.keras.optimizers.Adam(learning_rate=0.0005, decay=1e-6), metrics=['mean_squared_error']) # Be careful about your chosen metric.
model.summary()
model.fit(x, y, epochs=200, batch_size=128, callbacks=callbacks, validation_split=0.20)
model.save(model.h5)
But after the training is complete, the model just predicts the very input that it's given, as the output. So that leads me to believe that I am definitely doing something wrong. And I believe this problem shouldn't be too difficult to solve for a neural network since the training output itself is achieved using Scipy and argrelextrema.
Topic numpy machine-learning-model keras deep-learning python
Category Data Science