Data set of vectors of SVG paths for digits
I have used the MNIST data set many times to train models for digit recognition based on object character recognition (OCR).
I am now trying to do the same but with a data set of svg paths.. I am trying to find an MNIST equivalent of a digital path / svg based data set.
Here is a sample:
the svg
path d=m233.5,119.4375c-1,-1 -3.025818,-1.320366 -5,-1c-3.121445,0.506538
-8.191559,0.090805 -15,2c-14.665848,4.112541 -23.266006,8.139008 -31,11c-6.291519,
2.327393 -11.679474,6.571106 -14,11c-1.467636,2.801086 -2,7 -2,10c0,4
-0.610916,8.03746 0,13c0.503769,4.092209 2.877655,8.06601 4,10c1.809723,3.118484
4.718994,6.310211 8,9c5.576645,4.571762 11.887314,5.376694 18,7c9.8564,2.617508
19,2 34,2c11,0 19,0 24,0c5,0 9.222717,-1.723984 13,-5c2.136749,-1.853195 4.346191,
-4.70546 6,-7c1.307465,-1.813995 1.693512,-6.048325 1,-11c-1.009766,-7.209747
-3.793945,-12.087433 -6,-17c-1.832031,-4.079666 -2.714111,-7.21167 -5,
-10c-1.793182,-2.187347 -5.714111,-5.21167 -8,-8c-2.689789,-3.281006 -4,-5 -6,
-7c-1,-1 -4,-3 -5,-4c-1,-1 -2.042908,-1.710213 -3,-2c-3.450851,-1.04483
-3.852737,-2.173096 -5,-3c-1.813995,-1.307449 -3,-2 -6,-2c-2,0 -3.878555,
-1.493462 -7,-2c-0.987091,-0.160179 -2,0 -4,0l-1,0 id=svg_1 stroke-width=1.5
stroke=#000 fill=none/
I intend to train the model to understand the d=
param. MyScript uses this approach rather than OCR, https://developer.myscript.com/. They call their approach iink but is a brand rather than a word I can use to search for material in this space. MyScript works with 2 dimensional data, so not only just the paths but also the relative positions of objects. I am interested in applying this approach to mathematics but for now anything will do.
Topic mnist ocr image-recognition dataset
Category Data Science