How to generate syntactically correct text for CRNN-CTC text model?

Disregarding the image creation and labeling details, is there a way to generate syntactically correct text examples? As of my current understanding of the CTC model, it takes into consideration the likelihood of a given letter preceding or following another in a given sequence. For example:

Colorless green ideas sleep furiously

The sentence doesn't make sense however, it has a proper syntax: each word has a few vowels, verbs are where they should be, ... I want the word generator to take into account what is more / less valid, and generate examples accordingly. I think generating completely random words, phrases and so introduces bias to the model. Here's another form which is still okay:

clabe lonkey sining slace 225

Which is less valid than the previous example but still, words have a proper syntax. Here's what I think is not good for model generalization:

jhsgdvj c3DDsdc csdce5445dchjv3 cdsIBcsc

Which is usually the result of random generation that I'm trying to avoid. A common practice followed by some text generators I found, is to keep some sort of word files and use them as examples but this limits the examples to the predetermined words and introduces character imbalance for the less frequent characters ex: z, x, ...

Topic text-classification text-generation text

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.