How do i generate text from ids in Torchtext's sentencepiece_numericalizer?
The torchtext sentencepiece_numericalizer()
outputs a generator with indices SentencePiece model corresponding to token in the input sentence. From the generator, I can get the ids.
My question is how do I get the text back after training?
For example
sp_id_generator = sentencepiece_numericalizer(sp_model)
list_a = [sentencepiece encode as pieces, examples to try!]
list(sp_id_generator(list_a))
[[9858, 9249, 1629, 1305, 1809, 53, 842],
[2347, 13, 9, 150, 37]]
How do I convert list_a
back t(i.e sentencepiece encode as pieces, examples to try!
)?
Topic bert transformer pytorch nlp python
Category Data Science