Bidirectional Encoder Representations from Transformers in R

Can anybody suggest to me, where I can find example code for R language for BERT neural network for text mining tasks. All I can see are python examples, and I need R.

Topic text programming nlp r

Category Data Science


Disclaimer: Only performs text classification (for now)

For general purpose tasks, I recommend RBERT

I've developed a package on CRAN called transforEmotion

There is a vignette to get set up with Python (includes example of how to use package in R at the end)

After, the rest is taken care of in the transformer_scores() function.

The default model on CRAN is Facebook's BART Large

I've done much more work on GitHub

The default model on GitHub is Cross-Encoder's DistilRoBERTa (much faster than Facebook's BART Large with minimal trade-off for accuracy)

The transformer_scores() function allows you to implement any huggingface text classification pipeline so long as there is a pipeline for it: https://huggingface.co/models?pipeline_tag=zero-shot-classification


It seems the Python dependency is always there in some fashion. I found this nice example of bridging that gap via reticulate:


You might be interested in the open-source R package RBERT: https://github.com/jonathanbratt/RBERT

It's a work in progress, but the goal is to be able to use BERT directly in R.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.