Fastest way to parse regex in R

Question

Fastest way to parse regex in R

Luisda

2020年6月11日 18:33

I need to parse around 1.6k REGEX expressions such as the pair I am writing below.

I have also around 7k documents (1/2 page long each in average) that need to be parsed according to the REGEX expressions.

Right now I am using

library(rebus)
library(stringr)

regex_exp - rebus::or1("(?i-mx:\\b(?:actroid\\b))", "(?i-mx:\\b(?:robot\\*w\\b)))")

regex_exp - BOUNDARY %R% regex_exp %R% BOUNDARY

stringr::str_extract_all("This is my text talking about technology, but also about the actroid", regex_exp)

to found matches, but it takes approx. 3.5 minutes per file, which is of course not scalable.

Is there a more efficient library/method to parse regex expression in R? I am also naive about whether using reticulate to parse in Python and go back to R could be faster.

Topic regex parsing r

Category Data Science

Fastest way to parse regex in R

About