Theoretical Question: Data.table vs Data.frame with Big Data
I know that I can read in a very large csv file much faster with fread using the data.table library than with read.csv that reads a file in as a data.frame. However, dplyr can only perform operations on data.frame.
My questions are:
- Why was
dplyrbuilt to work with the slower of the two data structures? - When working with big data is it good practice to read in as
data.tablethen convert todata.frameto performdplyroperations? - Is there another strategy I am missing?
Topic dplyr data-table dataframe r
Category Data Science