Loading file into and out of HDFS via system call/cmd line vs using libhdfs

n0unc3

2020年3月27日 16:48

I am trying to implement a simple C/C++ program for the HDFS file system like word count, it takes a file from the input path puts it into HDFS (where it gets split), processed my map-reduce function and gives an output file which I place back to the local file system.

My question is what makes better design choice to load the files into HDFS: From a C program call bin/hdfs dfs -put ../inputFile /someDirectory or make use of libhdfs?

Topic c map-reduce apache-hadoop bigdata

Category Data Science

Loading file into and out of HDFS via system call/cmd line vs using libhdfs

About