What are common problems around HADOOP storage?

Question

vwdewaal

2021年1月26日 07:00

I've been asked to lead a program to understand why our Hadoop storage is constantly near capacity. What questions should I ask?

Data age,
Data size?
Housekeeping schedule?
How do we identify the different types of compression used by different applications?
How can we identify where the duplicate data sources are?
Are jobs designated for edge nodes only on edge nodes?

Soumendra Mishra · Accepted Answer · 2020年8月23日 08:32

You can ask following questions: