Data Anonymization for all domains?
I am using a dataset from Marketing and sales department. The dataset contains customer name (company name), company address, pincode, no of orders placed, revenue generated from that customer etc.
My question is whether I should hide/mask/anonymize customer name and address etc?
Of course, the insights that we generate will be used by the business users from sales and marketing team.
So, should we use a duplicate identifier (mapping sheet) to indicate the customer names and address etc.
For ex: Company A is indicated as 101, Company B is indicated as 321 etc. Some random identifiers and that mapping file will be maintained by the business users (from sales and marketing department). Or is it not necessary to anonymize the data?
Can share me your suggestions on when to and when not to anonymize the data?
I know in healthcare, we have individual level data (patient centric data) and they are very sensitive, so we mask them using identifiers. But does the same apply in sales and marketing domain as well? Are the company name, address, revenue generated etc should be treated as confidential?
Topic data anonymization deep-learning dataset machine-learning
Category Data Science