How to identify a field as holding personal identifiable information from the name of the field itself using ML model in python?
Is it possible to automatically detect fields holding personal information (name, phone, address, SSN, passport, gov ID...) from its names, using python in order to upload datasets into the cloud after encrypting or anonymizing the PII fields?
I am open to do my own model by training it on a dataset that holds thousands of fields and each one is classified whether personal or not. But apparently I can't find any related datasets.
Topic anonymization python machine-learning
Category Data Science