Data classification/content/support/library/data/v78/help/preciseid_summary.aspx
Data Security Manager Help v7.8
Data classification
There are several ways to classify your data using Data Security:
-
Pre-defined scripts, dictionaries, file-types, regular expression patterns, and key phrases - Data Security includes classifiers out-of-the box so you can start classifying your data right away. RegEx patterns are used to identify alphanumeric strings of a certain format, such as 123-45-6789. File properties classifier let you classify data by file name, type or size. .
- Custom scripts, dictionaries, file-types, regular expression patterns, and key phrases for specific (described) data - You can create your own custom classifiers for data that you describe.
- Fingerprinting (registered) data - The power of PreciseID techniques is its ability to detect sensitive information despite manipulation, reformatting, or other modification. Fingerprints enable the protection of whole or partial documents, antecedents, and derivative versions of the protected information, as well as snippets of the protected information whether cut and pasted or retyped. PreciseID technology can fingerprint 2 types of data: structured (databases) and unstructured (files and folders).
- Machine learning - In machine learning classifiers, you provide examples of the type of data that you want to protect and that you don't want to protect, so the system can learn and identify sensitive data in traffic. These are called positive and negative training sets because the examples educate the system. Unlike fingerprinting, the files do not need to contain parts of the analyzed files but can look similar or be on a similar topic. The system learns and recognizes complex patterns and relationships and makes decisions on them without exact include/exclude criteria that are specified in fingerprinting classifiers. Machine learning can even protect new, zero-day documents in this way.
Click View Complete Document for more.