Go to the table of contents Go to the previous page Go to the next page View or print as PDF
Accuracy of machine learning
Machine Learning | TRITON AP-DATA | v8.3.x | 15-Dec-2016
The ability of the system to accurately classify data depends to a large extent on the examples that you provide. If the system fails to find enough common elements, the results from machine learning may not be accurate. Should this happen, the system performs another stage of validation to assess the level of false positives (unintended matches) and false negatives (undetected matches) on new data that is not used during the training phase, sometimes referred to as "zero-day documents."
If the "recall" level of the classifier (i.e., the total number of "true positives" divided by the sum of false positives and false negatives in the new data) is below 70 percent, the system returns a FAIL message that includes the likely reason the attempt to accurately classify data failed. Examples of these error messages follow:
 
By adjusting the sensitivity level of the classifier, you can reduce the number of false negatives (unintended matches) while accepting a higher level of false positives (undetected matches) or accept some false negatives to reduce the rate of false positives (or find an acceptable balance in between). Factors influencing your choice include the level of commonality in your positive set of examples (a low level tends to decrease accuracy); the business implications of false positives; and the resources that you have available to deal with false positives.

Go to the table of contents Go to the previous page Go to the next page View or print as PDF
Copyright 2016 Forcepoint LLC. All rights reserved.