SIMCA classification

SIMCA (Soft Independent Modelling of Class Analogy) is a simple but efficient one-class classification method mainly based on PCA. The general idea is to create a PCA model using data for samples/objects belonging to a class and classify new objects based on how good the model can fit them. The decision is made using two distances — \(Q\) and \(T^2\) and corresponding critical limits. Critical limits computed for both distances (or their combination) are used to cut-off the strangers (extreme objects) and accept class members with a pre-define expected ratio of false negatives (\(\alpha\)). All details about which method for critical limits implemented in this package can be found here. It must be noted that version 0.9.0 brings a lot if improvements and new features related to the critical limits calculation and SIMCA classification, so it can be a good idea to read this chapter first.

The classification performance can be assessed using true/false positives and negatives and statistics, showing the ability of a classification model to recognize class members (sensitivity or true positive rate) and how good the model is for identifying strangers (specificity or true negative rate). In addition to that, model also calculates a percent of misclassified objects. All statistics are calculated for calibration and validation (if any) results, but one must be aware that specificity can not be computed without objects not belonging to the class and, therefore, calibration and cross-validation results in SIMCA do not have specificity values.

It must be also noted that any SIMCA model or result is also a PCA object and all plots, methods, statistics, available for PCA, can be used for SIMCA objects as well.