Predictions for a new data

Again very similar to PLS — just use method predict() and provide at least matrix or data frame with predictors (which should contain the same number of variables/columns). For test set validation you can also provide class reference information similar to what you have used for calibration of PLS-DA models.

In case of multiple class model, the reference values should be provided as a factor or vector with class names as text values. Here is an example.

res = predict(m.all, Xv, cv.all)
summary(res)
## 
## PLS-DA results (class plsdares) summary:
## Number of selected components: 3
## 
## Class #1 (setosa)
##        X expvar X cumexpvar Y expvar Y cumexpvar TP FP TN FN Spec Sens
## Comp 1    92.92       92.92    47.01       47.01 25  1 49  0 0.98    1
## Comp 2     4.56       97.48    10.37       57.39 25  0 50  0 1.00    1
## Comp 3     1.79       99.27     1.59       58.97 25  0 50  0 1.00    1
## 
## 
## Class #2 (versicolor)
##        X expvar X cumexpvar Y expvar Y cumexpvar TP FP TN FN Spec Sens
## Comp 1    92.92       92.92    47.01       47.01  0  0 50 25 1.00  0.0
## Comp 2     4.56       97.48    10.37       57.39 10  4 46 15 0.92  0.4
## Comp 3     1.79       99.27     1.59       58.97 10  6 44 15 0.88  0.4
## 
## 
## Class #3 (virginica)
##        X expvar X cumexpvar Y expvar Y cumexpvar TP FP TN FN Spec Sens
## Comp 1    92.92       92.92    47.01       47.01 25  4 46  0 0.92 1.00
## Comp 2     4.56       97.48    10.37       57.39 25  4 46  0 0.92 1.00
## Comp 3     1.79       99.27     1.59       58.97 24  4 46  1 0.92 0.96

And the corresponding plot with predictions.

par(mfrow = c(1, 1))
plotPredictions(res, legend.position = 'topleft')

If vector with reference class values contains names of classes model knows nothing about, they will simply be considered as members of non of the known clases (“None”).

In case of one-class model, the reference values can be either factor/vector with names or logical values, like the ones used for calibration of the model. Here is an example for each of the cases.

res21 = predict(m.vir, Xv, cv.all)
summary(res21)
## 
## PLS-DA results (class plsdares) summary:
## Number of selected components: 3
## 
## Class #1 (virginica)
##        X expvar X cumexpvar Y expvar Y cumexpvar TP FP TN FN Spec Sens
## Comp 1    92.95       92.95    53.96       53.96 25  4 46  0 0.92 1.00
## Comp 2     1.62       94.57     6.10       60.06 24  4 46  1 0.92 0.96
## Comp 3     2.70       97.27    -0.15       59.91 22  4 46  3 0.92 0.88
res22 = predict(m.vir, Xv, cv.vir)
summary(res22)
## 
## PLS-DA results (class plsdares) summary:
## Number of selected components: 3
## 
## Class #1 (virginica)
##        X expvar X cumexpvar Y expvar Y cumexpvar TP FP TN FN Spec Sens
## Comp 1    92.95       92.95    53.96       53.96 25  4 46  0 0.92 1.00
## Comp 2     1.62       94.57     6.10       60.06 24  4 46  1 0.92 0.96
## Comp 3     2.70       97.27    -0.15       59.91 22  4 46  3 0.92 0.88

As you can see, statistically results are identical. However, predictions plot will look a bit different for these two cases, as you can see below.

par(mfrow = c(2, 1))
plotPredictions(res21, legend.position = 'topleft')
plotPredictions(res22, legend.position = 'topleft')

And because predict() returns an object with results you can also use most of the plots available for PLS regression results. In the last example below you will find plots for X-residuals and Y-varaince.

par(mfrow = c(1, 2))
plotXResiduals(res21)
plotYVariance(res22)