| Title: | Calculates Robust Performance Metrics for Imbalanced Classification Problems |
|---|---|
| Description: | Calculates robust Matthews Correlation Coefficient (MCC) and robust F-Beta Scores, as introduced by Holzmann and Klar (2024) <doi:10.48550/arXiv.2404.07661>. These performance metrics are designed for imbalanced classification problems. Plots the receiver operating characteristic curve (ROC curve) together with the recall / 1-precision curve. |
| Authors: | Bernhard Klar [aut, cre], Hajo Holzmann [aut] |
| Maintainer: | Bernhard Klar <[email protected]> |
| License: | GPL (>= 3) |
| Version: | 0.1.1 |
| Built: | 2026-05-31 08:59:27 UTC |
| Source: | https://github.com/bernhardklar/robustmetrics |
Compute the F-Beta Score.
FScore( actual = NULL, predicted = NULL, TP = NULL, FN = NULL, FP = NULL, TN = NULL, beta = 1 )FScore( actual = NULL, predicted = NULL, TP = NULL, FN = NULL, FP = NULL, TN = NULL, beta = 1 )
actual |
A vector of actual values (1/0 or TRUE/FALSE) |
predicted |
A vector of prediction values (1/0 or TRUE/FALSE) |
TP |
Count of true positives (correctly predicted 1/TRUE) |
FN |
Count of false negatives (predicted 0/FALSE, but actually 1/TRUE) |
FP |
Count of false positives (predicted 1/TRUE, but actually 0/FALSE) |
TN |
Count of true negatives (correctly predicted 0/FALSE) |
beta |
Beta squared is the weight of recall in harmonic mean |
Calculate the F-Beta Score. Provide either:
actual and predicted or
TP, FN, FP and TN.
F-Beta Score.
Holzmann, H., Klar, B. (2024). Robust performance metrics for imbalanced classification problems. arXiv:2404.07661. LINK
actual <- c(1,1,1,1,1,1,0,0,0,0) predicted <- c(1,1,1,1,0,0,1,0,0,0) FScore(actual, predicted) FScore(TP=4, FN=2, FP=1, TN=3)actual <- c(1,1,1,1,1,1,0,0,0,0) predicted <- c(1,1,1,1,0,0,1,0,0,0) FScore(actual, predicted) FScore(TP=4, FN=2, FP=1, TN=3)
Compute Matthews correlation coefficient.
MCC( actual = NULL, predicted = NULL, TP = NULL, FN = NULL, FP = NULL, TN = NULL )MCC( actual = NULL, predicted = NULL, TP = NULL, FN = NULL, FP = NULL, TN = NULL )
actual |
A vector of actual values (1/0 or TRUE/FALSE) |
predicted |
A vector of prediction values (1/0 or TRUE/FALSE) |
TP |
Count of true positives (correctly predicted 1/TRUE) |
FN |
Count of false negatives (predicted 0/FALSE, but actually 1/TRUE) |
FP |
Count of false positives (predicted 1/TRUE, but actually 0/FALSE) |
TN |
Count of true negatives (correctly predicted 0/FALSE) |
Calculate Matthews correlation coefficient. Provide either:
actual and predicted or
TP, FN, FP and TN.
Matthews correlation coefficient.
Holzmann, H., Klar, B. (2024). Robust performance metrics for imbalanced classification problems. arXiv:2404.07661. LINK
actual <- c(1,1,1,1,1,1,0,0,0,0) predicted <- c(1,1,1,1,0,0,1,0,0,0) MCC(actual, predicted) MCC(TP=4, FN=2, FP=1, TN=3)actual <- c(1,1,1,1,1,1,0,0,0,0) predicted <- c(1,1,1,1,0,0,1,0,0,0) MCC(actual, predicted) MCC(TP=4, FN=2, FP=1, TN=3)
This dataset contains example data from a Random Forest model.
rf.datarf.data
A data frame with 2 columns:
Actual values
Predicted probabilities
Full test data set using random forest classifier, see Section 6 in Reference.
Holzmann, H., Klar, B. (2024). Robust performance metrics for imbalanced classification problems. arXiv:2404.07661. LINK
Compute a robust version of the F-Beta Score.
robFScore( actual = NULL, predicted = NULL, TP = NULL, FN = NULL, FP = NULL, TN = NULL, beta = 1, d0 = 0.1 )robFScore( actual = NULL, predicted = NULL, TP = NULL, FN = NULL, FP = NULL, TN = NULL, beta = 1, d0 = 0.1 )
actual |
A vector of actual values (1/0 or TRUE/FALSE) |
predicted |
A vector of prediction values (1/0 or TRUE/FALSE) |
TP |
Count of true positives (correctly predicted 1/TRUE) |
FN |
Count of false negatives (predicted 0/FALSE, but actually 1/TRUE) |
FP |
Count of false positives (predicted 1/TRUE, but actually 0/FALSE) |
TN |
Count of true negatives (correctly predicted 0/FALSE) |
beta |
Beta squared is the weight of recall in the harmonic mean |
d0 |
Weight of the estimated true positive probability in the harmonic mean |
Calculate the robust F-Beta Score with two parameters.
Provide either:
actual and predicted or
TP, FN, FP and TN.
If , the robust F-Beta Score coincides with the F-Beta Score.
robust F-Beta Score.
Holzmann, H., Klar, B. (2024). Robust performance metrics for imbalanced classification problems. arXiv:2404.07661. LINK
actual <- c(1,1,1,1,1,1,0,0,0,0) predicted <- c(1,1,1,1,0,0,1,0,0,0) robFScore(actual, predicted, beta=1, d0=0.1) robFScore(TP=4, FN=2, FP=1, TN=3, beta=1, d0=1)actual <- c(1,1,1,1,1,1,0,0,0,0) predicted <- c(1,1,1,1,0,0,1,0,0,0) robFScore(actual, predicted, beta=1, d0=0.1) robFScore(TP=4, FN=2, FP=1, TN=3, beta=1, d0=1)
Compute a robust version of the F-Beta Score with two additional parameters.
robFScore2( actual = NULL, predicted = NULL, TP = NULL, FN = NULL, FP = NULL, TN = NULL, d1 = 1, d0 = 0.1, c = 1 )robFScore2( actual = NULL, predicted = NULL, TP = NULL, FN = NULL, FP = NULL, TN = NULL, d1 = 1, d0 = 0.1, c = 1 )
actual |
A vector of actual values (1/0 or TRUE/FALSE) |
predicted |
A vector of prediction values (1/0 or TRUE/FALSE) |
TP |
Count of true positives (correctly predicted 1/TRUE) |
FN |
Count of false negatives (predicted 0/FALSE, but actually 1/TRUE) |
FP |
Count of false positives (predicted 1/TRUE, but actually 0/FALSE) |
TN |
Count of true negatives (correctly predicted 0/FALSE) |
d1 |
Weight of recall in the harmonic mean (corresponds to beta squared) |
d0 |
Weight of the estimated true positive probability in the harmonic mean |
c |
Additional parameter in numerator |
Calculate the robust F-Beta Score with two additional parameters.
Provide either:
actual and predicted or
TP, FN, FP and TN.
If , the robust F-Beta Score coincides with the F-Beta Score.
robust F-Beta Score with two additional parameters.
Holzmann, H., Klar, B. (2024). Robust performance metrics for imbalanced classification problems. arXiv:2404.07661. LINK
actual <- c(1,1,1,1,1,1,0,0,0,0) predicted <- c(1,1,1,1,0,0,1,0,0,0) robFScore2(actual, predicted, d0 = 0.1, c = 0.1) robFScore2(TP=4, FN=2, FP=1, TN=3, d0 = 0.1, c = 1)actual <- c(1,1,1,1,1,1,0,0,0,0) predicted <- c(1,1,1,1,0,0,1,0,0,0) robFScore2(actual, predicted, d0 = 0.1, c = 0.1) robFScore2(TP=4, FN=2, FP=1, TN=3, d0 = 0.1, c = 1)
Compute a robust version of Matthews correlation coefficient (MCC).
robMCC( actual = NULL, predicted = NULL, TP = NULL, FN = NULL, FP = NULL, TN = NULL, d = 0.1 )robMCC( actual = NULL, predicted = NULL, TP = NULL, FN = NULL, FP = NULL, TN = NULL, d = 0.1 )
actual |
A vector of actual values (1/0 or TRUE/FALSE) |
predicted |
A vector of prediction values (1/0 or TRUE/FALSE) |
TP |
Count of true positives (correctly predicted 1/TRUE) |
FN |
Count of false negatives (predicted 0/FALSE, but actually 1/TRUE) |
FP |
Count of false positives (predicted 1/TRUE, but actually 0/FALSE) |
TN |
Count of true negatives (correctly predicted 0/FALSE) |
d |
Parameter of the robust MCC |
Calculate the robust MCC. Provide either:
actual and predicted or
TP, FN, FP and TN.
If , the robust MCC coincides with the MCC.
robust MCC.
Holzmann, H., Klar, B. (2024). Robust performance metrics for imbalanced classification problems. arXiv:2404.07661. LINK
actual <- c(1,1,1,1,1,1,0,0,0,0) predicted <- c(1,1,1,1,0,0,1,0,0,0) robMCC(actual, predicted, d=0.05) robMCC(TP=4, FN=2, FP=1, TN=3, d=0.05)actual <- c(1,1,1,1,1,1,0,0,0,0) predicted <- c(1,1,1,1,0,0,1,0,0,0) robMCC(actual, predicted, d=0.05) robMCC(TP=4, FN=2, FP=1, TN=3, d=0.05)
Plot ROC curve together with recall / 1-precision curve.
ROC_curve(actual, predicted, d = c(0.01, 0.05, 0.1, 0.5))ROC_curve(actual, predicted, d = c(0.01, 0.05, 0.1, 0.5))
actual |
A vector of actual values (1/0 or TRUE/FALSE) |
predicted |
A vector of predicted probabilities (numeric values in |
d |
A vector of length 4 |
Instead of a precision-recall curve, a recall / 1-precision curve is plotted in the same coordinate system as the ROC curve.
Grey circles show the corresponding MCC optimal points; black symbols show points optimal with respect to the robust MCC for different values of d.
ROC curve.
Holzmann, H., Klar, B. (2024). Robust performance metrics for imbalanced classification problems. arXiv:2404.07661. LINK
actual <- rf.data[, 1] predicted <- rf.data[, 2] ROC_curve(actual, predicted, d=c(0.01,0.02,0.1,0.5))actual <- rf.data[, 1] predicted <- rf.data[, 2] ROC_curve(actual, predicted, d=c(0.01,0.02,0.1,0.5))