OrdinalForest Classification

Definition

OrdinalForest (HOF — Hierarchical OrdinalForest) is a random forest-based machine learning algorithm that explicitly incorporates the ordinal structure of class labels into its optimization criterion. It is distinct from a standard Random Forest treating multi-class outcomes as nominal, and avoids the implicit assumption that all class boundaries are equal.

Used in el-balkhi-2025 for multi-class staging of chronic liver disease from HSA isoform spectral profiles.

Why ordinal classification for CLD staging?

Liver fibrosis stages (F0/F1, F2/F3, F4_A, F4_B, F4_C) are ordered — a misclassification of F4_B as F4_A is less costly than misclassifying F4_B as F0/F1. Standard accuracy metrics penalize all errors equally and are therefore misleading for staged disease.

Quadratic Weighted Kappa (QWK) is the appropriate primary metric: it penalizes predictions proportionally to their ordinal distance from the true class.

Three architectures evaluated in ALBOM

Model	Description	Key property
RF	Standard Random Forest, ordinal outcome treated as factor	Ignores ordinal structure
HRF	Hierarchical Random Forest — sequential binary decomposition of 6 classes	Captures hierarchy, not ordinality
HOF	Hierarchical OrdinalForest — incorporates ordinal loss into optimization	Best: explicit ordinal structure + hierarchy

HOF was selected as the final model.

Implementation in ALBOM study

Input features

Spectral features: 75 selected from the full albumin spectral region (m/z 66,000–68,000 Da) by permutation-importance feature selection
Clinical features: 4 routine variables (total protein, serum albumin, INR, total bilirubin)
Combined model: “LC-TOF + Clinical” — 75 spectral + 4 clinical

Feature selection pipeline

Full normalized feature matrix → initial Random Forest fit
Permutation importance scoring
5-fold cross-validated QWK over grid of k values (20, 30, 40, 50, 75, 100)
Optimal: k = 75 (highest cross-validated QWK without overfitting)
Applied exclusively within training partition to prevent data leakage

Preprocessing

Total Ion Current (TIC) normalization — correct inter-instrument intensity differences
Probabilistic Quotient Normalization (PQN) — correct dilution effects

Train/test split

80% stratified training / 20% held-out test (stratified by fibrosis class)
Same pipeline applied independently to Platform 1 and Platform 2 data

Software

R v4.4.2, packages: ordinalForest, ranger, yardstick; RStudio 2025.05.1+513

Performance (ALBOM study)

Platform	n_test	QWK	95% CI (bootstrap, 1000 iter.)
Bruker timsTOF Pro2 (P1)	46	0.862	0.735–0.923
Sciex TripleTOF 5600+ (P2)	49	0.916	0.822–0.964
FIB-4 (comparator)	—	0.188–0.229	—

Secondary metric: balanced accuracy (reported to account for class imbalance).

Confusion matrix highlights (Platform 1)

Control class: 12/15 correctly classified
F2/F3 class: 8/10
F4_C class: 4/4 (perfect)
F4_A: 2/7 correctly classified — most misclassified (assigned to F2 or F4_B) — reflects biological overlap of compensated cirrhosis

Feature importance findings (from Fig S3, Supplemental data)

Top 4 features: clinical variables — total protein, routine serum albumin, INR, total bilirubin.

Spectral albumin peaks cluster in two sub-regions:

~66,230–66,600 Da — native HSA and cysteinylated isoforms (HSA+CYS, HSA+CYS+GLYC range); early-to-mid disease signal
~67,024–67,457 Da — poly-glycated albumin adducts (HSA+2GLYC, HSA+CYS+2GLYC range); advanced/end-stage disease signal

This bimodal spectral importance pattern is fully consistent with the 3-pattern biological model: the cysteinylated region captures the biphasic early-middle disease signal, while the poly-glycated region captures the monotonically-increasing end-stage signal.

Cross-platform equivalence

Same pipeline applied independently to both platforms; cross-platform agreement assessed by:

McNemar’s test on paired predictions: p = 0.149 → no significant difference in classification decisions
Jaccard Similarity Index of error matrices: 0.696 (>0.5 threshold → errors are biologically driven, not instrument-specific)

Generalizability notes

⚠️ Training and test from same single-center cohort — reported accuracy is internal; external validation (MALAHBAR NCT06318949) is underway
Model currently requires LC-HR-MS input — not directly deployable to simpler assay formats
The 75-feature spectral approach could in principle be translated to targeted LC-MRM-MS if isoform ratios are validated as the key predictors

Key references

el-balkhi-2025 — primary application in CLD staging
R package: ordinalForest (Hornung R; CRAN)

Albuminomics

Explorer

OrdinalForest Classification

OrdinalForest Classification

Definition

Why ordinal classification for CLD staging?

Three architectures evaluated in ALBOM

Implementation in ALBOM study

Input features

Feature selection pipeline

Preprocessing

Train/test split

Software

Performance (ALBOM study)

Confusion matrix highlights (Platform 1)

Feature importance findings (from Fig S3, Supplemental data)

Cross-platform equivalence

Generalizability notes

Key references

Graph View

Table of Contents