Unveiling the transferability of PLSR models for leaf trait estimation: lessons from a comprehensive analysis with a novel global dataset

New Phytol. 2024 May 6. doi: 10.1111/nph.19807. Online ahead of print.

Abstract

Leaf traits are essential for understanding many physiological and ecological processes. Partial least squares regression (PLSR) models with leaf spectroscopy are widely applied for trait estimation, but their transferability across space, time, and plant functional types (PFTs) remains unclear. We compiled a novel dataset of paired leaf traits and spectra, with 47 393 records for > 700 species and eight PFTs at 101 globally distributed locations across multiple seasons. Using this dataset, we conducted an unprecedented comprehensive analysis to assess the transferability of PLSR models in estimating leaf traits. While PLSR models demonstrate commendable performance in predicting chlorophyll content, carotenoid, leaf water, and leaf mass per area prediction within their training data space, their efficacy diminishes when extrapolating to new contexts. Specifically, extrapolating to locations, seasons, and PFTs beyond the training data leads to reduced R2 (0.12-0.49, 0.15-0.42, and 0.25-0.56) and increased NRMSE (3.58-18.24%, 6.27-11.55%, and 7.0-33.12%) compared with nonspatial random cross-validation. The results underscore the importance of incorporating greater spectral diversity in model training to boost its transferability. These findings highlight potential errors in estimating leaf traits across large spatial domains, diverse PFTs, and time due to biased validation schemes, and provide guidance for future field sampling strategies and remote sensing applications.

Keywords: cross‐validation; leaf spectroscopy; leaf traits; partial least squares regression; transferability.