Unveiling the transferability of PLSR models for leaf trait estimation: lessons from a comprehensive analysis with a novel global dataset

Fujiang Ji; Fa Li; Dalei Hao; Alexey N Shiklomanov; Xi Yang; Philip A Townsend; Hamid Dashti; Tatsuro Nakaji; Kyle R Kovach; Haoran Liu; Meng Luo; Min Chen

doi:10.1111/nph.19807

Unveiling the transferability of PLSR models for leaf trait estimation: lessons from a comprehensive analysis with a novel global dataset

New Phytol. 2024 May 6. doi: 10.1111/nph.19807. Online ahead of print.

Authors

Fujiang Ji^#¹, Fa Li^#¹, Dalei Hao², Alexey N Shiklomanov³, Xi Yang⁴, Philip A Townsend¹, Hamid Dashti¹, Tatsuro Nakaji⁵, Kyle R Kovach¹, Haoran Liu¹, Meng Luo¹, Min Chen^{1

6}

Affiliations

¹ Department of Forest and Wildlife Ecology, University of Wisconsin-Madison, 1630 Linden Dr., Madison, WI, 53706, USA.
² Atmospheric, Climate, & Earth Sciences Division, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, WA, 99354, USA.
³ NASA Goddard Space Flight Center, 8800 Greenbelt Road, Mail code: 610.1, Greenbelt, MD, 20771, USA.
⁴ Department of Environmental Sciences, University of Virginia, 291 McCormick Road, Charlottesville, VA, 22904, USA.
⁵ Uryu Experimental Forest, Hokkaido University, Moshiri, Horokanai, Hokkaido, 074-0741, Japan.
⁶ Data Science Institute, University of Wisconsin-Madison, 447 Lorch Ct, Madison, 53706, WI, USA.

^# Contributed equally.

PMID: 38708434
DOI: 10.1111/nph.19807

Abstract

Leaf traits are essential for understanding many physiological and ecological processes. Partial least squares regression (PLSR) models with leaf spectroscopy are widely applied for trait estimation, but their transferability across space, time, and plant functional types (PFTs) remains unclear. We compiled a novel dataset of paired leaf traits and spectra, with 47 393 records for > 700 species and eight PFTs at 101 globally distributed locations across multiple seasons. Using this dataset, we conducted an unprecedented comprehensive analysis to assess the transferability of PLSR models in estimating leaf traits. While PLSR models demonstrate commendable performance in predicting chlorophyll content, carotenoid, leaf water, and leaf mass per area prediction within their training data space, their efficacy diminishes when extrapolating to new contexts. Specifically, extrapolating to locations, seasons, and PFTs beyond the training data leads to reduced R² (0.12-0.49, 0.15-0.42, and 0.25-0.56) and increased NRMSE (3.58-18.24%, 6.27-11.55%, and 7.0-33.12%) compared with nonspatial random cross-validation. The results underscore the importance of incorporating greater spectral diversity in model training to boost its transferability. These findings highlight potential errors in estimating leaf traits across large spatial domains, diverse PFTs, and time due to biased validation schemes, and provide guidance for future field sampling strategies and remote sensing applications.

Keywords: cross‐validation; leaf spectroscopy; leaf traits; partial least squares regression; transferability.

Abstract

Grants and funding