A New Model Evaluation Framework for the International Land Model Benchmarking (ILAMB) Project

Forrest M. Hoffman and James T. Randerson

Anthropogenic perturbation of the global carbon cycle is expected to induce feedbacks on the climate system and future atmospheric CO2 concentrations. The need to reduce the range of uncertainty in climate predictions has motivated a growing number of site, regional, and global model-data intercomparison projects that employ terrestrial biosphere models, including the Carbon-Land Model Intercomparison Project (C-LAMP), the North American Carbon Program (NACP) interim synthesis projects, the Large Biosphere-Atmosphere Data-Model Intercomparison Project (LBA-DMIP), and the Multi-scale Synthesis and Intercomparison Project (MsTMIP). Such activities are difficult to carry out and time intensive. Specifically, there is a large cost in developing the infrastructure to make meaningful model-data comparisons, even when the data are freely and easily available. Moreover, the development of sophisticated model diagnostics packages that can exploit the richness of large Earth System data sets, from satellite to site-scale measurements, are typically outside the scope of any single modeling center. To leverage the efforts of current model-data intercomparison projects and support continued model improvements from a rapidly increasing body of observational data, a unified model evaluation system that implements a set of internationally agreed-upon model benchmarks is required. The International Land Model Benchmarking (ILAMB) Project was established to develop such benchmarks for land model performance, promote the use of these benchmarks for model-data intercomparison, strengthen linkages between experimental, remote sensing, and climate modeling communities, and support the design and development of a new, open source, benchmarking software system for use by the international community. Presented will be a prototype architecture for such a land-biosphere model benchmarking framework based on freely available software that can be shared among and modified by members of the wider modeling community. Inspired by the C-LAMP diagnostics (Randerson et al., 2009), this new system will support model-data metrics and model skill scoring profiles using best-available observational data sets.