Forrest Hoffman*, William Hargrove*, and Anthony Del Genio**
*Oak Ridge National Laboratory
**NASA Goddard Institute for Space Studies
Clustering is a statistical procedure which provides an objective method for classifying or grouping multivariate conditions or observations into a set of states or regimes within a given level of tolerance or statistical variance. The groups or clusters--statistically defined across space and through time--possess centroids which represent the mean or synoptic conditions of observations contained in each state no matter when or where they occurred. Although not used by the clustering procedure, the spatial and temporal coordinates are retained so that classified observations may be mapped back to the appropriate location and time for display and analysis.
Within this context, the behavior of the atmosphere can be thought of as a time series of recurring cloud and radiation "states" contained within a highly-multivariate state space. In this N-dimensional state space, cloud and radiation parameters serve as axes, and measurements of these parameters are used as coordinates to locate each observation. The time series of sucessive observations for a single geographic location creates a trajectory in state space. Clustering simplifies the analysis by classifying all observations into a set of states with approximately equal variance in state space. Clustering model results and observations together generates a set of states shared by both providing a common basis for intercomparison.
As a preliminary proof-of-concept, the clustering technique was applied to five business-as-usual (BAU) scenarios (or ensemble members) from the Parallel Climate Model (PCM). Three fields of significance to the global water cycle (surface temperature, precipitation, and soil moisture) were clustered from 1871 through 2098. Our analysis shows an increase in spatial area occupied by the cluster or climate regime which typifies desert regions (i.e., an increase in desertification) and a decrease in the spatial area occupied by the climate regime typifying winter-time high latitude perma-frost regions. The patterns of cluster changes can be used to predict variability in the water cycle on global and continental scales.
In a forthcoming ARM research project, we will use this clustering approach to compare model results with ARM's long time series observations in an effort to better understand cloud processes and feedbacks. Additionally, by defining a global atmospheric state space using reanalysis or model data, and analyzing the portion of that space occupied by ARM observations, insights into the representativeness of ARM observations will be gained. This analysis will identify locations and times where new measurements are needed for model validation.