On the systematic reduction of data complexity in multi-model atmospheric dispersion ensemble modeling

Riccio, Angelo; Ciaramella, Angelo; Giunta, Giulio; Galmarini, S; Solazzo, E; Potemski, S.

doi:10.1029/2011JD016503

The aim of this work is to explore the effectiveness of theoretical information approaches for the reduction of data complexity in multimodel ensemble systems. We first exploit a weak form of independence, i.e. uncorrelation, as a mechanism for detecting linear relationships. Then, stronger and more general forms of independence measure, such as mutual information, are used to investigate dependence structures for model selection. A distance matrix, measuring the interdependence between data, is derived for the investigated measures, with the scope of clustering correlated/dependent models together. Redundant information is discarded by selecting a few representative models from each cluster. We apply the clustering analysis in the context of atmospheric dispersion modeling, by using the ETEX-1 data set. We show how the selection of a small subset of models, according to uncorrelation or mutual information distance criteria, usually suffices to achieve a statistical performance comparable to, or even better than, that achieved from the whole ensemble data set, thus providing a simpler description of ensemble results without sacrificing accuracy