In this work we explore the effectiveness of theoretical information approaches, based on uncorrelation and Mutual Information, for the reduction of data complexity in multi-model ensemble systems. A distance matrix, measuring the inter-dependence between data, is derived, with the scope of clustering correlated/dependent models together and selecting a few representative models from each cluster. We test the efficacy of these distance measures to detect dependence structures and select “representative models”. We apply this analysis in the context of atmospheric dispersion modeling, by using the ETEX-1 dataset and the data made available by the more recent AQMEII (Air Quality Model Evaluation International Initiative) efforts. We show how the selection of a small subset of models data, according to uncorrelation or mutual information distance criteria, usually suffices to achieve a statistical performance comparable to, or even better than, that achieved from the whole ensemble dataset, thus providing a simpler description of ensemble results without sacrificing accuracy
Information-theoretic approaches for models selection in multi-model ensemble atmospheric dispersion predictions
RICCIO, Angelo;CIARAMELLA, Angelo;
2013-01-01
Abstract
In this work we explore the effectiveness of theoretical information approaches, based on uncorrelation and Mutual Information, for the reduction of data complexity in multi-model ensemble systems. A distance matrix, measuring the inter-dependence between data, is derived, with the scope of clustering correlated/dependent models together and selecting a few representative models from each cluster. We test the efficacy of these distance measures to detect dependence structures and select “representative models”. We apply this analysis in the context of atmospheric dispersion modeling, by using the ETEX-1 dataset and the data made available by the more recent AQMEII (Air Quality Model Evaluation International Initiative) efforts. We show how the selection of a small subset of models data, according to uncorrelation or mutual information distance criteria, usually suffices to achieve a statistical performance comparable to, or even better than, that achieved from the whole ensemble dataset, thus providing a simpler description of ensemble results without sacrificing accuracyI documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.