The aim of this work is to explore the effectiveness of theoretical information approaches for the reduction of data complexity in multimodel ensemble systems. We first exploit a weak form of independence, i.e. uncorrelation, as a mechanism for detecting linear relationships. Then, stronger and more general forms of independence measure, such as mutual information, are used to investigate dependence structures for model selection. A distance matrix, measuring the interdependence between data, is derived for the investigated measures, with the scope of clustering correlated/dependent models together. Redundant information is discarded by selecting a few representative models from each cluster. We apply the clustering analysis in the context of atmospheric dispersion modeling, by using the ETEX-1 data set. We show how the selection of a small subset of models, according to uncorrelation or mutual information distance criteria, usually suffices to achieve a statistical performance comparable to, or even better than, that achieved from the whole ensemble data set, thus providing a simpler description of ensemble results without sacrificing accuracy
Titolo: | On the systematic reduction of data complexity in multi-model atmospheric dispersion ensemble modeling | |
Autori: | ||
Data di pubblicazione: | 2012 | |
Rivista: | ||
Abstract: | The aim of this work is to explore the effectiveness of theoretical information approaches for the reduction of data complexity in multimodel ensemble systems. We first exploit a weak form of independence, i.e. uncorrelation, as a mechanism for detecting linear relationships. Then, stronger and more general forms of independence measure, such as mutual information, are used to investigate dependence structures for model selection. A distance matrix, measuring the interdependence between data, is derived for the investigated measures, with the scope of clustering correlated/dependent models together. Redundant information is discarded by selecting a few representative models from each cluster. We apply the clustering analysis in the context of atmospheric dispersion modeling, by using the ETEX-1 data set. We show how the selection of a small subset of models, according to uncorrelation or mutual information distance criteria, usually suffices to achieve a statistical performance comparable to, or even better than, that achieved from the whole ensemble data set, thus providing a simpler description of ensemble results without sacrificing accuracy | |
Handle: | http://hdl.handle.net/11367/30544 | |
Appare nelle tipologie: | 1.1 Articolo in rivista |