Estamos realizando la búsqueda. Por favor, espere...
Abstract: Processing data that originates from different sources (such as environmental and medical data) can prove to be a difficult task, due to the heterogeneity of variables, storage systems, and file formats that can be used. Moreover, once the amount of data reaches a certain threshold, conventional mining methods (based on spreadsheets or statistical software) become cumbersome or even impossible to apply. Data Extract, Transform, and Load (ETL) solutions provide a framework to normalize and integrate heterogeneous data into a local data store. Additionally, the application of Online Analytical Processing (OLAP), a set of Business Intelligence (BI) methodologies and practices for multidimensional data analysis, can be an invaluable tool for its examination and mining. In this article, we describe a solution based on an ETL?+?OLAP tandem used for the on-the-fly analysis of tens of millions of individual medical, meteorological, and air quality observations from 16 provinces in Spain provided by 20 different national and regional entities in a diverse array for file types and formats, with the intention of evaluating the effect of several environmental variables on human health in future studies. Our work shows how a sizable amount of data, spread across a wide range of file formats and structures, and originating from a number of different sources belonging to various business domains, can be integrated in a single system that researchers can use for global data analysis and mining.
Fuente: International Journal of Biometeorology June 2018, Volume 62, Issue 6, pp 1085-1095
Fecha de publicación: 01/06/2018
Tipo de publicación: Artículo de Revista
Url de la publicación: https://link.springer.com/article/10.1007%2Fs00484-018-1511-9
MARIA TERESA ZARRABEITIA CIMIANO
PABLO FERNANDEZ DE ARROYABE HERNAEZ
ANA SANTURTUN ZARRABEITIA