Objectives:
- To identify, mine, collect and review existing datasets including environmental and food contamination variables for different environmental health stressors considered in Stream 5 and of ancillary data such as spatially resolved land use/cover data.
- To analyse the collated data in terms of collection method and data quality, availability, applicability domains.
- To store the collated datasets within an environmental data management system to render the data readily available for the WP9 and WP11 and finally for health impact assessment in population studies executed in Stream 5.
Description of work and role of partners:
WP8 will develop an environmental data management system to permit the integration of data on emissions of stressors, concentrations of toxic substances in environmental media (outdoor and indoor air, soil, water), in food and in drinking water and external exposures to environmental hazards.
The following specific tasks have been identified:
Task 8.1 Data collection (UOWM, USTUTT, VTT, CSIC, IMDEC-FEUP, TNO, CERETOX, CNR)
Environmental data sources needed to perform Environment-wide application studies in the areas covered by the population studies addressed in Stream 5 will be identified through a detailed review including past and ongoing research and survey projects both at National and European level and EU-wide monitoring systems such as the ones managed by EEA and ESA, national and regional monitoring networks in the areas of interest to the Stream 5 population studies. Contributions on data provision from every HEALS partner will be assessed to verify the data directly available within the consortium.
The main objective of this Task is to gather collect and mining the environmental data from the information sources identified through the above review process and to be successively stored in the environmental management system developed in Task 8.3. The data collected, relevant to the groups of substances identified in Stream 5, will comprise but will be not limited to the following variables:
- Emission data and emission factors
- Satellite data for estimation of air pollution levels (data from and in collaboration with the GMES initiative)
- Pollution levels in different media (outdoor air, indoor air, soil, surface water, ground water)
- Pollution levels in food and drinking water
- Meteorological data to be used in input of air quality modeling
- Land use/land/cover for estimation of emission inventory
Since the data collected at this stage should serve WP11 and WP12, the dataset will be completed and provided in a standard format, compliant with the INSPIRE Directive.
Task 8.2 Quality Assessment / Quality Control (QA/QC) (UOWM, NCSRD, UPMC, URV, IOM)
The data collected in the previous task will be evaluated against their quality and applicability through the activities foreseen in the frame of Task 8.2. Careful checking of all the data used, whether obtained from outside the project or derived within it is mandatory. A wide range of methods will be used for this purpose, building on techniques already extensively applied in partner institutions. These will include:
- consultation with data suppliers and past data users to identify any known gaps or uncertainties;
- scrutiny of relevant metadata, describing the source and genealogy of the data (e.g. sampling methods, measurement procedures, analytical procedures, reporting);
- establishment of clear data standards and criteria prior to their acquisition and use, so that data can be rejected if they are not of a sufficient quality (e.g. in terms of coverage, sampling density, measurement accuracy, timeliness);
- statistical screening of the data – e.g. to check for outliers, impossible values, anticipated correlations/patterns;
- manual checking of subsamples of the data (e.g. to check for formatting errors);
- intercomparison and triangulation against independent data sets and sources.
An important corollary of the above QA/QC activities will be the identification of gaps, which will need to becovered. Moreover this will help the project team guide the optimally design the Pilot European Exposure and Health Examination Survey (EXHES) carried out in WP17.
Task 8.3 Designing and building the Environmental management system (UOWM, VTT, TNO, CNR)
After the data collection (Task 8.1) and its quality control (Task 8.2) the data will be stored in a coherent environmental data management system for further use within the project. The environmental data management system (EDMS) is planned to store all the data collected in Task 8.1. The work in this task will start with the design of the Db structure which besides accommodating HEALS own datasets should be able to retrieve data from existing Databases identified in Task 8.1 through suitable query scripts. The Db will be implemented in a standard database package such as MySQL, in order to grant interoperability in data storage, management and exchange with the Geo-database platform developed in WP12.
All data structures will be relational, i.e. the data tables will be linked to each other by means of univocal identifiers of records (IDs). In this way, each record can be easily accessed and shared by different tables. This will ensure a seamless integration with the GeoDatabase platform developed in WP12. In addition, all data will be geo-referenced, by specifying the geographic coordinates of each single observation, both for point-form and for polygonal spatial information, and as such they will be ready for analysis by GIS technology developed in WP12. All data will be univocally coupled to a time reference (instant, hour, day, month, etc.) and as such they will be ready to be investigated by using time series based statistics, and furthermore can be easily aggregated on the basis of different time scales (e.g. weekly or monthly averages).
Several query interfaces will be developed, in order to integrate the database with tools for: a) automatic updating (update query), b) importing form other data formats or software (import query), c) exporting into other data formats (export query), d) selecting specific subsets of data (selection query), e) grouping records by means of aggregation functions (group by query).
The structure of the EDMS will be compatible with the IPCHeM database of the JRC and the ToxHub platform of the HEROIC project, so that the collected data can feed into the above databases during project execution and in the future, contributing thus to environmental data integration across Europe.