WG 1 – Data Harmonization

WG 1 – Data Harmonization


Davorka Hackenberger
Department of Biology, University of Osijek, Croatia

Jiří Schlaghamerský
Terrestrial Invertebrate Research Group, Department of Botany and Zoology, Faculty of Science, Masaryk University, Czech Republic


Standardised vocabularies and data curation are essential for data comparability and common re-use as well as database credibility. They are thus a prerequisite to all other Action activities. The major goals of WG1 will be the development of peer-review protocols for standardised data vocabularies as well as scientific and data-management quality-assessment procedures. 

  • Standardised and well-defined vocabularies and formats for data fields to be covered in the data warehouse will be defined, building from existing data fields and implementing the Essential Biodiversity Variables. 
  • A minimum data set will be specified, including taxon, site of occurrence, observation date, observer/source, methodology (what, where, when, who, how), promoting future data identification and re-use. 
  • Essential environmental metadata will be defined, otherwise rarely collected within soil biodiversity surveys yet dramatically increasing the value of biodiversity data. 
  • Pre-existing hierarchal “authority lists” for spatial-data detail and thematic resolution of qualitative data fields (i.e., soil or habitat types) will be amended, expanded and agreed upon, allowing high data comparability. 
  • Harmonisation with European standards will be achieved by agreement with the INSPIRE directive as well as available global standards relevant to soil biodiversity, i.e., FAO guidelines, WRB and ISO standards. 

WG1 works closely with WG2 as well as WG5 to achieve a consensus on all terminologies, proposing thesauri for translation from national vocabularies.

The second main task of WG1 will be to develop procedures and protocols for data quality control, including data-harmonisation based on the standardised vocabularies and formats above. Building from existing procedures, a resource-efficient quality-control workflow will be defined and tested for high error recognition while minimising personal requirements. Both technical and content-oriented quality control will be considered. 

  • For necessary final manual quality control, data review boards analogous to journals’ editorial boards will be established.
  • Taxonomic review boards will also be established, drawing from Action participants as well as external experts. Building from national and international databases and checklists, they will achieve a consensus on a basic taxonomic backbone and definition of valid species’ nomenclatures. 
  • In addition to procedures and review boards geared to quality control prior to data import, procedures and software for data control, curation and correction after import into the data warehouse (WG2) will be reviewed and improved. 

By developing principles for the harmonisation, use and mapping of existing information on soil biodiversity in Europe, the quality of the data will increase the credibility of use by all partners, stakeholders and end-users. 

Intended outcomes

WG1 will provide standardised vocabularies, protocols and checklists for soil-biodiversity data. These will be made available to all data providers, users and stakeholders at European and national levels. (2) The definition of data curation and quality-control processes, establishing viable paths for data correction and harmonizing metadata for individual datasets to comply with overall criteria.