Agri-food Data Standards: a Gap Exploration Report

2018 
This report describes the methodology and results of our gap analysis on the availability and usability of data standards for food and agriculture. This is the first of three analysis reports, which form part of the GODAN Action project which aims to enable data users, producers and intermediaries to engage effectively with open data in the agriculture and nutrition sectors. This initial report is an attempt to provide a baseline gap analysis, against which subsequent analyses will be conducted. It is based on the content of the GODAN Action Map of agri-food data standards as of 20 November 2016. However, since this is the first gap analysis report and is based on an early and presumably incomplete version of the database, it focuses primarily on the overall methodology and criteria, and only gives an initial tentative overview of the partial gap analysis results. The methodology includes the design criteria behind the online database where all the metadata about the data standards were collected – the online map of data standards, described in Pesce, Kayumbi, Tennison, Mey and Zervas (2016) – as well as the assessment process and the elaboration of results. The metadata model, besides following existing standards for describing vocabularies, includes additional assessment metadata, drawn from two existing assessment practices (the assessment process used by the UK government’s Open Standards Board and the Open Data Certificates). These assessment criteria were organised in four categories: fitness for purpose, adoption, usability and openness. Although the whole set of assessment criteria must be considered for a full gap analysis, the fitness/adoption criteria are very difficult to assess. Such an assessment requires the participation of domain data experts who can evaluate the scientific soundness, the completeness and the level of adoption and authoritativeness of a standard. These domain-specific analyses will be conducted in a second phase, when the GODAN Action project has selected its thematic topics and domain experts can be brought on board. In this first version of the gap analysis, we are limiting ourselves to the usability and openness criteria, which can be more objectively evaluated by open data experts. The first part of the report illustrates the metadata used in the global map to describe data standards, in particular the specific assessment metadata (is it machine-readable? Is it served by application programming interfaces (APIs)? Is it clearly licensed?). The second part provides an initial analysis of the results of the assessment. The key findings of the assessment exercise are: In terms of openness and usability, a certain number of standards are barely usable because they are not even available on the web (16 per cent). Only 55 per cent of the standards are presented in machine-readable formats. Most standards fail to present a clear license (only 21 per cent) though where they do they are generally open (13 per cent). There is a gap between the information presented on the web and the documentation on the standard, with only 31 per cent of the standards having documentation, only 5 per cent having tests and only 40 per cent being supported.Doing an analysis by domain, it appears that certain domains are better covered than others (plant sciences above all, followed by natural resources), but the type and quality of standards used in domains that are apparently similarly covered varies greatly. Most of the standards used in plant sciences are in the form of ontologies and are highly open and usable, while the level of openness and usability of standards in sub-domains of natural resources is lower and still varies. The soil domain is covered by good standards (widely adopted models, thesauri) though in most cases are not yet formalised as open standards. In the land sector, classifications are still fragmented and on paper (with a subsequently low level of openness and usability) and only one open standard has been developed. More generally, we noted that in certain domains (e.g. plants) research institutions play a key role in developing ontologies, while government and standardisation bodies provide basic normative descriptors or syntactic standards (like the Multicrop Passport descriptors) with little use of common semantics. For other types of data (like soil, plant products, animal products), international bodies provide models and normative classifications (like INSPIRE for soils or the FAO/UN commodities and product classifications or the ISO Animal Identification standard) while the development of ontologies and common semantics seems to be still slow. Finally, although in the area of value chain data, standards are being increasingly used (messaging standards, product classifications) and semantics are slowly starting to be used (e.g. food ontologies), government and legislation data is an area where standardisation seems to be low – besides the use of basic statistical standards – with little development or re-use of semantics.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []