Identifying content and levels of representation in scientific data

2012 
Heterogeneous digital data that has been produced by different communities with varying practices and assumptions, and that is organized according to different representatio n schemes, encodings, and file formats, presents substantial obstacles to efficient integration, analysis, and preserva tion. This is a particular impediment to data reuse and interdisci plinary science. An underlying problem is that we have no shared formal conceptual model of information representation that is both accurate and sufficiently detailed to accom modate the management and analysis of real world digital data in varying formats. Developing such a model involves confronting extremely challenging foundational problems in information science. We present two complementary conceptual models for data representation, the Basic Representation Model and the Systematic Assertion Model. We show how these models work together to provide an analytical account of digitally encoded scientific data. These models wil l provide a better foundation for understanding and supporting a wide range of data curation activities, including form at migration, data integration, data reuse, digital preserva tion strategies, and assessment of identity and scientific equiv alence.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    21
    References
    12
    Citations
    NaN
    KQI
    []