IV. SCIENTIFIC DATA AND BIODIVERSITY COLLECTIONS DATA CONCEPTS AND THEIR RELEVANCE FOR DATA CAPTURE IN LARGE SCALE DIGITISATION OF BIOLOGICAL COLLECTIONS

2012 
Logistically, the data associated with biological collections can be divided into three main categories for digitisation: i) Label Data: the data appearing on the specimen on a label or annotation; ii) Curatorial Data: the data appearing on containers, boxes, cabinets and folders which hold the collections; iii) Supplementary Data: the data held separately from the collections in indices, archives and literature. Each of these categories of data have fundamentally different properties within the digitisation framework which have implications for the data capture process. These properties were assessed in relation to alternative data entry workflows and methodologies to create a more efficient and accurate system of data capture. We see a clear benefit in the prioritisation of curatorial data in the data capture process. These data are often only available at the cabinets, they are in a format suitable for allowing rapid data entry, and they result in an accurate cataloguing of the collections. Finally, the capture of a high resolution digital image enables additional data entry to be separated into multiple sweeps, and optical character recognition (OCR) software can be used to facilitate sorting images for fuller data entry, and giving potential for more automated data entry in the future.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    4
    References
    0
    Citations
    NaN
    KQI
    []