TAIGA: A Novel Dataset for Multitask Learning of Continuous and Categorical Forest Variables From Hyperspectral Imagery

2022 
The spectral and spatial resolutions of modern optical Earth observation data are continuously increasing. To fully utilize the data, integrate them with other information sources, and create applications relevant to real-world problems, extensive training data are required. We present TAIGA, an open dataset including continuous and categorical forestry data, accompanied by airborne hyperspectral imagery with a pixel size of 0.7 m. The dataset contains over 70 million labeled pixels belonging to more than 600 forest stands. To establish a baseline on TAIGA dataset for multitask learning, we trained and validated a convolutional neural network to simultaneously retrieve 13 forest variables. Due to the size of the imagery, the training and testing sets were independent, with strictly no overlap for patches up to $45\times 45$ pixels. Our retrieval results show that including both spectral and textural information improves the accuracy of mapping key boreal forest structural characteristics, compared with an earlier study including only spectral information from the same image. TAIGA responds to the increased availability of hyperspectral and very high resolution imagery, and includes the forestry variables relevant for forestry and environmental applications. We propose the dataset as a new benchmark for spatial–spectral methods that overcomes the limitations of widely used small-scale hyperspectral datasets.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    59
    References
    0
    Citations
    NaN
    KQI
    []