Cancer Digital Slide Archive: an informatics resource to support integrated in silico analysis of TCGA pathology data

2013 
Background The integration and visualization of multimodal datasets is a common challenge in biomedical informatics. Several recent studies of The Cancer Genome Atlas (TCGA) data have illustrated important relationships between morphology observed in whole-slide images, outcome, and genetic events. The pairing of genomics and rich clinical descriptions with whole-slide imaging provided by TCGA presents a unique opportunity to perform these correlative studies. However, better tools are needed to integrate the vast and disparate data types. Objective To build an integrated web-based platform supporting whole-slide pathology image visualization and data integration. Materials and methods All images and genomic data were directly obtained from the TCGA and National Cancer Institute (NCI) websites. Results The Cancer Digital Slide Archive (CDSA) produced is accessible to the public ( ) and currently hosts more than 20 000 whole-slide images from 22 cancer types. Discussion The capabilities of CDSA are demonstrated using TCGA datasets to integrate pathology imaging with associated clinical, genomic and MRI measurements in glioblastomas and can be extended to other tumor types. CDSA also allows URL-based sharing of whole-slide images, and has preliminary support for directly sharing regions of interest and other annotations. Images can also be selected on the basis of other metadata, such as mutational profile, patient age, and other relevant characteristics. Conclusions With the increasing availability of whole-slide scanners, analysis of digitized pathology images will become increasingly important in linking morphologic observations with genomic and clinical endpoints.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    21
    References
    111
    Citations
    NaN
    KQI
    []