All data introduced in the publication "SpatialSort: A Bayesian Model for Clustering and Cell Population Annotation of Spatial Proteomics Data" are deposited here. Forward simulated datasets, spatial Gaussian mixture datasets, semi-real datasets, and MIBI datasets are included. Code to run SpatialSort can be found at https://github.com/Roth-Lab/SpatialSort.
3430 Background: Small cell lung cancer (SCLC) accounts for 20% of the yearly cases of lung cancer in the United States. Survival rates for this disease have improved little over the last 20 years with median survival times of only 7-13 months. This may in part be due to the high frequency of relapse with resistant micrometastatic disease after initial chemotherapy. The identification of molecular targets for prognosis and therapy of the disease may be crucial in improving survival rates of SCLC. Array comparative genomic hybridization (aCGH) is a new method for detecting genomic alterations in human cancers, which may allow the rapid discovery of novel genes involved in tumorigenesis. We have designed a CGH array which represents a >100-fold increase in resolution over conventional metaphase CGH allowing for genome-wide identification of genetic alterations at a resolution of 100kbp. Objectives: To identify novel genetic alterations in SCLC cell lines by high resolution copy number profiling of the entire human genome. Design: Utilizing a human whole-genome 32,433 overlapping BAC clone set we have created a CGH array spanning the entire genome at an average density of 10 clones per megabase. This array was then used to profile 15 SCLC cell lines for copy number changes. Sample DNA was labeled with cyanine 5 through a random priming reaction and co-hybridized with a cyanine 3 labeled reference DNA to the arrays. Post-hybridization the arrays were scanned using a CCD based imaging system from Applied Precision. Signal ratios were then determined using the Softworx array analysis program. Results: We have delineated breakpoints in SCLC lines to within one BAC clone in single experiments. This high resolution profiling has allowed us to identify both the expected alterations at loci such as hTERT, ATM and MYC as well as novel micro-amplifications and deletions as small as 200kb that have not been detected by conventional methodologies. This has allowed us to rapidly identify novel candidate genes associated with SCLC tumorigenesis. Conclusions High resolution genome wide array CGH has allowed the rapid identification of novel candidate genes associated with lung cancer tumorigenesis. This work was supported by grants from Genome Canada/B.C., NCIC Terry Fox New Frontiers, and Lung SPORE P50 CA70907.
Abstract Emerging spatial proteomics technologies have created new opportunities to move beyond quantifying the composition of cell types in tissue and begin probing spatial structure. However, current methods for analysing such data are designed for non-spatial data and ignore spatial information. We present SpatialSort, a spatially aware Bayesian clustering approach that allows for the incorporation of prior biological knowledge. SpatialSort clusters cells by accounting for affinities of cells of different types to neighbours in space. Additionally, by incorporating prior information about cell types, SpatialSort outperforms current methods and can perform automated annotation of clusters.
Single cell segmentation is critical in the processing of spatial omics data to accurately perform cell type identification and analyze spatial expression patterns. Segmentation methods often rely on semi-supervised annotation or labeled training data which are highly dependent on user expertise. To ensure the quality of segmentation, current evaluation strategies quantify accuracy by assessing cellular masks or through iterative inspection by pathologists. While these strategies each address either the statistical or biological aspects of segmentation, there lacks a unified approach to evaluating segmentation accuracy.
Abstract Motivation Single cell segmentation is critical in the processing of spatial omics data to accurately perform cell type identification and analyze spatial expression patterns. Segmentation methods often rely on semi-supervised annotation or labeled training data which are highly dependent on user expertise. To ensure the quality of segmentation, current evaluation strategies quantify accuracy by assessing cellular masks or through iterative inspection by pathologists. While these strategies each address either the statistical or biological aspects of segmentation, there lacks an unified approach to evaluating segmentation accuracy. Results In this paper, we present ESQmodel, a Bayesian probabilistic method to evaluate single cell segmentation using expression data. By using the extracted cellular data from segmentation and a prior belief of cellular composition as input, ESQmodel computes per cell entropy to assess segmentation quality by how consistent cellular expression profiles match with cell type expectations. Availability and implementation Source code is available on Github at: https://github.com/Roth-Lab/ESQmodel under the MIT license.
Abstract Motivation Recent advances in spatial proteomics technologies have enabled the profiling of dozens of proteins in thousands of single cells in situ. This has created the opportunity to move beyond quantifying the composition of cell types in tissue, and instead probe the spatial relationships between cells. However, most current methods for clustering data from these assays only consider the expression values of cells and ignore the spatial context. Furthermore, existing approaches do not account for prior information about the expected cell populations in a sample. Results To address these shortcomings, we developed SpatialSort, a spatially aware Bayesian clustering approach that allows for the incorporation of prior biological knowledge. Our method is able to account for the affinities of cells of different types to neighbour in space, and by incorporating prior information about expected cell populations, it is able to simultaneously improve clustering accuracy and perform automated annotation of clusters. Using synthetic and real data, we show that by using spatial and prior information SpatialSort improves clustering accuracy. We also demonstrate how SpatialSort can perform label transfer between spatial and nonspatial modalities through the analysis of a real world diffuse large B-cell lymphoma dataset. Availability and implementation Source code is available on Github at: https://github.com/Roth-Lab/SpatialSort.
All data introduced in the publication "SpatialSort: Probabilistic Spatially Aware Clustering and Cell Population Annotation for Spatial Omics Data" are deposited here. Forward simulated datasets, spatial Gaussian mixture datasets, semi-real datasets, and MIBI datasets are included. Code to run SpatialSort can be found at https://github.com/Roth-Lab/SpatialSort.
All in-house generated data introduced in the publication "ESQmodel: biologically informed evaluation of 2-D cell segmentation quality in multiplexed tissue images" are deposited here. Simulated and real IMC datasets are included. Code to run ESQmodel can be found at https://github.com/Roth-Lab/ESQmodel.
All in-house generated data introduced in the publication "ESQmodel: biologically informed evaluation of 2-D cell segmentation quality in multiplexed tissue images" are deposited here. Simulated and real datasets are included. Code to run ESQmodel can be found at https://github.com/Roth-Lab/ESQmodel.
All in-house generated data introduced in the publication "ESQmodel: biologically informed evaluation of 2-D cell segmentation quality in multiplexed tissue images" are deposited here. Simulated and real datasets are included. Code to run ESQmodel can be found at https://github.com/Roth-Lab/ESQmodel.