1 Available datasets

The TENxVisiumData package provides an R/Bioconductor resource for Visium spatial gene expression datasets by 10X Genomics. The package currently includes 13 datasets from 23 samples across two organisms (human and mouse) and 13 tissues:

A list of currently available datasets can be obtained using the ExperimentHub interface:

eh <- ExperimentHub()
(q <- query(eh, "TENxVisium"))
## ExperimentHub with 26 records
## # snapshotDate(): 2023-04-24
## # $dataprovider: 10X Genomics
## # $species: Homo sapiens, Mus musculus
## # $rdataclass: SpatialExperiment
## # additional mcols(): taxonomyid, genome, description,
## #   coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
## #   rdatapath, sourceurl, sourcetype 
## # retrieve records with, e.g., 'object[["EH6695"]]' 
##            title                            
##   EH6695 | HumanBreastCancerIDC             
##   EH6696 | HumanBreastCancerILC             
##   EH6697 | HumanCerebellum                  
##   EH6698 | HumanColorectalCancer            
##   EH6699 | HumanGlioblastoma                
##   ...      ...                              
##   EH6739 | HumanSpinalCord_v3.13            
##   EH6740 | MouseBrainCoronal_v3.13          
##   EH6741 | MouseBrainSagittalPosterior_v3.13
##   EH6742 | MouseBrainSagittalAnterior_v3.13 
##   EH6743 | MouseKidneyCoronal_v3.13

2 Loading the data

To retrieve a dataset, we can use a dataset’s corresponding named function <id>(), where <id> should correspond to one a valid dataset identifier (see ?TENxVisiumData). E.g.:

spe <- HumanHeart()

Alternatively, data can loaded directly from Bioconductor’s ExerimentHub as follows. First, we initialize a hub instance and store the complete list of records in a variable eh. Using query(), we then identify any records made available by the TENxVisiumData package, as well as their accession IDs (EH1234). Finally, we can load the data into R via eh[[id]], where id corresponds to the data entry’s identifier we’d like to load. E.g.:

eh <- ExperimentHub()        # initialize hub instance
q <- query(eh, "TENxVisium") # retrieve 'TENxVisiumData' records
id <- q$ah_id[1]             # specify dataset ID to load
spe <- eh[[id]]              # load specified dataset

3 Data representation

Each dataset is provided as a SpatialExperiment (SPE), which extends the SingleCellExperiment (SCE) class with features specific to spatially resolved data:

## class: SpatialExperiment 
## dim: 36601 7785 
## metadata(0):
## assays(1): counts
## rownames(36601): ENSG00000243485 ENSG00000237613 ... ENSG00000278817
##   ENSG00000277196
## rowData names(1): symbol
## colData names(1): sample_id
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
## spatialCoords names(2) : pxl_col_in_fullres pxl_row_in_fullres
## imgData names(4): sample_id image_id data scaleFactor

For details on the SPE class, we refer to the package’s vignette. Briefly, the SPE harbors the following data in addition to that stored in a SCE:

spatialCoords; a numeric matrix of spatial coordinates, stored inside the object’s int_colData:

##                    pxl_col_in_fullres pxl_row_in_fullres
## AAACAAGTATCTCCCA-1              15937              17428
## AAACACCAATAACTGC-1              18054               6092
## AAACAGAGCGACTCCT-1               7383              16351
## AAACAGGGTCTATATT-1              15202               5278
## AAACAGTGTTCCTGGG-1              21386               9363
## AAACATTTCCCGGATT-1              18549              16740

spatialData; a DFrame of spatially-related sample metadata, stored as part of the object’s colData. This colData subset is in turn determined by the int_metadata field spatialDataNames:

## DataFrame with 6 rows and 0 columns

imgData; a DFrame containing image-related data, stored inside the int_metadata:

## DataFrame with 2 rows and 4 columns
##               sample_id    image_id   data scaleFactor
##             <character> <character> <list>   <numeric>
## 1 HumanBreastCancerIDC1      lowres   ####   0.0247525
## 2 HumanBreastCancerIDC2      lowres   ####   0.0247525

Datasets with multiple sections are consolidated into a single SPE with colData field sample_id indicating each spot’s sample of origin. E.g.:

spe <- MouseBrainSagittalAnterior()
## MouseBrainSagittalAnterior1 MouseBrainSagittalAnterior2 
##                        2695                        2825

Datasets of targeted analyses are provided as a nested SPE, with whole transcriptome measurements as primary data, and those obtained from targeted panels as altExps. E.g.:

spe <- HumanOvarianCancer()
## [1] "TargetedImmunology" "TargetedPanCancer"

