TCIApathfinder and downstream analysis

Pamela Russell

2019-09-21

TCIApathfinder wraps the Cancer Imaging Archive REST API. See TCIApathfinder vignettes for an introduction to package usage. This vignette shows how images downloaded with TCIApathfinder can be processed and analyzed with other R packages.

Use TCIApathfinder to download and extract an image series

library(TCIApathfinder)

# Pick a patient of interest
patient <- "TCGA-AR-A1AQ"

# Get information on all image series for this patient
series <- get_series_info(patient_id = patient)

# Pick an image series to download
series_instance_uid <- as.character(series$series[1, "series_instance_uid"])

# Download and unzip the image series
ser <- save_extracted_image_series(series_instance_uid = series_instance_uid)
dicom_dir <- ser$dirs

Use the “oro.dicom” package to load the image series

The oro.dicom package provides functions to process image files in DICOM format, which is the format used by TCIA. See oro.dicom package documentation for further details.

suppressPackageStartupMessages(library(oro.dicom))

# Read in the DICOM images and create a 3D array of intensities
dicom_list <- readDICOM(dicom_dir)
img_array_3d <- create3D(dicom_list)

# Check the dimensions of the 3D array
dim(img_array_3d)

Note that this series consists of 116 DICOM images. Each image is 256x256 pixels.

Download genomic data for this patient from The Cancer Genome Atlas

This patient is included in The Cancer Genome Atlas. A variety of germline and somatic genomic data can be downloaded with the Bioconductor package TCGAbiolinks. See TCGAbiolinks package vignettes for further detail. A sample workflow for analyzing TCGA data is provided in TCGA Workflow: Analyze cancer genomics and epigenomics data using Bioconductor packages.