Spatial Organ Atlas - Kidney
This report includes stats and diagnostic plots of the Nanostring GeoMx Digital Spatial (DSP) data.
Nanostring offers a set of R packages that enables comprehensive QC/analysis of the GeoMx DSP data and, visit the vignettes for more information. Nanostring generously provided Spatial Organ Atlas (SOA) and I downloaded the Kidney dataset to generate the following results. Lastly, the following results can also be generated by an App from Docker Hub that was built to compile a set of packages to improve reproducibility and replicability. To execute the Docker app, visit a repo on GitHub for the instructions.
WORKING_DIRECTORY
├── Annot
│ └── Annotation.xlsx
├── DCC
│ ├── DSP-*-?-???.dcc
└── PKC
└── *.pkc
# Do not run
baseDir <- "WORKING_DIRECTORY" # where the Annot/DCC/PKC/Settings folders are located
setwd(baseDir)
suppressPackageStartupMessages(library(NanoStringNCTools))
suppressPackageStartupMessages(library(GeoMxWorkflows))
suppressPackageStartupMessages(library(GeomxTools))
suppressPackageStartupMessages(library(stringr))
suppressPackageStartupMessages(library(dplyr))
suppressPackageStartupMessages(library(scales))
suppressPackageStartupMessages(library(ggplot2))
suppressPackageStartupMessages(library(ggpubr))
suppressPackageStartupMessages(library(ggsankey))
suppressPackageStartupMessages(library(plotly))
suppressPackageStartupMessages(library(reshape2))
suppressPackageStartupMessages(library(UpSetR))
suppressPackageStartupMessages(library(grid))
suppressPackageStartupMessages(library(EnvStats))
suppressPackageStartupMessages(library(DescTools))
suppressPackageStartupMessages(library(RColorBrewer))
suppressPackageStartupMessages(library(ComplexHeatmap))
suppressPackageStartupMessages(library(circlize))
dccFiles <- dir(file.path(baseDir, "DCC"), pattern = ".dcc$", full.names = TRUE, recursive = TRUE)
pkcFile <- dir(file.path(baseDir, "PKC"), pattern = ".pkc$", full.names = TRUE, recursive = TRUE)
annotFile <- dir(file.path(baseDir, "Annot"), pattern = ".xlsx$", full.names = TRUE, recursive = TRUE)
initSet <- readNanoStringGeoMxSet(
dccFiles = dccFiles,
pkcFiles = pkcFile,
phenoDataFile = annotFile,
phenoDataSheet = "Annotation",
phenoDataDccColName = "FileName",
protocolDataColNames = c("ROI", "AOI"),
experimentDataColNames = c("Panel")
)
## Warning in readNanoStringGeoMxSet(dccFiles = dccFiles, pkcFiles = pkcFile, :
## Annotations missing for the following: DSP-1003320000078-H-A01.dcc,
## DSP-1003320000078-H-A02.dcc, DSP-1003320000078-H-A03.dcc,
## DSP-1003320000078-H-A04.dcc, DSP-1003320000078-H-A05.dcc,
## DSP-1003320000078-H-A06.dcc, DSP-1003320000078-H-A07.dcc,
## DSP-1003320000078-H-A08.dcc, DSP-1003320000078-H-A09.dcc,
## DSP-1003320000078-H-A10.dcc, DSP-1003320000078-H-A11.dcc,
## DSP-1003320000078-H-A12.dcc, DSP-1003320000078-H-B01.dcc,
## DSP-1012300040419-G-A01.dcc, DSP-1012300040610-F-A01.dcc,
## DSP-1012300040610-F-E02.dcc, DSP-1012300040610-F-E03.dcc,
## DSP-1012300040610-F-E04.dcc, DSP-1012300040610-F-E05.dcc,
## DSP-1012300040610-F-E06.dcc, DSP-1012300040610-F-E07.dcc,
## DSP-1012300040610-F-E08.dcc, DSP-1012300040610-F-E09.dcc,
## DSP-1012300040610-F-E10.dcc, DSP-1012300040610-F-E11.dcc,
## DSP-1012300040610-F-E12.dcc, DSP-1012300040610-F-F01.dcc,
## DSP-1012300040610-F-F02.dcc, DSP-1012300040610-F-F03.dcc,
## DSP-1012300040610-F-F04.dcc, DSP-1012300040610-F-F05.dcc,
## DSP-1012300040610-F-F06.dcc, DSP-1012300040610-F-F07.dcc,
## DSP-1012300040610-F-F08.dcc, DSP-1012300040610-F-F09.dcc,
## DSP-1012300040610-F-F10.dcc, DSP-1012300040610-F-F11.dcc,
## DSP-1012300040610-F-F12.dcc, DSP-1012300040610-F-G01.dcc,
## DSP-1012300040610-F-G02.dcc, DSP-1012300040610-F-G03.dcc,
## DSP-1012300040610-F-G04.dcc, DSP-1012300040610-F-G05.dcc,
## DSP-1012300040610-F-G06.dcc, DSP-1012300040610-F-G07.dcc,
## DSP-1012300040610-F-G08.dcc, DSP-1012300040610-F-G09.dcc,
## DSP-1012300040610-F-G10.dcc, DSP-1012300040610-F-G11.dcc,
## DSP-1012300040610-F-G12.dcc, DSP-1012300040610-F-H01.dcc,
## DSP-1012305110210-G-D01.dcc, DSP-1012305110210-G-D02.dcc,
## DSP-1012305110210-G-D03.dcc, DSP-1012305110210-G-D04.dcc,
## DSP-1012305110210-G-D05.dcc, DSP-1012305110210-G-D06.dcc,
## DSP-1012305110210-G-D07.dcc, DSP-1012305110210-G-D08.dcc,
## DSP-1012305110210-G-D09.dcc, DSP-1012305110210-G-D10.dcc,
## DSP-1012305110210-G-D11.dcc, DSP-1012305110210-G-D12.dcc,
## DSP-1012305110210-G-E01.dcc, DSP-1012305110210-G-E02.dcc,
## DSP-1012305110210-G-E03.dcc, DSP-1012305110210-G-E04.dcc,
## DSP-1012305110210-G-E05.dcc, DSP-1012305110210-G-E06.dcc,
## DSP-1012305110210-G-E07.dcc, DSP-1012305110210-G-E08.dcc,
## DSP-1012305110210-G-E09.dcc, DSP-1012305110210-G-E10.dcc,
## DSP-1012305110210-G-E11.dcc, DSP-1012305110210-G-E12.dcc,
## DSP-1012305110210-G-F01.dcc, DSP-1012305110210-G-F02.dcc,
## DSP-1012305110210-G-F03.dcc, DSP-1012305110210-G-F04.dcc,
## DSP-1012305110210-G-F05.dcc, DSP-1012305110210-G-F06.dcc,
## DSP-1012305110210-G-F07.dcc, DSP-1012305110210-G-F08.dcc,
## DSP-1012305110210-G-F09.dcc, DSP-1012305110210-G-F10.dcc,
## DSP-1012305110210-G-F11.dcc, DSP-1012305110210-G-F12.dcc,
## DSP-1012305110210-G-G01.dcc, DSP-1012305110210-G-G02.dcc,
## DSP-1012305110210-G-G03.dcc, DSP-1012305110210-G-G04.dcc,
## DSP-1012305110210-G-G05.dcc, DSP-1012305110210-G-G06.dcc,
## DSP-1012305110210-G-G07.dcc, DSP-1012305110210-G-G08.dcc,
## DSP-1012305110210-G-G09.dcc, DSP-1012305110210-G-G10.dcc,
## DSP-1012305110210-G-G11.dcc, DSP-1012305110210-G-G12.dcc,
## DSP-1012305110210-G-H01.dcc, DSP-1012305110210-G-H02.dcc,
## DSP-1012305110210-G-H03.dcc, DSP-1012310040419-H-A01.dcc, These will be
## excluded from the GeoMxSet object.
## Warning in readNanoStringGeoMxSet(dccFiles = dccFiles, pkcFiles = pkcFile, : Not all probes are found within PKC probe metadata. The following probes are ignored from analysis and were most likely removed from metadata while resolving multiple module PKC version conflicts.
## RTS0021749RTS0021916RTS0021976RTS0021979RTS0022732RTS0027943RTS0031708RTS0036612RTS0036631RTS0050279RTS0050433RTS0050547RTS0050580RTS0051214RTS0051219RTS0051232RTS0051449RTS0052009RTS0029660
dim(initSet)
## Features Samples
## 18815 234
head(assayData(initSet)$exprs[, c(1:2)])
## DSP-1003320000078-H-B02.dcc DSP-1003320000078-H-B03.dcc
## RTS0020877 81 177
## RTS0020878 0 6
## RTS0020879 3 8
## RTS0020880 4 16
## RTS0020881 5 13
## RTS0020882 7 10
gSet <- shiftCountsOne(initSet, useDALogic = TRUE)
head(assayData(gSet)$exprs[, c(1:2)])
## DSP-1003320000078-H-B02.dcc DSP-1003320000078-H-B03.dcc
## RTS0020877 81 177
## RTS0020878 1 6
## RTS0020879 3 8
## RTS0020880 4 16
## RTS0020881 5 13
## RTS0020882 7 10
module <- gsub(".pkc", "", annotation(gSet))
gSet <- setSegmentQCFlags(gSet, qcCutoffs = segmentQcParams)
qcMat <- sData(gSet)
qcMat$Segment <- factor(qcMat$Segment, levels = segmentOrder)
suppressWarnings(histQC(qcMat, "Trimmed (%)", segmentQC_colBy, segmentQC_rowBy, segmentQC_trimmedThre, cols = segmentCols))
suppressWarnings(histQC(qcMat, "Stitched (%)", segmentQC_colBy, segmentQC_rowBy, segmentQC_stitchedThre, cols = segmentCols))
suppressWarnings(histQC(qcMat, "Aligned (%)", segmentQC_colBy, segmentQC_rowBy, segmentQC_alignedThre, cols = segmentCols))
suppressWarnings(histQC(qcMat, "Saturated (%)", segmentQC_colBy, segmentQC_rowBy, segmentQC_saturatedThre, cols = segmentCols))
suppressWarnings(histQC(qcMat, "area", segmentQC_colBy, segmentQC_rowBy, segmentQC_areaThre, "log10", "AOI Area (log10)", cols = segmentCols))
suppressWarnings(histQC(qcMat, "nuclei", segmentQC_colBy, segmentQC_rowBy, segmentQC_nucleiThre, "log10", "AOI nuclei count", cols = segmentCols))
segmentQcResults <- protocolData(gSet)[["QCFlags"]]
flagColumns <- colnames(segmentQcResults)
qcSummary <- data.frame(Pass = colSums(!segmentQcResults[, flagColumns]), Warning = colSums(segmentQcResults[, flagColumns]))
segmentQcResults$QCStatus <- apply(segmentQcResults, 1L, function(x) {
ifelse(sum(x) == 0L, "PASS", "WARNING")
})
qcSummary["TOTAL FLAGS", ] <- c(sum(segmentQcResults[, "QCStatus"] == "PASS"), sum(segmentQcResults[, "QCStatus"] == "WARNING"))
qcSummary[, "TOTAL"] <- apply(qcSummary, 1, sum)
qcSummary
## Pass Warning TOTAL
## LowReads 234 0 234
## LowTrimmed 233 1 234
## LowStitched 233 1 234
## LowAligned 233 1 234
## LowSaturation 234 0 234
## LowNegatives 234 0 234
## LowNuclei 234 0 234
## LowArea 234 0 234
## TOTAL FLAGS 233 1 234
gSet <- gSet[, segmentQcResults$QCStatus == "PASS"]
dim(gSet)
## Features Samples
## 18815 233
negCol <- "NegGeoMean"
negativeGeoMeans <- esBy(
negativeControlSubset(gSet),
GROUP = "Module",
FUN = function(x) {
assayDataApply(x, MARGIN = 2, FUN = ngeoMean, elt = "exprs")
}
)
protocolData(gSet)[[negCol]] <- negativeGeoMeans
pData(gSet)[, negCol] <- sData(gSet)[[negCol]]
backgrounMat <- sData(gSet)
backgrounMat <- backgrounMat[, c("NegGeoMean", segmentQC_colBy, segmentQC_rowBy)]
backgrounMat$Segment <- factor(backgrounMat$Segment, levels = segmentOrder)
suppressWarnings(histQC(backgrounMat, "NegGeoMean", segmentQC_colBy, segmentQC_rowBy, 2, "log10", "GeoMean(negative probes)", cols = segmentCols))
gSet <- setBioProbeQCFlags(gSet, qcCutoffs = probeQcParams, removeLocalOutliers = TRUE)
probeQcResults <- fData(gSet)[["QCFlags"]]
qcDf <- data.frame(
Passed = sum(rowSums(probeQcResults[, -1]) == 0),
Global = sum(probeQcResults$GlobalGrubbsOutlier),
Local = sum(rowSums(probeQcResults[, -2:-1]) > 0 & !probeQcResults$GlobalGrubbsOutlier),
TOTAL = nrow(probeQcResults)
)
qcDf
## Passed Global Local TOTAL
## 1 18790 1 24 18815
probeQcPassed <- subset(
gSet,
fData(gSet)[["QCFlags"]][, c("LowProbeRatio")] == FALSE &
fData(gSet)[["QCFlags"]][, c("GlobalGrubbsOutlier")] == FALSE
)
gSet <- probeQcPassed
dim(gSet)
## Features Samples
## 18814 233
newSet <- aggregateCounts(gSet)
newSet <- subset(newSet, fData(newSet)$TargetName != "NegProbe-WTX")
dim(newSet)
## Features Samples
## 18676 233
lowLvqcDf <- data.frame(
Slide = pData(newSet)$Slide,
Sample = pData(newSet)$Library,
Segment = pData(newSet)$Segment,
Area = pData(newSet)$area,
Nuclei = pData(newSet)$nuclei
)
lowLvqcDf$Slide <- factor(lowLvqcDf$Slide, levels=slideOrder)
lowLvqcDf$Segment <- factor(lowLvqcDf$Segment, levels=segmentOrder)
dodge <- position_dodge(width = 0.5)
suppressWarnings({
ggplot(data = lowLvqcDf, aes(x = Segment, y = Area, fill = Segment)) +
geom_violin(position = dodge, size = 0) +
geom_boxplot(width = 0.1, position = dodge, fill="white") +
scale_fill_manual(values = segmentCols) +
facet_grid( ~ Slide) +
labs(
title = "",
x = "",
y = "Area"
) +
theme_bw() +
theme(
axis.line = element_line(colour = "black"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.border = element_blank(),
panel.background = element_blank(),
axis.text.x = element_text(angle = 0, vjust = 0, hjust = 0.5),
legend.position = "none",
text = element_text(size = 12)
) +
scale_y_continuous(trans = "log10")
})
suppressWarnings({
ggplot(data = lowLvqcDf, aes(x = Segment, y = Nuclei, fill = Segment)) +
geom_violin(position = dodge, size = 0) +
geom_boxplot(width = 0.1, position = dodge, fill="white") +
scale_fill_manual(values = segmentCols) +
facet_grid( ~ Slide) +
labs(
title = "",
x = "",
y = "#Nuclei"
) +
theme_bw() +
theme(
axis.line = element_line(colour = "black"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.border = element_blank(),
panel.background = element_blank(),
axis.text.x = element_text(angle = 0, vjust = 0, hjust = 0.5),
legend.position = "none",
text = element_text(size = 12)
) +
scale_y_continuous(trans = "log10")
})
suppressWarnings({
scatterPlot <- ggplot(data = lowLvqcDf, aes(x = Area, y = Nuclei, color = Segment, label = Sample, label2 = Slide)) +
geom_point() +
scale_color_manual(values = segmentCols) +
labs(
title = "",
x = "Area",
y = "#Nuclei"
) +
theme_bw() +
theme(
axis.line = element_line(colour = "black"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.border = element_blank(),
panel.background = element_blank(),
axis.text.x = element_text(angle = 0, vjust = 0, hjust = 0.5),
text = element_text(size = 12)
) +
scale_x_continuous(trans = "log10") +
scale_y_continuous(trans = "log10")
})
ggplotly(scatterPlot)
loqDf <- data.frame(row.names = colnames(newSet))
varNames <- paste0(c("NegGeoMean_", "NegGeoSD_"), module)
if (all(varNames[1:2] %in% colnames(pData(newSet)))) {
loqDf[, module] <- pData(newSet)[, varNames[1]] * (pData(newSet)[, varNames[2]]^loqCutoff)
}
statDf <- data.frame(
Slide = pData(newSet)$Slide,
Sample = pData(newSet)$Sample,
Library = protocolData(newSet)$AOI,
Segment = pData(newSet)$Segment,
LOQ = loqDf[, 1]
)
statShort <- dcast(statDf, Sample ~ Segment, fun.aggregate = mean, value.var = "LOQ")
perSample <- statShort[, c(2:ncol(statShort))]
rownames(perSample) <- as.character(statShort[, 1])
statShort <- data.frame(
Sample = rownames(perSample),
LOQmean = as.matrix(apply(perSample, 1, mean, na.rm = TRUE))
)
statShort <- statShort[order(statShort$LOQmean), ]
statDf$Sample <- factor(statDf$Sample, levels = statShort$Sample)
statDf$Segment <- factor(statDf$Segment, levels = segmentOrder)
dodge <- position_dodge(width = 0.5)
suppressWarnings({
ggplot(data = statDf, aes(x = Segment, y = log10(LOQ), fill = Segment)) +
geom_violin(position = dodge, size = 0) +
geom_boxplot(width = 0.1, position = dodge, fill = "white") +
scale_fill_manual(values = segmentCols) +
labs(
title = "",
x = "Segment",
y = "LOQ, log10"
) +
theme_bw() +
theme(
axis.line = element_line(colour = "black"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.border = element_blank(),
panel.background = element_blank(),
axis.text.x = element_text(angle = 0, vjust = 0, hjust = 0.5),
legend.position = "none",
text = element_text(size = 12)
) +
geom_hline(aes(yintercept = log10(loqMin)), lty = 2, col = "grey50")
})
suppressWarnings({
ggplot(data = statDf, aes(x = Segment, y = log10(LOQ), fill = Segment)) +
geom_violin(position = dodge, size = 0) +
geom_boxplot(width = 0.1, position = dodge, fill = "white") +
scale_fill_manual(values = segmentCols) +
facet_grid(~Slide) +
labs(
title = "",
x = "Segment",
y = "LOQ, log10"
) +
theme_bw() +
theme(
axis.line = element_line(colour = "black"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.border = element_blank(),
panel.background = element_blank(),
axis.text.x = element_text(angle = 90, vjust = 0, hjust = 0),
legend.position = "none",
text = element_text(size = 12)
) +
geom_hline(aes(yintercept = log10(loqMin)), lty = 2, col = "grey50")
})
suppressWarnings({
ggplot(data = statDf, aes(x = Sample, y = log10(LOQ), group = Segment)) +
geom_line(aes(color = Segment), lwd = 0.5) +
geom_point(aes(color = Segment)) +
scale_color_manual(values = segmentCols) +
labs(
title = "",
x = "",
y = "LOQ, log10"
) +
theme_bw() +
theme(
axis.line = element_line(colour = "black"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.border = element_blank(),
panel.background = element_blank(),
axis.text.x = element_text(angle = 90, vjust = 0, hjust = 0),
# legend.position = "none",
text = element_text(size = 12)
) +
scale_x_discrete(guide = guide_axis(check.overlap = TRUE)) +
geom_hline(aes(yintercept = log10(loqMin)), lty = 2, col = "grey50")
})
loqDf[loqDf < loqMin] <- loqMin
pData(newSet)$LOQ <- loqDf
loqMat <- t(esApply(newSet, MARGIN = 1, FUN = function(x) {
x > LOQ[, module]
}))
loqMat <- loqMat[fData(newSet)$TargetName, ] # Ordering
pData(newSet)$GenesDetected <- colSums(loqMat, na.rm = TRUE)
pData(newSet)$GeneDetectionRate <- pData(newSet)$GenesDetected / nrow(newSet)
pData(newSet)$DetectionThreshold <- cut(pData(newSet)$GeneDetectionRate, breaks = geneDetectionRateBins, labels = geneDetectionRateBinLabels)
rateMat <- pData(newSet)
rateMat$Segment <- factor(rateMat$Segment, levels = segmentOrder)
suppressWarnings({
ggplot(rateMat, aes(x = DetectionThreshold)) +
geom_bar(aes(fill = Segment)) +
geom_text(stat = "count", aes(label = after_stat(count)), vjust = -0.5) +
scale_fill_manual(values = segmentCols) +
theme_bw() +
theme(
axis.line = element_line(colour = "black"),
panel.background = element_blank(),
axis.text.x = element_text(angle = 90, vjust = 0, hjust = 0),
text = element_text(size = 12)
) +
scale_y_continuous(expand = expansion(mult = c(0, 0.1))) +
labs(
x = "Gene Detection Rate",
y = "Number of Segments/Samples",
fill = "Segment"
) +
facet_grid(as.formula(paste("~", segmentQC_rowBy)))
})
statDf <- data.frame(
Sample = protocolData(newSet)$AOI,
Segment = pData(newSet)$Segment,
GenesDetected = colSums(loqMat, na.rm = TRUE),
Nuclei = pData(newSet)$nuclei
)
statDf$Segment <- factor(statDf$Segment, segmentOrder)
for (segment in segmentOrder) {
tmpDf <- statDf[which(statDf$Segment == segment), ]
tmpDf <- tmpDf[order(tmpDf$GenesDetected, decreasing = F), ]
tmpDf$Sample <- factor(tmpDf$Sample, levels = tmpDf$Sample)
barplot <- ggplot(tmpDf, aes(x = Sample, y = GenesDetected, fill = Segment)) +
geom_bar(stat = "identity") +
scale_fill_manual(values = segmentCols[segment]) +
theme_minimal() +
labs(
title = "",
x = "Sample"
) +
coord_flip() +
theme_bw() +
theme(
axis.line = element_line(colour = "black"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.border = element_blank(),
panel.background = element_blank(),
axis.text.x = element_text(angle = 0, vjust = 0, hjust = 0),
legend.position = "none"
) +
scale_x_discrete(guide = guide_axis(check.overlap = TRUE))
print(barplot)
}
newSet <- newSet[, pData(newSet)$GeneDetectionRate >= geneDetectionRateThre]
dim(newSet)
## Features Samples
## 18676 233
loqMat <- loqMat[, colnames(newSet)]
fData(newSet)$DetectedSegments <- rowSums(loqMat, na.rm = TRUE)
fData(newSet)$DetectionRate <- fData(newSet)$DetectedSegments / nrow(pData(newSet))
geneDetectionRateDf <- data.frame(
Gene = rownames(newSet),
Number = fData(newSet)$DetectedSegments,
DetectionRate = percent(fData(newSet)$DetectionRate)
)
geneDetectionRateDf <- geneDetectionRateDf[order(geneDetectionRateDf$Number, geneDetectionRateDf$DetectionRate, geneDetectionRateDf$Gene), ]
finalSet <- newSet[fData(newSet)$DetectionRate >= geneDetectionRateThre, ]
dim(finalSet)
## Features Samples
## 11225 233
finalSet <- normalize(finalSet, norm_method = "quant", desiredQuantile = .75, toElt = "q_norm")
annot <- pData(finalSet)
segmentIdx <- c()
for (segment in segmentOrder) segmentIdx <- c(segmentIdx, which(annot$Segment == segment))
rawMat <- assayData(finalSet)$exprs
colnames(rawMat) <- annot$Library
rawDf <- melt(rawMat)
colnames(rawDf) <- c("Gene", "Sample", "Expression")
rawDf$Segment <- rep(annot$Segment, each=nrow(rawMat))
rawDf$Segment <- factor(rawDf$Segment, levels = segmentOrder)
normMat <- assayData(finalSet)$q_norm
colnames(normMat) <- annot$Library
normDf <- melt(normMat)
colnames(normDf) <- c("Gene", "Sample", "Expression")
normDf$Segment <- rep(annot$Segment, each=nrow(rawMat))
normDf$Segment <- factor(normDf$Segment, levels = segmentOrder)
boxplot(log10(Expression) ~ Segment, data = rawDf, col = segmentCols, pch = 20, ylab = "Raw count, log10", xlab = "Segment")
rawMat <- rawMat[, segmentIdx]
boxplot(log10(rawMat), col = segmentCols[annot$Segment[segmentIdx]], pch = 20, names = NA, xlab = "Sample", ylab = "Raw count, log10")
boxplot(log10(Expression) ~ Segment, data = normDf, col = segmentCols, pch = 20, ylab = "Upper-quartile norm, log10", xlab = "Segment")
normMat <- normMat[, segmentIdx]
boxplot(log10(normMat), col = segmentCols[annot$Segment[segmentIdx]], pch = 20, names = NA, xlab = "Sample", ylab = "Upper-quartile norm, log10")
normMat <- assayData(finalSet)$q_norm
pcaRes <- prcomp(t(normMat), scale = TRUE)
pcaDf <- data.frame(
Sample = annot$Library,
X = pcaRes$x[, 1],
Y = pcaRes$x[, 2],
Slide = annot$Slide,
Segment = annot$Segment,
Region = annot$Region
)
pcaDf$Slide <- factor(pcaDf$Slide, levels = slideOrder)
pcaDf$Segment <- factor(pcaDf$Segment, levels = segmentOrder)
pcaDf$Region <- factor(pcaDf$Region)
suppressWarnings({
pcaPlot <- ggplot(data = pcaDf, aes(x = X, y = Y, color = Slide, label = Sample)) +
geom_point(size = 2, shape = 20) +
scale_color_manual(values = slideCols) +
labs(
x = paste0("PC1 (", round(summary(pcaRes)$importance[2, c(1)] * 100, 1), "%)"),
y = paste0("PC2 (", round(summary(pcaRes)$importance[2, c(2)] * 100, 1), "%)")
) +
theme_bw() +
theme(
axis.line = element_line(colour = "black"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.border = element_blank(),
panel.background = element_blank(),
axis.text.x = element_text(angle = 0, vjust = 0, hjust = 0.5),
text = element_text(size = 12)
)
})
ggplotly(pcaPlot)
suppressWarnings({
pcaPlot <- ggplot(data = pcaDf, aes(x = X, y = Y, color = Segment, label = Sample)) +
geom_point(size = 2, shape = 20) +
scale_color_manual(values = segmentCols) +
labs(
x = paste0("PC1 (", round(summary(pcaRes)$importance[2, c(1)] * 100, 1), "%)"),
y = paste0("PC2 (", round(summary(pcaRes)$importance[2, c(2)] * 100, 1), "%)")
) +
theme_bw() +
theme(
axis.line = element_line(colour = "black"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.border = element_blank(),
panel.background = element_blank(),
axis.text.x = element_text(angle = 0, vjust = 0, hjust = 0.5),
text = element_text(size = 12)
)
})
ggplotly(pcaPlot)
suppressWarnings({
pcaPlot <- ggplot(data = pcaDf, aes(x = X, y = Y, color = Region, label = Sample)) +
geom_point(size = 2, shape = 20) +
scale_color_manual(values = regionCols) +
labs(
x = paste0("PC1 (", round(summary(pcaRes)$importance[2, c(1)] * 100, 1), "%)"),
y = paste0("PC2 (", round(summary(pcaRes)$importance[2, c(2)] * 100, 1), "%)")
) +
theme_bw() +
theme(
axis.line = element_line(colour = "black"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.border = element_blank(),
panel.background = element_blank(),
axis.text.x = element_text(angle = 0, vjust = 0, hjust = 0.5),
text = element_text(size = 12)
)
})
ggplotly(pcaPlot)
Dataset | #Samples | #Features |
---|---|---|
Initial | 234 | 18815 |
After QC (analysis-ready) | 233 | 11225 |
countDf <- pData(finalSet) %>% make_long(Slide, Segment, Region)
ggplot(countDf, aes(x = x, next_x = next_x, node = node, next_node = next_node, fill = factor(node), label = node)) +
geom_sankey(flow.alpha = .6, node.color = "gray30") +
geom_sankey_label(size = 3, color = "black", fill = "white") +
scale_fill_viridis_d(option = "A", alpha = 0.95) +
theme_sankey(base_size = 18) +
labs(
x = NULL,
y = NULL
) +
theme_bw() +
theme(
axis.line = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.border = element_blank(),
panel.background = element_blank(),
axis.text.x = element_text(angle = 0, vjust = 0, hjust = 0.5),
axis.ticks.x = element_blank(),
axis.text.y = element_blank(),
axis.ticks.y = element_blank(),
legend.position = "none",
text = element_text(size = 12)
)
Stats in the reference datasets, i.e., Nanostring Spatial Organ Atals (SOA)
Step | Filter name | Threshold | Description |
---|---|---|---|
SegmentQC | minSegmentReads | 1000 | Minimum number of reads |
percentTrimmed | 80 | Minimum % of reads trimmed | |
percentStitched | 80 | Minimum % of reads stitched | |
percentAligned | 75 | Minimum % of reads aligned | |
percentSaturation | 50 | Minimum sequencing saturation (%) | |
minNuclei | 20 | Minimum number of nuclei estimated | |
minArea | 1000 | Minimum segment area | |
ProbeQC | minProbeRatio | 0.1 | Geometric mean of a given probe / geometric mean of all probe |
percentFailGrubbs | 20 | An outlier according to the Grubb’s test (%) | |
LOQ | loqCutoff | 2 | LOQ cut off value |
loqMin | 2 | LOQ minimum value | |
GeneQC | geneDetectionRateThre | 0.05 | Minimum gene detection rate |