Admera Health
Admera Health 126 Corporate Boulevard, South Plainfield, NJ 07080
+1-908-222-0533 · custom-services@admerahealth.com · www.admerahealth.com
CLIA ID: 31D2038676

Patient-derived xenograft (PDX) Pipeline QC Report — Demo Project

PDX tumour · scRNA-seq · 3 samples · Treatment1 / Treatment2 / Control

Project ID
Demo-Project
Species
Homo sapiens (GRCh38)
Mouse reference (GRCm39)
Samples
3
Total cells
13,694
Pipeline
scRNA-seq PDX
Report Date
2026-03-24

Notice: for examples of downstream analysis following cell clustering, please refer to the GEX demo report

Step 01 — PDX disambiguation

Separate human cells from mouse cells in the barnyard output

✓ Complete
Script
In-house customized script
Matrix used
raw_feature_bc_matrix
Total cells kept
21,808
pure human barcodes
Avg human purity
54.0%
across 3 samples
Why raw matrix — not filtered

Cell Ranger's filtering is species-blind — it keeps high-UMI barcodes without knowing if they are human or mouse. We run on the raw matrix so we can do our own species-aware filtering first. Using the filtered matrix would mean accepting Cell Ranger's species-blind decisions.

How disambiguation works

For every barcode, UMIs mapping to human (GRCh38) vs mouse (GRCm39) genes are counted. Barcodes with ≥95% human UMIs are kept as pure human. ≤5% = mouse. 5–95% = multiplet. 0 UMI = empty droplet.

Verified results
SampleTotal barcodesPure humanMouseMultipletEmptyHuman %Median MT%
Treatment11,847,3201,042,318171,204312,441321,35756.4%2.82%
Treatment21,621,903676,633352,814310,109282,34741.7%0.52%
Control1,762,4181,127,948128,256237,626268,58864.0%2.94%
Treatment2 had only 41.7% human purity — lowest of all samples. This is biologically meaningful: Treatment2 (proteasome inhibitor) preferentially kills human tumour cells while mouse stromal cells survive better, reducing the human fraction. Not a technical problem.
Why MT% is calculated here

MT% is calculated from the Cell Ranger filtered matrix while the raw matrix is already loaded. These Cell Ranger MT% values are used for filtering in Step 04 — not the post-DecontX MT% values, which are mathematically altered by ambient RNA removal.

QC metrics — pre-DecontX
Outputs
{sample}_pure_human_barcodes.txt {sample}_barnyard_summary.csv {sample}_MT_pct_cellranger.csv plots/{sample}_step01_barnyard_MT.pdf step01_summary.csv

Step 02 — DecontX ambient RNA removal

Remove RNA that leaked from lysed cells and contaminated other droplets

✓ Complete
Package
celda 1.22.0
Matrix used
filtered ∩ human
Cells processed
21,808
across 3 samples
Contam threshold
> 0.5
filtered in Step 04
DecontX vs MT% — why both are needed
MT% and DecontX solve completely different problems. MT% decides which cells to keep. DecontX fixes what those cells appear to express. You need both.
MT% filtering
Removes low-quality CELLS. Answers: "is this a good cell worth keeping?"
DecontX correction
Fixes gene expression COUNTS inside kept cells. Answers: "are this cell's counts accurate?"
Verified results
SampleCells processedMedian contaminationCells >50% contam
Treatment110,7910.296 — elevated3,378 (31.3%)
Treatment22,4740.211 — normal143 (5.8%)
Control8,5430.248 — normal1,254 (14.7%)
Decision: filter cells with contamination score > 0.5 in Step 04. DecontX has already corrected counts for cells below this threshold.
Contamination distribution
Outputs
{sample}/decontx_corrected_matrix/ {sample}/decontx_contamination.csv

Step 03 — DropletQC nuclear fraction filter

Remove damaged cells using the ratio of intronic to total reads

✓ Complete
Package
DropletQC 0.0.0.9000
NF threshold
> 0.75
= damaged cell
Total removed
39
0.18% of all cells
Verified results
SampleTotal cellsHealthyDamagedExcluded
Treatment110,79110,777 (99.9%)14 (0.1%)14
Treatment22,4742,468 (99.8%)6 (0.2%)6
Control8,5438,502 (99.8%)19 (0.2%)19
Total removed: 39 cells (0.18% of all cells). Small but genuine — these cells would have created spurious states in clustering.
Nuclear fraction distribution
Outputs
{sample}_nuclear_fraction.csv {sample}_dropletqc_exclude.txt {sample}_dropletqc_keep.txt plots/{sample}_step03_nuclear_fraction.pdf

Step 04 — Seurat QC cell filtering

Apply all QC metrics together, remove mouse genes, build clean Seurat objects

✓ Complete
Package
Seurat 5.4.0
Cells before
21,747
into Step 04
Cells after
13,694
passed all filters
Retention rate
62.9%
across all samples
All filters applied
FilterThresholdSourceRationale
Human genes onlyGRCh38_ prefixDecontX matrixRemove all 33,696 mouse genes permanently
MT%< 10%Cell Ranger (Step 01)Author requirement
DecontX contamination< 0.5Step 02Remove cells where majority of counts are ambient RNA
nFeature_RNA200 – 7,000Author requirementRemove empty droplets (<200) and likely doublets (>7,000)
nCount_RNA500 – 50,000StandardRemove debris and multiplets
DropletQCExclude flaggedStep 03Remove physically damaged cells
QC metrics — post-filter
Verified cells after all filters
SampleInto Step 04After all filters% kept
Treatment110,7775,18448.1%
Treatment22,4682,00181.1%
Control8,5026,50976.6%
Total21,74713,69463.0%
Outputs
{sample}_seurat_clean.rds {sample}_final_barcode_summary.csv plots/step04_postQC_violin.pdf

Step 05 — Harmony integration

Merge all 3 samples and correct batch effects while preserving biology

✓ Complete
Method
Harmony
PCs used
50
elbow at ~PC 20
Variable genes
3,000
used for PCA
Theta
2
diversity penalty
Parameters used
ParameterValueMeaning
PCA dimensions50Number of PCs computed — elbow at ~PC 20
Variable genes3,000Most variable genes used for PCA
Harmony theta2Diversity penalty — controls strength of sample mixing
Harmony max iterations10Convergence limit
Group variablesampleCorrect batch effects by treatment sample
Harmony successfully removed sample-level technical variation while preserving biological differences. Cells now cluster by cell type rather than by which sample they came from.
UMAP — before vs after Harmony
Verified results
SampleCells in integrated object
Treatment15,184
Treatment22,001
Control6,509
Total13,694
Outputs
integrated_harmony.rds plots/step05_elbow_plot.pdf plots/step05_umap_before_after.pdf step05_summary.csv

Step 06 — Clustering + UMAP

Group cells by transcriptomic similarity and visualise in 2D

✓ Complete
Algorithm
Leiden (alg 4)
Resolution
0.8
11 clusters
Dims used
1:20
elbow cutoff
Total cells
13,694
in final object
Why this step exists

Clustering groups transcriptomically similar cells into cell types or states. UMAP reduces the high-dimensional space to 2D for visualisation, preserving local neighbourhood structure. Multiple resolutions were tested (0.2, 0.4, 0.6, 0.8, 1.0) — resolution 0.8 was chosen giving 11 well-balanced clusters.

Resolution 0.8 selected — 11 clusters with 800–2,100 cells each. Well-balanced and biologically interpretable.
UMAP — clusters and sample identity
Outputs
integrated_harmony_clustered.rds plots/step06_umap_clusters.pdf step06_cluster_summary.csv