Guide – Discovery Analysis

Discovery is for gene exploration.

Discovery is CellSeekr’s exploratory mode, designed to help users investigate and discover gene expression across immune cells through multiple analytical views. It includes: top 50 differential gene expression analysis; gene-to-cell-type mapping; proximity mapping of gene expression between two cell types or states; a UMAP viewer; and a GEO explorer. All views are anchored to specific cell types and support pairwise comparisons between patient groups within a selected scRNA-seq study.

Basic Steps to Using CellSeekr’ Discovery Tool

In order to use CellSeekr’s Discovery tool, access the analyzer page and select “Discovery.”

Start here: Select analysis module.

The Discovery workstream offers five different modules: Signature To Cell Tethering, Top 50 DE Genes, UMAP viewer, Promixity Mapper and GEO Cluster Explorer.

A user can navigate between the tools by selecting the appropriate button.

Module 1. Signature to Cell Tethering

The Signature to Cell Tethering module identifies the proportion of a specific immune cell type expressing a gene or set of genes, and determines whether that expression differs significantly across patient groups.

Step 1. Select the tumor type and the pairwise comparison from the dropdowns.

Step 2. Type up to 5 genes you wish to analyze into the Genes box. If entered correctly, the gene name will appear and become selectable.

Step 3. Click “Run Analysis” to perform the analysis.

For each analysis, CellSeekr will generate a Proportional Expression per Cell Type and Ranking leaderboard, a Top Phenotypes graph as well as Significance leaderboard and a Gene expression per group graph.

The Proportional Expression per Cell Type and Ranking leaderboard is a table displaying the expression of selected genes across immune cell types, ranked by expression proportion. The table can be customized using the Gene Filter and Cell Type Filter dropdowns to display either a composite view of all selected genes or a single gene at a time, and to show expression levels across all immune cell types or within a specific cell type of interest.

The Top Phenotypes and Gene Expression per Group graphs display expression of selected genes as bar graphs across cell types and sample groups, enabling direct comparison of gene expression levels between cell types within each group. The graphs can be customized to display only certain cell types or genes from the dropdown menus above the graphs.

The Significance Leadeboard compares expression levels o f all selected genes between cell types and groups, and rell types by the statistical significance of differential expression of the selected gene(s) between the two comparison groups. For each cell type, the table displays the number of significantly differentially expressed genes along with their corresponding p-values and expression levels.

Data from tables can be downloaded as .tsv files.

Top Phenotypes Graph

The Top Phenotypes graph displays a bar chart with percentage expression of the selected genes for the top 4 immune cell types expressing IFNG. The group being viewed can be changed from the dropdown menu (image below) above the graph.

Proportional gene expression per cell type

When a gene is selected from the Gene Filter dropdown, the table displays the percentage of each cell type expressing that gene. Results can be further filtered to show expression within a specific cell type of interest using the Cell Type Filter dropdown (figure on the right shows T cells expressing IFNG).

Proportional Expression Ranking Leaderboard

The Proportional Expression Ranking Leaderboard ranks cell types by the proportion of cells expressing the selected gene(s). The table can be filtered by cell type, individual gene, or a combination of both.

Gene expression per group Graph

The Geometric Cluster Map displays a merged UMAP in which clusters are identified within predefined cell types, with differential expression analysis performed within each cluster and visualized as pairwise comparisons.

Significance Leaderboard

The Significance Leaderboard ranks cell types by the statistical significance of expression of the selected gene(s) between the two comparison groups. For each cell type, the table displays the number of genes with significant differential expression, along with their corresponding p-values and expression levels.

Module 2. Top 50 DE Genes

The Top 50 DE Genes module identifies the top 50 differentially expressed genes within a selected cell type between two patient groups.

Step 1. Select the tumor type and the pairwise comparison groups from the dropdowns.

Step 2. In the Phenotype Gate box, select the cell type you wish to analyze.

Step 3. Click “Find Top 50 DE Genes” to run your analysis and generate your results.

CellSeekr produces a volcano plot and a list of Top 50 up- and downregulated genes in the chosen cell type and comparison groups (selected in Step 1).

Top 50 DE Genes Graph: Volcano

The volcano plot displays the top 50 upregulated and top 50 downregulated genes for the selected comparison group and cell type. The example shown highlights the top 50 up- and downregulated genes in immune cells of newly diagnosed GBM compared to PD-1 non-responders.

Module 3. UMAP Viewer

The UMAP Viewer allows users to visualize gated cell populations and highlight which cells express a gene or set of genes of interest.

Step 1. Select the tumor type and the pairwise comparison from the dropdowns.

Step 2. In the Phenotype Gate box, select the cell type you wish to visualize.

Step 3. Type the gene(s) you wish to visualize in the Genes to Highlight box. If entered correctly, the gene name will appear and become selectable.

Step 4. Optional: Customize the highlight color and point size for the selected genes.

Step 5. Click “Render UMAPs” to generate your visualization.

A cell lights up if any gene is expressed > 0.

UMAP Viewer Graph

The UMAPs visualize cell populations highlighting those cells which express a gene of interest selected in the Genes to Highlight field. The example on the left shows UMAPs highlighting all immune cells expressing the PDCD1 gene in newly diagnosed and non-responder groups. A cell lights up if any gene is expressed > 0.

Module 4. Proximity Mapper

The Proximity Mapper module identifies the reference cell population most closely associated with a query cell population, based on the lowest average distance to nearest neighbors, in order to infer spatial relationships between cell types or states.

Step 1. Select the tumor type and the pairwise comparison from the dropdowns.

Step 2. Under Query Gene (+/−), type in the gene you wish to analyze. If entered correctly, the gene name will appear and become selectable.

Step 3. Under Query Cell Type, select the cell type you wish to generate expression for.

Step 4. In the Reference A Gene (+/-) box, type in the reference, or second, gene you wish to analyze. If entered correctly, the gene name will appear and become selectable.

Step 5. Select the cell type for Reference A Gene in the Reference A Cell Type dropdown box.

Step 6. Repeat steps 3 and 4 to select a second (“B”) reference gene and cell type. Both reference cell types can be the same, but the genes should be different.

Step 5. Click “Run Proximity Analysis” to generate your visualization.

CellSeekr will generate a Proximity Spectrum per group, and two UMAPs, one for each comparison group selected in step 1. More on the graphs below.

Proximity Spectrum Graphs

The Proximity Spectrum Graphs display the association between a gene of interest and one of two reference populations — each defined by a selected gene — based on lowest average distance. These graphs are used to infer spatial relationships between cell types and gene expression patterns. The examples on the left show association between IFNG+ immune cells and either PDCD1 or IL2RA-expressing immune cells.

UMAP Gated Population Per Group

The Gated Population per Group UMAP highlights the association of a query cell population with one or both reference cell-gene populations, illustrating spatial relationships in gene expression across groups. The UMAP on the left shows association between IFNG+ immune cells and either PDCD1 or IL2RA-expressing immune cells.

Module 5. GEO Cluster Explorer

The GEO Cluster Explorer geometrically partitions UMAP space into clusters within predefined cell types and performs group-level differential expression analysis within each cluster.

Step 1. Select the tumor type and the pairwise comparison from the dropdowns.

Step 2. In the Phenotype Gate box, select the cell type you wish to analyze.

Sampling: Sampling is set at 500 cells from each group. You can customize this below.

Step 3. Under Display, you can change the size of the background cells and the seed cells.

Step 4. In the Cells per Group to Sample box, specify the number of cells to sample per group. The default is 500; if unsure, leave this value unchanged.

Step 5. Under GEO-Cluster DE Filters, set the minimum detection percentage (in at least one cluster) and the minimum percentage difference between clusters. These filters apply to the pairwise gene tables displayed below the UMAP. A gene must pass both thresholds to appear in the results. The more stringent the selection, the fewer genes will be shown.

Step 5. Click “Run GEO Clustering” to generate your visualization

CellSeekr generates a Geometric Cluster Map, and Patient-level GEO cluster composition graphs, and enables customization to modify cluster comparison and sub-filter by cell subtype.

Geometric Cluster Map

The Geometric Cluster Map displays a merged UMAP in which clusters are identified within predefined cell types, with differential expression analysis performed within each cluster and visualized as pairwise comparisons.

Patient-level GEO Cluster Comparison

The Patient-Level GEO Cluster Comparisons display bar graphs compare patient groups within each cluster. Three graphs are generated — one per cluster — with each patient’s gated cells assigned to a GEO cluster via majority vote.

Filtering by Cell Subtype

The GEO Cluster Explorer enables customized comparison of gene expression within and between clusters, with additional filtering by specific cell subtypes (see image to the left). Once these filters are applied, CellSeekr generates a table ranking genes by percentage expression difference between the selected cell types and/or clusters.


How to request studies to be added to CellSeekr

To request custom scRNAseq studies to be added to the analyzer, submit a study request.