Hypergraph Representations of Single-Cell RNA Sequencing Data for Improved Cell Clustering

Wan He, Daniel I. Bolnick, Samuel V. Scarpino, Tina Eliassi-Rad
Bioinformatics
btag148
March 27, 2026

Single-cell RNA sequencing (scRNA-seq) data analysis is often performed using network projections that produce co-expression networks. These network-based algorithms are attractive because regulatory interactions are fundamentally network-based and there are many tools available for downstream analysis. However, most network-based approaches have two major limitations. First, they are typically unipartite and therefore fail to capture higher-order information. Second, scRNA-seq data are often sparse, so most algorithms for constructing unipartite network projections are inefficient and may overestimate co-expression relationships, or may under-utilize the sparsity when clustering (e.g., with cosine distance). To address these limitations, we propose representing scRNA-seq expression data as hypergraphs, which are generalized graphs where a hyperedge can connect more than two nodes. In this context, hypergraph nodes represent cells, and hyperedges represent genes. Each hyperedge connects all cells in which its corresponding gene is actively expressed, indicating the expression of that gene across different cells. The resulting hypergraph can capture higher-order information and appropriately handle varying levels of data sparsity. This representation enables clustering algorithms to leverage higher-order relationships for improved cell-type differentiation.

Related publications