A non-parameteric method to identify tissue-specific molecular features with unequal sample group sizes

To understand biology and differences among various tissues or cell types, one typically searches for molecular features that display characteristic abundance patterns. Several specificity metrics have been introduced to identify tissue-specific molecular features, but these either require an equal number of replicates per tissue or they can’t handle replicates at all. We describe a non-parametric specificity score that is compatible with unequal sample group sizes. To demonstrate its usefulness, the specificity score was calculated on all GTEx samples, detecting known and novel tissue-specific genes. A webtool was developed to browse these results for genes or tissues of interest. An example python implementation of SPECS is available at https://github.ugent.be/ceeverae/SPECs.



preprint available at BioRxiv

Made with the support of:

FWO
CRIG
UGent
Vocatio
STK