Skip to main content

Table 4 A summary of the analogies between document-topic-word and a biological object in the relevant studies (see ““Document-word-topic” in biological data” section)

From: An overview of topic modeling and its current applications in bioinformatics

Reference

Words

Topics

Documents

Biological dataset

Rogers et al. (2005), Masada et al. (2009), Perina et al. (2010), Bicego et al. (2010a, b, 2012), Lee et al. (2014)

Genes

Functional groups

Samples

Expression microarray data

Masseroli et al. (2012), Pinoli et al. (2013, 2014), Youngs et al. (2014)

Ontological terms

Latent relationship

Proteins

Protein annotations

Chen et al. (2010, 2012a, b), La Rosa et al. (2015), Zhang et al. (2015)

K-mers of DNA sequences

Taxonomic category/components of the whole genome

DNA sequences

Genomic sequences

Caldas et al. (2009)

Gene sets

Biological process

Experiments

Gene expression dataset

Coelho et al. (2010)

Object classes

Fundamental patterns

Images

Fluorescence images

Konietzny et al. (2011)

A fixed-sized vocabulary of words based on the gene annotations

Functional modules of protein families

Genome annotations

A set of genome annotations

Bisgin et al. (2013)

Endpoint measurements

Diagnostic topics

Drugs

Expression of the HCS endpoints

Chen et al. (2011), Randhave and Sonkamble (2014)

Functional elements (NCBI taxonomic level indicators, indicator of gene orthologous groups and KEGG pathway indicators)

Functional groups

Samples

Genome set

Pan et al. (2010)

Local sequential features

Latent topic features

Protein sequences

Protein–protein interaction dataset

Castellani et al. (2010)

Shape descriptors

Brain surface geometric patterns

Images

Magnetic resonance images

Pratanwanich and Lio (2014)

Genes

Pathways

Gene expression profiles

Gene expression data

Dawson and Kendziorski (2012)

Clinical events, treatment protocols, and genomic information from multiple sources

The category of patients

Patients

Patient’s text constructed from clinical and multidimensional genomic analyses