In the past decade, multi-omics data for complex diseases have been generated rapidly and used extensively to improve understanding of molecular mechanisms underlying phenotypes. Presently, accurate identification of disease-associated alterations from multi-dimensional omics data remains a key challenge in repurposing these molecular data to guide clinical decision-making and to recognize precision medicine applications.
Identification of highly confident, disease-associated signatures is subjected to considerable uncertainties, including genomic heterogeneity, low effect, uncertain functional roles and high prevalence of passenger events, among others ( Kim et al., 2016). However, among thousands to millions of features measured by these cutting-edge platforms, only a few are truly disease-associated. To date, contemporary platforms such as next-generation sequencing have generated large volumes of data in genetics, epigenetics, transcriptomics and proteomics for many complex diseases. Genome-wide multi-omics profiling of complex diseases provides valuable resources and opportunities to discover associations between various measures of genes (generally referred as signatures) and diseases. With appropriate setting, CNet can be applied in many biological conditions. In addition, we applied CNet to identify likely disease-causing chains involving somatic mutations, pathway activities and patient outcomes. Our results demonstrated that in various scenarios, CNet could effectively identify signatures that are associated with the outcomes. We tested CNet using drug-response data, multidimensional cancer genomics data and genome-wide association study data for multiple traits. To deal with various forms of clinical and phenotypical measurements, we introduced four models to deal with continuous, categorical and censored data. CNet can manage heterogeneous genomic signature profiles simultaneously and select the best signature to represent a specific gene.
It further applies a dynamic trimming procedure to remove relatively less informative signatures at every step. CNet builds on a generalized sequential feedforward method, augmented by a down-sampling bootstrap strategy to reduce random hitchhiking signatures.