R Manuscript Author Manuscript Author Manuscript Author ManuscriptWiley Interdiscip Rev Syst Biol Med. Author manuscript; available in PMC 2016 July 01.Wang et al.ShikoninMedChemExpress C.I. 75535 Pageamounts of expression Quantitative Trait Loci (eQTL), which expose potential mechanisms by which to explain observed associations. Such high-throughput genetic data further enable the investigation of complex relationships between genotypes and phenotypes (Table 1). In doing so, the accuracy of information is very important as a KF-89617 manufacturer system or model works only if the input data are sound. There are some available software tools that assist in using the correct phenotype information. For example, Genome-Phenome Analyzer, launched by SimulConsult (www.simulconsult.com/), links curated phenome databases, clinical findings, and associated variants generated by whole-exome sequencing to compute a differential diagnosis for patients. PhenoDB (http://phenodb.net) is a Web-based portal for integration and analysis of phenotypic features, whole exome/genome sequence data, knowledge of pedigree structure, and previous clinical testing. It can also be used to format phenotypic data for submission to dbGaP. Biological systems are highly dynamic and hierarchical. Each technology can only generate the data at one dimension of complex biological systems. However, any single type of highthroughput data cannot fully interpret a variety of system functions. Therefore, how to integrate heterogeneous and large `omics data and mine useful knowledge to interpret phenotypes is critical for the success of systems biology (Figure 1). Thus, at the very least, systems biology must also borrow quantitative modeling approaches from multidisciplinary fields, which we will discuss in next section.Author Manuscript Author Manuscript Author Manuscript Author ManuscriptCOMPUTATIONAL METHODS IN SYSTEMS BIOLOGYHigh-throughput technologies highlight the challenge of how to mine biological knowledge and generate testable hypotheses from the massive amount of available data. Tackling this challenge requires sophisticated quantitative modeling methods and multidisciplinary expertise from different fields, such as mathematics, physics, and computer science. One of the hallmarks of systems biology is the use of computational approaches from quantitative science to develop a wide spectrum of models and tools for analyzing large-scale data (Figure 2). Computational methods that have been used in systems biology can be classified into datadriven top-down methods and model-driven bottom-up methods 9. In general, highthroughput multi-parametric `omics data characterize the abundance of biological elements across different system states. Data-driven top-down approaches integrate and analyze experimental data to reveal biomarkers and biologically meaningful patterns. These approaches can be applied to the analysis of unbiased genome-scale data with thousands of components to obtain coarse-grained knowledge about biological systems. For example, various statistical analyses have been used for identifying differentially expressed genes, proteins, or metabolites 38. One can further examine whether the resulting component lists are enriched for known gene signatures or signaling pathways 39. Statistical methods, such as Principle Component Analysis (PCA), Partial Linear-square Regression (PLSR), and Canonical Ccorrelation Analysis (CCA), are then utilized to identify functional relationships by checking the expression correlation.R Manuscript Author Manuscript Author Manuscript Author ManuscriptWiley Interdiscip Rev Syst Biol Med. Author manuscript; available in PMC 2016 July 01.Wang et al.Pageamounts of expression Quantitative Trait Loci (eQTL), which expose potential mechanisms by which to explain observed associations. Such high-throughput genetic data further enable the investigation of complex relationships between genotypes and phenotypes (Table 1). In doing so, the accuracy of information is very important as a system or model works only if the input data are sound. There are some available software tools that assist in using the correct phenotype information. For example, Genome-Phenome Analyzer, launched by SimulConsult (www.simulconsult.com/), links curated phenome databases, clinical findings, and associated variants generated by whole-exome sequencing to compute a differential diagnosis for patients. PhenoDB (http://phenodb.net) is a Web-based portal for integration and analysis of phenotypic features, whole exome/genome sequence data, knowledge of pedigree structure, and previous clinical testing. It can also be used to format phenotypic data for submission to dbGaP. Biological systems are highly dynamic and hierarchical. Each technology can only generate the data at one dimension of complex biological systems. However, any single type of highthroughput data cannot fully interpret a variety of system functions. Therefore, how to integrate heterogeneous and large `omics data and mine useful knowledge to interpret phenotypes is critical for the success of systems biology (Figure 1). Thus, at the very least, systems biology must also borrow quantitative modeling approaches from multidisciplinary fields, which we will discuss in next section.Author Manuscript Author Manuscript Author Manuscript Author ManuscriptCOMPUTATIONAL METHODS IN SYSTEMS BIOLOGYHigh-throughput technologies highlight the challenge of how to mine biological knowledge and generate testable hypotheses from the massive amount of available data. Tackling this challenge requires sophisticated quantitative modeling methods and multidisciplinary expertise from different fields, such as mathematics, physics, and computer science. One of the hallmarks of systems biology is the use of computational approaches from quantitative science to develop a wide spectrum of models and tools for analyzing large-scale data (Figure 2). Computational methods that have been used in systems biology can be classified into datadriven top-down methods and model-driven bottom-up methods 9. In general, highthroughput multi-parametric `omics data characterize the abundance of biological elements across different system states. Data-driven top-down approaches integrate and analyze experimental data to reveal biomarkers and biologically meaningful patterns. These approaches can be applied to the analysis of unbiased genome-scale data with thousands of components to obtain coarse-grained knowledge about biological systems. For example, various statistical analyses have been used for identifying differentially expressed genes, proteins, or metabolites 38. One can further examine whether the resulting component lists are enriched for known gene signatures or signaling pathways 39. Statistical methods, such as Principle Component Analysis (PCA), Partial Linear-square Regression (PLSR), and Canonical Ccorrelation Analysis (CCA), are then utilized to identify functional relationships by checking the expression correlation.