Download Paper

Systems Biology Solution for Drug Discovery and Personalized Medicine

by Dr. Kenneth Drake

The hope of the rapid translation of 'genes to drugs' has floundered on the reality that disease biology is complex, and that drug development must be driven by insights into biological responses . Historically, the scientific approach to drug discovery has been reductionist, focusing on one component of a system at a time. The information gained from studying a single gene or protein is then applied to an entire biological system, be it a tissue, organ or organism. This approach is often too simplistic, as biological systems are more than the sum of their individual parts. The result of ignoring the interactions of the various components and pathways at the systems biology level is an increased likelihood of missing other important biological responses. Further, not all biological systems can be expected to have the same responses to the same activation (perturbation) due to genetic diversity. Development of new therapeutic products requires a biological system understanding of the human genome and the underlying elements and stages of diseases specific to genetically different patient populations. Hence, systems biology is key to the development of personalized medicine; and for this reason the approach should be adopted for drug discovery at all phases of drug development, from petri dish to clinical trials. Additionally, the biological knowledge gained through this approach will ultimately lead to improved disease diagnostics and disease stage therapeutic management.

Systems biology studies biological systems by systematically perturbing them (biologically, genetically, or chemically); monitoring the gene, protein, and informational pathway responses; integrating these data; and ultimately, formulating mathematical models that describe the structure of the system and its responses to individual perturbations. The controlled perturbations generate unique biosignatures (time-course response profiles) of the system leading to improved understanding and more comprehensive hypotheses of the underlying response mechanisms. By incorporating these unique biosignatures into existing genome, proteome, and metabolome databases, it is possible to model, mine, and experimentally test these hypotheses.

Biosignature analysis and modeling is an important new approach for drug discovery and development because of its ability to integrate a vast amount of relevant biological data into the process. It can identify where a network should be perturbed to achieve a desired effect, provide insight into a drug’s function, evaluate a drug’s likely efficacy and toxicity, and act as a screen for combinatorial perturbations. This model can also assess responses in different patient groups and create diagnostics for the development of personalized medicines.

Systems biology analysis requires sophisticated bioinformatics software to find and analyze patterns in diverse forms of data producing an integrated view of specific diseases.

A New Generation Analysis

Seralogix’ BioSignatureDS™ uses a new generation analysis to reveal the biosignatures associated with disease states, allowing scientists to identify therapeutic targets, determine if a drug is targeting its intended pathway, and to identify patterns for diagnostics and personalized medicine. The core computational tools are based on the probabilistic power of dynamic Bayesian networks -- an advanced form of machine learning and pattern recognition. These integrated tools are used to model the complex temporal pattern of modulated genes and proteins in response to disease and varying treatment conditions. BioSignatureDS™ analysis incorporates diverse types of data including time-course gene expression data, protein data, and metabolite data. Additionally, data is analyzed in the context of existing biological knowledge; including genetic relations of regulatory pathways, gene-gene/protein-protein relations, and relations of other biological processes. This powerful combination identifies perturbed pathways and processes, produces complex comparative analysis, and generates dynamic models for what-if analysis and pattern recognition. Unlike other tools, BioSignatureDS™ analysis automatically processes, deciphers, learns, refines, and recognizes unique biosignature patterns-of-change of interacting genes and proteins. Physiological responses, as well as genotypic and phenotypic characteristics, can be included in the analysis. BioSignatureDS™ analysis can provide deeper contextual information to root cause of effects and the living system’s state-of-health.

Bayesian Networks Analyze Perturbed Pathways, Processes, and Genes in Context

Bayesian networks are graphical models that represent conditional dependencies and independencies among the variables corresponding to biological measurements. These variables are illustrated using nodes that are connected together by lines which represent the relationships between variables. Figure 1 is an example of a simple Bayesian network describing a gene/protein regulatory network. BioSignatureDS™ analysis uses dynamic Bayesian network learning and modeling methods which can include temporal processes such as time series and feedback loops, essential features of most biological systems.


Figure 1. A simple example of a dynamic Bayesian network representation of the Toll-like Receptor Pathway showing the initial causal genetic relationships and its rollout to create the dynamic Bayesian Network.

BioSignatureDS™ analysis employs a top-down technique for identifying groups and individual genes/proteins that represent the perturbation of a pathway, sub-networks within a pathway, biological processes, and individual genes over time. This technique is termed Dynamic Bayesian Gene Group Activation™ (DBGGA). DBGGA scores and ranks groups of genes/proteins across all time points in lieu of just individual genes in a single time point, to determine differences between experimental conditions. In the simplest terms, the technique creates a Bayesian network model for the group of genes of interest. For example, the Toll-like Receptor pathway will have a known network structure which defines the causal relationships for the Bayesian model, while temporal relations are captured in the dynamic Bayesian network (Figure 1). The DBGGA method utilizes biosignature data to train the models with control expression data while data from the perturbed condition is used to test its goodness-of-fit with the control model. By measuring the likelihood of this perturbed data fit, the method can compare the magnitude of pathway perturbation relative to several hundred other pathways and biological processes available in Seralogix’ database. The model scores, ranks and then selects significantly activated pathways, biological process groups, and individual genes/proteins resulting from the disease or perturbation conditions. The selection of perturbed pathways and groups provide a method of data reduction that focuses attention to those gene/protein groups that are uniquely affected by the given experimental conditions.

Individual gene and protein expression profiles determine the magnitude of the gene group perturbed. The genes which contribute to this perturbation are referred to as mechanistic genes. BioSignatureDS™ analysis identifies these mechanistic genes by Bayesian modeling of the genes in context of their parent and child relationships (Figure 2). This novel technique is able to detect more subtle changes in expression and identifies important genetic relationships that are potential drug targets and disease biomarkers. BioSignatureDS™ automatically compares the data to several hundred pathways and several thousand biological processes to determine which groups and individual genes/proteins are responding to the perturbation conditions, thereby providing a comprehensive systems biology analysis.

Figure 2. A Graphical Example of a Toll-like Receptor Pathway Analysis. The blue concentric rings indicate candidate mechanistic genes discovered in context with their neighboring relationships through the Bayesian modeling techniques of BioSignatureDS™ analysis.

Disease Models

These highly perturbed gene/protein groups become the building blocks for constructing a Dynamic Bayesian Network model which represents a hypothesized network model of the disease/conditions being studied (Figure 3). The resulting model is also a powerful tool for pattern recognition which, for example, can be used for diagnostics or early efficacy and toxicity detection. Furthermore, the model can be trained with multiple conditions to discern the unique differences or commonalities between different population groups defined by genotypes, race, age, gender or other conditions.

Figure 3. A simplified example dynamic Bayesian network disease model constructed from mechanistic genetic/proteomic relationships identified from the results of BioSignatureDS™ analysis. The model can be used for “what-if” analysis as well as pattern recognition for diagnostics.

Using BioSignatureDS™ in Vaccine and Immunotherapeutic Development

A recent multi-conditional study used BioSignatureDS™ analysis to identify unique and common perturbations of groups of genes belonging to host defensive or pathogen invasive functions. BioSignatureDS™ analysis revealed early temporal changes in host response to three different pathogens, wild type S. typhimurium (STMWT), B. melitensis (BMEL), and Mycobacterium avium subsp. avium (MAA). The analysis used a multi-comparative approach to provide a comprehensive systems perspective for both the host and pathogen with the goal of discovering novel mechanistic genes important for vaccine and immunotherapeutic development.

BioSignatureDS™ analysis scored and ranked 206 known metabolic and signaling pathways for each condition and 98 pathogen-specific pathways at seven different time points post infection. The pathway scoring identified a significant number of metabolic and immune response signaling transduction pathways observed to be uniquely and commonly perturbed between conditions. The Venn diagrams of Figure 4 provide a high-level summary of the comparisons for pathway and mechanistic gene analysis for the host.

Figure 4. Multi-conditional comparative analysis identified the unique differences and commonalities of a host response to three different pathogens. The left side Venn diagram represents summary results for pathway analysis and the right side Venn diagram is for mechanistic genes.

To further illustrate this multi-comparative analysis, the bar plots in Figure 5 indicate the levels of perturbation for the top 8 significantly perturbed pathways for the STMWT condition in comparison to the pathway activation scores of BMEL and MAA. The analysis clearly indicates that Salmonella (STMWT) has a uniquely different biosignature in comparison to the other two pathogen host responses. STMWT had an early but moderate Toll-like Receptor pathway response with a much larger perturbation at 240 minutes post infection, while BMEL and MAA showed only modest response across the time course. Cytokine activation grew in magnitude over time for STMWT while MAA showed a delayed response. Apoptosis activation occurred early for STMWT while MAA was delayed. Such analysis is available for each condition and for each pathway. Along with each pathway analysis the candidate mechanistic genes were identified, however, confidentiality requirements of this project prevent their disclosure.

Based on the analysis results, new insights into the possible genetic relationships that are altered by the invading pathogens were obtained and new targets of intervention which before would have gone undetected were identified. Currently, new immunotherapeutic drug candidates are being evaluated.

Figure 5. BioSignatureDS™ provides powerful comparative analysis capability which can identify significantly perturbed pathways in comparison to other experimental conditions. Here, is presented the results of STMWT pathway scoring versus time in comparison to BMEL and MAA.

Application to Pre-symptomatic Diagnosis and Personalized Medicine

The cause of most human disease lies in the functional derangement of gene-gene and protein-protein interactions. Understanding the role of gene and protein networks in disease will create enormous pharmaceutical and clinical opportunities, since these pathways represent the drug targets of the next decade. In the future, entire cellular networks, not just single deregulated proteins, will be the targets of therapeutics. It will soon be possible to analyze the state of protein signaling pathways in disease-altered cells and circulating blood, before, during, and after therapy. This will herald the advent of true patient-tailored therapy, i.e. personalized medicine.

Seralogix’s proprietary BioSignatureDS™ discovery software combined with expert services offers customized solutions for systems biology data analysis. BioSignatureDS™ analysis offers a new generation solution enabling biological discoveries that are otherwise not possible. Highly flexible and easily customized, the BioSignatureDS™ approach integrates varied experimental data sets, including time-course data, both with each other and with existing biological knowledge. The BioSignatureDS™ algorithms use powerful dynamic Bayesian methods of machine learning and pattern recognition to decode the complexities of systems biology. To find out if you qualify for a free trial analysis, email Seralogix with a description of your project and study goals at free trials.

About Dr. Drake

Dr. Kenneth Drake is a veteran entrepreneur with extensive background in medical device, software and services business both in early stage and public companies. He is the key visionary, founder, and CTO of Seralogix. He has over 25 years of experience in the medical and defense industries where he has held numerous scientific and leadership positions. Prior to founding Seralogix, Dr. Drake’s research and development focused on genomics and proteomics-based diagnostics and bioinformatics as a solution for point of care diagnostics for infectious diseases.

References

  1. Butcher, E.C., Berg, E.L., & Kunkel E.J. Nature Biotechnology 22, 1253-1259 (Oct 2004)
  2. Ideker, T., Galitski, T., and Hood, L., A NEW APPROACH TO DECODING LIFE: Systems Biology, Annu. Rev. Genomics Hum. Genet. 2001. 2:343–72
  3. Buntine, W.: Theory Refinement for Bayesian Networks. Proc. Seventh Conference on Uncertainty in Artificial Intelligence, 52-60, Morgan Kaufmann, 1991.
  4. Friedman, N., and Goldszmidt, M.: Learning Bayesian Networks with Local Structure. Proc. Twelfth Conference on Uncertainty in Artificial Intelligence, 211-219, Morgan Kaufmann, 1996.
  5. Barash, Y., Friedman, N., “Context-Specific Bayesian Clustering for Gene Expression Data”, RECOMB, pages 12--21, 2001
  6. Y. Tominaga, M. Okamoto, S. Eguchi, “Development of a System for the Inference of Large Scale Genetic Networks”, Pac Symp Biocomput (2001), pp. 446-458.