Skip Navigation [You are reading this message either because you can not see our css files, or because you do not have a standards-compliant browser.]

The methods, models, and data produced by the Center's research activities are made available though geworkbench, an interoperable, grid-enabled, state-of-the-art bioinformatics platform that allows them to be

(a)
integrated with a variety of other existing bioinformatics modules for the analysis, visualization, and management of multiple data modalities and
(b)
assembled into complex bioinformatics workflows and biomedical applications using a simple yet powerful visual front-end and a scripting language.

geWorkbench (Genomic Workbench) is based on caWorkbench, an integrated genomics environment previously developed by the Center's investigators with funding from NCI. A key feature of geWorkbench is its integration with GenePattern (a leading bioinformatics application that is also funded by caBIG) which enables geWorkbench users to gain access to the advanced analysis modules available in GenePattern. We have also developed a Biomedical Informatics Structured Ontology (BISON) which is used to create interoperable interfaces for the components of both platforms and a scripting language to allow their assembly into complex workflows. Many components that are data or computationally-intensive have already been wrapped as grid-services using the caGrid framework.

geWorkbench

A team of experts, familiar with large-scale commercial and academic software development are leading and coordinating the geWorkbench related activities. Based on significant expertise in interacting with the biomedical community, gained through participation in the caBIG project (among others), the development effort is both driven and tested by the broader biomedical community to ensure the usefulness of the geWorkbench tools and graphical user interfaces. Appropriate workshops and web-based seminars (webinars) for developers and end-users are administered by the Center and on-line documentation, including videos and multimedia materials, are made available to the community for training purposes. An important element of the software development effort is the use of established software engineering principles, including:

  1. A rational Software Development Lifecycle based on UML tools and processes, including the creation of functional requirements, use cases, and entity relationship diagrams.
  2. The use of proven (community-based) software development methodologies, including
    (a)
    Source Code Version Control,
    (b)
    bug tracking and resolution,
    (c)
    mailing lists-based communities, and
    (d)
    extensive unit, system, and integration testing methodologies.
  3. An appropriate modification of the UML approach that allows rapid integration of software prototypes from individual Center projects, supporting the creation and management of an innovation pipeline
  4. A community-centric approach to the extension of the formal ontology (BISON) that supports the component interoperability framework. This is in part accomplished by relying on the existing NCI caDSR repository to deposit and revise BISON concepts and to obtain community feedback.

An important element of the software platform is the development of specific software components that integrate algorithms and databases resulting from the Center's biomedical computation research activities. For instance, the interaction of two transcription factors may be identified from the literature (using NLP methods), from protein-protein structural interactions (using domain recognition motifs), from expression data (using information theoretic or regression methods), or from databases (using ChIP-on-Chip experimental assays). Such clues are quantified through a likelihood function and their integration allow improving our overall confidence measure. Besides providing evidence about a specific interaction, this approach allows clues obtained by one method to trigger conditional analysis via other methods. For instance, the identification of a specific transcription factor interaction by regression-based reverse-engineering may trigger the analysis of the specific protein-protein interaction at the structural level (assuming that the protein structures are both known).

MAGNet