Understanding Biochemical Reactions
SABIO-RK: Database for Reaction Kinetics
The understanding of cellular behavior and complex biochemical processes requires detailed qualitative and quantitative information about single biochemical reactions. Hence for the simulation und modeling of biochemical reactions and complex networks reliable kinetic data for the individual reaction steps are essential. Kinetic laws describing the dynamics of the reactions with their respective parameters determined under certain experimental conditions are mainly found in the literature and described in many different formats and vocabularies. SABIO-RK [1, 2] is a curated, web-accessible database which integrates, standardizes and annotates these data in a structured format. A general overview about the data flow in the SABIO-RK project is represented in figure 1.
SABIO-RK integrates data from different origin in order to establish a broad information basis and to facilitate the access to reaction kinetics data and corresponding information. Reactions, their associations with biochemical pathways, and their enzymes are automatically extracted from the KEGG database . Since most of the kinetic data is exclusively found in the literature SABIO-RK offers data manually extracted from the literature and related information obtained from other publicly available biological databases. The kinetic data are related to reactions, organisms, tissues and cellular locations. The type of the kinetic mechanism and corresponding rate equations are presented together with their parameters and experimental conditions. Additionally, SABIO-RK also includes data based on literature information about the elementary steps for some of the reactions. This not only includes the single elementary steps with their corresponding kinetic data but also a graphical representation of the reaction mechanism.
Based on the literature also detailed information about the protein catalyzing a reaction is inserted. This includes information about specific isozymes or mutations used in the experiments, UniProt  accession numbers and information about the protein complex composition of subunits.
Another possibility to insert kinetic data in SABIO-RK in addition to extraction from literature is the usage of a XML-based integration tool to import a higher amount of kinetic data automatically.
Data in SABIO-RK are extracted manually from literature and the selection of papers is not restricted to any biological source (e.g. organisms or organism classifications). All the data are curated and annotated by biological experts using a web-based input interface. To support the curation process and data integration we have implemented different constraints in the input interface and offer several controlled vocabularies as lists of values, as well as additional semi-automatic consistency checks to avoid errors and inconsistencies in the database.
Scientific communication needs standards and a shared vocabulary to avoid misinterpretations. Such standards are especially important when gathering data from different sources. To unambiguously identify entities or terms and to facilitate search, interpretation and comparison of the data, these are standardized to a uniform format and structure. Controlled vocabularies and annotations with terms or identifiers from external resources and ontologies are used to identify and relate the data to their biological context. All these efforts to unify and integrate the data augment the content and the semantics of the SABIO-RK database entries to enable a comprehensive understanding and comparison of the data for the user.
Biological ontologies used in SABIO-RK are ChEBI , a dictionary and ontological classification of small chemical compounds, Systems Biology Ontology (SBO) , a controlled vocabulary for systems biology, and NCBI taxonomy , a complex vocabulary and classification of organisms.
SABIO-RK can be accessed via a web-based user interface or via web-services (http://sabio.villa-bosch.de/SABIORK). The web-based user interface enables search for reactions and their kinetics by specifying characteristics of the reactions. Complex queries can be defined by selecting reaction participants (substrates, products, catalysts etc.), organisms, tissues or cellular locations, kinetic parameters, environmental conditions or literature sources. The search results (fig. 2) distinguish between entries satisfying the given criteria and indicate whether kinetic information for the reactions and associated enzymes under the defined search criteria is specified (green entries). Apart from this, yellow entries indicate search results not exactly matching the search criteria including for example entries for another organism.
Links to other databases based on the annotations of the data enable the user to gather further information for example for compounds, reactions or proteins. Selected data about reactions and their kinetics, together with their annotations to external databases and ontologies, can be exported in SBML (Systems Biology Mark-up Language) , a widely used standard exchange format in systems biology.
SABIO-RK is a curated data resource for modelers of biochemical networks to assemble information about reactions and their kinetics. It also enables experimentalists to obtain information about biochemical reactions and their kinetics, within the context of cellular locations, tissues and organisms. The kinetic information is manually extracted from literature and put into relation to corresponding data automatically extracted from databases. The database uses controlled vocabularies and links to other ontologies or external databases to allow for the comparison of data and to extract additional information from other sources.
The project is financed by the Klaus Tschira Foundation (www.klaus-tschira-stiftung.de/) and the Federal Ministry of Education and Research (www.bmbf.de/) through the HepatoSys Competence Network (www.hepatosys.de/) and the European transnational research initiative SysMO (www.sysmo-db.org/).
 Wittig U. et al.: Lecture Notes in Computer Science 4075, 94-103 (2006)
 Rojas I. et al.: In Silico Biol. 7 (Suppl 2) (17822389), 37-44 (2007)
 Kanehisa M. et al.: Nucleic Acids Res. 36, D480-D484 (2008)
 The UniProt Consortium: Nucleic Acids Res. 36, D190-D195 (2008)
 Degtyarenko K. et al.: Nucleic Acids Res. 36, D344-D350 (2008)
 Le Novère N.: BMC Neuroscience 7(Suppl 1), S11 (2006)
 Wheeler D.L. et al.: Nucleic Acids Res 1, 28(1), 10-14 (2000)
 Hucka M. et al.: Bioinformatics. 19, 524-531 (2003)
Ulrike Wittig, Renate Kania, Martin Golebiewski, Andreas Weidemann, Olga Krebs, Saqib Mir, Henriette Engelken, Heidrun Sauer-Danzwith, Wolfgang Mueller, Isabel Rojas
EML Research gGmbH, Heidelberg, Germany