Links A-L

  • AbsIDconvert (article) THESAURI – TOOL | GENOMICS – IDENTIFIER
  • AbsIDconvert is an absolute approach for converting genetic identifiers at different granularities and is based on the unique idea that genomic identifiers can be converted to genomic intervals, and therefore conversion between identifiers requires simply finding overlapping intervals.

  • Accelerating Medicines Partnership (article) PROJECT | BIOMARKERS – SYSTEMS BIOLOGY
  • The NIH has joined with over 15 companies and not-for-profit organizations to create the Accelerated Medicines Partnership, a public-private partnership focused on discovering new targets and biomarkers for 4 diseases. The precompetitive PPP will make results publicly available, aiming to spur innovation from all sectors of the industry.

  • ACToR PROJECT – DATABASE
  • The Aggregated Computational Toxicology Resource (ACToR) is EPA’s online warehouse of all publicly available chemical toxicity data and can be used to find all publicly available data about potential chemical risks to human health and the environment. ACToR aggregates data from over 1000 public sources on over 500,000 environmental chemicals searchable by chemical name, other identifiers and by chemical structure. The data warehouse: 1) allows users to search and query data from other EPA chemical toxicity databases including: ToxRefDB (30 years and $2 billion worth of animal toxicity studies), ToxCastDB (data from screening 1,000 chemicals in over 500 high-throughput assays), ExpoCastDB (consolidate and link human exposure and exposure factor data for chemical prioritization), DSSTox (provides high quality chemical structures and annotations), 2) includes chemical structure, physico-chemical values, in vitro assay data and in vivo toxicology data, and 3) includes, but not limited to, high and medium production volume industrial chemicals, pesticides (active and inert ingredients), and potential ground and drinking water contaminants.

  • Acute Toxicity Database DATABASE
  • The Acute Toxicity Database summarizes the results from aquatic acute toxicity tests conducted by the USGS CERC located in Columbia, Missouri. The acute toxicity test provides a relative starting point for hazard assessment of contaminants and is required for federal chemical registration programs such as the Federal Insecticide Fungicide Rodenticide Act (PL 80-104) as amended by the Federal Environmental Pesticide Control Act of 1972 (7 U.S.C. 136-136y) and the Toxic Substances Control Act of 1976 (PL 94-469).

  • ACuteTox PROJECT – DATABASE
  • The ACuteTox project has set the very ambitious overall objective of developing an in-vitro test strategy sufficiently robust and powerful to completely replace in-vivo testing of acute toxicity of chemicals. Anticipated corollary results of this overall objective are decreased testing costs and improved scientific validity of the results. In order to realise such an in-vitro test strategy, a number of building blocks are required.

  • ADME SARfari (article) TOOL | ADME – PHARMACOKINETICS – PHARMACOLOGY – SYSTEMS BIOLOGY
  • ADME SARfari enables to: 1) Predict likely ADME targets for an input molecule, 2) Find ADME targets similar to an input FASTA sequence, 3) Find ADME targets related to text terms, 4) Find pharmacokinetic data relating to an input target, sequence or text term, 5) Find activity/pharmacokinetic data for an input molecule or related compounds (via a similarity/substructure search, and 6) Match expression levels in human tissues for found targets.

  • admetSAR (article) TOOL | ADME – METABOLISM – QSAR MODELING
  • admetSAR provides the latest and most comprehensive manually curated data for diverse chemicals associated with known Absorption, Distribution, Metabolism, Excretion and Toxicity profiles. admetSAR created a user-friendly interface to search for ADMET properties profiling by name, CASRN and similarity search. In addition, admetSAR can predict about 50 ADMET endpoints by our recently development chemoinformatics-based toolbox, entitled ADMET-Simulator which integrates high quality and predictive QSAR models. admetSAR will be helpful for in silico screening ADMET profiles of drug candidates and environmental chemicals.

  • ADME/Tox Web TOOL | ADME – CYP450 – METABOLISM – P-gP- PHARMACOKINETICS – PHARMACOLOGY – QSAR MODELING
  • ACD/ADME Suite is a collection of software modules that provide predictions relating to the pharmacokinetic profiling of compounds, specifically their Absorption, Distribution, Metabolism, and Excretion properties. Predict P-gp specificity, oral bioavailability, passive absorption, blood brain barrier permeation, distribution, P450 inhibitors, substrates and inhibitors, maximum recommended daily dose, Abraham-type (Absolv) solvation parameters, and more. Predictions are based on a combination of expert knowledge, scientific intuition, and QSAR modeling.

  • Adverse Outcome Pahtway Knowledge Database DATABASE | PATHWAY – RISK ASSESSMENT
  • Tthe Organisation for Economic Co-operation and Development (OECD) has launched the Adverse Outcome Pathway Knowledge Base. This joint initiative between the OECD, U.S Environmental Protection Agency and the European Commission Joint Research Centre is a web-based platform which aims to bring together all the knowledge on how chemicals can induce adverse effects, providing a focal point for AOP development and dissemination.
    The first AOP KB module is the AOP Wiki – an interactive and virtual encyclopaedia for AOP development. All stakeholders from academia, governmental agencies and the chemical industry are invited to use the Wiki either as a source of information, or as active contributors posting comments and content. This expert contribution from third-parties is strongly encouraged since it is through such “crowd sourcing” that the AOP KB will ultimately evolve.

  • AERS DATABASE | DRUG SAFETY – ADVERSE EVENT
  • The Adverse Event Reporting System (AERS) is a computerized information database designed to support the FDA’s post-marketing safety surveillance program for all approved drug and therapeutic biologic products. The FDA uses AERS to monitor for new adverse events and medication errors that might occur with these marketed products.
    AERS is a useful tool for FDA, which uses it for activities such as looking for new safety concerns that might be related to a marketed product, evaluating a manufacturer’s compliance to reporting regulations and responding to outside requests for information. The reports in AERS are evaluated by clinical reviewers in the Center for Drug Evaluation and Research (CDER) and the Center for Biologics Evaluation and Research (CBER) to monitor the safety of products after they are approved by FDA.

  • AMBIT DATABASE – TOOL | DATA MINING – MOLECULAR DESCRIPTORS – QSAR MODELING
  • The AMBIT system consists of a database and functional modules allowing a variety of flexible searches and mining of the data stored in the database. The AMBIT database stores chemical structures, their identifiers such as CAS, Einecs, Inchi numbers; attributes such as molecular descriptors, experimental data together with test descriptions, literature references. The database can also store QSAR models. In addition the software can generate a suite of 2D and 3D molecular descriptors.

  • AnatomyTagger (article) THESAURI – TOOL | ANNOTATION – TEXT MINING
  • AnatomyTagger is a machine learning-based system for anatomical entity mention recognition. The system incorporates a broad array of approaches proposed to benefit tagging, including the use of Unified Medical Language System (UMLS)- and Open Biomedical Ontologies (OBO)-based lexical resources, word representations induced from unlabeled text, statistical truecasing and non-local features. We train and evaluate the system on a newly introduced corpus that substantially extends on previously available resources, and apply the resulting tagger to automatically annotate the entire open access scientific domain literature. The resulting analyses have been applied to extend services provided by the Europe PubMed Central literature database..

  • Annotare (article) THESAURI – TOOL | ANNOTATION – ONTOLOGY
  • Annotare is a tool for annotating biomedical investigations and resulting data. It is a stand-alone desktop application that features 1) a set of intuitive editor forms to create and modify annotations, 2) support for easy incorporation of terms from biomedical ontologies, 3) standard templates for common experiment types, 4) a design wizard to help create a new document, and 5) a validator that checks for syntactic and semantic violation.

  • ANTARES PROJECT | QSAR MODELING – REGULATORY GUIDELINES – TOXENDPOINT
  • ANTARES aims to reduce this gap assessing NTM as an alternative approach for the REACH legislation. REACH legislation states that Non-Testing Methods (NTM) can be used within REACH. These methods include Quantitative Structure-Activity Relationship (QSAR) models and read-across. Before making an animal experiment the industry should verify if alternative methods exist. However, so far there is a deep gap of knowledge on which methods are available and can be used in practice.

  • APREDICA OTHER | ADME – PHARMACOKINETICS
  • APREDICA provides preclinical contract testing services for the evaluation and optimization of the ADME, Toxicity, and Pharmacokinetic properties of drug candidates early in the drug-discovery process. We also provide in vitro testing services for cosmetics, agrochemicals, and other products for REACH compliance.

  • ApPredict (article) TOOL | PHARMACOLOGY – TOXICITY ENDPOINT
  • ApPredict is a free open source program for prediction of action potential changes under drug-block of ion channels. This web portal provides a user-interface, results database and results presentation for the Oxford ‘Action Potential Prediction’ (ApPredict) open source cardiac electrophysiology simulator, for changes to the action potential under drug block of multiple ion channels. It allows you to enter information on multiple cardiac ion channel block by a compound, and to predict the effect on whole cardiac myocyte electrophysiology, across a range of concentrations. The ion channel block is modelled as conductance block, and it can be used with a number of action potential models (specified using CellML), different pacing rates, and blockade of the following ion channels can be included: IKr (hERG), IKs (KCNQ1), ICaL (CaV1.2), INa (NaV1.5), Ito (Kv4.3), IK1 (KCNN4).

  • Argo (article) THESAURI – TOOL | ANNOTATION – ONTOLOGY – TEXT MINING – WORKFLOW
  • Argo, an interoperable, integrative, interactive and collaborative system for text analysis with a convenient graphic user interface to ease the development of processing workflows and boost productivity in labour-intensive manual curation. Robust, scalable text analytics follow a modular approach, adopting component modules for distinct levels of text analysis. The user interface is available entirely through a web browser that saves the user from going through often complicated and platform-dependent installation procedures. Argo comes with a predefined set of processing components commonly used in text analysis, while giving the users the ability to deposit their own components. The system accommodates various areas and levels of user expertise, from TM and computational linguistics to ontology-based curation. One of the key functionalities of Argo is its ability to seamlessly incorporate user-interactive components, such as manual annotation editors, into otherwise completely automatic pipelines.

  • ArrayExpress DATABASE | GENOMICS
  • The ArrayExpress Archive is a database of functional genomics experiments including gene expression where you can query and download data collected to MIAME and MINSEQE standards. Gene Expression Atlas contains a subset of curated and re-annotated Archive data which can be queried for individual gene expression under different biological conditions across experiments.

  • atBioNet (article) TOOL | GENOMICS – NETWORK
  • atBioNet is a free, user friendly, web-based network analysis tool for analyzing, visualizing, and interpreting genomics or proteomics data. The user supplies atBioNet with a list of proteins or genes, and atBioNet then creates an interactive graphical network model that can identify key functional modules. Pathway information from the Kyoto Encyclopedia of Genes and Genomes (KEGG) is directly integrated within atBioNet for enrichment analysis and assessment of the biological meaning of modules.

  • ATC/DDD index (2010) THESAURI – TOOL
  • The Anatomical Therapeutic Chemical (ATC) classification system and the Defined Daily Dose (DDD) as a measuring unit have become the gold standard for international drug utilization research. The ATC/DDD system is a tool for exchanging and comparing data on drug use at international, national or local levels.
    In the Anatomical Therapeutic Chemical (ATC) classification system, the active substances are divided into different groups according to the organ or system on which they act and their therapeutic, pharmacological and chemical properties. Drugs are classified in groups at five different levels. The drugs are divided into fourteen main groups (1st level), with pharmacological/therapeutic subgroups (2nd level). The 3rd and 4th levels are chemical/pharmacological/ therapeutic subgroups and the 5th level is the chemical substance. The 2nd, 3rd and 4th levels are often used to identify pharmacological subgroups when that is considered more appropriate than therapeutic or chemical subgroups.
    The DDD is the assumed average maintenance dose per day for a drug used for its main indication in adults. A DDD will only be assigned for drugs that already have an ATC code.

  • AtlasCSB server TOOL | PHARMACOLOGY
  • AtlasCBS is a tool that allows you to explore chemico-biological space using Ligand Efficiency Indices (LEIs) as variables. This unique set of variables permits a graphical representation of the content of databases that contain affinity data (SAR databases such as PDBBind, BindingDB) in an atlas-like fashion. The server allows you see the content of the database(s) of choice as pages in a map-like environment with different variables and scales. The content of the databases is displayed in two-dimensional pages where the angular coordinate is related to the chemical composition of the ligand(s) and the radial coordinate is related to affinity of the ligand(s) towards the specific target(s).

  • AutoBind (article) DATABASE – TOOL | ANNOTATION – PHARMACOLOGY – TEXT MINING
  • AutoBind is a Protein-Ligand binding information Database that based on automated information extraction techniques to collect majority binding affinity data from the primary references of Protein Databank that contains the information of the binding measurements. AutoBind collects and updates data every month, it allows users effective to access the rich and up-to-date than other database. AutoBind not only provide simple affinities value but also display full sentences which describe particular relation of protein binding.

  • BAO (article) THESAURI | ONTOLOGY
  • The BioAssay Ontology (BAO) describes biological screening assays and their results including high-throughput screening (HTS) data for the purpose of categorizing assays and data analysis. BAO is an extensible, knowledge-based, highly expressive (currently SHOIQ(D)) description of biological assays making use of descriptive logic based features of the Web Ontology Language (OWL). BAO currently has over 700 classes and also makes use of several other ontologies. It describes several concepts related to biological screening, including Perturbagen, Format, Meta Target, Design, Detection Technology, and Endpoint. Perturbagens are perturbing agents that are screened in an assay; they are mostly small molecules. Assay Meta Target describes what is known about the biological system and / or its components interrogated in the assay (and influenced by the Perturbagen). Meta target can be directly described as a molecular entity (e.g. a purified protein or a protein complex), or indirectly by a biological process or event (e.g. phosphorylation). Format describes the biological or chemical features common to each test condition in the assay and includes biochemical, cell-based, organism-based, and variations thereof. The assay Design describes the assay methodology and implementation of how the perturbation of the biological system is translated into a detectable signal. Detection Technology relates to the physical method and technical details to detect and record a signal. Endpoints are the final HTS results as they are usually published (such as IC50, percent inhibition, etc.). BAO has been designed to accommodate multiplexed assays. All main BAO components include multiple levels of sub-categories and specification classes, which are linked via object property relationships forming an expressive knowledge-based representation.

  • BeCAS (article) THESAURI – TOOL | ANNOTATION – TEXT MINING
  • BeCAS, the Biomedical Concept Annotation System, is an API for biomedical concept identification and a web-based tool that addresses these limitations. MEDLINE abstracts or free text can be annotated directly in the web interface, where identified concepts are enriched with links to reference databases. Using its customizable widget, it can also be used to augment external web pages with concept highlighting features. Furthermore, all text-processing and annotation features are made available through an HTTP REST API, allowing integration in any text-processing pipeline.

  • Benchmark Data Set or In Silico Prediction of Ames Mutagenicity DATABASE | STRUCTURE-BASED PREDICTION – TOXENDPOINT
  • The Benchmark Data Set is a compendium of public benchmark data set of 6512 chemical compounds together with their Ames mutagenicity test results (2-class classification problem). The data set was designed for the evaluation of in silico prediction methods.

  • BiGG (article) DATABASE | METABOLISM – SYSTEMS BIOLOGY
  • The Biochemical Genetic and Genomic (BiGG) knowledgebase is a metabolic reconstruction of human metabolism designed for systems biology simulation and metabolic flux balance modelling.

  • BiNChE (article) TOOL | ANNOTATION – ONTOLOGY – PHARMACOLOGY – VOCABULARY
  • BiNChE is an enrichment analysis tool for small molecules based on the ChEBI Ontology. BiNChE displays an interactive graph that can be exported as a high-resolution image or in network formats. The tool provides plain, weighted and fragment analysis based on either the ChEBI Role Ontology or the ChEBI Structural Ontology.

  • Binding DB DATABASE | PHARMACOLOGY – QSAR MODELING
  • BindingDB is a public, web-accessible database of measured binding affinities, focusing chiefly on the interactions of proteins considered to be candidate drug-targets with ligands that are small, drug-like molecules. BindingDB supports medicinal chemistry and drug discovery via literature awareness and development of structure-activity relations (SAR and QSAR); validation of computational chemistry and molecular modeling approaches such as docking, scoring and free energy methods; chemical biology and chemical genomics; and basic studies of the physical chemistry of molecular recognition. BindingDB also includes a small collection of host-guest binding data of interest to chemists studying supramolecular systems. The data collection derives from a variety of measurement techniques, including enzyme inhibition and kinetics, isothermal titration calorimetry, NMR, and radioligand and competition assays. BindingDB includes data extracted from the literature by the BindingDB project, selected PubChem confirmatory BioAssays, and ChEMBL entries for which a well defined protein target (“TARGET_TYPE=’PROTEIN’”) is provided. Data extracted by BindingDB typically includes more details regarding experimental conditions, etc. BindingDB currently contains about 620,000 binding data for 5,500 proteins and over 270,000 drug-like molecules.

  • BioLabeler THESAURI – TOOL | ANNOTATION
  • BioLabeler extracts UMLS concepts from Biomedical texts such as scientific paper abstracts, experiments descriptions or medical notes and can be use to automatically curate and annotate BioMedical Literature or to index large documents databases and improve searches or discover relationships between them. The concepts from the UMLS Database are presented in order according to the relative probability to represent the text knowledge taking into account different semantic properties in the input terms. Biolabeler can filter the results extracting concepts only from specific Dictionaries (GO, MESH, etc) and/or specific semantic types (Diseases, Genes, etc). Recognizing specific biomedical concepts from free text is an increasingly important process and Biolabeler focus on this task to help human and computer annotators to be more precise in order to improve the quality of the huge BioMedical text databases that bioinformatics and biologists has to deal with nowadays.

  • BioMet Toolbox TOOL | METABOLISM – NETWORK
  • The BioMet ToolBox is a web-based resource for analysis of high-throughput data, together with methods for flux analysis (fluxomics) and integration of transcriptome data exploiting the capabilites of metabolic networks described in genome scale models. The BioMet ToolBox, also, includes genome scale metabolic models of various cell factories used both in industrial biotechnology and in fundamental research.

  • Biomine (article) THESAURI – TOOL | ANNOTATION – ONTOLOGY
  • Biomine is a system that integrates cross-references from several biological databases into a graph model with multiple types of edges, such as protein interactions, gene-disease associations and gene ontology annotations. Edges are weighted based on their type, reliability, and informativeness.

  • bioNerDS (article) THESAURI – TOOL | ANNOTATION
  • bioNerDS is a named entity recogniser for the recovery of bioinformatics databases and software from primary literature.

  • BioPAX (article) TOOL | PATHWAY
  • BioPAX is a standard language that aims to enable integration, exchange, visualization and analysis of biological pathway data. Specifically, BioPAX supports data exchange between pathway data groups and thus reduces the complexity of interchange between data formats by providing an accepted standard format for pathway data. By offering a standard, with well-defined semantics for pathway representation, BioPAX allows pathway databases and software to interact more efficiently. In addition, BioPAX enables the development of pathway visualization from databases and facilitates analysis of experimentally generated data through combination with prior knowledge. The BioPAX effort is coordinated closely with that of other pathway related standards initiatives namely; PSI-MI, SBML, CellML, and SBGN in order to deliver a compatible standard in the areas where they overlap.

  • BioPlat (article) TOOL | BIOMARKERS
  • Human cancer transcriptomes has been extensively profiled over the last decade allowing the identification of different cancer molecular subtypes and the development of prognostic and predictive gene expression signatures. The identification of novel gene expression signatures is of high relevance not only for the potential value as prognostic / predictive biomarkers but also because may provide insight into mechanisms and pathways of relevance in human cancer progression.The BioPlat (Biomarkers Platform) is a user-friendly open-source bioinformatic resource, which provides a set of analytic tools for the discovery and in silico evaluation of novel prognostic and predictive cancer biomarkers based on integration and re-use of gene expression signature in the context of follow-up data.

  • BioPlex (article) DATABASE | NETWORK – SYSTEMS BIOLOGY
  • The BioPlex (biophysical interactions of ORFeome-based complexes) network is the result of creating thousands of cell lines with each expressing a tagged version of a protein from the ORFeome collection. Immunopurification of the tagged protein and detection of associated proteins by mass spectrometry are the building blocks of the network. The overarching project goal is to determine protein interactions for every member of the collection.

  • BioSharing DATABASE
  • BioSharing is a catalogue of databases, described according to the BioDBcore guidelines, along with the standards used within them; partly compiled with the support of Oxford University Press (NAR Database Issue and DATABASE Journal) and Re3Data.org.

  • Bio TextQuest+ DATABASE | SYSTEMS BIOLOGY – TEXT MINING
  • BioTextQuest+, a web-based interactive knowledge exploration platform with significant advances to its predecessor (BioTextQuest), aiming to bridge processes such as bioentity recognition, functional annotation, document clustering and data integration towards literature mining and concept discovery. BioTextQuest+ enables PubMed and OMIM querying, retrieval of abstracts related to a targeted request and optimal detection of genes, proteins, molecular functions, pathways and biological processes within the retrieved documents.

  • BKM-react online (article) DATABASE | METABOLISM – MOLECULAR DESCRIPTORS – PHARMACOLOGY
  • BKM-react online, abbreviation for BRENDA-KEGG-MetaCyc-reactions online, is an integrated and non-redundant biochemical reaction database containing known enzyme-catalyzed and spontaneous reactions. Biochemical reactions collected from BRENDA (BRaunschweig ENzyme DAtabase), KEGG, and MetaCyc were matched and integrated by aligning substrates and products. BKM-react reaction comparisons were done by an in silico approach in which two steps, first a comparison of reactant structures using InChIs (linearized chemical structure descriptors) and, second, a compound name comparison (incl. synonyms), were combined. After submitting an EC number or another attribute as substrate(s), product(s), or reaction ID of one of the databases, BKM-react online will retrieve all results that match your query and display the aligned reactions for all databases in comparison.

  • BRENDA (article) DATABASE | PHARMACOLOGY
  • BRENDA is the main collection of enzyme functional data available to the scientific community. It is available free of charge for via the internet (www.brenda-enzymes.org) and as an in-house database for commercial users (requests to its distributor Biobase).
    New release online since January 2013

  • CADASTER REGULATORY GUIDELINES – TOOL | QSAR MODELING – RISK ASSESSMENT
  • CADASTER aims at providing the practical guidance to integrated risk assessment by carrying out a full hazard and risk assessment for chemicals belonging to four compound classes. A Decision Support System (DSS) will be developed that will be updated on a regular basis in order to accommodate and integrate the alternative methods mentioned above. Operational procedures will be developed, tested, and disseminated that guide a transparent evaluation of four classes of emerging chemicals, explicitly taking account of variability and uncertainty in data and in models. QSAR models will be developed and validated, also externally, according to the OECD principles for the validation of QSAR. The prediction of data for chemicals of the four selected classes, belonging to the applicability domain of the developed models, will be used for hazard and risk assessment, when experimental data are lacking. The CADASTER main goal is to exemplify the integration of information, models and strategies for carrying out safety-, hazard- and risk assessments for large numbers of substances. Real risk estimates will be delivered according to the basic philosophy of REACH of minimizing animal testing, costs, and time. CADASTER will show how to increase the use of non-testing information for regulatory decision whilst meeting the main challenge of quantifying and reducing uncertainty.

  • CAESAR PROJECT | QSAR MODELING – TOXENDPOINT
  • CAESAR is an EC funded project (Project no. 022674 – SSPI), which is specifically dedicated to develop QSAR models for the REACH legislation. Five endpoints are addressed with CAESAR: bioconcentration factor, skin sensitization, mutagenicity, carcinogenicity, and development toxicity. CAESAR’s models have been assessed according to the OECD principles for the validation of QSAR. For the model validity we used a wide series of statistical checks. We also used external tests, to verify that the models performs correctly on new compounds.

  • CancerEST (article) TOOL | BIOMARKERS – SYSTEMS BIOLOGY
  • CancerEST, a user-friendly and intuitive web-based tool for the automated identification of candidate cancer markers/targets, for examining tissue specificity as well as for integrated expression profiling.

  • CancerResource (article) DATABASE | PHARMACOLOGY – TEXT-MINING – TOXENDPOINT
  • CancerResource is a comprehensive knowledgebase for drug-target relationships related to cancer as well as for supporting information or experimental data. Drug-target relationships are determined by a manually curated text-mining of publicly available literature. A couple of resources that provide similar data with slightly different background and intention are mined a. for comparison with the CancerResource text-mining and b. for integration into this knowledgebase. Thus, CancerResource reflects the actual knowledge about this matter in an integrative way. To strenghten the literature mining, which is in its result a compilation of direct knowledge of drug-target relationships, interactions that are known in the PDB is added to that part.

  • CAS THESAURI | IDENTIFIER
  • CAS, a division of the American Chemical Society, is the most authoritative and comprehensive source for chemical information. CAS databases, including CAS REGISTRYSM, the gold standard for substance information, are curated and quality-controlled by CAS scientists. Combining these databases with advanced search and analysis technologies, CAS delivers the most complete, cross-linked, and effective digital information environment for scientific research and discovery through such products as SciFinder, STN, STN Express, and STN AnaVistTM, as well as services, such as Science IP.

  • CasesDatabase DATABASE | DRUG SAFETY
  • Cases Database is a freely accessible and continuously updated search interface, developed by BioMed Central, which allows clinicians, researchers, teachers and patients to explore thousands of peer-reviewed medical case reports including content integrated from PubMed Central and publishers such as Springer and BMJ Group. By bringing case reports together, Cases Database adds value to individual reports, allowing comparison of similar cases, helping to highlight trends and patterns which may help researchers to develop hypotheses which can then be tested by systematic research.

  • CCLE DATABASE | PHARMACOLOGY – TOXENDPOINT
  • The Cancer Cell Line Encyclopedia (CCLE) project is a collaboration between the Broad Institute, and the Novartis Institutes for Biomedical Research and its Genomics Institute of the Novartis Research Foundation to conduct a detailed genetic and pharmacologic characterization of a large panel of human cancer models, to develop integrated computational analyses that link distinct pharmacologic vulnerabilities to genomic patterns and to translate cell line integrative genomics into cancer patient stratification. The CCLE provides public access analysis and visualization of DNA copy number, mRNA expression and mutation data for about 1000 cell lines.

  • CDD DATABASE – TOOL
  • Collaborative Drug Discovery’s web-based software organizes preclinical research data to help scientists advance new drug candidates more effectively. CDD offers an industrial-strength database at a price affordable to academic laboratories, research foundations, and companies of any size. Analyze and mine your data intuitively through a web browser. Collaborate securely with other researchers in your own lab … or across the globe. CDD’s software accelerates international R&D projects combating neglected diseases, as well as traditional commercial drug discovery programs.

  • CDISC CDISC 2011 Annual report THESAURI – REGULATORY GUIDELINES
  • CDISC is a global, open, multidisciplinary, non-profit organization that has established standards to support the acquisition, exchange, submission and archive of clinical research data and metadata. The CDISC mission is to develop and support global, platform-independent data standards that enable information system interoperability to improve medical research and related areas of healthcare. CDISC standards are vendor-neutral, platform-independent and freely available via the CDISC website.

  • CEBS DATABASE | GENOMICS – PHARMACOLOGY – TOXENDPOINT
  • The Chemical Effects in Biology Systems (CEBS) database houses data of interest to environmental health scientists. CEBS is a public resource, and has received depositions of data from academic, industrial and governmental laboratories. CEBS is designed to display data in the context of biology and study design, and to permit data integration across studies for novel meta analysis. CEBS integrates genomic and biological data including dose–response studies in toxicology and pathology.

  • CELDA (article) THESAURI | ONTOLOGY
  • The CELDA – Ontology (Cell: Expression, Localization, Development, Anatomy) is a structured vocabulary to organize cell-associated data and to place these data in clearly defined semantic relations to other biological facts. The CellFinder Ontology describes cell types, their properties and origin and links this information to other existing ontologies like the Cell Ontology (CL), Foundational Model of Anatomy (FMA), Gene Ontology (GO), Mouse Anatomy and others using the top-level ontology BioTop.
    Furthermore, CELDA is able to describe the development of organs on a cellular level. We started to describe this development for kidney, liver and skin. The CELDA – Ontology is currently used as a data structure to support modeling, analysis and comparison of cells within and across species in the development of the wider research and data repository platform CellFinder.

  • Cellular phenotype database (article) DATABASE – TOOL | SYSTEMS BIOLOGY
  • The Cellular Phenotype database stores data derived from high-throughput phenotypic studies and it is being developed as part of the Systems Microscopy Network of Excellence project. The aim of the Cellular Phenotype database is to provide easy access to phenotypic data and facilitate the integration of independent phenotypic studies. Through its interface, users can search for a gene of interest, or a collection of genes, and retrieve the loss-of-function phenotypes observed, in human cells, by suppressing the expression of the selected gene(s), through RNA interference (RNAi), across independent phenotypic studies. Similarly, users can search for a phenotype of interest and retrieve the RNAi reagents that have caused such phenotype and the associated target genes. Information about specific RNAi reagents can also be obtained when searching for a reagent ID. Alternatively, users can explore all datasets loaded in the database by browsing the phenotypes as well as searching studies by keyword.

  • CharaParser (article) THESAURI – TOOL | ANNOTATION
  • CharaParser, a software application for semantic annotation of morphological descriptions. CharaParser annotates semistructured morphological descriptions in such a detailed manner that all stated morphological characters of an organ are marked up in Extensible Markup Language1 format. Using an unsupervised machine learning algorithm and a general purpose syntactic parser as its key annotation tools, CharaParser requires minimal additional knowledge engineering work and seems to perform well across different description collections and/or taxon groups.

  • ChEBI THESAURI | PHARMACOLOGY
  • Chemical Entities of Biological Interest (ChEBI) is a freely available dictionary of molecular entities focused on ‘small’ chemical compounds. The term ‘molecular entity’ refers to any constitutionally or isotopically distinct atom, molecule, ion, ion pair, radical, radical ion, complex, conformer, etc., identifiable as a separately distinguishable entity. The molecular entities in question are either products of nature or synthetic products used to intervene in the processes of living organisms.
    ChEBI incorporates an ontological classification, whereby the relationships between molecular entities or classes of entities and their parents and/or children are specified.

  • ChemAxon products TOOL
  • ChemAxon is a leader in providing chemical software development platforms and desktop applications for the biotechnology and pharmaceutical industries. By focusing upon active interaction with users and core portability, ChemAxon creates leading edge cross platform solutions to power modern cheminformatics and chemical communication.

  • Chembench (article) DATABASE – TOOL | PHARMACOLOGY
  • The Carolina Cheminformatics Workbench (Chembench) is an integrated toolkit developed by the Carolina Exploratory Center for Cheminformatics Research (CECCR) with the support of the National Institutes of Health. It provides cheminformatics research support to molecular modelers, experimental chemists in the Chemical Synthesis Centers and quantitative biologists in the Molecular Libraries Probe Production Centers Network (MLPCN) by integrating robust model builders, property and activity predictors, virtual library of available chemicals with predicted biological and drug-like properties, and special tools for chemical library design.
    The Workbench is intended in part as a data analytical extension of the PubChem. Chembench enables researchers to mine available chemical and biological data to rationally design or select new compounds or compound libraries with significantly enhanced hit rates in screening experiments.

  • ChemBioServer (article) DATABASE – TOOL
  • ChemBioServer is a publicly available web-application for effectively mining and filtering chemical compounds used in drug discovery. It provides researchers with the ability to (i) browse and visualize compounds along with their properties, (ii) filter chemical compounds for a variety of properties such as steric clashes and toxicity, (iii) apply perfect match substructure search, (iv) cluster compounds according to their physicochemical properties providing representative compounds for each cluster, (v) build custom compound mining pipelines and (vi) quantify through property graphs the top ranking compounds in drug discovery procedures. ChemBioServer allows for pre-processing of compounds prior to an in silico screen, as well as for post-processing of top-ranked molecules resulting from a docking exercise with the aim to increase the efficiency and the quality of compound selection that will pass to the experimental test phase.

  • ChEMBL (article) (Last release: ChEMBL_19) DATABASE | ADME – PHARMACOLOGY
  • ChEMBL is a database of bioactive drug-like small molecules, it contains 2-D structures, calculated properties (e.g. log P, Molecular Weight, Lipinski Parameters, etc.) and abstracted bioactivities (e.g. binding constants, pharmacology and ADMET data). They attempt to normalise the bioactivities into a uniform set of end-points and units where possible, and also to tag the links between a molecular target and a published assay with a set of varying confidence levels. The data is abstracted and curated from the primary scientific literature, and cover a significant fraction of the SAR and discovery of modern drugs.
    recent article: The ChEMBL database: a taster for medicinal chemists

  • ChEMBLSpace (article) TOOL | PHARMACOLOGY
  • The ChEMBLSpace graphical explorer enables the identification of compounds from the ChEMBL database, which exhibit a desirable polypharmacology profile. This profile can be predefined or created iteratively, and the tool can be extended to other data sources. The source code of all developed programs is hosted as a Sourceforge project and is available under a Berkeley Software Distribution (BSD)-type license.

  • Chemical Identifier Resolver THESAURI – TOOL | IDENTIFIER
  • This service works as a resolver for different chemical structure identifiers and allows one to convert a given structure identifier into another representation or structure identifier.

  • Chemical Translation Service THESAURI – TOOL | IDENTIFIER
  • The Chemical Translation Service holds publicly available chemical information including structures, chemical names, chemical synonyms, database identifiers, molecular masses, XlogP and proton-donor/acceptor data downloaded from different databases and combined into a single internal repository for compound-specific, structure-based cross references. Molfile (SDF) to InChI code converters (vs. 1.0.2) and InChI code to InChI key converters were integrated into the tool set to allow any type of query access, e.g. by chemical names, structures, database identifiers. For data storage and retrieval the service is using a PostgresSQL database.

  • chemicalize.org THESAURI – TOOL | IDENTIFIER – STRUCTURE-BASED PREDICTION
  • chemicalize.org is a public web resource developed by ChemAxon which uses ChemAxon’s Name to Structure parsing to identify chemical structures on webpages and other text. Related to each structure, structure based predictions are available, as well a search interface is provided to discovery substructures or similar structures.

  • ChemIDplus Advanced DATABASE – THESAURI | TOXENDPOINT
  • The ChemIDplus Advanced database provides chemical structure, property, and toxicity searching, mainly in acute toxicity.

  • Cheminformatics.org – Data Sets DATABASE | METABOLISM – PHARMACOKINETICS – QSAR MODELING
  • This website contains links to cheminformatics programs and QSAR datasets (with structures!). All programs should be free to use, at least for academics. Currently: 44 datasets in 9 categories (Binary (active/inactive), QSAR, QSPR, Toxicity, Metabolism, Permeability, Docking, Mechanistic and Mixed/Other datasets).

  • ChemMapper (article) TOOL | PHARMACOLOGY
  • ChemMapper is a free web server for computational drug discovery based on the concept that compounds sharing high 3D similarities may have relatively similar target association profile. ChemMapper integrates nearly 300 000 chemical structures from various sources with pharmacology annotations and over 3 000 000 compounds from commercial and public chemical catalogues. In-house SHAFTS method which combines the strength of molecular shape superposition and chemical feature matching is used in ChemMapper to perform the 3D similarity searching, ranking, and superposition. Taking the user-provided chemical structure as the query, SHAFTS aligns each target compound in the database onto the query and calculates the 3D similarity scores and the top most similar structures are returned. Base on these top most similar structures whose pharmacology annotation is available, a chemical-protein network is constructed and a random walk algorithm is taken to compute the probabilities of the interaction between the query structure and proteins which associated with hit compounds. These potential protein targets ranked by the standard score of the probabilities. ChemMapper can be useful in a variety of polypharmacology, drug repurposing, chemical-target association, virtual screening, and scaffold hopping studies.

  • ChemProt (article 2011) ChemProt2.0 (article 2013) DATABASE | PHARMACOLOGY
  • The ChemProt 2.0 server is a resource of annotated and predicted chemical-protein interactions. The server is a compilation of over 1 100 000 unique chemicals with biological activity for more than 15000 proteins. ChemProt can assist in the in silico evaluation of small molecules (drugs, environmental chemicals and natural products) with the integration of molecular, cellular and disease-associated proteins complexes.

  • ChemScreen (article) PROJECT – DATABASE | – PATHWAYS – TOXENDPOINT
  • The Chemical substance in vitro/in silico screening system to predict human- and ecotoxicological effects (ChemScreen) is a collaborative project funded by the European Union. The current system of risk assessment of chemicals is complex, very resource-intensive and extremely time-consuming. This is particularly needed for reproductive toxicity testing of chemicals. Reproductive toxicity is important to assess both human and environmental toxicity and uses the most animals in toxicity testing. Unfortunately, there are very few alternative methods. ChemScreen aims to fill this gap and place the tests in a more general innovative animal free testing strategy.

  • ChemSpider DATABASE | PHARMACOLOGY
  • ChemSpider is a free access service providing a structure centric community for chemists. Providing access to millions of chemical structures and integration to a multitude of other online services, ChemSpider is the richest single source of structure-based chemistry information.

  • ChemSpot (article) THESAURI – TOOL | PHARMACOLOGY
  • ChemSpot is a named entity recognition tool for identifying mentions of chemicals in natural language texts, including trivial names, drugs, abbreviations, molecular formulas and IUPAC entities. Since the different classes of relevant entities have rather different naming characteristics, ChemSpot uses a hybrid approach combining a Conditional Random Field with a dictionary. It achieves an F1 measure of 68.1% on the SCAI corpus, outperforming the only other freely available chemical named entity recognition tool, OSCAR4, by 10.8 percentage points.

  • CheNER (article) THESAURI – TOOL | TEXT MINING
  • CheNER is a named entity recognition tool, that uses Conditional Random Fields for identifying mentions of chemicals in text, focusing on IUPAC entities.

  • CIL (article) THESAURI – TOOL | PHARMACOLOGY
  • Compounds In Literature (CIL) is a web site which allows for finding all PubChem compounds and UniProt proteins in all available abstracts of PubMed. Additionally, one of the problems researchers may face is that new compounds may not or only very rarely be described (or even have no name accepted by researchers’ community) in literature and thus literature and possible relations to proteins can not be found in e.g. PubMed. CIL starts searches with a query compound given by name or drawn structure. With this compound similar compounds are searched with a given similarity threshold. Subsequently, the synonyms of these compounds are looked up in literature. Thus, also hints and knowledge on not well-known compounds can be acquired. In the abstracts found by those searches, all protein synonyms (besides the compound synonyms) are highlighted. A comprehensive overview is given via a “heat table” as a first result.

  • Cloe® Predict The Cyprotex Discovery Bus TOOL | DRUG DISCOVERY – WORKFLOW
  • Cyprotex’s Discovery Bus is an integrated IT solution for the drug discovery industry which automates decision making and information processing. It enables the steps carried out by a human expert to be modelled as a workflow and the decomposition of a more complex task into series of linked steps, each step of which can be implemented by specialist programs known as agents. Intelligent workflow techniques identify when an agent should be called and which tasks are appropriate for an agent to solve. A collection of specialist agents work together in a highly organized and efficient manner with other agents providing management and control.
    The applications of the Discovery Bus are numerous and extend to a number of different industries. However some examples of the main applications which are relevant to the Drug Discovery industry include auto-QSAR and laboratory workflow processes which streamline the flow of compounds into assays and, subsequently, the capture and interpretation of instrument data.

  • CMLD PROJECT | PHARMACOLOGY
  • The Center for Chemical Methodology and Library Development at Boston University (CMLD-BU) is a new center funded by the National Institute of General Medical Sciences (NIGMS) focused on the discovery of new methodologies to produce novel chemical libraries of unprecedented complexity for biological screening. The goal of the CMLD-BU is to explore and expand the diversity of small-molecule libraries by creating general, useful protocols for stereocontrolled synthesis. This process will involve the creation of novel chemical libraries that uniquely probe three-dimensional space by employing stereochemical and positional variation within the molecular framework as diversity elements for library design. A major objective of the CMLD-BU is also to provide information and chemistry protocols to the public on parallel and chemical library synthesis. The CMLD-BU has also organized the Chemical Library Consortium (CLC) to provide members of the biology community with access to its chemical libraries. The CLC will ultimately enable the demonstration of the quality of any given library through the identification of molecules that can be utilized as tools to investigate cellular processes.

  • Connectivity Map (article) DATABASE | GENOMICS – PHARMACOLOGY
  • The Connectivity Map (also known as cmap) is a collection of genome-wide transcriptional expression data from cultured human cells treated with bioactive small molecules and simple pattern-matching algorithms that together enable the discovery of functional connections between drugs, genes and diseases through the transitory feature of common gene-expression changes.

  • COPICAT TOOL | PHARMACOLOGY – STRUCTURE-BASED PREDICTION
  • COPICAT is a system for predicting interactions between chemical compounds and proteins by using Support Vector Machine (SVM), one of the most widely used statistical learning methods. COPICAT realizes comprehensive prediction of protein-chemical interactions by utilizing very general, or the most easily available, data i.e. amino acid sequences and chemical structures.
    COPICAT provides two functions; (1) prediction and (2) training. The ‘prediction’ of COPICAT is made using SVM models. COPICAT default prediction models are based on FDA approved drugs and their target proteins, including enzymes and GPCRs etc., according to the DrugBank database. In the ‘training’, users can upload a set of pairs of a protein and a chemical compound with their sequence or structure data to construct specific prediction models.

  • CORPORA THESAURI | PHARMACOLOGY
  • Corpora for Named Entity Recognition of Chemical Compounds (CORPORA), see this document for further information.

  • COSMOS DATABASE | TOXENDPOINT
  • COSMOS DB is a new database bringing together access to chemical structures, an inventory of cosmetic materials and toxicological information. The first public release of COSMOS DB (December 2013) contained 12,538 toxicological studies for 1,660 compounds. In addition it provided access to 44,773 unique chemical structures including the European Union Inventory of Cosmetic Ingredients (CosIng), and the USA Personal Care Products Council (PCPC) inventories. Two separate datasets within COSMOS are available: US FDA PAFA and oRepeatToxDB. More data(sets) will be added in future updates. The PAFA dataset contains 12,198 studies across 27 endpoints including both repeat dose and genetic toxicity data. Data harvesting efforts within the COSMOS Project have resulted in the assembly of the oRepeatToxDB dataset. This has resulted in the collection of 340 in vivo repeat dose toxicity studies for 228 chemicals. The oRepeatToxDB study records contain the full plethora of observed toxicological effects together with the corresponding sites at which the effect occurred.

  • CPDB DATABASE | TOXENDPOINT
  • The Carcinogenic Potency Database (CPDB) provides a broad perspective on possible cancer hazards from human exposures to chemicals that cause cancer in high dose rodent cancer tests. Exposures are given in graphic and table formats, including high historical exposures to workers, pharmaceuticals, natural chemicals in the average diet, air pollutants, food additives, and pesticide residues.

  • CSS (article) PROJECT | DRUG SAFETY
  • The U.S. Environmental Protection Agency (EPA) has instituted the Chemical Safety for Sustainability (CSS) research program for assessing the health and environmental impact of manufactured chemicals. This is a broad program wherein one of the tasks is to develop high throughput screening (HTS) methods and follow-up confirmation for toxicity at realistic environmental exposure levels. The main tools under this task are in vitro toxicity testing, in silico molecular modeling, and in vivo (systemic) measurements documentation. The in vivo research component is intended to support and corroborate in vitro chemical toxicity prioritization with observations of systemic perturbations and statistical parameters derived from intact (living) organisms. Based on EPA’s Biomonitoring Framework for human health research, such observations are intended to link environmental exposures to a cascade of biomarker chemicals to help identify and clarify adverse outcome pathways within the context of systems biology. This commentary discusses the issues regarding interpretation of in vitro changes from HTS as an adverse result, an adaptive (non-adverse) response, or a random/irrelevant occurrence. A second goal is to inform in vitro strategies as to relevant dosing (potency) levels at the cellular level that reflect realistic systemic exposures. Although we recognize the high value of in vivo animal toxicity testing, herein we focus on observational (minimally invasive) human biomonitoring methods and propose complementary in vivo testing that could help guide the design of high-throughput analyses and the ultimate interpretation of their outcomes.

  • CTCAE THESAURI | ADVERSE EVENT – PHARMACOLOGY
  • The NCI Common Terminology Criteria for Adverse Events is a descriptive terminology which can be utilized for Adverse Event (AE) reporting. A grading (severity) scale is provided for each AE term. Common Terminology Criteria for Adverse Events (CTCAE) version v4.0 is MedDRA v12.0 (Medical Dictionary for Regulatory Activities Terminology) compatible at the AE (Adverse Event) term level where each CTCAE term is a MedDRA LLT (Lowest Level Term). CTCAE v4.0 includes 764 AE terms and 26 ‘Other, specify’ options for reporting text terms not listed in CTCAE. Each AE term is associated with a 5-point severity scale.

  • CTD (article, update 2011) DATABASE | NETWORK – PHARMACOLOGY
  • The Comparative Toxicogenomics Database (CTD) includes manually curated data describing cross-species chemical–gene/protein interactions and chemical– and gene–disease relationships to illuminate molecular mechanisms underlying variable susceptibility and environmentally influenced diseases. These data will also provide insights into complex chemical–gene and protein interaction networks.

  • CTSA DATABASE | DRUG SAFETY – PHARMACOLOGY
  • The CTSA Pharmaceutical Assets Portal invites you to join in the effort to find new uses for discontinued drugs. The ultimate goal is to leverage existing compounds to advance mechanistic understanding of human disease, resulting in novel treatments for patients. Integration of academic investigators into collaborative repositioning efforts with Pfizer would substantially increase the knowledge base and the pool of methodologies available for proof of concept studies. These matches will undoubtedly result in an increased number of approved drugs for new indications and considerable public benefit.

  • CWM Global Search (commercial) THESAURI – TOOL | IDENTIFIER
  • CWM Global Search allows searching the Internet by structure, synonym and CAS Registry Number. In a Quick Search, we return structure, names and CAS Registry Numbers* within seconds. The Global Search allows a comprehensive search and the resulting links are organized by topics. A search by name can automatically invoke another search by structure and/or CAS Registry, or any combination of these. CWM Global Search presently searches more than 50 free chemical and pharma relevant databases — containing more than 100 million pages which associate chemical structures with data.

  • CypRules (article) TOOL | CYP450 – PHARMACOLOGY – STRUCTURE-BASED PREDICTION
  • CypRules is a rule-based CYP inhibition prediction online server, CypRules, was created based on predictive models generated by the rule-based C5.0 algorithm. CypRules can predict and provide structural rule sets for CYP inhibition for each compound uploaded to the server. Capable of fast execution performance, it can be used for virtual high-throughput screening (VHTS) of a large set of testing compounds.

  • DAIN Metadatabase of Internet Resources for Environmental Chemicals DATABASE | TOXENDPOINT
  • This resource is designed to help you finding relevant databases for environmental chemicals worldwide. You can search our database by name or by subject. The subject search provides several structured search formats. See the link ‘Show the list of databases (complete)’.

  • DAAB (article) DATABASE – TOOL | BIOMARKERS – SYSTEMS BIOLOGY – TEXT MINING
  • The Database of Allergy and Asthma Biomarkers (DAAB) is a web based repository of molecular biomarkers of atopic asthma and other allergic diseases. This database stores information of genes and proteins, which are modulated significantly in allergy and asthma pathogenesis. The DAAB has been constructed from published genomic, proteomic and epigenetic studies using ‘text mining’ approach followed by ‘manual curation’. Dataset have been classified broadly into four groups (Genomics, Proteomics, Epigenetics and Others).

  • Danish (Q)SAR Database DATABASE | PHARMACOLOGY – QSAR MODELING
  • The Danish (Q)SAR Database is the Danish EPA repository of estimates from over 70 QSAR models and health effects for 166,072 chemicals.

  • DataWarrior DATABASE – TOOL | QSAR MODELING
  • DataWarrior combines dynamic graphical views and interactive row filtering with chemical intelligence. Scatter plots, box plots, bar charts and pie charts not only visualize numerical or category data, but also show trends of multiple scaffolds or compound substitution patterns. Chemical descriptors encode various aspects of chemical structures, e.g. the chemical graph, chemical functionality from a synthetic chemist’s point of view or 3-dimensional pharmacophore features. These allow for fundamentally different types of molecular similarity measures, which can be applied for many purposes including row filtering and the customization of graphical views. DataWarrior supports the enumeration of combinatorial libraries as the creation of evolutionary libraries. Compounds can be clustered and diverse subsets can be picked. Calculated compound similarities can be used for multidimensional scaling methods, e.g. Kohonen nets. Physicochemical properties can be calculated, structure activity relationship tables can be created and activity cliffs be visualized.

  • DCDB 2.0 (article) DATABASE | PHARMACOLOGY
  • Drug combination database (DCDB), launched in 2010, was the first available database that collects and organizes information on drug combinations, with an aim to facilitate systems-oriented new drug discovery. Now, the second major release of DCDB (Version 2.0) includes 866 new drug combinations (1363 in total), consisting of 904 distinctive components. These drug combinations are curated from ∼140,000 clinical studies and the food and drug administration (FDA) electronic orange book. In this update, DCDB collects 237 unsuccessful drug combinations, which may provide a contrast for systematic discovery of the patterns in successful drug combinations.

  • DDMoRe PROJECT | WORKFLOW
  • The Drug Disease Model Resources (DDMoRe) consortium’s strategy will have standards as its core: a newly developed common definition language for data, models and workflows, along with an ontology based standard for storage and transfer of models and associated metadata. A drug and disease model library will be developed as a public resource. Its flexibility and power will be showcased by the addition of “proof of concept” drug and disease models from key therapeutic areas such as diabetes and oncology.
    An open-source interoperability framework will be the backbone for the integration of M&S applications into seamless, standardized but flexible workflows. Initially, currently-used tools (e.g. NONMEM, WinBUGS, MATLAB, R) will be integrated into the framework. From the outset resources will also be dedicated to new application development which will be steered by identified gaps in the M&S software ecosystem. The DDMoRe project’s standards and tools – intended as the gold standard for future collaborative drug and disease M&S – will be supported by comprehensive training and will be made publicly accessible.

  • DEER (article) DATABASE | CYP450 – DRUG SAFETY – PHARMACOLOGY – TRANSPORTERS
  • Drug response is determined by the complex interactions between genetic factors and environmental factors. DEER is a database which aimed to explore the environmental effects on drug response. Environmental regulation such as transcription factors and cytochromes P450 are given. The direct interactions (such as the shared targets and transporters) between drugs and ENFs are provided, so does the chemical similarities. The users can: search the database; filter the result; save filtered data as csv; get the regulatory graph between an ENF and a drug; and download all data in the database.

  • DiGEP-Pred (article) TOOL | SYSTEMS BIOLOGY
  • DIGEP-Pred is a web-service for in silico prediction of drug-induced gene expression profiles based on structural formula.

  • DINTO (article) DATABASE | ONTOLOGY – PHARMACOLOGY – SYSTEMS BIOLOGY
  • DINTO is an OWL ontology that systematically organizes all drug-drug interaction (DDI) related information. Drug-drug interactions (DDIs) form a significant risk group for adverse effects associated with pharmaceutical treatment. These interactions are often reported in the literature, however, they are sparsely represented in machine-readable resources, such as online databases, thesauri or ontologies. DINTO is an ontology that describes and categorizes DDIs and all the possible mechanisms that can lead to them (including both pharmacodynamic and pharmacokinetic DDI mechanisms). DINTO can be combined with specifically created Semantic Web Rule Language (SWRL) rules to infer DDIs and their different mechanisms (both pharmacokinetic and pharmacodynamic).

  • DemPRED (article) TOOL | ADME – STRUCTURE-BASED PREDICTION
  • DemPred is a collection of different prediction tools for bioinformatic analysis (1) DemXXX Online: Collection of prediction tools which are directly accesible via an online service; 2) DemXXX: Standalone java applications which provide a command line interface. They can be downloaded and executed locally and thus are applicable for large scale analysis. 2.1) DemPHOS Online is a tool to predict kinase specific phosphorylation sites from the primary amino acid sequence. 2.2.) DemQSAR Online is a web service for the prediction of various ADME-Tox properties from two dimensional compounds structures. 2.3) DemQSAR is a classification and regression program for QSAR analysis. and 2.4) DemSEQ is a simple classification program that can be used to classify small peptides.

  • DEREK (commercial) TOOL | STRUCTURE-BASED PREDICTION – TOXENDPOINT
  • The Deductive Estimation of Risk from Existing Knowledge (DEREK) is an expert knowledge base system that predicts whether a chemical is toxic in humans, other mammals and bacteria. The application is a high throughput screen for these endpoints: Carcinogenicity, Mutagenicity, Genotoxicity,Skin Sensitisation, Teratogenicity, Irritancy, Respiratory Sensitisation, Hepatotoxicity, Chromosome Damage, Ocular Toxicity, HERG Channel Inhibition. DEREK also provides structural reasoning for general reactivity features and associated interaction mechanisms.

  • DevTox DATABASE | TOXENDPOINT
  • The Developmental Toxicity is a study data and historical control database for various strains of common laboratory animals developed by German industry and government.

  • DiDB DATABASE – TOOL | PHARMACOKINETICS – PHARMACOLOGY
  • The Drug Interaction database (DiDB) is a research and analysis tool developed at the University of Washington, in the Department of Pharmaceutics. It contains in vitro and in vivo information on drug interactions in humans from the following sources: 8334 peer-reviewed journal articles referenced in PubMed, 77 New Drug Applications (NDAs), 376 excerpts f FDA Prescribing Information and In-depth analyses of drug-drug interactions in the context of 40 diseases / co-morbidities. In addition, the database also provides PK Profiles of drugs, QT Prolongation data, including results of TQT studies from recent NDAs, as well as Regulatory Guidances and Editorial Summaries/Syntheses relevant to advances in the field of drug interactions.

  • Disease Ontology (article) DATABASE – TOOL | SYSTEMS BIOLOGY – TEXT MINING – VOCABULARY
  • The Disease Ontology has been developed as a standardized ontology for human disease with the purpose of providing the biomedical community with consistent, reusable and sustainable descriptions of human disease terms, phenotype characteristics and related medical vocabulary disease concepts through collaborative efforts of researchers at Northwestern University, Center for Genetic Medicine and the University of Maryland School of Medicine, Institute for Genome Sciences. The Disease Ontology semantically integrates disease and medical vocabularies through extensive cross mapping of DO terms to MeSH, ICD, NCI’s thesaurus, SNOMED and OMIM.

  • DisGeNET DATABASE – TOOL | NETWORK
  • DisGeNET is a plugin for Cytoscape to query and analyze a network representation of human gene-disease databases. For this purpose, it has beendeveloped as a new gene-disease database integrating data from several public sources. DisGeNET allows user-friendly access to our database, which includes queries restricted to (i) the original data source, (ii) the association type, (iii) the disorder class of interest and (iv) specific diseases, respectively genes. It represents gene-disease associations in terms of bipartite graphs and additionally provides gene centric and disease centric views of the data. It assists the user in the interpretation and exploration of human complex diseases with respect to their genetic origin by a variety of built-in functions. Moreover, DisGeNET permits multicoloring of nodes (genes/diseases) according to their disease classes for expedient visualization.

  • DITOP DATABASE | PHARMACOLOGY – TOXENDPOINT
  • The Drug-Induced Toxicity Related Protein database (DITOP) is a comprehensive database that provides related information of Drug-induced toxicity related proteins (DITRPs) which are proteins that mediate toxicities through their interaction with drugs or reactive metabolites. Collection of these proteins will in large extent facilitates better understanding of the molecular mechanisms of drug-induced toxicity and the rational drug discovery. Currently, DITOP contains 618 distinct literature-reported DITRPs, 529 drugs/ligands, and 418 distinct toxicity terms. The related toxicities include overdose toxicity, idosycratic toxicity, drug-drug interactions and genetic toxicity.

  • DNorm (article) TOOL | TEXT MINING
  • DNorm is an automated method for determining which diseases are mentioned in biomedical text, the task of disease normalization. Diseases have a central role in many lines of biomedical research, making this task important for many lines of inquiry, including etiology (e.g. gene-disease relationships) and clinical aspects (e.g. diagnosis, prevention, and treatment). DNorm is a high-performing and mathematically principled framework for learning similarities between mentions and concept names directly from training data. DNorm is the first technique to use machine learning to normalize disease names and also the first method employing pairwise learning to rank in a normalization task.

  • DoGSiteScorer TOOL | PHARMACOLOGY – STRUCTURE-BASED PREDICTION
  • The Active Site Prediction and Analysis Server DogSiteScorer is an automated pocket detection and analysis tool which can be used for protein assessment. Predictions with DoGSiteScorer are based on calculated size, shape and chemical features of automatically predicted pockets, incorporated into a support vector machine for druggability estimation.

  • dRiskKB (article) DATABASE | SYSTEMS BIOLOGY
  • dRiskKB consists of a total of 34,448 unique D1 →D2 pairs, representing the risk-specific semantic relationships among 12,981 diseases with each disease linked to its associated genes and drugs, based on 21,354,075 MEDLINE records that comprises the text corpus.

  • drugable (article) TOOL | PHARMACOLOGY
  • Drugable is a search engine that maintains a comprehensive index of druggable small molecule chemistry and protein targets.

  • Drug Databases DATABASE | ADVERSE EVENT
  • Drug databases provide comprehensive information for healthcare professionals or consumers on prescription and over-the-counter medications. The information includes most important facts about the drug, detailed instructions for proper use, side effects, warnings, precautions, food and drug interactions, dosage, over-dosage etc. The information could be obtained by search of product name or browsing the starting letter of the medication at the alphabetic index.

  • DrugbankDrugbank4.0 DATABASE | PATHWAY – PHARMACOLOGY
  • The DrugBank database is a unique bioinformatics and cheminformatics resource that combines detailed drug (i.e. chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e. sequence, structure, and pathway) information. The database contains nearly 4800 drug entries including >1,350 FDA-approved small molecule drugs, 123 FDA-approved biotech (protein/peptide) drugs, 71 nutraceuticals and >3,243 experimental drugs. Additionally, more than 2,500 non-redundant protein (i.e. drug target) sequences are linked to these FDA approved drug entries. Each DrugCard entry contains more than 100 data fields with half of the information being devoted to drug/chemical data and the other half devoted to drug target or protein data.
    release4.0 – The DrugBank database is a unique bioinformatics and cheminformatics resource that combines detailed drug (i.e. chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e. sequence, structure, and pathway) information. The database contains 7668 drug entries including 1552 FDA-approved small molecule drugs, 151 FDA-approved biotech (protein/peptide) drugs, 87 nutraceuticals and over 6000 experimental drugs. Additionally, 4273 non-redundant protein (i.e. drug target/enzyme/transporter/carrier) sequences are linked to these drug entries. Each DrugCard entry contains more than 200 data fields with half of the information being devoted to drug/chemical data and the other half devoted to drug target or protein data.

  • DrugMatrix DATABASE | GENOMICS
  • The DrugMatrix Database is the largest database of its kind, comprised of Entelos-generated data from over 18,000 vehicle and drug-treated animal tissue samples and cultures. The data include gene expression profiles linked to clinical chemistry and pathology, public literature information, and physical properties of drugs and compounds.

  • DrugNet (article) DATABASE | SYSTEMS BIOLOGY – DRUG SAFETY
  • DrugNet is a new methodology for drug-disease and disease-drug prioritization. The approach applied is based on a network-based prioritization method called ProphNet which is able to integrate data from complex networks involving a wide range of types of elements and interactions.

  • Drug2Gene (article) DATABASE | PHARMACOLOGY – SYSTEMS BIOLOGY
  • Drug2Gene, a knowledge base, which combines the compound/drug-gene/protein information from 19 publicly available databases. A key feature is our rigorous unification and standardization process which makes the data truly comparable on a large scale, allowing for the first time effective data mining in such a large knowledge corpus. As of version 3.2, Drug2Gene contains 4,372,290 unified relations between compounds and their targets most of which include reported bioactivity data. We extend this set with putative (i.e. homology-inferred) relations where sufficient sequence homology between proteins suggests they may bind to similar compounds. Drug2Gene provides powerful search functionalities, very flexible export procedures, and a user-friendly web interface.

  • Drugs Vocabulary THESAURI | IDENTIFIER – PHARMACOLOGY
  • The Lèxic de fàrmacs (“Drugs Vocabulary”) gathers 2.868 Catalan denominations of the most common drugs with their grammatical category, their correspondent equivalents in Spanish, French and English, the registered CAS number and their most common therapeutic actions or action mechanisms.

  • Drugs@FDA DATABASE | DRUG SAFETY – PHARMACOLOGY
  • Drugs@FDA allows you to search for official information about FDA approved brand name and generic drugs and therapeutic biological products. The main uses of Drugs@FDA are: finding labels for approved drug products, finding generic drug products for a brand name drug product, finding therapeutically equivalent drug products for a brand name or generic drug product, finding consumer information for drugs approved from 1998 on, finding all drugs with a specific active ingredient, and viewing the approval history of a drug.

  • DSSTox DATABASE | TOXENDPOINT
  • The Distributed Structure-Searchable Toxicity (DSSTox) Database Network is a project of EPA’s National Center for Computational Toxicology, helping to build a public data foundation for improved structure-activity and predictive toxicology capabilities. The DSSTox website provides a public forum for publishing downloadable, structure-searchable, standardized chemical structure files associated with toxicity data.

  • Early Diagnosis Consortium PROJECT | BIOMARKERS
  • The EDC is a global consortium between Abcodia, Cancer Research Technology and Cancer Research UK, the world’s leading charity dedicated to cancer research. It aims to bring together the most cutting edge biomarker discovery platform providers and the world’s leading clinicians and academics to create an unrivalled global collaboration. The EDC will not build its own biomarker discovery platforms, but look to external providers for biomarker discovery, verification and validation services.

  • eChemPortal DATABASE | PHARMACOLOGY – TOXENDPOINT
  • eChemPortal offers free public access to information on: Physical chemical properties, Environmental Fate and Behaviour, Ecotoxicity, Toxicity, and GHS Classifications. eChemPortal allows for simultaneous search of multiple databases and provides clearly described sources and quality of data. eChemPortal gives access to data submitted to government chemical review programmes at national, regional, and international levels.

  • ECOSAR TOOL | STRUCTURE-BASED PREDICTION – TOXENDPOINT
  • The Ecological Structure Activity Relationships (ECOSAR) Class Program is a computerized predictive system that estimates the aquatic toxicity of industrial chemicals. The program estimates a chemical’s acute (short-term) toxicity and chronic (long-term or delayed) toxicity to aquatic organisms such as fish, aquatic invertebrates, and aquatic plants by using Structure Activity Relationships (SARs).

  • EDAM (article) THESAURI -TOOL
  • EDAM is an ontology of bioinformatics operations (tool or workflow functions), types of data and identifiers, application domains and data formats. EDAM supports semantic annotation of diverse entities such as Web services, databases, programmatic libraries, standalone tools, interactive applications, data schemas, datasets and publications within bioinformatics. EDAM applies to organizing and finding suitable tools and data and to automating their integration into complex applications or workflows. It includes over 2200 defined concepts and has successfully been used for annotations and implementations.

  • Effectopedia DATABASE – TOOL | ADVERSE OUTPUT PATHWAY – QSAR MODELING – VOCABULARY
  • Effectopedia is an open-knowledge aggregation and collaboration tool designed to facilitate the interdisciplinary efforts for delineating adverse outcome pathways (AOPs) in an encyclopedic manner with greater predictive power. As a response to the growing awareness that a paradigm shift in chemical risk assessment is needed, Effectopedia provides a capability to move beyond the last half-century’s phenomenological approach with animal testing to a more mechanistic and hypothesis-driven approach. The 21st-Century shift to more prospective hypothesis generation requires more strategic use of systems biology, QSAR and archived toxicological information in the form of AOPs. Effectopedia is designed as a new technology both to reduce multidisciplinary barriers in the development of AOPs and to integrate AOPs with historical case studies.

  • eGIFT TOOL | GENOMICS
  • Extracting Gene Information From Text (eGIFT) is a web application designed for use by life scientists who are interested in rapidly finding information about a gene. eGIFT uses natural language processing techniques to retrieve iTerms (informative terms) relevant to a specific gene. This engine looks at PubMed references, gathers those abstracts which focus on the given gene, and automatically identifies terms which are statistically more likely to be relevant to that gene than to genes in general.

  • EMA Non-Clinical Guidelines REGULATORY GUIDELINES
  • The EMA’s Committee for Medicinal Products for Human Use (CHMP) prepares scientific guidelines, in consultation with the competent authorities of the EU Member States, to help applicants prepare marketing-authorisation applications for medicinal products for human use.
    Guidelines are intended to provide a basis for practical harmonisation of the manner in which the EU Member States and the EMA interpret and apply the detailed requirements for the demonstration of quality, safety and efficacy contained in the Community directives. They also help to ensure that applications for marketing authorisation are prepared in a manner that will be recognised as valid by the EMA.

  • EMBRACE DATABASE – TOOL | WORKFLOW
  • The EMBRACE Service Registry is a collection of life-science web services with built-in service testing. This site is a prelude to the internationally supported BioCatalogue system that will collect, store, validate, and make available web-services in the biosciences. This registry is mainly meant for the EU projects EMBRACE, BioSapiens and ENFIN, but other users are welcome too. As a potential web service user, you can search or browse the registry for services that match your needs. Furthermore, each entry includes live test data, showing the current and historical status of the service. Each entry can also include example client software to help you include them in your own programs or workflows. As a service provider, the registry helps you build high quality services that conform to industry standards, and gives you a means of advertising your tools to the user community as well as a platform for testing your service. Your service remains your property and published by you – this registry merely advertises its existence and its status.

  • eMolecules DATABASE | PHARMACOLOGY
  • This website offers a free online database of almost 8 million unique chemical structures. The database is assembled from data supplied by over 150 suppliers and provides a path to identifying a vendor for a particular chemical compound. Their database was recently enhanced by providing access to NMR, MS and IR spectra from Wiley-VCH for over 500 000 compounds via ChemGate, a fee-based service. eMolecules also provides links to many sources of data for spectra, physical properties and biological data.

  • EnrichNet (article) TOOL | SYSTEMS BIOLOGY
  • The EnrichNet is an integrative analysis approach and web-application called EnrichNet. It combines a novel graph-based statistic with an interactive sub-network visualization to accomplish two complementary goals: improving the prioritization of putative functional gene/protein set associations by exploiting information from molecular interaction networks and tissue-specific gene expression data and enabling a direct biological interpretation of the results.

  • Entrez gene DATABASE | GENOMICS
  • Entrez Gene is a searchable database of genes, from RefSeq genomes, and defined by sequence and/or located in the NCBI Map Viewer.

  • Enzyme Detector (article) TOOL | ENZYMES
  • The EnzymeDetector is a tool that automatically compares and evaluates the assigned enzyme functions from the main annotation databases and supplements them with its own function prediction. This is based on a sequence similarity analysis, on manually created organism-specific enzyme information from BRENDA (Braunschweig Enzyme Database), and on sequence pattern searches.

  • EOB THESAURI | DRUG SAFETY – PHARMACOLOGY
  • The Electronic Orange Book (EOB) of Approved Drug Products with Therapeutic Equivalence Evaluations includes: New Drug Application (NDA) approvals in the EOB month they were approved (NDA application numbers are preceded with “N”); Abbreviated New Drug Application approvals (ANDA or Generic) as of the date of the daily update. Generic application numbers are preceded with “A”; All product changes received and processed as of the monthly update date; Patent information, also updated daily in the EOB, as of the date of the daily update; and Exclusivity information updated monthly and current to the date of the monthly EOB update date.

  • EPARs for authorised medicinal products for human use DATABASE | DRUG SAFETY – TOXENDPOINT
  • The EPAR provides a summary of the grounds for the CHMP opinion in favour of granting a marketing authorisation for a specific medicinal product. It results from the Committee’s review of the documentation submitted by the applicant, and from subsequent discussions held during Committee for Medicinal Products for Human Use (CHMP) meetings. The EPAR is updated throughout the authorisation period as changes to the original terms and conditions of the authorisation (i.e. variations, pharmacovigilance issues, specific obligations) are made. EPARs also contain a summary written in a manner that is understandable to the public.

  • EPI Suite ™ (commercial) TOOL | DRUG SAFETY – TOXENDPOINT
  • The Office of Pollution Prevention and Toxics (OPPT) has developed several exposure assessment methods, databases, and predictive models to help in evaluating: what happens to chemicals when they are used and released to the environment; and how workers, the general public, consumers and the aquatic ecosystems may be exposed to chemicals. An example of Exposure Assessment Tools is the EPI (Estimation Programs Interface) Suite™ a Windows®-based suite of physical/chemical property and environmental fate estimation programs developed by the EPA’s Office of Pollution Prevention Toxics and Syracuse Research Corporation (SRC). EPI Suite™ uses a single input to run the following estimation programs: KOWWIN™, AOPWIN™, HENRYWIN™, MPBPWIN™, BIOWIN™, BioHCwin, KOCWIN™, WSKOWWIN™, WATERNT™, BCFBAF™, HYDROWIN™, KOAWIN and AEROWIN™, and the fate models WVOLWIN™, STPWIN™ and LEV3EPI™. ECOSAR™, which estimates ecotoxicity, is also included in EPI Suite™.

  • ESIS DATABASE – THESAURI | DRUG SAFETY
  • The European chemical Substances Information System (ESIS) is an IT System which provides you with information on chemicals, related to:
    * EINECS (European Inventory of Existing Commercial chemical Substances) O.J. C 146A, 15.6.1990,
    * ELINCS (European List of Notified Chemical Substances) in support of Directive 92/32/EEC, the 7th amendment to Directive 67/548/EEC,
    * NLP (No-Longer Polymers),
    * BPD (Biocidal Products Directive) active substances listed in Annex I or IA of Directive 98/8/EC or listed in the so-called list of non-inclusions,
    * PBT (Persistent, Bioaccumulative, and Toxic) or vPvB (very Persistent and very Bioaccumulative),
    * C&L (Classification and Labelling), substances or preparations in accordance with Directive 67/548/EEC (substances) and 1999/45/EC (preparations),
    * Export and Import of Dangerous Chemicals listed in Annex I of Regulation (EC) No 689/2008,
    * HPVCs (High Production Volume Chemicals) and LPVCs (Low Production Volume Chemicals), including EU Producers/Importers lists,
    * IUCLID Chemical Data Sheets, IUCLID Export Files, OECD-IUCLID Export Files, EUSES Export Files,
    * Priority Lists, Risk Assessment process and tracking system in relation to Council Regulation (EEC) 793/93 also known as Existing Substances Regulation (ESR).

  • eTRIKS PROJECT
  • The main objective of ’Delivering eTRIKS’ is to address this gap by building a sustainable IMI translational research informatics/KM platform – eTRIKS, and to provide sustainable IMI KM services. In order to realize an ‘open’ platform, the development will begin with transMART, an open source KM platform. eTRIKS, however, will not end with transMART. The intent is to build a combined KM/analytics platform that can serve as a base for continued development. A benchmetriks of success will be the establishment of a sustainable platform and service layer as well as a robust user and developer community.

  • ExPub DATABASE | TOXENDPOINT
  • The world’s leading provider of up-to-date toxicology and chemical hazard information. Consisting of more than 130 databases, ExPub provides users with access to millions of documents containing comprehensive human and/or environmental hazard data needed to manage the impact of chemicals in the workplace or on the environment. ExPub databases include essential information for emergencies such as responding to fires, spills or explosions involving hazardous chemicals, as well as more day-to-day needs like creating and maintaining healthy and compliant workplaces.

  • fconv (article) TOOL | PHARMACOLOGY – WORKFLOW
  • fconv (move to Download tag) is a program intended for parsing and manipulating multiple aspects and properties of molecular data. It is a very robust and comprehensive tool involved in a broad range of computational workflows that are currently applied in the drug design environment. Typical tasks are as follows: conversion and error correction of formats such as PDB(QT), MOL2, SDF, DLG and CIF; extracting ligands from PDB as MOL2; automatic or ligandbased cavity detection; rmsd calculation and clustering; substructure searches; alignment and structural superposition; building of crystal packings; adding hydrogens; calculation of various properties like the number of rotatable bonds; molecular weights or vdW volumes. The atom type classification is based on a consistent assignment of internal atom types, which are by far more differentiated compared with e.g. Sybyl atom types. Apart from the predefined mapping of these types onto Sybyl types, the user is able to assign own mappings by providing modified template files, thus allowing for tailor-made atom type sets.

  • FDA Drug-Drug Interaction Draft Guidance (17-Feb-2012) REGULATORY GUIDELINES | PHARMACOLOGY
  • Guidance for Industry ‘Drug Interaction Studies — Study Design, Data Analysis, Implications for Dosing, and Labeling Recommendations’ DRAFT GUIDANCE

  • FDA Qualification Process for Drug Development tools REGULATORY GUIDELINES | BIOMARKERS – DRUG DISCOVERY
  • Guidance for Industry and FDA Staff ‘Qualification Process for Drug Development tools’

  • FDA Toxicoinformatics DATABASE – TOOL | GENOMICS – METABOLISM – PHARMACOLOGY – TOXENDPOINT
  • The Center for Toxicoinformatics conducts research in bioinformatics and chemoinformatics, and develops and coordinates informatics capabilities within NCTR, across FDA Centers, and in the larger toxicology community. The goal of the toxicoinformatics group is to develop methods for the analysis and integration of omics (genomic, transcriptomic, proteomic, and metabolomic) databases with the objective of knowledge discovery and the elucidation of mechanisms of toxicity.

  • foodTOX lectures OTHER | RISK ASSESSMENT – TOXENDPOINT
  • This course provides a general understanding of toxicology related to food and the human food chain. Fundamental concepts will be covered including dose-response relationships, absorption of toxicants, distribution and storage of toxicants, biotransformation and elimination of toxicants, target organ toxicity, teratogenesis, mutagenesis, carcinogenesis, food allergy, and risk assessment. The course will examine chemicals of food interest such as food additives, mycotoxins, and pesticides, and how they are tested and regulated.

  • FragmentStore (article) DATABASE | PHARMACOLOGY – STRUCTURE-BASED PREDICTION
  • The FragmentStore is a database that has been primarily designed for pharmacists, biochemists and medical scientists as well as researchers working in cognate disciplines like fragment-based drug design. It provides information about fragments of compounds and their properties (e.g. charge, hydrophobicity, binding site preferences). It allows the user to perform statistical analysis of the fragments’ properties and binding site preferences. Moreover, the database supports the building of a fragment for fragment-based drug design. This database contains over 35000 different fragments resulting from fragmentation of more than 13000 metabolic compounds, 2200 toxic compounds and 16000 drugs and pharmacologically characterized compounds using these two strategies of fragmentation: a) recursive fragmentation of compounds according to the recap-rules and b) cutting out of chains between rings structures.
    Fraggle is a key word search facility that finds fragments which are related to a protein or compound. For example, if the user searches for a ligand’s three letter code the results are a list of all fragments of this compound.

  • FTC (article) DATABASE | IDENTIFIER – PHARMACOLOGY
  • The Functional Therapeutic Chemical Classification (FTC) is public resource, which should assist drug repurposing initiatives or enhance computational studies that judge drugs according to their ‘mode of action’. The resource attributes biomolecular functions and processes to drugs, the same way as GO types have been assigned to gene products. The construction of FTC relies on axiomatic representations of MoA as the core means to attribute and derive the MoA for approved drugs. We shown the validity of the approach by comparing the content
    of the FTC to a well-established gold standard, the ATC

  • GAG database (article) DATABASE | GENOMICS – IDENTIFIER
  • The Genomic Annotation Gathering tool is a database which aims to provide an enriched cross-references database to scientific community. Cross-references are achieved by using sequence similarity comparison between transcripts originating from GenBank and Ensembl, a custom filtering process, and posterior comparison with official cross-reference tables provided by GenBank. This way, we are able to propose new (predicted) cross-references that were unavailable before, to complete known (official) data.
    Available annotation data includes all transcripts and their identifiers, functionnal description of genes, chromosomal localization, gene symbols, gene homologs for model species (human, chicken, mouse), and several identifiers to link those genes to external databases (UniProt, HGNC).Queries should be made either by chromosomal localization (to return all known genes and their cross-references between two loci), gene symbol (to return both GenBank and Ensembl identifiers that link to this gene), or unique gene identifier.

  • GARField TOOL | PHARMACOLOGY
  • GARField, one of the Example Applications developed within the project, is launched. GARField is an abbreviation for Graph-Activity-Relationship visualization Field and is a realization of the Open PHACTS “Polypharmacology Browser”. It is meant as a tool for browsing data made available through the Open PHACTS platform. Initialiy our main focus is to enable the user to explore interactions between targets and compounds, do predictions of interactions, and to visualize the interactions in a heatmap.

  • GDB DATABASE | PHARMACOLOGY (new release GDB-17, with 166.4 billion molecules
  • GDB-13 enumerates small organic molecules up to 13 atoms of C, N, O, S and Cl following simple chemical stability and synthetic feasibility rules. With 977 468 314 structures, GDB-13 is the largest publicly available small organic molecule database to date.

  • GeneTIER (article) DATABASE | SYSTEMS BIOLOGY
  • Gene TIssue Expression Ranker (GeneTIER), a new web-based application for candidate gene prioritization. GeneTIER replaces knowledge-based inference traditionally used in candidate disease gene prioritization applications with experimental data from tissue-specific gene expression datasets and thus largely overcomes the bias toward the better characterized genes/diseases that commonly afflict other methods.

  • Gene Expression Omnibus DATABASE | GENOMICS
  • Gene Expression Omnibus is a public functional genomics data repository supporting MIAME-compliant data submissions. Array- and sequence-based data are accepted. Tools are provided to help users query and download experiments and curated gene expression profiles.

  • Gene-Tox DATABASE | GENOMIC – TOXENDPOINT
  • The Gene-Tox is a peer-reviewed genetic toxicology test data for over 3000 chemicals.

  • GeneSetDB (article) DATABASE | GENOMICS
  • GeneSetDB, a comprehensive meta-database with associated gene set enrichment analysis statistical tools, which integrates a large number of other biological databases for gene set analysis.

  • Gimli (article) THESAURI – TOOL | TEXT MINING
  • Gimli, an open-source, state-of-the-art tool for automatic recognition of biomedical names. Gimli includes an extended set of implemented and user-selectable features, such as orthographic, morphological, linguistic-based, conjunctions and dictionary-based. A simple and fast method to combine different trained models is also provided. Gimli is an off-the-shelf, ready to use tool for named-entity recognition, providing trained and optimized models for recognition of biomedical entities from scientific text. It can be used as a command line tool, offering full functionality, including training of new models and customization of the feature set and model parameters through a configuration file. Advanced users can integrate Gimli in their text mining workflows through the provided library, and extend or adapt its functionalities. Based on the underlying system characteristics and functionality, both for final users and developers, and on the reported performance results, we believe that Gimli is a state-of-the-art solution for biomedical NER, contributing to faster and better research in the field.

  • GOLD DATABASE | GENOMICS
  • Genomes Online Database (GOLD) is a World Wide Web resource for comprehensive access to information regarding complete and ongoing genome projects, as well as metagenomes and metadata, around the world.

  • goRENI THESAURI
  • goRENI is the standard reference for nomenclature and diagnostic criteria in toxicologic pathology and at the same time the Internet discussion platform for the global initiative “INHAND” – the International Harmonization of Nomenclature and Diagnostic criteria.

  • GPCR-specific PDF reader THESAURI – TOOL | PHARMACOLOGY
  • The GPCR-specific PDF reader allows you to enrich your scientific literature with information and knowledge from the GPCRDB. Relevant information for genes, proteins, residues and mutations is automatically retrieved from the GPCRDB and made available to you. This information is integrated with the article in a non-obtrusive way; it is only there where and when you want it. This software helps you put your literature in the context of the total body of knowledge related to GPCRs, providing you with instant access to current, integrated, validated, internally consistent data and information.

  • GPCRnetwork DATABASE | PHARMACOLOGY
  • The GPCR Network works closely with the GPCR community to determine the HIGH RESOLUTION STRUCTURE AND FUNCTION of GPCRs distributed broadly across the phylogenetic family tree. They offer the easily accessible and open data to create a dynamic and informative GPCR-Network of value to the entire scientific community. Note that they have a tracking system to inform about targets status.

  • gsGator TOOL | SYSTEMS BIOLOGY
  • gsGator is a fully integrated, web-based tool for gene set analysis(GSA), which allows highly flexible and interactive GSA analyses. A series of new gene sets can be created as a combination of any existing gene sets, which is highly desirable in most exploratory and discovery-oriented studies. According to our survey, cross-species GSA expands the coverage of phenotypic annotation by ~20% and PPI network by ~12% for human genes, respectively. Few existing tools are equipped with these functionalities in a single unified database, reducing the burden of consulting multiple web sites and bioinformatics tools. All the gene lists and analytic results can be exported for further processing and integration with other analytic results. As demonstrated in the three case examples, interactive and crossspecies GSA greatly extends the scope and utility of GSA, leading to novel insights via conserved functional gene modules across different species.

  • GUILDify (article) TOOL | DRUG DISCOVERY – PHARMACOLOGY – SYSTEMS BIOLOGY
  • GUILDify is a free and easy-to-use web server for prioritization of genes using PPI networks. For a given phenotype, GUILDify uses descriptive fields in several proteomics and genomics databases in combination with network-based prioritization methods and provides an interactome-wide ranking. The ranking represents the relevance to the phenotype of interest and can be used to short-list the set of candidate genes that need to be further validated or to repurpose drugs (e.g. through common high-ranking targets).

  • Guide to Receptors and Channels (GRAC) DATABASE – THESAURI | ENZYMES – NUCLEAR RECEPTORS – PHARMACOLOGY – TRANSPORTERS
  • The Guide to Receptors and Channels (GRAG) is an authoritative but user-friendly publication that allows a rapid overview of the key properties of a wide range of established or potential pharmacological targets. Due to the great proliferation of drug targets in recent years has driven the need to organise and condense the information in a logical way. The information is provided succinctly, so that a newcomer to a particular target group can identify the main elements “at a glance”. The GRAC is divided into seven sections, which comprise pharmacological targets of similar structure/function. These are: 7TM receptors, LGIC, ion channels, nuclear receptors, catalytic receptors, transporters and enzymes.

  • HapMap DATABASE | SYSTEMS BIOLOGY
  • The International HapMap Project is a partnership of scientists and funding agencies from Canada, China, Japan, Nigeria, the United Kingdom and the United States to develop a public resource that will help researchers find genes associated with human disease and response to pharmaceuticals. See “About the International HapMap Project” for more information.

  • HeCaToS PROJECT
  • HeCaToS aims at developing integrative in silico tools for predicting human liver and heart toxicity. The objective is to develop an integrated modeling framework, by combining advances in computational chemistry and systems toxicology, for modelling toxic perturbations in liver and heart across multiple scales. This framework will include vertical integrations of representations from drug(metabolite)-target interactions, through macromolecules/proteins, to (sub-)cellular functionalities and organ physiologies, and even the human whole-body level. In view of the importance of mitochondrial deregulations and of immunological dysfunctions associated with hepatic and cardiac drug-induced injuries, focus will be on these particular Adverse Outcome Pathways. Models will be populated with data from innovative in vitro 3D liver and heart assays challenged with prototypical hepato- or cardiotoxicants; data will be generated by advanced molecular and functional analytical techniques retrieving information on key (sub-)cellular toxic events.

  • hERGCentral DATABASE | hERG – PHARMACOLOGY
  • hERGCentral is a resource center for researchers who work on hERG potassium channels or develop therapeutic compounds without side cardiac effects. By acquiring electrophysiological recording of large compound libraries, hERGcentral provides unique resources to facilitate research and drug discovery. To explore the whole library of over 300,000 compounds, it is necessary to sign in, but 2,300 compounds with bioactivities (known drugs) can be previewed around without login. The data can be explored in a number of ways.

  • hERGdatabase DATABASE | hERG -PHARMACOLOGY
  • The hERGdatabse contains information on two types of the electrophysiological assay data as described in the followings: 1) hERG channel inhibitory potential of the chemical compounds and 2) Action potential duration prolongation activity of the chemical compounds. This database allows users to obtain the IC50 value, which is the value of hERG channel current inhibitory activity of the chemical compounds measured by a patch-clamp technique under various experimental conditions, and to obtain % of APD time change between before and after application of the chemical compounds to the cells from cardiovascular tissues of various species.

  • HExpoChem TOOL | ADVERSE EVENT – DRUG SAFETY
  • The HExpoChem server contains information on diverse sources of chemicals with the aim to explore human health risk from chemical exposure. Five sources of information are considered i.e. drugs, food, cosmetics, industrial chemicals and human metabolites corresponding of over 10183 unique chemicals with bioactivities for around 19 483 human proteins.
    HExpoChem can assist in the multi-exposure to chemicals defined as a “cocktail” through several tools and can help in the decision of potential proteins and proteins complexes associated to life-style diseases.

  • HIPC PROJECT | TOXENDPOINT
  • The Human Immunology Project Consortium (HIPC) program was established in 2010 by the NIAID Division of Allergy, Immunology, and Transplantation as part of the overall NIAID focus on human immunology. The purpose of HIPC is to capitalize on recent advances in immune profiling methods in order to create a novel public resource that characterizes diverse states of the human immune system following infection; prior to and following vaccination against an infectious disease; or prior to and following treatment with an immune adjuvant that targets a known innate immune receptor(s).

  • HPIminer (article) TOOL | SYSTEMS BIOLOGY – TEXT MINING
  • HPIminer is a text mining system for visualizing human protein interactions and pathways from biomedical literature. HPIminer extracts human PPI information and PPI pairs from biomedical literature, and visualize their associated interactions, networks and pathways using two curated databases HPRD and KEGG.

  • Human Cytochrome P450 (CYP) Allele Nomenclature THESAURI | CYP450 – PHARMACOLOGY
  • The home page of the Human Cytochrome P450 (CYP) Allele Nomenclature Committee.

  • Human Liver Adverse Effects Database (file) DATABASE – THESAURI | ADVERSE EVENT – DRUG SAFETY – PHARMACOLOGY
  • FDA’s Center for Drug Evaluation and Research, Office of Pharmaceutical Science, Informatics and Computational Safety Analysis Staff’s Adverse Effects Database contains data for 631 unique pharmaceuticals listed in FDA/CDER’s Spontaneous Reporting System Database. The ICSAS Adverse Effects Database includes adverse drug reaction (ADR) reports described using a standardized vocabulary of 1191 COSTAR terms linked to 22 organ systems which are based upon the anatomical COSTAR term nomenclature. For each of the 631 drugs in our database, we extracted the counts of ADR reports for all populated COSTAR terms for the time periods described below.

  • Human Metabolic Atlas (article) DATABASE – TOOL | METABOLISM – SYSTEMS BIOLOGY
  • The Human Metabolic Atlas (HMA) is a unique tool for studying human metabolism, ranging in scope from an individual cell, to a specific organ, to the overall human body.

  • Human Metabolome Database DATABASE | METABOLISM – PHARMACOLOGY
  • The Human Metabolome Database (HMDB) is a freely available electronic database containing detailed information about small molecule metabolites found in the human body. It is intended to be used for applications in metabolomics, clinical chemistry, biomarker discovery and general education. The database is designed to contain or link three kinds of data: 1) chemical data, 2) clinical data, and 3) molecular biology/biochemistry data. The database (version 2.5) contains over 7900 metabolite entries including both water-soluble and lipid soluble metabolites as well as metabolites that would be regarded as either abundant (> 1 uM) or relatively rare (< 1 nM). Additionally, approximately 7200 protein (and DNA) sequences are linked to these metabolite entries. Each MetaboCard entry contains more than 110 data fields with 2/3 of the information being devoted to chemical/clinical data and the other 1/3 devoted to enzymatic or biochemical data. Many data fields are hyperlinked to other databases (KEGG, PubChem, MetaCyc, ChEBI, PDB, Swiss-Prot, and GenBank) and a variety of structure and pathway viewing applets. The HMDB database supports extensive text, sequence, chemical structure and relational query searches. Four additional databases, DrugBank, T3DB, SMPDB and FooDB are also part of the HMDB suite of databases. DrugBank contains equivalent information on ~1500 drugs, T3DB contains information on 2900 common toxins and environmental pollutants, SMPDB contains pathway diagrams for 350 human metabolic and disease pathways, while FooDB contains equivalent information on ~2000 food components and food additives.

  • Human oral bioavailability database DATABASE | ADME
  • The oral bioavailability database includes 805 molecules collected from about 200 literatures. There are two databases for downloading. The first database includes 805 molecules and the second one includes 773 molecules. The first database is the updated version of the second one. In the new version of the database, 32 molecules were added, and some errors in the old version were corrected. So please download the updated version for your research.

  • Human Proteome Initiative DATABASE – THESAURI | ANNOTATION
  • The Human Proteome Initiative (HPI) aims to annotate all known human protein sequences and their mammalian orthologs, according to the quality standards of UniProtKB/Swiss-Prot. This goal has been partially reached as of UniProt release 14.1 of 2-Sep-2008, since a manually annotated representation of all the currently known human protein-coding genes has been made publicly available on our website. In addition to accurate sequences, we offer, for each characterized protein, a wealth of information that includes the description of its function, domain structure, subcellular location, similarities to other proteins, etc.

  • IARC Monographs on the Evaluation of Carcinogenic Risks to Human DATABASE | DRUG SAFETY – TOXENDPOINT
  • The International Agency Research on Cancer (IARC) Monographs identify environmental factors that can increase the risk of human cancer. These include chemicals, complex mixtures, occupational exposures, physical agents, biological agents, and lifestyle factors. National health agencies can use this information as scientific support for their actions to prevent exposure to potential carcinogens.

  • ICSAS DATABASE – TOOL | RISK ASSESSMENT – TOXENDPOINT
  • The Informatics and Computational Safety Analysis Staff (ICSAS) is part of CDER’s Office of Pharmaceutical Science. ICSAS is an applied regulatory research unit that: Develops databases of toxicological and clinical endpoints; Transforms data, developing rules for quantifying toxicological and clinical effects; Evaluates structure activity relationship (SAR) and data mining software using ICSAS databases; Works with software developers to develop toxicology and clinical effects prediction programs through research leveraging partnerships; Reduces the use of animals in testing by eliminating non-critical laboratory studies; Facilitates the review process by making better use of accumulated scientific knowledge; Supplies tools to the pharmaceutical industry to develop better means to identify and eliminate compounds with potentially significant adverse properties early in the drug discovery and development process.

  • IMEx (article) PROJECT | NETWORK – PATHWAY – SYSTEMS BIOLOGY
  • The IMEx consortium is an international collaboration between a group of major public interaction data providers who have agreed to share curation effort and develop and work to a single set of curation rules when capturing data from both directly deposited interaction data or from publications in peer-reviewed journals; capture full details of an interaction in a “deep” curation model; perform a complete curation of all protein-protein interactions experimentally demonstrated within a publication; make these interaction available in a single search interface on a common website; provide the data in standards compliant download formats; and make all IMEx records freely accessible under the Creative Commons Attribution License.

  • IMID (article) DATABASE – THESAURI| NETWORK – ONTOLOGY – PATHWAY
  • Integrated Molecular Interaction Databse (IMID) is a database for molecular interaction information integrated with various other bio-entity information, including pathways, diseases, gene ontology (GO) terms, species and molecular types. The information is obtained from several manually curated databases and automatic extraction from literature.
    Currently, there are protein-protein interaction, gene/protein regulation and protein-small molecule interaction information stored in the database. The interaction information is linked with relevant GO terms, pathway, disease and species names. Interactions are also linked to the PubMed IDs of the corresponding abstracts the interactions were obtained from.
    Manually curated molecular interaction information was obtained from BioGRID, IntAct, NCBI Gene, and STITCH database. Pathway related information was obtained from KEGG database, Pathway Interaction database and Reactome. Disease information was obtained from PharmGKB and KEGG database.Gene ontology terms and related information was obtained from Gene Ontology database and GOA database.

  • INCHEM DATABASE | TOXENDPOINT
  • Chemical Safety Information from Intergovernmental Organizations. Rapid access to internationally peer reviewed information on chemicals commonly used throughout the world, which may also occur as contaminants in the environment and food. It consolidates information from a number of intergovernmental organizations whose goal it is to assist in the sound management of chemicals.

  • InChI THESAURI | IDENTIFIER
  • InChI was developed in cooperation of IUPAC and NIST and is the newest way of describing chemical structures in text. It is continuously gaining popularity in the chemical informatics community as it has several very interesting features.

  • InChIKey THESAURI | IDENTIFIER
  • InChIKey is a fixed-length format directly derived from InChI. It is based on a strong hash (SHA-256 algorithm) of an InChI string (there is no guarantee that two distinct molecules will have different InChIKeys).
    The nature of InChIKey makes it ideal for database storage, especially for indexing purposes (it cannot be used as the only format for chemical structure storage because it is not convertible to the original structure).
    InChIKey is also a very good format for online publishing in form of metadata. Its small length and compact form guarantee that search engines will read and index them properly, which might not be true for long InChIs.

  • INHAND (article) THESAURI | ONTOLOGY
  • The European Society of Toxicologic Pathology, in conjunction with RITA, endorsed the proposal in late 2005. They suggested that since RITA had recently completed a large amount of work on proliferative lesions of both rats and mice, that the project focus on non-proliferative lesions. They also offered to provide an open version of RENI (goRENI: global open Registry Nomenclature Information System (www.goreni.org) to serve as a platform. Access to goRENI is available to all members of the participating STPs (see below for further information on accessing goRENI.)
    The result of these discussions was the INHAND Proposal (International Harmonization of Nomenclature and Diagnostic Criteria for Lesions in Rats and Mice). In 2006, the BSTP and the JSTP joined the initiative, so that the project was truly global.

  • InnoMed DATABASE – PROJECT | DRUG SAFETY – RISK ASSESSMENT
  • InnoMed PredTox is a joint Industry and European Commission collaboration to improve drug safety. The consortium is composed of 14 pharmaceutical companies, three academic institutions and two technology providers. The goal of InnoMed PredTox is to assess the value of combining results from omics technologies together with the results from more conventional toxicology methods in more informed decision making in preclinical safety evaluation.

  • insilicofirst TOOL
  • Insilicofirst is a new collaboration of the world’s leading organisations in toxicity prediction systems, uniting: Lhasa Limited, Leadscope Inc., Molecular Networks GmbH and Multicase Inc. in a partnership combining highly regarded scientific knowledge with expertise in in silico predictive software development.

  • IntSide (article) TOOL – DATABASE | RISK ASSESSMENT – SYSTEMS BIOLOGY
  • IntSide is a web server to elucidate the molecular processes involved in drug side effects through the integration of chemistry and biology. Navigate through enriched traits categorized in eigth levels of complexity. The network integration and visualization offered by IntSide allows for the identification of complex mechanisms.

  • In Silico Toxicology book OTHER | ADME – QSAR MODELING – STRUCTURE-BASED PREDICTION – TOXENDPOINT
  • A new book entitled “In Silico Toxicology: Principles and Applications” edited by Mark Cronin and Judith Madden from Liverpool John Moores University, England. The book guides the user through the development of (Q)SARs and categories for toxicity prediction. Whilst the book leans towards toxicology, from human health effects to environmental endpoints, it will be of interest to all QSAR practitioners as it contains a number of useful “QSAR Methods” chapters. The book describes the process of developing QSARsfor toxicity, specifically: Obtaining data for modelling and assessing its quality; Calculating physico-chemical properties as well as 2-D, 3-D and MO descriptors; Statistical analysis; Regulatory use and acceptance of QSARs including OECD principles and applicability domain; Relevance of mechanisms of action and (adverse outcome) pathways; Category formation and read-across, including use of freely available tools; Expert systems for toxicity prediction; Prediction of ADME and exposure modelling; and Using in silico predictions in weight of evidence and integrated testing strategies, illustrated with case studies.

  • IntAct (article) DATABASE | NETWORK – PATHWAY – SYSTEMS BIOLOGY
  • IntAct provides a freely available, open source database system and analysis tools for protein interaction data. All interactions are derived from literature curation or direct user submissions and are freely available.

  • International Conference of Harmonisation PROJECT – REGULATORY GUIDELINES
  • The International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use (ICH) is a unique project that brings together the regulatory authorities of Europe, Japan and the United States and experts from the pharmaceutical industry in the three regions to discuss scientific and technical aspects of product registration. The purpose is to make recommendations on ways to achieve greater harmonisation in the interpretation and application of technical guidelines and requirements for product registration in order to reduce or obviate the need to duplicate the testing carried out during the research and development of new medicines.

  • InterMine (article) DATABASE – TOOL
  • InterMine is an open source data warehouse build specifically for the integration and analysis of complex biological data. Developed by the Micklem lab at the University of Cambridge, InterMine enables the creation of biological databases accessed by sophisticated web query tools. Parsers are provided for integrating data from many common biological data sources and formats, and there is a framework for adding your own data. InterMine includes an attractive, user-friendly web interface that works ‘out of the box’ and can be easily customised for your specific needs, as well as a powerful, scriptable web-service API to allow programmatic access to your data.

  • IntPath (article) DATABASE | PATHWAY – SYSTEMS BIOLOGY
  • IntPath is an integrated pathway gene relationship database for model organisms and important pathogens.

  • iPHACE (article) TOOL | PHARMACOLOGY
  • iPHACE is an integrative web-based tool to navigate in the pharmacological space defined by small molecule drugs present in the IUPHAR-DB database, with additional interactions from the PDSP database. The current release (1.0) contains 4,089 interactions between 739 drugs and 181 targets, covering 147 G protein-coupled receptors (GPCR) and 34 ligand-gated ion channels (IC). Of those, a total of 2,122 interactions between 739 drugs and 140 targets come from IUPHAR-DB, supplemented with 3,009 interactions between 330 of those drugs and 121 targets from PDSP.

  • iProClass DATABASE | DRUG DISCOVERY – PATHWAY – SYSTEMS BIOLOGY
  • The iProClass database provides value-added information reports for UniProtKB and unique NCBI Entrez protein sequences in UniParc, with links to over 90 biological databases, including databases for protein families, functions and pathways, interactions, structures and structural classifications, genes and genomes, ontologies, literature, and taxonomy. iProClass combines both data warehouse and hypertext navigation methods for integrating data, providing a comprehensive picture of protein properties that may lead to novel prediction and functional inference for previously uncharacterized “hypothetical” proteins and protein groups.

  • ISIDA TOOL | ADME – PHARMACOLOGY
  • ISIDA (In SIlico design and Data Analysis) is a software to perform virtual screening of large databases of compounds and reactions and to assess some ADME/Tox properties.

  • ISSCAN DATABASE | STRUCTURE-BASED PREDICTION – TOXENDPOINT
  • The Chemical carcinogens: structures and experimental data (ISSCAN) datababe contains information on chemical compounds tested with the long-term carcinogenicity bioassay on rodents (rat, mouse). Beside being a repository of data, it has been specifically designed as an expert decision support tool.
    Historically, this database originates from the experience of researchers of the Environment and Primary Prevention Department in the field of structure-activity relationships, aimed at developing models which theoretically predict the carcinogenicity of chemicals. The use of experimental carcinogenicity data for structure-activity relationship studies amplifies their informative value, and contributes to the reduction and replacement of animal experimentation.
    This database does not contain neither epidemiological data nor regulatory classifications of the carcinogens, but only the experimental results from the carcinogenicity bioassay.
    The structure of this database is inspired by that of the Distributed Structure-Searchable Toxicity (DSSTox) Network of the US Enviromental Protection Agency (EPA) (DSSTox ). Similarly to the DSSTox spirit this project wants to contribute to the free diffusion of scientific data in a standardized, easy to read format.

  • ITER DATABASE | RISK ASSESSMENT – TOXENDPOINT
  • The International Toxicity Estimates for Risk Assessment (ITER) is a free Internet database of human health risk values and cancer classifications for over 600 chemicals of environmental concern from multiple organizations worldwide. ITER is the only database that presents risk data in a tabular format for easy comparison, along with a synopsis explaining differences in data and a link to each organization for more information.

  • IUPAC Gold Book THESAURI | NUCLEAR RECEPTORS – PHARMACOLOGY
  • The IUPAC Gold Book is the interactive version of IUPAC Compendium of Chemical Terminology, informally known as the Gold Book where one can browse by the alphabetical index, using one of the many thematic indexes, or by using the search entry in the navigation sidebar.

  • IUPHAR DATABASE – THESAURI | PHARMACOLOGY
  • The International Union of Basic and Clinical Pharmacology (IUPHAR) database is an expert-curated database and an established online reference resource for several important classes of human drug targets and related proteins. As well as providing recommended nomenclature, the database integrates information on the chemical, genetic, functional and pathophysiological properties of receptors and ion channels, curated and peer-reviewed from the biomedical literature by a network of experts. The database now includes information on 616 gene products from four superfamilies in human and rodent model organisms: G protein-coupled receptors, voltage- and ligand-gated ion channels and, in a recent update, 49 nuclear hormone receptors (NHRs). New data types for NHRs include details on co-regulators, DNA binding motifs, target genes and 3D structures. Other recent developments include curation of the chemical structures of approximately 2000 ligand molecules, providing electronic descriptors, identifiers, link-outs and calculated molecular properties, all available via enhanced ligand pages. The interface now provides intelligent tools for the visualization and exploration of ligand structure-activity relationships and the structural diversity of compounds active at each target.

  • jCompoundMapper (article) THESAURI – TOOL | PHARMACOLOGY
  • jCompoundMapper is a java library for the decomposition of chemical graphs, provides popular fingerprinting algorithms for chemical graphs such as depth-first search fingerprints, shortest-path fingerprints, extended connectivity fingerprints, autocorrelation fingerprints (e.g. CATS2D), radial fingerprints (e.g. Molprint2D), geometrical Molprint, atom pairs, and pharmacophore fingerprints; provides exporters for several formats for machine learning tools such as LIBSVM, LIBLINEAR, and WEKA; also allows for a parameterization like search depth, distance cut-offs, or atom typing. In case of hashed fingerprints you may configure the size of the hash space; can be used as a lightweight jar library or a stand-alone executable and it is based on open source software with a liberal license. It uses the chemical expert system of the Chemistry Development Kit.

  • Jochem (article) THESAURI – TOOL | PHARMACOLOGY
  • Jochem is a dictionary for the identification of small molecules and drugs in text, combining information from UMLS, MeSH, ChEBI, DrugBank, KEGG, HMDB, and ChemIDplus. Rule-based term filtering, manual check of highly frequent terms, and disambiguation rules were applied. We tested the combined dictionary and the dictionaries derived from the individual resources on an annotated corpus, and conclude the following: (1) each of the different processing steps increase precision with a minor loss of recall; (2) the overall performance of the combined dictionary is acceptable (precision 0.67, recall 0.40 (0.80 for trivial names); (3) the combined dictionary performed better than the dictionary in the chemical recognizer OSCAR3; (4) the performance of a dictionary based on ChemIDplus alone is comparable to the performance of the combined dictionary.

  • Joint European compound Library (article) DATABASE – PROJECT | PHARMACOLOGY
  • The Joint European Compound Library (JECL) lies at the heart of the EU Lead Factory. It contains a diverse range of very high quality compounds that are ‘drug-like’ and synthetically tractable and that will make excellent starting points for projects. Until now, these have been distributed across many different proprietary collections with highly restricted access.
    Around 300,000 high quality compounds have been contributed by the seven pharmaceutical companies in the EU Lead Factory Consortium (the EFPIA Collection). A key project goal is to add a further 200,000 innovative compounds, carefully selected for novelty, drug-like properties, diversity and synthetic tractability. These will be based on ideas from academic and industry chemists from across Europe and synthesised by chemistry SMEs with expertise in the preparation of chemical libraries (the Public Collection). Together, the EFPIA and Public Collections make up the JECL.

  • JRC QSAR Database DATABASE | QSAR MODELING
  • The JRC QSAR Database contains a compilation of documentation on the science and applications of non-testing methods, including (Quantitative) Structure-Activity Relationships and chemical grouping methods. More information on the applications of non-testing methods, especially for regulatory purposes, are available on their website.

  • KEGG DATABASE | PATHWAY – PHARMACOLOGY
  • KEGG is an integrated database resource consisting of 16 main databases, broadly categorized into systems information, genomic information, and chemical information as shown below. Genomic and chemical information represents the molecular building blocks of life in the genomic and chemical spaces, respectively, and systems information represents functional aspects of the biological systems, such as the cell and the organism, that are built from the building blocks. KEGG has been widely used as a reference knowledge base for biological interpretation of large-scale datasets generated by sequencing and other high-throughput experimental technologies.

  • KEGGParser (article) TOOL | PATHWAY
  • KEGGParser is a MATLAB based tool for KEGG pathway parsing, semiautomatic fixing, editing, visualization and analysis in MATLAB environment. It also works with Scilab.

  • KiPar (article) TOOL | METABOLISM – PATWHAY – TEXT MINING
  • KiPar is a standalone Java application for the retrieval of textual documents likely to contain information relevant for kinetic modelling of a given metabolic pathway. KiPar is a computer application for (1) retrieval of textual documents given pathway information and the required kinetic parameters; and (2) annotation of the retrieved documents with the pathway- and kinetics-related concepts and the potential values of these parameters.

  • Kiwi (article) TOOL | SYSTEMS BIOLOGY
  • Kiwi is a tool that enhances interpretability of high-throughput data. It allows the users not only to discover a list of significant entities or processes as in gene-set analysis, but also to visualize whether these entities or processes are isolated or connected by means of their biological interaction. The Kiwi module combines geneset analyses with biological networks to visualize the interactions between genesets that are significant in a given biological systems.

  • KNApSAcK TOOL | METABOLISM
  • KNApSAcK is a tool for the analysis of metabolites. Information on natural products has been amassed, with special emphasis on their biological origins. You can retrieve information on metabolites by entering the organism name, the name of a metabolite, molecular weight, or molecular formula. KNApSAcK also provides a tool for analyzing mass spectrum data.

  • KNIME TOOL | WORKFLOW
  • KNIME is a modular data exploration platform that enables the user to visually create data flows (often referred to as pipelines), selectively execute some or all analysis steps, and later investigate the results through interactive views on data and models.

  • KNIME-CDK (article) TOOL | WORKFLOW
  • KNIME-CDK is an open-source plug-in for the Konstanz Information Miner, a free workflow plat- form. KNIME-CDK is build on top of the open-source Chemistry Development Toolkit and allows for efficient cross-vendor structural cheminformatics. Its ease-of-use and modularity enables researchers to automate routine tasks and data analysis, bringing complimentary cheminformatics functionality to the workflow environment.

  • KnowLife (article) TOOL | TEXT MINING – RISK ASSESSMENT – SYSTEMS BIOLOGY
  • KnowLife is a large KnowledgeBase that captures a variety of biomedical knowledge and is automatically extracted from different genres of input sources, supports a number of use cases for different information needs. For instance, a patient may wish to find out the side effects of a specific drug, by searching for the drug name and browsing the SideEffect facts and their provenance; and a physician may want to “speed read” publications and online discussions on treatment options for an unfamiliar disease. Also it provides a function for on-the-fly annotation of new text from publications or social media, leveraging known patterns to highlight any relations found.

  • KOMICS (article) DATABASE – PROJECT | METABOLISM
  • KOMICS (The Kazusa Metabolomics Portal), where the tools and databases that we developed are available for free to academic users. KOMICS includes the tools and databasesfor preprocessing, mining, visualization, and publication of metabolomics data. Improvements in the annotation of unknown metabolites and dissemination of comprehensive metabolomic data are the primary aims behind the development of this portal.

  • LAZAR DATABASE | QSAR MODELING – STRUCTURE-BASED PREDICTION
  • The Lazy Structure–Activity Relationships (LAZAR) database provides QSAR predictions for liver toxicity, mutagenicity, and carcinogenicity. In silico toxicology provides customised solutions for the computer based prediction of toxic activities.

  • Ligand.Info DATABASE | PHARMACOLOGY
  • Ligand.Info is a compilation of various publicly available databases of small molecules such as ChemBank, ChemPDB, KEGG, NCI, AKos GmbH, Asinex Ltd, and TimTec. The total size of the Meta-Database is 1 million entries. The compound records contain calculated three-dimensional coordinates and sometimes information about biological activity. Some molecules have information about FDA drug approving status or about anti-HIV activity. Meta-Database can be downloaded in SDF format and used for virtual high-throughput screening of new potential drugs. The database can also be screened using a Java-based tool.

  • LINCS DATABASE – PROJECT | NETWORK – SYSTEMS BIOLOGY
  • The Library of Integrated Network-based Cellular Signatures (LINCS) program aims to develop a “library” of molecular signatures, based on gene expression and other cellular changes that describe the response that different types of cells elicit when exposed to various perturbing agents, including siRNAs and small bioactive molecules. High-throughput screening approaches will be used to interrogate the cells and mathematical approaches will be used to describe the molecular changes and patterns of response. The data will be collected in a standardized, integrated, and coordinated manner to promote consistency and comparison across different cell types.

    The underlying premise of the LINCS program is that disrupting any one of the many steps of a given biological process will cause related changes in the molecular and cellular characteristics, behavior, and/or function of the cell – also known as the cellular phenotype. A cellular phenotype is, in turn, intended to reflect signatures derived for comparable assays of clinical states. Observing how and when a cell’s phenotype is altered by specific stressors can provide clues about the underlying mechanisms involved in perturbation and ultimately disease.

    LINCS data will be made openly available as a community resource that can be easily scaled up and augmented to address a broad range of basic research questions and to facilitate the identification of biological targets for new disease therapies.

  • LiverAtlas (article) DATABASE | TOXENDPOINT
  • LiverAtlas is a comprehensive resource of biomedical knowledge related to the liver and various hepatic diseases. It provides a wealth of manually curated records, relevant literature citations and cross-references to other databases. LiverAtlas covers detailed information of liver-related genome, transcriptome, proteome, metabolome, pathways and liver diseases. Importantly, the database contains proteins and genes that specificly expressed in the liver. Especially, an expert-confirmed Human Liver Disease Ontology, including relevant information for 227 types of hepatic disease, has been constructed and is used to annotate and arrange the data in LiverAtlas. We also assigned reliability scores to the entries of liver-expressed genes and proteins, PPIs, PTMs, and molecular/genetic events of hepatic diseases by a semi-quantitative assessment in order to facilitate users to select data of their interests.

  • LiverTox DATABASE | TOXENDPOINT
  • LiverTox is a freely available website that provides up-to-date, comprehensive and unbiased information about drug induced liver injury caused by prescription and nonprescription drugs, herbals and dietary supplements. LiverTox represents a collaborative effort by medical and scientific specialists to provide a central repository of clinical information in support of clinical and basic research on the prevention and control of drug induced liver injury. LiverTox also provides guidance to clinicians and healthcare providers on the diagnosis and management of this important cause of liver disease. LiverTox is a joint effort of the Liver Disease Research Branch of the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) and the Division of Specialized Information Services of the National Library of Medicine (NLM), National Institutes of Health. The authors of LiverTox welcome any and all comments, particularly corrections or additions. The website is a living textbook that will be regularly updated and improved. The text of LiverTox is not copyright protected and its general use is encouraged.

  • LocFuse TOOL | NETWORKS – PATHWAYS
  • In the framework of human PPI prediction, LocFuse is a method that uses eight different genomic and proteomic features along with four types of different classifiers. The prediction performance of this classifier selection method was found to be considerably better than methods employed hitherto. This confirms the complex nature of the PPI prediction problem and also the necessity of using biological information for classifier fusion.

  • LmmD DATABASE | ADME – CYP450 – PHARMACOLOGY – TOXENDPOINT
  • The Laboratory of Molecular Modeling and Design contributes in discovering and design drugs on anti-viral, anti-tumor, nervous system disease and metabolism related diseases. They release six ADME/Tox database including Blood Brain Barrier (BBB) Penetration Database, Human Intestinal Absorption (HIA) Database, Cytochrome P450 Inhibitors Databases, Tetrahymena pyriformis Toxicity Databases, Fathead minnow and honey bee Toxicity Database for public scientific community. The contents of these databases are carefully collected from literature and others open source database, such as PubChem, EPA toxicity database etc. In the near future, more databases will be continuously added to our database collections.

  • LTKB Benchmark dataset (article 2011) (article 2013) DATABASE | TOXENDPOINT
  • NCTR scientists have developed a benchmark dataset, the Liver Toxicity Knowledge Base (LTKB-BD), containing 287 drugs whose Drug-Induced Liver Injury (DILI) impact has been established. FDA-approved prescription drug labels—available on the National Institutes of Health’s DailyMeddisclaimer icon web site—were examined, focusing on drugs that have been available for ten or more years and are available commercially from one of three large chemical supply companies. After reviewing the labels of 287 prescription drugs the following was found: a) 137 most-DILI-concern drugs which are either withdrawn or discontinued from the markets, have a “Boxed Warning” label, or come from the “Warnings and Precautions” section with severe DILI content, b) 85 less-DILI-concern drugs whose DILI events are highlighted in the “Adverse Reactions” section or “Warnings and Precautions” section with mild DILI content, and c) 65 no-DILI-concern drugs whose labels did not contain any DILI event.

  • LTMap (article) TOOL | TOXENDPOINT
  • LTMap is developed as a web server for assessing the potential liver toxicity by genome-wide transcriptional expression data. Inspired by the Connectivity Map (cMap), we applied fold-change ranking approach developed by Lamb to compare ‘query signature’ with reference gene lists. This web compiled a large, publically available gene sets database retrieved from Toxicogenomics Project (TGP) database in Japan to form the basis for generating rank-ordered lists. In current version, LTMap contains 20123 Affymetrix arrays generated in both in vivo experiments and in vitro experiments.