FACTA: a text search engine for finding associated biomedical concepts

Bioinformatics. 2008 Nov 1;24(21):2559-60. doi: 10.1093/bioinformatics/btn469. Epub 2008 Sep 4.

Abstract

FACTA is a text search engine for MEDLINE abstracts, which is designed particularly to help users browse biomedical concepts (e.g. genes/proteins, diseases, enzymes and chemical compounds) appearing in the documents retrieved by the query. The concepts are presented to the user in a tabular format and ranked based on the co-occurrence statistics. Unlike existing systems that provide similar functionality, FACTA pre-indexes not only the words but also the concepts mentioned in the documents, which enables the user to issue a flexible query (e.g. free keywords or Boolean combinations of keywords/concepts) and receive the results immediately even when the number of the documents that match the query is very large. The user can also view snippets from MEDLINE to get textual evidence of associations between the query terms and the concepts. The concept IDs and their names/synonyms for building the indexes were collected from several biomedical databases and thesauri, such as UniProt, BioThesaurus, UMLS, KEGG and DrugBank.

Availability: The system is available at http://www.nactem.ac.uk/software/facta/

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Abstracting and Indexing / methods*
  • Database Management Systems
  • MEDLINE*
  • Software*