Extrinsic Evaluation of Automated Information Extraction Programs

Report No. ARL-TN-391
Authors: Frank Small and William Tanenbaum
Date/Pages: May 2010; 16 pages
Abstract: Information extraction (IE) plays a vital role in Natural Language Processing and also serves as the foundation for computational visualization of information. Information can be converted into a user-defined, ontology-friendly format much faster and more efficiently by automating IE. Programs like General Architecture for Text Engineering (GATE) and Automap allow entities such as name, date, location, and organization to be extracted from a corpus written in a natural language. Verbs and other parts of speech can be extracted using these programs as well. The extracted information can then be formatted into a computer-readable language for visualization and populating a database for use by the fusion community to provide actionable intelligence for the Warfighter. This technical note documents the results of the comparison of the IE tools offered by GATE versus those in Automap.
Distribution: Approved for public release
  Download Report ( 0.087 MBytes )
If you are visually impaired or need a physical copy of this report, please visit and contact DTIC.

Last Update / Reviewed: May 1, 2010