|
FASTUS is a (slightly permuted) acronym for Finite State
Automata-based Text Understanding System. It is a system for extracting
information from free text. Currently English and Japanese versions of
the system exist. Typical applications mark text with annotations that
indicate items of interest, such as names of people or companies, or it
fills database templates with information that could be then entered into
a relational database.
FASTUS was developed in response to the needs of the intelligence community
for scanning and processing huge volumes of written texts. Government intelligence
agencies collect information from around the world from both classified
and unclassified sources. Assimilating important facts from this data can
be a daunting task for an analyst. One analyst described the problem
by saying that, ``If I read every bit of information that might be important
to what I am working on, it would be like reading War and Peace
every day.’’ FASTUS provides the analyst with a tool that will help him
or her to avoid being overwhelmed by the flood of information.
FASTUS is most appropriate for information
extraction tasks, rather than full text understanding. That is, it
is most effective for tasks in which (1) only parts of the text
contain relevent information, and (2) there is a relatively simple,
predefined target representation that the information is mapped
into.
FASTUS has been under development since 1992. The system
is implemented in Common Lisp, and has been transported to several hardware
platforms, including Macs and PCs.
|