Terrier Future Features and Known Issues
List of features and known issues that are marked for future Terrier versions:
- Ant task for compiling Terrier.
- Alternatively, provisions for compiling Terrier on Windows.
- Release of distributed version of Terrier.
- Terrier's own Exceptions for setup, indexing, querying
- Integration with log4j possibly.
- B-tree based lexicon file format.
- Postscript parsing (via PDF?).
- More provisions for multiple languages (Unicode, encodings,
- Refinement of desktop search application with improved
interface and parsers for more types of documents, as well as
better integration with common Operating Systems.
- Open With dialog for Desktop Terrier on Mac OS X.
- Faster PDF parsing, perhaps calling pdf2text if available.
- Move all Binary Trees to Threaded Trees, which would allow
non-recursive traversals, thus preventing stack overflows,
particularly block indexing documents that have large numbers
of repeated terms (eg spreadsheets).
- Refinement of the query language: allows more term characters; remove the
ambiguity warnings when generating the parser with ANTLR.
All community contributions to the Terrier framework are welcome.
If you're actively using Terrier, or developing for it, please join our mailing
lists. See the Terrier mailing lists for more
information. In addition, you can find more information about contributing on
the Terrier website.