Package org.terrier.terms

Provides the interface and classes for the term pipeline, a set of objects that process the terms during indexing and processing of queries.

See:
          Description

Interface Summary
Stemmer Interface for all stemmers.
TermPipeline Models the concept of a component in a pipeline of term processors.
TermPipelineAccessor This interface allows code to access the TermPipeline without implementing the end of the term pipeline.
 

Class Summary
BaseTermPipelineAccessor A base implementation for TermPipelineAccessor
CropTerm Reduces the size of terms passing through the term pipeline to the maximum allowed size for this indexing run.
DanishSnowballStemmer Danish stemmer implmented by Snowball.
DumpTerm Useful development phase TermPipeline object that allows prints every term that passes through it to System.err
DutchSnowballStemmer Dutch stemmer implmented by Snowball.
EnglishSnowballStemmer English stemmer implmented by Snowball.
FinnishSnowballStemmer Finnish stemmer implmented by Snowball.
FrenchSnowballStemmer French stemmer implmented by Snowball.
GermanSnowballStemmer German stemmer implmented by Snowball.
HungarianSnowballStemmer Hungerian stemmer implmented by Snowball.
ItalianSnowballStemmer Italian stemmer implmented by Snowball.
NoOp A do-nothing term pipeline object.
NorwegianSnowballStemmer Norwegian stemmer implmented by Snowball.
PorterStemmer Stemmer, implementing the Porter Stemming Algorithm.
PortugueseSnowballStemmer Portuguese stemmer implmented by Snowball.
RomanianSnowballStemmer Romanian stemmer implmented by Snowball.
RussianSnowballStemmer Russian stemmer implmented by Snowball.
SkipTermPipeline Class that identified tokens which should not be passed down the entire term pipeline, and instead passed onto a specified stage instead.
SnowballStemmer  
SpanishSnowballStemmer Spanish stemmer implmented by Snowball.
StemmerTermPipeline Abstract base class for Stemmers that are also TermPipeline instances
Stopwords Implements stopword removal, as a TermPipeline object.
SwedishSnowballStemmer Swedish stemmer implmented by Snowball.
TRv2PorterStemmer This is the Porter stemming algorithm, coded up in JAVA by Gianni Amati.
TRv2WeakPorterStemmer An implementation of the Porter stemming algorithm that uses only the first step of the algorithm.
TurkishSnowballStemmer Turkish stemmer implmented by Snowball.
WeakPorterStemmer Weak Porter Stemmer, using Porter's Java implementation as the base.
 

Package org.terrier.terms Description

Provides the interface and classes for the term pipeline, a set of objects that process the terms during indexing and processing of queries.

This package includes implementations of a stop-word remover, as well as a full and a weak version of Porter's stemming algorithm.



Terrier 3.5. Copyright © 2004-2011 University of Glasgow