Package org.terrier.terms
Provides the interface and classes for the term pipeline, a set of objects that process the terms during indexing and processing of queries.
This package includes implementations of a stop-word remover, as well as a full and a weak version of Porter's stemming algorithm.
-
Interface Summary Interface Description Stemmer Interface for all stemmers.TermPipeline Models the concept of a component in a pipeline of term processors.TermPipelineAccessor This interface allows code to access the TermPipeline without implementing the end of the term pipeline. -
Class Summary Class Description BaseTermPipelineAccessor A base implementation for TermPipelineAccessorCropTerm Reduces the size of terms passing through the term pipeline to the maximum allowed size for this indexing run.DanishSnowballStemmer Danish stemmer implmented by Snowball.DumpTerm Useful development phase TermPipeline object that allows prints every term that passes through it to System.errDutchSnowballStemmer Dutch stemmer implmented by Snowball.EnglishSnowballStemmer English stemmer implmented by Snowball.FinnishSnowballStemmer Finnish stemmer implmented by Snowball.FrenchSnowballStemmer French stemmer implmented by Snowball.GermanSnowballStemmer German stemmer implmented by Snowball.HungarianSnowballStemmer Hungerian stemmer implmented by Snowball.ItalianSnowballStemmer Italian stemmer implmented by Snowball.NoOp A do-nothing term pipeline object.NorwegianSnowballStemmer Norwegian stemmer implmented by Snowball.PorterStemmer Stemmer, implementing the Porter Stemming Algorithm.PortugueseSnowballStemmer Portuguese stemmer implmented by Snowball.RemoveDiacritics Removes diacritics in letters.RomanianSnowballStemmer Romanian stemmer implmented by Snowball.RussianSnowballStemmer Russian stemmer implmented by Snowball.SkipTermPipeline Class that identified tokens which should not be passed down the entire term pipeline, and instead passed onto a specified stage instead.SnowballStemmer Classic Snowball stemmer implmented by Snowball.SpanishSnowballStemmer Spanish stemmer implmented by Snowball.StemmerTermPipeline Abstract base class for Stemmers that are also TermPipeline instancesStopwords Implements stopword removal, as a TermPipeline object.SwedishSnowballStemmer Swedish stemmer implmented by Snowball.TRv2PorterStemmer This is the Porter stemming algorithm, coded up in JAVA by Gianni Amati.TRv2WeakPorterStemmer An implementation of the Porter stemming algorithm that uses only the first step of the algorithm.TurkishSnowballStemmer Turkish stemmer implmented by Snowball.WeakPorterStemmer Weak Porter Stemmer, using Porter's Java implementation as the base.