Terrier IR Platform
1.1.1

uk.ac.gla.terrier.terms
Class PorterStemmer

java.lang.Object
  extended by uk.ac.gla.terrier.terms.PorterStemmer
All Implemented Interfaces:
TermPipeline
Direct Known Subclasses:
WeakPorterStemmer

public class PorterStemmer
extends java.lang.Object
implements TermPipeline

This is the Porter stemming algorithm, coded up in JAVA by Gianni Amati. All comments were made by Porter, but few ones due to some implementation choices.
Porter says "It may be be regarded as cononical, in that it follows the algorithm presented in Porter, 1980, An algorithm for suffix stripping, Program, Vol. 14, no. 3, pp 130-137, only differing from it at the points marked --DEPARTURE-- below. The algorithm as described in the paper could be exactly replicated by adjusting the points of DEPARTURE, but this is barely necessary, because (a) the points of DEPARTURE are definitely improvements, and (b) no encoding of the Porter stemmer I have seen is anything like as exact as this version, even with the points of DEPARTURE!".
This class is not thread safe.

Version:
$Revision: 1.10 $
Author:
Gianni Amati, modified into a TermPipeline and (Java) optimised by Craig Macdonald

Field Summary
static char[] b
          A buffer for word to be stemmed.
 
Constructor Summary
PorterStemmer(TermPipeline next)
          Constructs an instance of the class, given the next component in the pipeline.
 
Method Summary
 void processTerm(java.lang.String t)
          Stems the given term.
 java.lang.String stem(java.lang.String s)
          Returns the stem of a given term
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

b

public static char[] b
A buffer for word to be stemmed.

Constructor Detail

PorterStemmer

public PorterStemmer(TermPipeline next)
Constructs an instance of the class, given the next component in the pipeline.

Parameters:
next - TermPipeline the next component in the term pipeline.
Method Detail

processTerm

public void processTerm(java.lang.String t)
Stems the given term.

Specified by:
processTerm in interface TermPipeline
Parameters:
t - String the term to stem.

stem

public java.lang.String stem(java.lang.String s)
Returns the stem of a given term

Parameters:
s - String the term to be stemmed.
Returns:
String the stem of a given term.

Terrier IR Platform
1.1.1

Terrier Information Retrieval Platform 1.1.1. Copyright 2004-2007 University of Glasgow