|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.terrier.terms.Stopwords
public class Stopwords
Implements stopword removal, as a TermPipeline object. Stopword list to load can be
passed in the constructor or loaded from the stopwords.filename property.
Note that this TermPipeline uses the system default encoding for the stopword list.
Properties
Field Summary | |
---|---|
protected static boolean |
INTERN_STOPWORDS
|
protected TermPipeline |
next
The next component in the term pipeline. |
protected gnu.trove.THashSet<java.lang.String> |
stopWords
The hashset that contains all the stop words. |
Constructor Summary | |
---|---|
Stopwords(TermPipeline _next)
Makes a new stopword termpipeline object. |
|
Stopwords(TermPipeline _next,
java.lang.String StopwordsFile)
Makes a new stopword term pipeline object. |
|
Stopwords(TermPipeline _next,
java.lang.String[] StopwordsFiles)
Makes a new stopword term pipeline object. |
Method Summary | |
---|---|
void |
clear()
Clear all stopwords from this stopword list object. |
boolean |
isStopword(java.lang.String t)
Returns true is term t is a stopword |
void |
loadStopwordsList(java.lang.String stopwordsFilename)
Loads the specified stopwords file. |
void |
loadStopwordsList(java.lang.String[] StopwordsFiles)
Loads the specified stopwords files. |
void |
processTerm(java.lang.String t)
Checks to see if term t is a stopword. |
boolean |
reset()
This method implements the specific rest option needed to implements query or doc oriented policy. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
protected static final boolean INTERN_STOPWORDS
protected final TermPipeline next
protected final gnu.trove.THashSet<java.lang.String> stopWords
Constructor Detail |
---|
public Stopwords(TermPipeline _next)
_next
- TermPipeline the next component in the term pipeline.public Stopwords(TermPipeline _next, java.lang.String StopwordsFile)
_next
- TermPipeline the next component in the term pipelineStopwordsFile
- The filename(s) of the file to use as the stopwords list. Split on comma,
and passed to the (TermPipeline,String[]) constructor.public Stopwords(TermPipeline _next, java.lang.String[] StopwordsFiles)
_next
- TermPipeline the next component in the term pipelineStopwordsFiles
- Array of filenames of stopword lists.Method Detail |
---|
public void loadStopwordsList(java.lang.String[] StopwordsFiles)
StopwordsFiles
- Array of filenames of stopword lists.public void loadStopwordsList(java.lang.String stopwordsFilename)
stopwordsFilename
- The filename of the file to use as the stopwords list.public void clear()
public boolean isStopword(java.lang.String t)
public void processTerm(java.lang.String t)
processTerm
in interface TermPipeline
t
- The term to be checked.public boolean reset()
reset
in interface TermPipeline
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |