[TR-188] Stopwords incorrectly handles reset Created: 19/Jan/12 Updated: 13/Apr/12 Resolved: 13/Apr/12
The implementation of the reset method in class org.terrier.terms.Stopwords doesn't call the reset method of the next class in the pipeline.
This causes all the input to be treated as belonging to the same document and this may impact the quality of retrieval.
Using an empty stopword-list the precision changed by over 7% when inserting Stopwords at the beginning of the pipeline (the test was run on custom collection with specific settings).
The attached patch fixes the problem, simply by calling next.reset().
|Comment by Craig Macdonald [ 20/Jan/12 ]|
|Comment by Craig Macdonald [ 13/Apr/12 ]|
Committed for 3.6