[TR-188] Stopwords incorrectly handles reset Created: 19/Jan/12  Updated: 13/Apr/12  Resolved: 13/Apr/12

Status: Resolved
Project: Terrier Core
Component/s: None
Affects Version/s: 3.5
Fix Version/s: 3.6

Type: Bug Priority: Major
Reporter: Steven Assignee: Craig Macdonald
Resolution: Fixed  
Labels: None

Attachments: File Stopwords.diff    

The implementation of the reset method in class org.terrier.terms.Stopwords doesn't call the reset method of the next class in the pipeline.
This causes all the input to be treated as belonging to the same document and this may impact the quality of retrieval.

Using an empty stopword-list the precision changed by over 7% when inserting Stopwords at the beginning of the pipeline (the test was run on custom collection with specific settings).

The attached patch fixes the problem, simply by calling next.reset().

Comment by Craig Macdonald [ 20/Jan/12 ]

Good catch!

Comment by Craig Macdonald [ 13/Apr/12 ]

Committed for 3.6

Generated at Mon Sep 28 13:26:34 BST 2020 using JIRA 7.1.1#71004-sha1:d6b2c0d9b7051e9fb5e4eb8ce177ca56d91d7bd8.