[TR-340] TaggedDocument.saveToAbstract is expensive even when no abstracts enabled Created: 29/Jul/15  Updated: 09/Nov/15  Resolved: 09/Nov/15

Status: Resolved
Project: Terrier Core
Component/s: .indexing
Affects Version/s: 4.0
Fix Version/s: 4.1

Type: Improvement Priority: Trivial
Reporter: Craig Macdonald Assignee: Richard McCreadie
Resolution: Fixed  
Labels: None


 Description   
saveToAbstract() is invoked for every token in the collection. Hence, it invokes an upper-casing of the tag name for every single token. (a) could this be done less often, (b) can we short circuit out of this method if there are no abstract tags.

 Comments   
Comment by Richard McCreadie [ 09/Nov/15 ]

Committed fix in 64ac7c07. TestTaggedDocument unit test passes.

Now checks a boolean to see whether to consider abstracts. Only upper cases once (on new tag detection). Added map lookup on the tag name.

Generated at Wed Dec 13 08:50:36 GMT 2017 using JIRA 7.1.1#71004-sha1:d6b2c0d9b7051e9fb5e4eb8ce177ca56d91d7bd8.