[TR-340] TaggedDocument.saveToAbstract is expensive even when no abstracts enabled Created: 29/Jul/15  Updated: 09/Nov/15  Resolved: 09/Nov/15

Status: Resolved
Project: Terrier Core
Component/s: .indexing
Affects Version/s: 4.0
Fix Version/s: 4.1

Type: Improvement Priority: Trivial
Reporter: Craig Macdonald Assignee: Richard McCreadie
Resolution: Fixed  
Labels: None

saveToAbstract() is invoked for every token in the collection. Hence, it invokes an upper-casing of the tag name for every single token. (a) could this be done less often, (b) can we short circuit out of this method if there are no abstract tags.

Comment by Richard McCreadie [ 09/Nov/15 ]

Committed fix in 64ac7c07. TestTaggedDocument unit test passes.

Now checks a boolean to see whether to consider abstracts. Only upper cases once (on new tag detection). Added map lookup on the tag name.

Generated at Wed Dec 19 12:15:05 GMT 2018 using JIRA 7.1.1#71004-sha1:d6b2c0d9b7051e9fb5e4eb8ce177ca56d91d7bd8.