[TR-334] Terrier can not parse topic file when it contains only IDs (not English words) Created: 03/Apr/15 Updated: 01/Dec/15 Resolved: 01/Dec/15
|Reporter:||shadi saleh||Assignee:||Craig Macdonald|
I am doing indexing for medical documents, and instead of indexing the English terms I am annotating the terms in both of the documents and query.
So I have something like a sequence if "C092736" .
Terrier can not index those terms, can I override that?
|Comment by Craig Macdonald [ 06/Nov/15 ]|
Tagging for 4.1.
|Comment by Craig Macdonald [ 01/Dec/15 ]|
Hi. This isn't a bug per-se, but by design. You need to override EnglishTokeniser such that the check() method isn't called.