[TR-324] Evaluation fails to parse .res files Created: 15/Jan/15  Updated: 01/Dec/15  Resolved: 01/Dec/15

Status: Resolved
Project: Terrier Core
Component/s: .evaluation
Affects Version/s: 4.0
Fix Version/s: 4.1

Type: Bug Priority: Major
Reporter: Ian Soboroff Assignee: Richard McCreadie
Resolution: Fixed  
Labels: None

Attachments: File PL2c10.99_0.res     File PL2c10.99_0.res.settings     File Terrier324.patch     File TF_IDF_1.res.settings    

This is running the quick-start example, except over an index of TREC-8 adhoc:

% ./bin/trec_terrier.sh -r -Dtrec.model=PL2 -c 10.99 -Dtrec.topics=./cd45-cr/topics.401-450
INFO - Finished topics, executed 50 queries in 1.393 seconds, results written to /Users/soboroff/terrier-4.0/var/results/PL2c10.99_0.res

% ./bin/trec_terrier.sh -e -Dtrec.qrels=./cd45-cr/adhoc.qrels
Setting TERRIER_HOME to /Users/soboroff/terrier-4.0
INFO - Evaluating result file: /Users/soboroff/terrier-4.0/var/results/PL2c10.99_0.res
A problem occurred: java.lang.NumberFormatException: For input string: "10.91742797093212"
java.lang.NumberFormatException: For input string: "10.91742797093212"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:492)
at java.lang.Integer.parseInt(Integer.java:527)
at org.terrier.evaluation.AdhocEvaluation.evaluate(AdhocEvaluation.java:175)
at org.terrier.applications.TrecTerrier.run(TrecTerrier.java:525)
at org.terrier.applications.TrecTerrier.applyOptions(TrecTerrier.java:588)
at org.terrier.applications.TrecTerrier.main(TrecTerrier.java:245)

Comment by Ian Soboroff [ 15/Jan/15 ]

Oops, attached an extra file for another run I was trying.

Comment by Ian Soboroff [ 15/Jan/15 ]

It is unhappy because the .res line in question is missing a docid.

Comment by Craig Macdonald [ 15/Jan/15 ]

I don't suppose you know which document didnt had an empty DOCNO tag?

Perhaps you listed a readme file in the collection.spec file that got indexed?

Use bin/trec_terrier.sh --printmeta to see the docnos contained in the index in order.



Comment by Ian Soboroff [ 16/Jan/15 ]

Yes, indexing some non-TREC docs in the collection.spec was the initial cause, but maybe when trying to write out TREC runfiles, if a docno is absent something suitably diagnostic could be output in that field instead?

Comment by Ian Soboroff [ 16/Jan/15 ]

I'll try to cook up a patch time permitting.

Comment by Richard McCreadie [ 23/Nov/15 ]

Patch to add diagnostic DOCNOs.

Comment by Richard McCreadie [ 01/Dec/15 ]

Fixed 730a83ec

Generated at Sat Jun 23 18:26:23 BST 2018 using JIRA 7.1.1#71004-sha1:d6b2c0d9b7051e9fb5e4eb8ce177ca56d91d7bd8.