Terrier Users :  Terrier Forum terrier.org
General discussion about using/developing applications using Terrier 
Process AQUAINT for Robust05
Posted by: khui ()
Date: June 12, 2017 01:41PM

Hi,

I would like to process the AQUAINT dataset for Robust05. Could I directly use the TRECCollection class, or should I implement a separate parser for the document?

In addition, given the availability of the AQUAINT in our group, I am actually employing the GIGAWORD5 corpus, which is supposed to be a superset of the AQUAINT. I would like to ask should I make any special attentions for this corpus when indexing it with Terrier?

Thanks in advances.

Best,
Kai



Edited 1 time(s). Last edit at 06/12/2017 01:43PM by khui.

Options: ReplyQuote
Re: Process AQUAINT for Robust05
Posted by: craigm ()
Date: June 20, 2017 11:04AM

My recollection is that AQUAINT is indeed a TREC-formatted corpus, and can be used as such with the TRECCollection parser.

Is it not simialr to Disk4&5 - c.f.

[ir.dcs.gla.ac.uk]

We haven't got Gigaword5, so I cannot comment on that.

Craig

Options: ReplyQuote


Sorry, only registered users may post in this forum.
This forum powered by Phorum.