Re: What information do each index file stores?
Date: December 07, 2017 06:08PM
.docid is called the document index in later version of terrier - essentially, the document lengths.
.lexhash is the offset of each starting letter in the lexicon, to speed searches
docpointers.col wasn't much use. you can ignore ;-)
Again, these files are from a very old version of Terrier. The filenames are more meaningful in newer versions.