Uploaded image for project: 'Terrier Core'
  1. Terrier Core
  2. TR-519

getId() flaw in (AND|Field|Block|BlockField)IterablePosting after EOL

    Details

    • Type: Bug
    • Status: Open
    • Priority: Critical
    • Resolution: Unresolved
    • Affects Version/s: 5.0
    • Fix Version/s: 5.1
    • Component/s: .structures
    • Labels:
      None

      Description

      In the BlockIterablePosting class (package org.terrier.structures.postings.bit),
      the getId() public method is inherited from the BasicPostingImpl class (package org.terrier.structures.postings), where it returns the value of the protected member variable id.

      In the implementation of the next() method in the BlockIterablePosting class, if we advance after the end of the posting list, the id variable value is not updated, just returning EOL.

      This means that, when we next() or next(target) at the end of the posting list, and then we invoke getId(), the expected EOL value is not returned, but the useless value of the last docid we moved into.

      A correct implementation of the next() method in the BlockIterablePosting class could be the following (next(target) untouched since it uses next()):

      @Override
      public int next() throws IOException
      {
      if (numEntries == 0) {
      id = END_OF_LIST;
      } else {
      id += bitFileReader.readGamma();
      tf = bitFileReader.readUnary();
      //TODO: this has a memory allocation for every posting in the posting list. can we reuse an array?
      positions = new int[bitFileReader.readUnary() -1];
      if (positions.length == 0)
      return id;
      positions[0] = bitFileReader.readGamma() -1;
      for(int i=1;i<positions.length;i++)
      positions[i] = positions[i-1] + bitFileReader.readGamma();

      numEntries--;
      }
      return id;
      }

        Attachments

          Activity

          Hide
          craigm Craig Macdonald added a comment -

          Nicola,

          Block, BlockField & Field were affected. I think this patch fixes all.

          Craig

          Show
          craigm Craig Macdonald added a comment - Nicola, Block, BlockField & Field were affected. I think this patch fixes all. Craig
          Hide
          craigm Craig Macdonald added a comment -

          This causes
          org.terrier.tests.BlockMaxBlocksShakespeareEndToEndTest : testBasicClassical
          to fail.

          Craig

          Show
          craigm Craig Macdonald added a comment - This causes org.terrier.tests.BlockMaxBlocksShakespeareEndToEndTest : testBasicClassical to fail. Craig
          Hide
          craigm Craig Macdonald added a comment - - edited

          ANDIterablePosting and ORIterablePosting have similar issues, whereby if no posting matches, subsequent calls to getId() will return -1 instead of EOL.

          Show
          craigm Craig Macdonald added a comment - - edited ANDIterablePosting and ORIterablePosting have similar issues, whereby if no posting matches, subsequent calls to getId() will return -1 instead of EOL.

            People

            • Assignee:
              craigm Craig Macdonald
              Reporter:
              nicola.tonellotto Nicola Tonellotto
            • Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: