Uploaded image for project: 'Terrier Core'
  1. Terrier Core
  2. TR-519

getId() should return EOL in (AND|Field|Block|BlockField)IterablePosting after EOL

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 5.0
    • Fix Version/s: 5.1
    • Component/s: .structures
    • Labels:
      None

      Description

      In the BlockIterablePosting class (package org.terrier.structures.postings.bit),
      the getId() public method is inherited from the BasicPostingImpl class (package org.terrier.structures.postings), where it returns the value of the protected member variable id.

      In the implementation of the next() method in the BlockIterablePosting class, if we advance after the end of the posting list, the id variable value is not updated, just returning EOL.

      This means that, when we next() or next(target) at the end of the posting list, and then we invoke getId(), the expected EOL value is not returned, but the useless value of the last docid we moved into.

      A correct implementation of the next() method in the BlockIterablePosting class could be the following (next(target) untouched since it uses next()):

      @Override
      public int next() throws IOException
      {
      if (numEntries == 0) {
      id = END_OF_LIST;
      } else {
      id += bitFileReader.readGamma();
      tf = bitFileReader.readUnary();
      //TODO: this has a memory allocation for every posting in the posting list. can we reuse an array?
      positions = new int[bitFileReader.readUnary() -1];
      if (positions.length == 0)
      return id;
      positions[0] = bitFileReader.readGamma() -1;
      for(int i=1;i<positions.length;i++)
      positions[i] = positions[i-1] + bitFileReader.readGamma();

      numEntries--;
      }
      return id;
      }

        Attachments

          Activity

          nicola.tonellotto Nicola Tonellotto created issue -
          craigm Craig Macdonald made changes -
          Field Original Value New Value
          Summary Docid flaw in BlockIterablePosting Docid flaw in (Field|Block|BlockField)IterablePosting
          craigm Craig Macdonald made changes -
          Attachment TREC-519.patch [ 10691 ]
          Hide
          craigm Craig Macdonald added a comment -

          Nicola,

          Block, BlockField & Field were affected. I think this patch fixes all.

          Craig

          Show
          craigm Craig Macdonald added a comment - Nicola, Block, BlockField & Field were affected. I think this patch fixes all. Craig
          craigm Craig Macdonald made changes -
          Fix Version/s 5.1 [ 10193 ]
          Hide
          craigm Craig Macdonald added a comment -

          This causes
          org.terrier.tests.BlockMaxBlocksShakespeareEndToEndTest : testBasicClassical
          to fail.

          Craig

          Show
          craigm Craig Macdonald added a comment - This causes org.terrier.tests.BlockMaxBlocksShakespeareEndToEndTest : testBasicClassical to fail. Craig
          craigm Craig Macdonald made changes -
          Summary Docid flaw in (Field|Block|BlockField)IterablePosting Docid flaw in (AND|Field|Block|BlockField)IterablePosting
          Hide
          craigm Craig Macdonald added a comment - - edited

          ANDIterablePosting and ORIterablePosting have similar issues, whereby if no posting matches, subsequent calls to getId() will return -1 instead of EOL.

          Show
          craigm Craig Macdonald added a comment - - edited ANDIterablePosting and ORIterablePosting have similar issues, whereby if no posting matches, subsequent calls to getId() will return -1 instead of EOL.
          craigm Craig Macdonald made changes -
          Summary Docid flaw in (AND|Field|Block|BlockField)IterablePosting geId() flaw in (AND|Field|Block|BlockField)IterablePosting after EOL
          craigm Craig Macdonald made changes -
          Summary geId() flaw in (AND|Field|Block|BlockField)IterablePosting after EOL getId() flaw in (AND|Field|Block|BlockField)IterablePosting after EOL
          craigm Craig Macdonald made changes -
          Summary getId() flaw in (AND|Field|Block|BlockField)IterablePosting after EOL getId() should return EOL in (AND|Field|Block|BlockField)IterablePosting after EOL
          Hide
          craigm Craig Macdonald added a comment -

          Fixed in github

          Show
          craigm Craig Macdonald added a comment - Fixed in github
          craigm Craig Macdonald made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]

            People

            • Assignee:
              craigm Craig Macdonald
              Reporter:
              nicola.tonellotto Nicola Tonellotto
            • Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: