Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Invalid
    • Affects Version/s: 3.0
    • Fix Version/s: 3.5
    • Component/s: None
    • Labels:
      None

      Description

      While looking at TR-107, I found a potential bug on DirectIndexInputStream (actually, observed on a parent method: BitPostingIndexInputStream.print()).

      Here is a sample code to trigger the problem:

      {code:java}
      Document[] sourceDocs = new Document[]{
      new FileDocument("doc1", new ByteArrayInputStream("cats dogs horses".getBytes()), new EnglishTokeniser()),
      new FileDocument("doc2", new ByteArrayInputStream("chicken cats chicken chicken".getBytes()), new EnglishTokeniser())
      };

      Collection col = new CollectionDocumentList(sourceDocs, "filename");
      Indexer indexer = new BasicIndexer(ApplicationSetup.TERRIER_INDEX_PATH, ApplicationSetup.TERRIER_INDEX_PREFIX);

      indexer.createDirectIndex(new Collection[]{col});
      indexer.createInvertedIndex();

      Index index = Index.createIndex();
      BitPostingIndexInputStream bpiis = null;

      System.out.println("INVERTED ----------");
      bpiis = (BitPostingIndexInputStream) index.getIndexStructureInputStream("inverted");
      bpiis.print();

      System.out.println("DIRECT ----------");
      bpiis = (BitPostingIndexInputStream) index.getIndexStructureInputStream("direct");
      bpiis.print();
      {code}

      And here is the corresponding output:

      {quote}
      INVERTED ----------
      0 (0,1) (1,1) // cats -> doc1, doc2 -> OK
      1 (1,3) // chicken -> doc2 -> OK
      2 (0,1) // dogs -> doc1 -> OK
      3 (0,1) // horses -> doc1 -> OK
      DIRECT ----------
      0 (0,1) (1,1) (2,1) // doc1 -> cats, chicken, dogs -> NOT OK
      1 (2,1) (3,3) // doc2 -> dogs, horses -> NOT OK
      {quote}

        Attachments

          Issue Links

            Activity

            rodrygo Rodrygo L. T. Santos created issue -
            rodrygo Rodrygo L. T. Santos made changes -
            Field Original Value New Value
            Link This issue relates to TR-107 [ TR-107 ]
            rodrygo Rodrygo L. T. Santos made changes -
            Status Open [ 1 ] Closed [ 6 ]
            Resolution Invalid [ 6 ]

              People

              • Assignee:
                craigm Craig Macdonald
                Reporter:
                rodrygo Rodrygo L. T. Santos
              • Watchers:
                0 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: