Uploaded image for project: 'Terrier Core'
  1. Terrier Core
  2. TR-116

Lexicon not properly renamed on Windows, multipass indexing

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.0
    • Fix Version/s: 3.5
    • Component/s: .indexing, .structures
    • Labels:
      None

      Description

      see http://terrier.org/forum//read.php?3,1493

      problem is that data_1.lexicon.fsomapfile and data_1.tmplexicon.fsomapfile are not properly renamed, probably because a file is left opened somewhere. An inspection of InvertedIndexBuilder suggests the problem is not here.

        Attachments

          Activity

          Hide
          craigm Craig Macdonald added a comment -

          Line 713 & 714 of LexiconBuilder should close lis1 not lis2. However, this method normally lexicon in pairs, while normally N lexicon are merged at once (an obscure property controls this). I'm not sure this is the cause of the issue yet.

          Show
          craigm Craig Macdonald added a comment - Line 713 & 714 of LexiconBuilder should close lis1 not lis2. However, this method normally lexicon in pairs, while normally N lexicon are merged at once (an obscure property controls this). I'm not sure this is the cause of the issue yet.
          Hide
          richardm Richard McCreadie added a comment -

          Tried fixing the wring close in the lexicon builder, no joy. Failed move only occurs which inverted indexing (i.e. -i or -i -v). Still fails when only a single lexicon is around to merge.

          Show
          richardm Richard McCreadie added a comment - Tried fixing the wring close in the lexicon builder, no joy. Failed move only occurs which inverted indexing (i.e. -i or -i -v). Still fails when only a single lexicon is around to merge.
          Hide
          richardm Richard McCreadie added a comment -

          data.tmplexicon.fsomapfile is closed correctly. However, there remain 2 open RW- handles on data.lexicon.fsomapfile around line 381 of InvertedIndexBuilder. This causes FSOMapFileLexicion.deleteMapFileLexicon to not delete data.lexicon.fsomapfile and FSOMapFileLexicion.renameMapFileLexicon to fail.

          Test case: Windows 7 64bit, trec_terrier.bat -i -v (building from existing direct file)

          Show
          richardm Richard McCreadie added a comment - data.tmplexicon.fsomapfile is closed correctly. However, there remain 2 open RW- handles on data.lexicon.fsomapfile around line 381 of InvertedIndexBuilder. This causes FSOMapFileLexicion.deleteMapFileLexicon to not delete data.lexicon.fsomapfile and FSOMapFileLexicion.renameMapFileLexicon to fail. Test case: Windows 7 64bit, trec_terrier.bat -i -v (building from existing direct file)
          Hide
          craigm Craig Macdonald added a comment -

          Found problem. line 188 in InvertedIndexBuilder.java:
          replace

          int numberOfUniqueTerms = index.getLexicon().numberOfEntries();
          

          with

          int numberOfUniqueTerms = index.getCollectionStatistics().getNumberOfUniqueTerms();
          
          Show
          craigm Craig Macdonald added a comment - Found problem. line 188 in InvertedIndexBuilder.java: replace int numberOfUniqueTerms = index.getLexicon().numberOfEntries(); with int numberOfUniqueTerms = index.getCollectionStatistics().getNumberOfUniqueTerms();
          Hide
          craigm Craig Macdonald added a comment -

          Fix committed to trunk.

          Show
          craigm Craig Macdonald added a comment - Fix committed to trunk.

            People

            • Assignee:
              craigm Craig Macdonald
              Reporter:
              craigm Craig Macdonald
            • Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: