Terrier Users :  Terrier Forum terrier.org
General discussion about using/developing applications using Terrier 
Pages: 123Next
Current Page: 1 of 3
_1 in document.fsarrayfile filename during indexing
Posted by: kenston ()
Date: June 29, 2011 12:59PM

Hi,

In Terrier 3.5,

The document.fsarrayfile always have an _1, it seems that during the post-indexing process, the system forgot to rename it to the proper name.

Files:
barnes.direct.bf
barnes.lexicon.fsomapid
barnes_1.document.fsarrayfile <--

A manual rename (removal of _1) solves the issue.

while it's not always the case, as there were some which need not be renamed.

Thanks!



Edited 1 time(s). Last edit at 06/29/2011 02:09PM by kenston.

Options: ReplyQuote
Re: _1 in document.fsarrayfile filename during indexing
Posted by: craigm ()
Date: June 29, 2011 01:57PM

This is peculiar. It suggest that the writing of the DocumentIndex hasnt completed properly.

In particular, my guess is that you are using Windows(?) and that the file hasn't closed, and hence the rename failed. (Rename operations on open files with windows will fail, differently from Unix operating systems).

My inclination is that this should have been identified by the unit tests on Windows. Can you run the unit tests and report any that fail?

Craig

Options: ReplyQuote
Re: _1 in document.fsarrayfile filename during indexing
Posted by: kenston ()
Date: June 29, 2011 02:11PM

Hi Craig,

It's not always the case, as there were some which need not be renamed, meaning they succeeded. It only happened to some, and only the document.fsarrayfile gets involved.

Can I know how to run the tests?

Options: ReplyQuote
Re: _1 in document.fsarrayfile filename during indexing
Posted by: craigm ()
Date: June 29, 2011 02:15PM

Just to confirm, this is Windows?

I can see that this is a classical indexer. Blocks or non-blocks?

I'm not sure what the "some" refers to - do you mean that some indexings run OK, but other have this issue?

Type "ant test" from a command prompt in the terrier-3.5 folder
[terrier.org]

Craig

Options: ReplyQuote
Re: _1 in document.fsarrayfile filename during indexing
Posted by: kenston ()
Date: June 29, 2011 02:20PM

Hi Craig,

Yes, I am using Windows 7 32-bit.
I'm using Block Indexer.
Some indexings run OK, meaning they do not exhibit the _1 issue, but there were some which did.

I am not familiar with Ant, but I will try installing it some time.

Options: ReplyQuote
Re: _1 in document.fsarrayfile filename during indexing
Posted by: craigm ()
Date: June 29, 2011 08:23PM

Alternatively

bin/anyclass.sh org.junit.runner.JUnitCore TerrierDefaultTestSuite

and redirect the output to a file. I'm looking for anything that says failed or an Exception etc.

Craig

Options: ReplyQuote
Re: _1 in document.fsarrayfile filename during indexing
Posted by: kenston ()
Date: June 30, 2011 05:39AM

Hi craig,

I installed Ant, and ran it. Below is the result:

D:\PROJ\MS Thesis\Implementation\Terrier\terrier-3.5>ant test
Buildfile: D:\PROJ\MS Thesis\Implementation\Terrier\terrier-3.5\build.xml

init:

compile-core-grammars:

compile-core-classes:
[javac] D:\PROJ\MS Thesis\Implementation\Terrier\terrier-3.5\build.xml:111:
warning: 'includeantruntime' was not set, defaulting to build.sysclasspath=last;
set to false for repeatable builds
[javac] Compiling 431 source files to D:\PROJ\MS Thesis\Implementation\Terri
er\terrier-3.5\classes
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.

core-jar:
[jar] Building jar: D:\PROJ\MS Thesis\Implementation\Terrier\terrier-3.5\l
ib\terrier-3.5-core.jar

compile-test-classes:
[javac] D:\PROJ\MS Thesis\Implementation\Terrier\terrier-3.5\build.xml:126:
warning: 'includeantruntime' was not set, defaulting to build.sysclasspath=last;
set to false for repeatable builds
[javac] Compiling 70 source files to D:\PROJ\MS Thesis\Implementation\Terrie
r\terrier-3.5\classes_test
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.

test-jar:
[jar] Building jar: D:\PROJ\MS Thesis\Implementation\Terrier\terrier-3.5\l
ib\terrier-3.5-test.jar

test:
[junit] Running TerrierDefaultTestSuite
[junit] Tests run: 302, Failures: 0, Errors: 0, Time elapsed: 56.913 sec

BUILD SUCCESSFUL
Total time: 1 minute 30 seconds

Options: ReplyQuote
Re: _1 in document.fsarrayfile filename during indexing
Posted by: craigm ()
Date: June 30, 2011 10:50AM

Hi kenston,

Thanks for running that. Given that the BlockIndexer is run multiple times within the test cases, I'm disappointed that the test case hasn't identified the issue.

Let me think about it.

C

Options: ReplyQuote
Re: _1 in document.fsarrayfile filename during indexing
Posted by: kenston ()
Date: June 30, 2011 04:34PM

Hi Craig,

I transitioned from Terrier 3.0 to 3.5, and using 3.0, I do not encounter the said issue (_1) during Terrier 3.0 if I recall it right. I think I haven't changed much of the indexing code after the switch to 3.5 except for the docNo additional requirement.

In this case, I added the entry docno in the indexer.meta.forward.keys with just a fixed value of 0 for all documents since I am indexing a database (I added the primary keys as keys instead)



Edited 1 time(s). Last edit at 06/30/2011 04:35PM by kenston.

Options: ReplyQuote
Re: _1 in document.fsarrayfile filename during indexing
Posted by: kenston ()
Date: August 12, 2011 08:02AM

Hi Craig,

Any solution for this or clues on how to prevent this?

Thanks!

Options: ReplyQuote
Pages: 123Next
Current Page: 1 of 3


Sorry, only registered users may post in this forum.
This forum powered by Phorum.