org.terrier.structures.indexing
Class MetaIndexBuilder

java.lang.Object
  extended by org.terrier.structures.indexing.MetaIndexBuilder
All Implemented Interfaces:
Closeable
Direct Known Subclasses:
CompressingMetaIndexBuilder

public abstract class MetaIndexBuilder
extends Object
implements Closeable

Abstract class for writing document metadata. Metadata means textual data associated with a document, e.g. an external document identifier (e.g. docnos), a URL, or the title or abstracts of a document.

Lookups in the resulting <MetaIndex are supported in two manners - either by docid, or for specified key types, by value. In the latter scenario, metadata values are assumed to be unique.

Typical usage during indexing:

 MetaIndexBuilder metaBuilder = ...
 while(collection.nextDocument())
 {
        Document d = collection.getDocument();
  metaBuilder.writeDocumentEntry(d.getAllProperties());
 }
 

Since:
3.0
Author:
Craig Macdonald

Constructor Summary
MetaIndexBuilder()
           
 
Method Summary
abstract  void writeDocumentEntry(Map<String,String> data)
          Write out metadata for current document, extracted from specified map Typically, the MetaIndexBuilder will know which keys from data that it is interested in.
abstract  void writeDocumentEntry(String[] data)
          Write out metadata for current document.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface java.io.Closeable
close
 

Constructor Detail

MetaIndexBuilder

public MetaIndexBuilder()
Method Detail

writeDocumentEntry

public abstract void writeDocumentEntry(Map<String,String> data)
                                 throws IOException
Write out metadata for current document, extracted from specified map Typically, the MetaIndexBuilder will know which keys from data that it is interested in.

Throws:
IOException

writeDocumentEntry

public abstract void writeDocumentEntry(String[] data)
                                 throws IOException
Write out metadata for current document. Values for all keys are specified.

Throws:
IOException


Terrier 3.6. Copyright © 2004-2011 University of Glasgow