org.terrier.structures.collections
Class FSOrderedMapFile<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>

java.lang.Object
  extended by org.terrier.structures.collections.ReadOnlyMap<K,V>
      extended by org.terrier.structures.collections.FSOrderedMapFile<K,V>
Type Parameters:
K - Type of the keys
V - Type of the values
All Implemented Interfaces:
Closeable, Map<K,V>, SortedMap<K,V>, OrderedMap<K,V>

public class FSOrderedMapFile<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>
extends ReadOnlyMap<K,V>
implements OrderedMap<K,V>, Closeable, SortedMap<K,V>

An implementation of java.util.Map that can be accessed from disk. Key and value types are assumed to have a fixed size. Their factories must be passed to the constructor. In the name, FSOrderedMapFile, FS stands for Fixed Size.

Since:
3.0
Author:
Craig Macdonald

Nested Class Summary
static class FSOrderedMapFile.EntryIterator<IK extends org.apache.hadoop.io.Writable,IV extends org.apache.hadoop.io.Writable>
          an iterator for entries.
static interface FSOrderedMapFile.FSOMapFileBSearchShortcut<KEY>
          interface FSOMapFileBSearchShortcut
static class FSOrderedMapFile.MapFileInMemory<IK extends org.apache.hadoop.io.Writable,IV extends org.apache.hadoop.io.Writable>
          MapFileInMemory class
static interface FSOrderedMapFile.MapFileWriter
          Interface for writing a FSOMapFile
static class FSOrderedMapFile.MultiFSOMapWriter
          Writes out a FSOMapFile, but assumes that input data need not be sorted by key.
 
Nested classes/interfaces inherited from interface java.util.Map
Map.Entry<K,V>
 
Field Summary
protected  RandomDataInput dataFile
          actual underlying data file
protected  String dataFilename
          filename of the underlying file
protected  int entrySize
          total size of one key,value pair
protected  FixedSizeWriteableFactory<K> keyFactory
           
protected static org.apache.log4j.Logger logger
          The logger used for this class
protected  int numberOfEntries
          The number of entries in the file.
protected  FSOrderedMapFile.FSOMapFileBSearchShortcut<K> shortcut
           
static String USUAL_EXTENSION
          USUAL_EXTENSION
protected  FixedSizeWriteableFactory<V> valueFactory
           
 
Constructor Summary
FSOrderedMapFile(Index index, String structureName)
          constructor
FSOrderedMapFile(RandomDataInput file, String filename, FixedSizeWriteableFactory<K> _keyFactory, FixedSizeWriteableFactory<V> _valueFactory)
          constructor
FSOrderedMapFile(String filename, boolean updateable, FixedSizeWriteableFactory<K> _keyFactory, FixedSizeWriteableFactory<V> _valueFactory)
          Construct a new object to access the underlying file data structure
 
Method Summary
protected  void _clear()
           
 void clear()
          Remove all entries from this map
 void close()
          
 Comparator<? super K> comparator()
           Always returns null, as keys for FSOMapFile are always Comparable, and their Comparable implementation are used.
 boolean containsKey(Object o)
          
 boolean containsValue(Object o)
          
 Set<Map.Entry<K,V>> entrySet()
          
 K firstKey()
          
 Map.Entry<K,V> get(int entryNumber)
          Return the entry at the specified index
 V get(Object _key)
          
protected  org.terrier.structures.collections.FSOrderedMapFile.MapFileEntry<K,V> getEntry(K key)
          this method is the one which does the actual disk lookup of entries.
 WriteableFactory<K> getKeyFactory()
          Get the key factory
 WriteableFactory<V> getValueFactory()
          Get the value factory
 SortedMap<K,V> headMap(K to)
          
 boolean isEmpty()
          
 Set<K> keySet()
          
 K lastKey()
          
static FSOrderedMapFile.MapFileWriter mapFileWrite(String filename)
          returns a utility class which can be used to write a FSOrderedMapFile.
static void mapFileWrite(String filename, Iterable<Map.Entry<org.apache.hadoop.io.WritableComparable,org.apache.hadoop.io.Writable>> t)
          writes an entire map FSOrderedMapFile at once, to the specified filename, and using the data contained in the specified iterator
static void mapFileWrite(String filename, Iterator<Map.Entry<org.apache.hadoop.io.WritableComparable,org.apache.hadoop.io.Writable>> ti)
          writes an entire map FSOrderedMapFile at once, to the specified filename, and using the data contained in the specified iterator
static int numberOfEntries(String filename, FixedSizeWriteableFactory<?> _keyFactory, FixedSizeWriteableFactory<?> _valueFactory)
          Return number of entries
 void putAll(Map<? extends K,? extends V> m)
          
 void setBSearchShortcut(FSOrderedMapFile.FSOMapFileBSearchShortcut<K> _shortcut)
          Set the FSOMapFileBSearchShortcut
 int size()
          Returns the number of entries in this map
 SortedMap<K,V> subMap(K from, K to)
          
 SortedMap<K,V> tailMap(K from)
          
 Collection<V> values()
          
protected  RandomDataOutput write()
           
 
Methods inherited from class org.terrier.structures.collections.ReadOnlyMap
put, remove
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface java.util.Map
equals, hashCode, put, remove
 

Field Detail

USUAL_EXTENSION

public static final String USUAL_EXTENSION
USUAL_EXTENSION

See Also:
Constant Field Values

logger

protected static final org.apache.log4j.Logger logger
The logger used for this class


dataFile

protected RandomDataInput dataFile
actual underlying data file


dataFilename

protected String dataFilename
filename of the underlying file


numberOfEntries

protected int numberOfEntries
The number of entries in the file.


entrySize

protected int entrySize
total size of one key,value pair


shortcut

protected FSOrderedMapFile.FSOMapFileBSearchShortcut<K extends org.apache.hadoop.io.WritableComparable> shortcut

keyFactory

protected FixedSizeWriteableFactory<K extends org.apache.hadoop.io.WritableComparable> keyFactory

valueFactory

protected FixedSizeWriteableFactory<V extends org.apache.hadoop.io.Writable> valueFactory
Constructor Detail

FSOrderedMapFile

public FSOrderedMapFile(Index index,
                        String structureName)
                 throws IOException
constructor

Parameters:
index -
structureName -
Throws:
IOException

FSOrderedMapFile

public FSOrderedMapFile(String filename,
                        boolean updateable,
                        FixedSizeWriteableFactory<K> _keyFactory,
                        FixedSizeWriteableFactory<V> _valueFactory)
                 throws IOException
Construct a new object to access the underlying file data structure

Parameters:
filename - Filename of the file containing the structure
updateable - Whether the file can be updated in this JVM
_keyFactory - factory object for keys
_valueFactory - factory object for values
Throws:
IOException - thrown if an IO problem occurs

FSOrderedMapFile

public FSOrderedMapFile(RandomDataInput file,
                        String filename,
                        FixedSizeWriteableFactory<K> _keyFactory,
                        FixedSizeWriteableFactory<V> _valueFactory)
                 throws IOException
constructor

Parameters:
file -
filename -
_keyFactory -
_valueFactory -
Throws:
IOException
Method Detail

write

protected RandomDataOutput write()

numberOfEntries

public static int numberOfEntries(String filename,
                                  FixedSizeWriteableFactory<?> _keyFactory,
                                  FixedSizeWriteableFactory<?> _valueFactory)
Return number of entries

Parameters:
filename -
_keyFactory -
_valueFactory -
Returns:
number of entries

getKeyFactory

public WriteableFactory<K> getKeyFactory()
Get the key factory


getValueFactory

public WriteableFactory<V> getValueFactory()
Get the value factory


clear

public void clear()
Remove all entries from this map

Specified by:
clear in interface Map<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>
Overrides:
clear in class ReadOnlyMap<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>

putAll

public void putAll(Map<? extends K,? extends V> m)

Specified by:
putAll in interface Map<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>

_clear

protected void _clear()

entrySet

public Set<Map.Entry<K,V>> entrySet()

Specified by:
entrySet in interface Map<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>
Specified by:
entrySet in interface SortedMap<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>

keySet

public Set<K> keySet()

Specified by:
keySet in interface Map<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>
Specified by:
keySet in interface SortedMap<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>

values

public Collection<V> values()

Specified by:
values in interface Map<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>
Specified by:
values in interface SortedMap<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>

size

public int size()
Returns the number of entries in this map

Specified by:
size in interface Map<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>

containsValue

public boolean containsValue(Object o)

Specified by:
containsValue in interface Map<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>

containsKey

public boolean containsKey(Object o)

Specified by:
containsKey in interface Map<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>

isEmpty

public boolean isEmpty()

Specified by:
isEmpty in interface Map<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>

setBSearchShortcut

public void setBSearchShortcut(FSOrderedMapFile.FSOMapFileBSearchShortcut<K> _shortcut)
Set the FSOMapFileBSearchShortcut


getEntry

protected org.terrier.structures.collections.FSOrderedMapFile.MapFileEntry<K,V> getEntry(K key)
this method is the one which does the actual disk lookup of entries. If an entry is not found, then a MapFileEntry is returned where the index field indicates the (-(insertion point) -1) of the specified key. See also Arrays.binarySearch()


firstKey

public K firstKey()

Specified by:
firstKey in interface SortedMap<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>

lastKey

public K lastKey()

Specified by:
lastKey in interface SortedMap<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>

headMap

public SortedMap<K,V> headMap(K to)

Specified by:
headMap in interface SortedMap<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>

subMap

public SortedMap<K,V> subMap(K from,
                             K to)

Specified by:
subMap in interface SortedMap<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>

tailMap

public SortedMap<K,V> tailMap(K from)

Specified by:
tailMap in interface SortedMap<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>

comparator

public final Comparator<? super K> comparator()
Always returns null, as keys for FSOMapFile are always Comparable, and their Comparable implementation are used.

Specified by:
comparator in interface SortedMap<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>

get

public V get(Object _key)

Specified by:
get in interface Map<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>

get

public Map.Entry<K,V> get(int entryNumber)
Return the entry at the specified index

Specified by:
get in interface OrderedMap<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>

close

public void close()
           throws IOException

Specified by:
close in interface Closeable
Throws:
IOException

mapFileWrite

public static void mapFileWrite(String filename,
                                Iterable<Map.Entry<org.apache.hadoop.io.WritableComparable,org.apache.hadoop.io.Writable>> t)
                         throws IOException
writes an entire map FSOrderedMapFile at once, to the specified filename, and using the data contained in the specified iterator

Throws:
IOException

mapFileWrite

public static void mapFileWrite(String filename,
                                Iterator<Map.Entry<org.apache.hadoop.io.WritableComparable,org.apache.hadoop.io.Writable>> ti)
                         throws IOException
writes an entire map FSOrderedMapFile at once, to the specified filename, and using the data contained in the specified iterator

Throws:
IOException

mapFileWrite

public static FSOrderedMapFile.MapFileWriter mapFileWrite(String filename)
                                                   throws IOException
returns a utility class which can be used to write a FSOrderedMapFile. Input data MUST be sorted by key.

Throws:
IOException


Terrier 3.6. Copyright © 2004-2011 University of Glasgow