org.terrier.utility
Class Files

java.lang.Object
  extended by org.terrier.utility.Files

public class Files
extends java.lang.Object

Utililty class for opening readers/writers and input/output streams to files. Handles gzipped and bzipped files on the fly, ie if a file ends in ".gz" or ".GZ", then it will be opened using a GZipInputStream/GZipOutputStream. ".bz2" files are handled in a similar fashion. All returned Streams, Readers, Writers etc are Buffered. If a charset encoding is not specified, then the system default is used. New interfaces are used to descibe random data access.

FileSystem plugsin

Additional file systems can be plugged into this module, by calling the addFileSystemCapability() method. FileSystems have read and/or write capabilities, as specified using the FSCapability constants. Files using these external file systems should be denoted by scheme prefixes - eg ftp://, http:// etc. NB: file:// is the default scheme



Additional Compression Support

Support for additional stream compression & decompression can be plugged in by calling addFilterInputStreamMapping().



File Caching

Terrier can cache files which will see heavy IO activity. In particular, files mentioned in the files.to.cache property will be cached to the default temporary folder. There are also API method to populate the cache with files. For all methods, java.io.tmpdir is the default temporary directory. An IOException will occur if caching fails for some reason.


Nested Class Summary
static interface Files.FSCapability
          constants declaring which capabilites a file system has
protected static class Files.PathTransformation
          a search regex and a replacement for path transformations
 
Field Summary
protected static java.lang.String DEFAULT_SCHEME
          default scheme
protected static java.util.Map<java.lang.String,FileSystem> fileSystems
          map of scheme to FileSystem implementation
protected static java.util.List<Files.PathTransformation> pathTransformations
          transformations to apply to a path
 
Constructor Summary
Files()
           
 
Method Summary
static void addFileSystemCapability(FileSystem fs)
          Add a file system to Terrier.
static void addFilterInputStreamMapping(java.lang.String regex, java.lang.Class<? extends java.io.InputStream> inputStreamClass, java.lang.Class<? extends java.io.OutputStream> outputStreamClass)
          Add a filter mapping to the Files layes.
static void addPathTransormation(java.lang.String find, java.lang.String replace)
          add a static transformation to apply to a path.
static void cacheFile(java.lang.String filename)
          Cache to the temporary directory specified by java.io.tmpdir System property.
static void cacheFile(java.lang.String filename, java.lang.String temporaryFolder)
          Cache file to specified temporary folder
static boolean canRead(java.lang.String filename)
          returns true iff path can be read
static boolean canWrite(java.lang.String filename)
          returns true iff path can be read
static java.lang.Long copyFile(java.io.File srcFile, java.io.File destFile)
          Copy a file from srcFile to destFile.
static java.lang.Long copyFile(java.io.InputStream in, java.io.OutputStream out)
          Copy all bytes from in to out
static java.lang.Long copyFile(java.lang.String srcFilename, java.lang.String destFilename)
          Copy a file from srcFile to destFile.
static java.lang.Long createChecksum(java.io.File file)
          Returns the CRC checksum of denoted file
static boolean delete(java.lang.String filename)
          Delete the named file.
static boolean deleteOnExit(java.lang.String path)
          Mark the named path as to be deleted on exit.
static boolean exists(java.lang.String path)
          returns true iff the path is really a path
protected static FileSystem getFileSystem(java.lang.String filename)
          derive the file system to use that is associated with the scheme in the specified filename.
static java.lang.String getFileSystemName(java.lang.String path)
          Get the name of the file system that would be used to access a given file or directory.
static java.lang.String getParent(java.lang.String path)
          What is the parent path to the specified path?
protected static void initialise_mappings()
          initialise the default compression mappings
protected static void initialise_static_cache()
          we may have been specified some files to cache immediately
protected static void intialise_transformations()
          initialise the transformations from Application property
static boolean isDirectory(java.lang.String path)
          return true if path is a directory
static long length(java.io.File f)
          returns the length of file f
static long length(java.lang.String filename)
          returns the length of the file, or 0L if cannot be found etc
static java.lang.String[] list(java.lang.String path)
          List the contents of a directory
static void main(java.lang.String[] args)
          Check that the a specified file exists as per Terrier's file system abstraction layer
static boolean mkdir(java.lang.String path)
          returns true if the specificed path can be made as a directory
protected static java.io.InputStream openFile(java.lang.String filename)
          Opens an OutputStream to a file called Filename, processing all allowed writable file systems named in writeFileSystemPrefixes
static RandomDataInput openFileRandom(java.io.File file)
          Open a file for random access reading
static RandomDataInput openFileRandom(java.lang.String filename)
          Returns a RandomAccessFile implementation accessing the specificed file
static java.io.BufferedReader openFileReader(java.io.File file)
          Opens a reader to the file called file.
static java.io.BufferedReader openFileReader(java.io.File file, java.lang.String charset)
          Opens a reader to the file called filename.
static java.io.BufferedReader openFileReader(java.lang.String filename)
          Opens a reader to the file called filename.
static java.io.BufferedReader openFileReader(java.lang.String filename, java.lang.String charset)
          Opens a reader to the file called filename.
static java.io.InputStream openFileStream(java.io.File file)
          Opens an InputStream to a file called file.
static java.io.InputStream openFileStream(java.lang.String filename)
          Opens an InputStream to a file called filename.
static boolean rename(java.lang.String sourceFilename, java.lang.String destFilename)
          rename a file or directory.
protected static java.lang.String transform(java.lang.String filename)
          apply any transformations to the specified filename
protected static java.io.OutputStream writeFile(java.lang.String filename)
          Opens an OutputStream to a file called filename, using the filesystem named in the scheme component of the filename.
static RandomDataOutput writeFileRandom(java.io.File file)
          Open a file for random access writing and reading
static RandomDataOutput writeFileRandom(java.lang.String filename)
          Returns a RandomAccessFile implementation accessing the specificed file
static java.io.OutputStream writeFileStream(java.io.File file)
          Opens an OutputStream to a file called file.
static java.io.OutputStream writeFileStream(java.lang.String filename)
          Opens an OutputStream to a file called filename.
static java.io.Writer writeFileWriter(java.io.File file)
          Opens an Writer to a file called file.
static java.io.Writer writeFileWriter(java.io.File file, java.lang.String charset)
          Opens an Writer to a file called file.
static java.io.Writer writeFileWriter(java.lang.String filename)
          Opens an Writer to a file called file.
static java.io.Writer writeFileWriter(java.lang.String filename, java.lang.String charset)
          Opens an Writer to a file called file.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

fileSystems

protected static final java.util.Map<java.lang.String,FileSystem> fileSystems
map of scheme to FileSystem implementation


pathTransformations

protected static final java.util.List<Files.PathTransformation> pathTransformations
transformations to apply to a path


DEFAULT_SCHEME

protected static final java.lang.String DEFAULT_SCHEME
default scheme

Constructor Detail

Files

public Files()
Method Detail

addFilterInputStreamMapping

public static void addFilterInputStreamMapping(java.lang.String regex,
                                               java.lang.Class<? extends java.io.InputStream> inputStreamClass,
                                               java.lang.Class<? extends java.io.OutputStream> outputStreamClass)
Add a filter mapping to the Files layes. This is the method used to implement stream decompression. For example:
 addFilterInputStreamMapping(".+\\.gz$", GZIPInputStream.class, GZIPOutputStream.class);
 addFilterInputStreamMapping(".+\\.GZ$", GZIPInputStream.class, GZIPOutputStream.class);
 

Parameters:
regex - Regular expression that the filename must match to require the filter stream
inputStreamClass - Class extending InputStream that decompresses the file
outputStreamClass - Class extending OutputStream that compresses the file

initialise_static_cache

protected static void initialise_static_cache()
we may have been specified some files to cache immediately


intialise_transformations

protected static void intialise_transformations()
initialise the transformations from Application property


initialise_mappings

protected static void initialise_mappings()
initialise the default compression mappings


cacheFile

public static void cacheFile(java.lang.String filename)
                      throws java.io.IOException
Cache to the temporary directory specified by java.io.tmpdir System property.

Throws:
java.io.IOException

cacheFile

public static void cacheFile(java.lang.String filename,
                             java.lang.String temporaryFolder)
                      throws java.io.IOException
Cache file to specified temporary folder

Throws:
java.io.IOException

addPathTransormation

public static void addPathTransormation(java.lang.String find,
                                        java.lang.String replace)
add a static transformation to apply to a path. Find and replace are both regular expressions


addFileSystemCapability

public static void addFileSystemCapability(FileSystem fs)
Add a file system to Terrier. File systems are denoted by URI scheme prefixes (e.g. http). The underlying file system is represented by an FileSystem


transform

protected static java.lang.String transform(java.lang.String filename)
apply any transformations to the specified filename


getFileSystem

protected static FileSystem getFileSystem(java.lang.String filename)
derive the file system to use that is associated with the scheme in the specified filename.

Parameters:
filename -

getFileSystemName

public static java.lang.String getFileSystemName(java.lang.String path)
Get the name of the file system that would be used to access a given file or directory.

Parameters:
path -
Returns:
name Name of the file system, or null if no filesystem found

openFile

protected static java.io.InputStream openFile(java.lang.String filename)
                                       throws java.io.IOException
Opens an OutputStream to a file called Filename, processing all allowed writable file systems named in writeFileSystemPrefixes

Parameters:
filename - Filename of file to open
Throws:
java.io.IOException

writeFile

protected static java.io.OutputStream writeFile(java.lang.String filename)
                                         throws java.io.IOException
Opens an OutputStream to a file called filename, using the filesystem named in the scheme component of the filename.

Parameters:
filename - Filename of file to open, optionally including scheme
Throws:
java.io.IOException

openFileRandom

public static RandomDataInput openFileRandom(java.lang.String filename)
                                      throws java.io.IOException
Returns a RandomAccessFile implementation accessing the specificed file

Throws:
java.io.IOException

writeFileRandom

public static RandomDataOutput writeFileRandom(java.lang.String filename)
                                        throws java.io.IOException
Returns a RandomAccessFile implementation accessing the specificed file

Throws:
java.io.IOException

delete

public static boolean delete(java.lang.String filename)
Delete the named file. Returns false if the scheme of filename cannot be recognised, the filesystem doesnt have write capability, or the underlying filesystem could not delete the file

Parameters:
filename - path to file to delete

deleteOnExit

public static boolean deleteOnExit(java.lang.String path)
Mark the named path as to be deleted on exit. Returns false if the scheme of the filename cannot be recognised, the filesystem does not have write capability, or the file system does not have deleteOnExit capability


exists

public static boolean exists(java.lang.String path)
returns true iff the path is really a path


canRead

public static boolean canRead(java.lang.String filename)
returns true iff path can be read


canWrite

public static boolean canWrite(java.lang.String filename)
returns true iff path can be read


mkdir

public static boolean mkdir(java.lang.String path)
returns true if the specificed path can be made as a directory


length

public static long length(java.lang.String filename)
returns the length of the file, or 0L if cannot be found etc


isDirectory

public static boolean isDirectory(java.lang.String path)
return true if path is a directory


rename

public static boolean rename(java.lang.String sourceFilename,
                             java.lang.String destFilename)
rename a file or directory. If the two are on different file systems, it is assumed to be a file


getParent

public static java.lang.String getParent(java.lang.String path)
What is the parent path to the specified path?


list

public static java.lang.String[] list(java.lang.String path)
List the contents of a directory


openFileReader

public static java.io.BufferedReader openFileReader(java.io.File file)
                                             throws java.io.IOException
Opens a reader to the file called file. Provided for easy overriding for encoding support etc in child classes. Called from openNextFile().

Parameters:
file - File to open.
Returns:
BufferedReader of the file
Throws:
java.io.IOException

openFileReader

public static java.io.BufferedReader openFileReader(java.io.File file,
                                                    java.lang.String charset)
                                             throws java.io.IOException
Opens a reader to the file called filename. Provided for easy overriding for encoding support etc in child classes. Called from openNextFile().

Parameters:
file - File to open.
charset - Character set encoding of file. null for system default.
Returns:
BufferedReader of the file
Throws:
java.io.IOException

openFileReader

public static java.io.BufferedReader openFileReader(java.lang.String filename)
                                             throws java.io.IOException
Opens a reader to the file called filename. Provided for easy overriding for encoding support etc in child classes. Called from openNextFile().

Parameters:
filename - File to open.
Returns:
BufferedReader of the file
Throws:
java.io.IOException

openFileReader

public static java.io.BufferedReader openFileReader(java.lang.String filename,
                                                    java.lang.String charset)
                                             throws java.io.IOException
Opens a reader to the file called filename. Provided for easy overriding for encoding support etc in child classes. Called from openNextFile().

Parameters:
filename - File to open.
charset - Character set encoding of file. null for system default.
Returns:
BufferedReader of the file
Throws:
java.io.IOException

openFileStream

public static java.io.InputStream openFileStream(java.io.File file)
                                          throws java.io.IOException
Opens an InputStream to a file called file.

Parameters:
file - File to open.
Returns:
InputStream of the file
Throws:
java.io.IOException

openFileRandom

public static RandomDataInput openFileRandom(java.io.File file)
                                      throws java.io.IOException
Open a file for random access reading

Throws:
java.io.IOException

openFileStream

public static java.io.InputStream openFileStream(java.lang.String filename)
                                          throws java.io.IOException
Opens an InputStream to a file called filename.

Parameters:
filename - File to open.
Returns:
InputStream of the file
Throws:
java.io.IOException

writeFileStream

public static java.io.OutputStream writeFileStream(java.io.File file)
                                            throws java.io.IOException
Opens an OutputStream to a file called file.

Parameters:
file - File to open.
Returns:
OutputStream of the file
Throws:
java.io.IOException

writeFileRandom

public static RandomDataOutput writeFileRandom(java.io.File file)
                                        throws java.io.IOException
Open a file for random access writing and reading

Throws:
java.io.IOException

writeFileStream

public static java.io.OutputStream writeFileStream(java.lang.String filename)
                                            throws java.io.IOException
Opens an OutputStream to a file called filename.

Parameters:
filename - File to open.
Returns:
OutputStream of the file
Throws:
java.io.IOException

writeFileWriter

public static java.io.Writer writeFileWriter(java.io.File file)
                                      throws java.io.IOException
Opens an Writer to a file called file. System default encoding will be used.

Parameters:
file - File to open.
Returns:
Writer of the file
Throws:
java.io.IOException

writeFileWriter

public static java.io.Writer writeFileWriter(java.io.File file,
                                             java.lang.String charset)
                                      throws java.io.IOException
Opens an Writer to a file called file.

Parameters:
file - File to open.
charset - Character set encoding of file. null for system default.
Returns:
Writer of the file
Throws:
java.io.IOException

writeFileWriter

public static java.io.Writer writeFileWriter(java.lang.String filename)
                                      throws java.io.IOException
Opens an Writer to a file called file. System default encoding will be used.

Parameters:
filename - File to open.
Returns:
Writer of the file
Throws:
java.io.IOException

writeFileWriter

public static java.io.Writer writeFileWriter(java.lang.String filename,
                                             java.lang.String charset)
                                      throws java.io.IOException
Opens an Writer to a file called file.

Parameters:
filename - File to open.
charset - Character set encoding of file. null for system default.
Returns:
Writer of the file
Throws:
java.io.IOException

copyFile

public static java.lang.Long copyFile(java.lang.String srcFilename,
                                      java.lang.String destFilename)
                               throws java.io.IOException
Copy a file from srcFile to destFile.

Returns:
null if OK
Throws:
java.io.IOException - if there was a problem copying

copyFile

public static java.lang.Long copyFile(java.io.File srcFile,
                                      java.io.File destFile)
                               throws java.io.IOException
Copy a file from srcFile to destFile.

Returns:
null if OK
Throws:
java.io.IOException - if there was a problem copying

copyFile

public static java.lang.Long copyFile(java.io.InputStream in,
                                      java.io.OutputStream out)
                               throws java.io.IOException
Copy all bytes from in to out

Returns:
null if OK throws IOException if there was a problem copying
Throws:
java.io.IOException

createChecksum

public static java.lang.Long createChecksum(java.io.File file)
                                     throws java.io.IOException
Returns the CRC checksum of denoted file

Throws:
java.io.IOException

length

public static long length(java.io.File f)
returns the length of file f


main

public static void main(java.lang.String[] args)
Check that the a specified file exists as per Terrier's file system abstraction layer



Terrier 3.5. Copyright © 2004-2011 University of Glasgow