org.terrier.utility
Class Files

java.lang.Object
  extended by org.terrier.utility.Files

public class Files
extends Object

Utililty class for opening readers/writers and input/output streams to files. Handles gzipped and bzipped files on the fly, ie if a file ends in ".gz" or ".GZ", then it will be opened using a GZipInputStream/GZipOutputStream. ".bz2" files are handled in a similar fashion. All returned Streams, Readers, Writers etc are Buffered. If a charset encoding is not specified, then the system default is used. New interfaces are used to descibe random data access.

FileSystem plugsin

Additional file systems can be plugged into this module, by calling the addFileSystemCapability() method. FileSystems have read and/or write capabilities, as specified using the FSCapability constants. Files using these external file systems should be denoted by scheme prefixes - eg ftp://, http:// etc. NB: file:// is the default scheme



Additional Compression Support

Support for additional stream compression & decompression can be plugged in by calling addFilterInputStreamMapping().



File Caching

Terrier can cache files which will see heavy IO activity. In particular, files mentioned in the files.to.cache property will be cached to the default temporary folder. There are also API method to populate the cache with files. For all methods, java.io.tmpdir is the default temporary directory. An IOException will occur if caching fails for some reason.


Nested Class Summary
static interface Files.FSCapability
          constants declaring which capabilites a file system has
protected static class Files.PathTransformation
          a search regex and a replacement for path transformations
 
Field Summary
protected static String DEFAULT_SCHEME
          default scheme
protected static Map<String,FileSystem> fileSystems
          map of scheme to FileSystem implementation
protected static List<Files.PathTransformation> pathTransformations
          transformations to apply to a path
 
Constructor Summary
Files()
           
 
Method Summary
static void addFileSystemCapability(FileSystem fs)
          Add a file system to Terrier.
static void addFilterInputStreamMapping(String regex, Class<? extends InputStream> inputStreamClass, Class<? extends OutputStream> outputStreamClass)
          Add a filter mapping to the Files layes.
static void addPathTransormation(String find, String replace)
          add a static transformation to apply to a path.
static void cacheFile(String filename)
          Cache to the temporary directory specified by java.io.tmpdir System property.
static void cacheFile(String filename, String temporaryFolder)
          Cache file to specified temporary folder
static boolean canRead(String filename)
          returns true iff path can be read
static boolean canWrite(String filename)
          returns true iff path can be read
static Long copyFile(File srcFile, File destFile)
          Copy a file from srcFile to destFile.
static Long copyFile(InputStream in, OutputStream out)
          Copy all bytes from in to out
static Long copyFile(String srcFilename, String destFilename)
          Copy a file from srcFile to destFile.
static Long createChecksum(File file)
          Returns the CRC checksum of denoted file
static boolean delete(String filename)
          Delete the named file.
static boolean deleteOnExit(String path)
          Mark the named path as to be deleted on exit.
static boolean exists(String path)
          returns true iff the path is really a path
protected static FileSystem getFileSystem(String filename)
          derive the file system to use that is associated with the scheme in the specified filename.
static String getFileSystemName(String path)
          Get the name of the file system that would be used to access a given file or directory.
static String getParent(String path)
          What is the parent path to the specified path?
protected static void initialise_mappings()
          initialise the default compression mappings
protected static void initialise_static_cache()
          we may have been specified some files to cache immediately
protected static void intialise_transformations()
          initialise the transformations from Application property
static boolean isDirectory(String path)
          return true if path is a directory
static long length(File f)
          returns the length of file f
static long length(String filename)
          returns the length of the file, or 0L if cannot be found etc
static String[] list(String path)
          List the contents of a directory
static void main(String[] args)
          Check that the a specified file exists as per Terrier's file system abstraction layer
static boolean mkdir(String path)
          returns true if the specificed path can be made as a directory
protected static InputStream openFile(String filename)
          Opens an OutputStream to a file called Filename, processing all allowed writable file systems named in writeFileSystemPrefixes
static RandomDataInput openFileRandom(File file)
          Open a file for random access reading
static RandomDataInput openFileRandom(String filename)
          Returns a RandomAccessFile implementation accessing the specificed file
static BufferedReader openFileReader(File file)
          Opens a reader to the file called file.
static BufferedReader openFileReader(File file, String charset)
          Opens a reader to the file called filename.
static BufferedReader openFileReader(String filename)
          Opens a reader to the file called filename.
static BufferedReader openFileReader(String filename, String charset)
          Opens a reader to the file called filename.
static InputStream openFileStream(File file)
          Opens an InputStream to a file called file.
static InputStream openFileStream(String filename)
          Opens an InputStream to a file called filename.
static boolean rename(String sourceFilename, String destFilename)
          rename a file or directory.
protected static String transform(String filename)
          apply any transformations to the specified filename
protected static OutputStream writeFile(String filename)
          Opens an OutputStream to a file called filename, using the filesystem named in the scheme component of the filename.
static RandomDataOutput writeFileRandom(File file)
          Open a file for random access writing and reading
static RandomDataOutput writeFileRandom(String filename)
          Returns a RandomAccessFile implementation accessing the specificed file
static OutputStream writeFileStream(File file)
          Opens an OutputStream to a file called file.
static OutputStream writeFileStream(String filename)
          Opens an OutputStream to a file called filename.
static Writer writeFileWriter(File file)
          Opens an Writer to a file called file.
static Writer writeFileWriter(File file, String charset)
          Opens an Writer to a file called file.
static Writer writeFileWriter(String filename)
          Opens an Writer to a file called file.
static Writer writeFileWriter(String filename, String charset)
          Opens an Writer to a file called file.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

fileSystems

protected static final Map<String,FileSystem> fileSystems
map of scheme to FileSystem implementation


pathTransformations

protected static final List<Files.PathTransformation> pathTransformations
transformations to apply to a path


DEFAULT_SCHEME

protected static final String DEFAULT_SCHEME
default scheme

Constructor Detail

Files

public Files()
Method Detail

addFilterInputStreamMapping

public static void addFilterInputStreamMapping(String regex,
                                               Class<? extends InputStream> inputStreamClass,
                                               Class<? extends OutputStream> outputStreamClass)
Add a filter mapping to the Files layes. This is the method used to implement stream decompression. For example:
 addFilterInputStreamMapping(".+\\.gz$", GZIPInputStream.class, GZIPOutputStream.class);
 addFilterInputStreamMapping(".+\\.GZ$", GZIPInputStream.class, GZIPOutputStream.class);
 

Parameters:
regex - Regular expression that the filename must match to require the filter stream
inputStreamClass - Class extending InputStream that decompresses the file
outputStreamClass - Class extending OutputStream that compresses the file

initialise_static_cache

protected static void initialise_static_cache()
we may have been specified some files to cache immediately


intialise_transformations

protected static void intialise_transformations()
initialise the transformations from Application property


initialise_mappings

protected static void initialise_mappings()
initialise the default compression mappings


cacheFile

public static void cacheFile(String filename)
                      throws IOException
Cache to the temporary directory specified by java.io.tmpdir System property.

Throws:
IOException

cacheFile

public static void cacheFile(String filename,
                             String temporaryFolder)
                      throws IOException
Cache file to specified temporary folder

Throws:
IOException

addPathTransormation

public static void addPathTransormation(String find,
                                        String replace)
add a static transformation to apply to a path. Find and replace are both regular expressions


addFileSystemCapability

public static void addFileSystemCapability(FileSystem fs)
Add a file system to Terrier. File systems are denoted by URI scheme prefixes (e.g. http). The underlying file system is represented by an FileSystem


transform

protected static String transform(String filename)
apply any transformations to the specified filename


getFileSystem

protected static FileSystem getFileSystem(String filename)
derive the file system to use that is associated with the scheme in the specified filename.

Parameters:
filename -

getFileSystemName

public static String getFileSystemName(String path)
Get the name of the file system that would be used to access a given file or directory.

Parameters:
path -
Returns:
name Name of the file system, or null if no filesystem found

openFile

protected static InputStream openFile(String filename)
                               throws IOException
Opens an OutputStream to a file called Filename, processing all allowed writable file systems named in writeFileSystemPrefixes

Parameters:
filename - Filename of file to open
Throws:
IOException

writeFile

protected static OutputStream writeFile(String filename)
                                 throws IOException
Opens an OutputStream to a file called filename, using the filesystem named in the scheme component of the filename.

Parameters:
filename - Filename of file to open, optionally including scheme
Throws:
IOException

openFileRandom

public static RandomDataInput openFileRandom(String filename)
                                      throws IOException
Returns a RandomAccessFile implementation accessing the specificed file

Throws:
IOException

writeFileRandom

public static RandomDataOutput writeFileRandom(String filename)
                                        throws IOException
Returns a RandomAccessFile implementation accessing the specificed file

Throws:
IOException

delete

public static boolean delete(String filename)
Delete the named file. Returns false if the scheme of filename cannot be recognised, the filesystem doesnt have write capability, or the underlying filesystem could not delete the file

Parameters:
filename - path to file to delete

deleteOnExit

public static boolean deleteOnExit(String path)
Mark the named path as to be deleted on exit. Returns false if the scheme of the filename cannot be recognised, the filesystem does not have write capability, or the file system does not have deleteOnExit capability


exists

public static boolean exists(String path)
returns true iff the path is really a path


canRead

public static boolean canRead(String filename)
returns true iff path can be read


canWrite

public static boolean canWrite(String filename)
returns true iff path can be read


mkdir

public static boolean mkdir(String path)
returns true if the specificed path can be made as a directory


length

public static long length(String filename)
returns the length of the file, or 0L if cannot be found etc


isDirectory

public static boolean isDirectory(String path)
return true if path is a directory


rename

public static boolean rename(String sourceFilename,
                             String destFilename)
rename a file or directory. If the two are on different file systems, it is assumed to be a file


getParent

public static String getParent(String path)
What is the parent path to the specified path?


list

public static String[] list(String path)
List the contents of a directory


openFileReader

public static BufferedReader openFileReader(File file)
                                     throws IOException
Opens a reader to the file called file. Provided for easy overriding for encoding support etc in child classes. Called from openNextFile().

Parameters:
file - File to open.
Returns:
BufferedReader of the file
Throws:
IOException

openFileReader

public static BufferedReader openFileReader(File file,
                                            String charset)
                                     throws IOException
Opens a reader to the file called filename. Provided for easy overriding for encoding support etc in child classes. Called from openNextFile().

Parameters:
file - File to open.
charset - Character set encoding of file. null for system default.
Returns:
BufferedReader of the file
Throws:
IOException

openFileReader

public static BufferedReader openFileReader(String filename)
                                     throws IOException
Opens a reader to the file called filename. Provided for easy overriding for encoding support etc in child classes. Called from openNextFile().

Parameters:
filename - File to open.
Returns:
BufferedReader of the file
Throws:
IOException

openFileReader

public static BufferedReader openFileReader(String filename,
                                            String charset)
                                     throws IOException
Opens a reader to the file called filename. Provided for easy overriding for encoding support etc in child classes. Called from openNextFile().

Parameters:
filename - File to open.
charset - Character set encoding of file. null for system default.
Returns:
BufferedReader of the file
Throws:
IOException

openFileStream

public static InputStream openFileStream(File file)
                                  throws IOException
Opens an InputStream to a file called file.

Parameters:
file - File to open.
Returns:
InputStream of the file
Throws:
IOException

openFileRandom

public static RandomDataInput openFileRandom(File file)
                                      throws IOException
Open a file for random access reading

Throws:
IOException

openFileStream

public static InputStream openFileStream(String filename)
                                  throws IOException
Opens an InputStream to a file called filename.

Parameters:
filename - File to open.
Returns:
InputStream of the file
Throws:
IOException

writeFileStream

public static OutputStream writeFileStream(File file)
                                    throws IOException
Opens an OutputStream to a file called file.

Parameters:
file - File to open.
Returns:
OutputStream of the file
Throws:
IOException

writeFileRandom

public static RandomDataOutput writeFileRandom(File file)
                                        throws IOException
Open a file for random access writing and reading

Throws:
IOException

writeFileStream

public static OutputStream writeFileStream(String filename)
                                    throws IOException
Opens an OutputStream to a file called filename.

Parameters:
filename - File to open.
Returns:
OutputStream of the file
Throws:
IOException

writeFileWriter

public static Writer writeFileWriter(File file)
                              throws IOException
Opens an Writer to a file called file. System default encoding will be used.

Parameters:
file - File to open.
Returns:
Writer of the file
Throws:
IOException

writeFileWriter

public static Writer writeFileWriter(File file,
                                     String charset)
                              throws IOException
Opens an Writer to a file called file.

Parameters:
file - File to open.
charset - Character set encoding of file. null for system default.
Returns:
Writer of the file
Throws:
IOException

writeFileWriter

public static Writer writeFileWriter(String filename)
                              throws IOException
Opens an Writer to a file called file. System default encoding will be used.

Parameters:
filename - File to open.
Returns:
Writer of the file
Throws:
IOException

writeFileWriter

public static Writer writeFileWriter(String filename,
                                     String charset)
                              throws IOException
Opens an Writer to a file called file.

Parameters:
filename - File to open.
charset - Character set encoding of file. null for system default.
Returns:
Writer of the file
Throws:
IOException

copyFile

public static Long copyFile(String srcFilename,
                            String destFilename)
                     throws IOException
Copy a file from srcFile to destFile.

Returns:
null if OK
Throws:
IOException - if there was a problem copying

copyFile

public static Long copyFile(File srcFile,
                            File destFile)
                     throws IOException
Copy a file from srcFile to destFile.

Returns:
null if OK
Throws:
IOException - if there was a problem copying

copyFile

public static Long copyFile(InputStream in,
                            OutputStream out)
                     throws IOException
Copy all bytes from in to out

Returns:
null if OK throws IOException if there was a problem copying
Throws:
IOException

createChecksum

public static Long createChecksum(File file)
                           throws IOException
Returns the CRC checksum of denoted file

Throws:
IOException

length

public static long length(File f)
returns the length of file f


main

public static void main(String[] args)
Check that the a specified file exists as per Terrier's file system abstraction layer



Terrier 3.6. Copyright © 2004-2011 University of Glasgow