Class Files
- java.lang.Object
-
- org.terrier.utility.Files
-
public class Files extends java.lang.Object
Utililty class for opening readers/writers and input/output streams to files. Handles gzipped and bzipped files on the fly, ie if a file ends in ".gz" or ".GZ", then it will be opened using a GZipInputStream/GZipOutputStream. ".bz2" files are handled in a similar fashion. All returned Streams, Readers, Writers etc are Buffered. If a charset encoding is not specified, then the system default is used. New interfaces are used to descibe random data access.
FileSystem plugsinAdditional file systems can be plugged into this module, by calling the addFileSystemCapability() method. FileSystems have read and/or write capabilities, as specified using the FSCapability constants. Files using these external file systems should be denoted by scheme prefixes - eg ftp://, http:// etc. NB: file:// is the default scheme
Additional Compression SupportSupport for additional stream compression & decompression can be plugged in by calling addFilterInputStreamMapping().
File CachingTerrier can cache files which will see heavy IO activity. In particular, files mentioned in the files.to.cache property will be cached to the default temporary folder. There are also API method to populate the cache with files. For all methods, java.io.tmpdir is the default temporary directory. An IOException will occur if caching fails for some reason.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static interface
Files.FSCapability
constants declaring which capabilities a file system hasprotected static class
Files.PathTransformation
a search regex and a replacement for path transformations
-
Field Summary
Fields Modifier and Type Field Description protected static java.lang.String
DEFAULT_SCHEME
default schemeprotected static java.util.Map<java.lang.String,FileSystem>
fileSystems
map of scheme to FileSystem implementationprotected static java.util.List<Files.PathTransformation>
pathTransformations
transformations to apply to a path
-
Constructor Summary
Constructors Constructor Description Files()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static void
addFileSystemCapability(FileSystem fs)
Add a file system to Terrier.static void
addFilterInputStreamMapping(java.lang.String regex, java.lang.Class<? extends java.io.InputStream> inputStreamClass, java.lang.Class<? extends java.io.OutputStream> outputStreamClass)
Add a filter mapping to the Files layes.static void
addPathTransormation(java.lang.String find, java.lang.String replace)
add a static transformation to apply to a path.static void
cacheFile(java.lang.String filename)
Cache to the temporary directory specified by java.io.tmpdir System property.static void
cacheFile(java.lang.String filename, java.lang.String temporaryFolder)
Cache file to specified temporary folderstatic boolean
canRead(java.lang.String filename)
returns true iff path can be readstatic boolean
canWrite(java.lang.String filename)
returns true iff path can be readstatic java.lang.Long
copyFile(java.io.File srcFile, java.io.File destFile)
Copy a file from srcFile to destFile.static java.lang.Long
copyFile(java.io.InputStream in, java.io.OutputStream out)
Copy all bytes from in to outstatic java.lang.Long
copyFile(java.lang.String srcFilename, java.lang.String destFilename)
Copy a file from srcFile to destFile.static java.lang.Long
createChecksum(java.io.File file)
Returns the CRC checksum of denoted filestatic boolean
delete(java.lang.String filename)
Delete the named file.static boolean
deleteOnExit(java.lang.String path)
Mark the named path as to be deleted on exit.static boolean
exists(java.lang.String path)
returns true iff the path is really a pathprotected static FileSystem
getFileSystem(java.lang.String filename)
derive the file system to use that is associated with the scheme in the specified filename.static java.lang.String
getFileSystemName(java.lang.String path)
Get the name of the file system that would be used to access a given file or directory.static java.lang.String
getParent(java.lang.String path)
What is the parent path to the specified path?protected static void
initialise_mappings()
initialise the default compression mappingsprotected static void
initialise_static_cache()
we may have been specified some files to cache immediatelyprotected static void
intialise_transformations()
initialise the transformations from Application propertystatic boolean
isDirectory(java.lang.String path)
return true if path is a directorystatic long
length(java.io.File f)
returns the length of file fstatic long
length(java.lang.String filename)
returns the length of the file, or 0L if cannot be found etcstatic java.lang.String[]
list(java.lang.String path)
List the contents of a directorystatic void
main(java.lang.String[] args)
Check that the a specified file exists as per Terrier's file system abstraction layerstatic boolean
mkdir(java.lang.String path)
returns true if the specificed path can be made as a directoryprotected static java.io.InputStream
openFile(java.lang.String filename)
Opens an OutputStream to a file called Filename, processing all allowed writable file systems named in writeFileSystemPrefixesstatic RandomDataInput
openFileRandom(java.io.File file)
Open a file for random access readingstatic RandomDataInput
openFileRandom(java.lang.String filename)
Returns a RandomAccessFile implementation accessing the specified filestatic java.io.BufferedReader
openFileReader(java.io.File file)
Opens a reader to the file called file.static java.io.BufferedReader
openFileReader(java.io.File file, java.lang.String charset)
Opens a reader to the file called filename.static java.io.BufferedReader
openFileReader(java.lang.String filename)
Opens a reader to the file called filename.static java.io.BufferedReader
openFileReader(java.lang.String filename, java.lang.String charset)
Opens a reader to the file called filename.static java.io.InputStream
openFileStream(java.io.File file)
Opens an InputStream to a file called file.static java.io.InputStream
openFileStream(java.lang.String filename)
Opens an InputStream to a file called filename.static boolean
rename(java.lang.String sourceFilename, java.lang.String destFilename)
rename a file or directory.protected static java.lang.String
transform(java.lang.String filename)
apply any transformations to the specified filenameprotected static java.io.OutputStream
writeFile(java.lang.String filename)
Opens an OutputStream to a file called filename, using the filesystem named in the scheme component of the filename.static RandomDataOutput
writeFileRandom(java.io.File file)
Open a file for random access writing and readingstatic RandomDataOutput
writeFileRandom(java.lang.String filename)
Returns a RandomAccessFile implementation accessing the specificed filestatic java.io.OutputStream
writeFileStream(java.io.File file)
Opens an OutputStream to a file called file.static java.io.OutputStream
writeFileStream(java.lang.String filename)
Opens an OutputStream to a file called filename.static java.io.Writer
writeFileWriter(java.io.File file)
Opens an Writer to a file called file.static java.io.Writer
writeFileWriter(java.io.File file, java.lang.String charset)
Opens an Writer to a file called file.static java.io.Writer
writeFileWriter(java.lang.String filename)
Opens an Writer to a file called file.static java.io.Writer
writeFileWriter(java.lang.String filename, java.lang.String charset)
Opens an Writer to a file called file.
-
-
-
Field Detail
-
fileSystems
protected static final java.util.Map<java.lang.String,FileSystem> fileSystems
map of scheme to FileSystem implementation
-
pathTransformations
protected static final java.util.List<Files.PathTransformation> pathTransformations
transformations to apply to a path
-
DEFAULT_SCHEME
protected static final java.lang.String DEFAULT_SCHEME
default scheme
-
-
Method Detail
-
addFilterInputStreamMapping
public static void addFilterInputStreamMapping(java.lang.String regex, java.lang.Class<? extends java.io.InputStream> inputStreamClass, java.lang.Class<? extends java.io.OutputStream> outputStreamClass)
Add a filter mapping to the Files layes. This is the method used to implement stream decompression. For example:addFilterInputStreamMapping(".+\\.gz$", GZIPInputStream.class, GZIPOutputStream.class); addFilterInputStreamMapping(".+\\.GZ$", GZIPInputStream.class, GZIPOutputStream.class);
- Parameters:
regex
- Regular expression that the filename must match to require the filter streaminputStreamClass
- Class extending InputStream that decompresses the fileoutputStreamClass
- Class extending OutputStream that compresses the file
-
initialise_static_cache
protected static void initialise_static_cache()
we may have been specified some files to cache immediately
-
intialise_transformations
protected static void intialise_transformations()
initialise the transformations from Application property
-
initialise_mappings
protected static void initialise_mappings()
initialise the default compression mappings
-
cacheFile
public static void cacheFile(java.lang.String filename) throws java.io.IOException
Cache to the temporary directory specified by java.io.tmpdir System property.- Throws:
java.io.IOException
-
cacheFile
public static void cacheFile(java.lang.String filename, java.lang.String temporaryFolder) throws java.io.IOException
Cache file to specified temporary folder- Throws:
java.io.IOException
-
addPathTransormation
public static void addPathTransormation(java.lang.String find, java.lang.String replace)
add a static transformation to apply to a path. Find and replace are both regular expressions
-
addFileSystemCapability
public static void addFileSystemCapability(FileSystem fs)
Add a file system to Terrier. File systems are denoted by URI scheme prefixes (e.g. http). The underlying file system is represented by an FileSystem
-
transform
protected static java.lang.String transform(java.lang.String filename)
apply any transformations to the specified filename
-
getFileSystem
protected static FileSystem getFileSystem(java.lang.String filename)
derive the file system to use that is associated with the scheme in the specified filename.- Parameters:
filename
-
-
getFileSystemName
public static java.lang.String getFileSystemName(java.lang.String path)
Get the name of the file system that would be used to access a given file or directory.- Parameters:
path
-- Returns:
- name Name of the file system, or null if no filesystem found
-
openFile
protected static java.io.InputStream openFile(java.lang.String filename) throws java.io.IOException
Opens an OutputStream to a file called Filename, processing all allowed writable file systems named in writeFileSystemPrefixes- Parameters:
filename
- Filename of file to open- Throws:
java.io.IOException
-
writeFile
protected static java.io.OutputStream writeFile(java.lang.String filename) throws java.io.IOException
Opens an OutputStream to a file called filename, using the filesystem named in the scheme component of the filename.- Parameters:
filename
- Filename of file to open, optionally including scheme- Throws:
java.io.IOException
-
openFileRandom
public static RandomDataInput openFileRandom(java.lang.String filename) throws java.io.IOException
Returns a RandomAccessFile implementation accessing the specified file- Throws:
java.io.IOException
-
writeFileRandom
public static RandomDataOutput writeFileRandom(java.lang.String filename) throws java.io.IOException
Returns a RandomAccessFile implementation accessing the specificed file- Throws:
java.io.IOException
-
delete
public static boolean delete(java.lang.String filename)
Delete the named file. Returns false if the scheme of filename cannot be recognised, the filesystem doesnt have write capability, or the underlying filesystem could not delete the file- Parameters:
filename
- path to file to delete
-
deleteOnExit
public static boolean deleteOnExit(java.lang.String path)
Mark the named path as to be deleted on exit. Returns false if the scheme of the filename cannot be recognised, the filesystem does not have write capability, or the file system does not have deleteOnExit capability
-
exists
public static boolean exists(java.lang.String path)
returns true iff the path is really a path
-
canRead
public static boolean canRead(java.lang.String filename)
returns true iff path can be read
-
canWrite
public static boolean canWrite(java.lang.String filename)
returns true iff path can be read
-
mkdir
public static boolean mkdir(java.lang.String path)
returns true if the specificed path can be made as a directory
-
length
public static long length(java.lang.String filename)
returns the length of the file, or 0L if cannot be found etc
-
isDirectory
public static boolean isDirectory(java.lang.String path)
return true if path is a directory
-
rename
public static boolean rename(java.lang.String sourceFilename, java.lang.String destFilename)
rename a file or directory. If the two are on different file systems, it is assumed to be a file
-
getParent
public static java.lang.String getParent(java.lang.String path)
What is the parent path to the specified path?
-
list
public static java.lang.String[] list(java.lang.String path)
List the contents of a directory
-
openFileReader
public static java.io.BufferedReader openFileReader(java.io.File file) throws java.io.IOException
Opens a reader to the file called file. Provided for easy overriding for encoding support etc in child classes. Called from openNextFile().- Parameters:
file
- File to open.- Returns:
- BufferedReader of the file
- Throws:
java.io.IOException
-
openFileReader
public static java.io.BufferedReader openFileReader(java.io.File file, java.lang.String charset) throws java.io.IOException
Opens a reader to the file called filename. Provided for easy overriding for encoding support etc in child classes. Called from openNextFile().- Parameters:
file
- File to open.charset
- Character set encoding of file. null for system default.- Returns:
- BufferedReader of the file
- Throws:
java.io.IOException
-
openFileReader
public static java.io.BufferedReader openFileReader(java.lang.String filename) throws java.io.IOException
Opens a reader to the file called filename. Provided for easy overriding for encoding support etc in child classes. Called from openNextFile().- Parameters:
filename
- File to open.- Returns:
- BufferedReader of the file
- Throws:
java.io.IOException
-
openFileReader
public static java.io.BufferedReader openFileReader(java.lang.String filename, java.lang.String charset) throws java.io.IOException
Opens a reader to the file called filename. Provided for easy overriding for encoding support etc in child classes. Called from openNextFile().- Parameters:
filename
- File to open.charset
- Character set encoding of file. null for system default.- Returns:
- BufferedReader of the file
- Throws:
java.io.IOException
-
openFileStream
public static java.io.InputStream openFileStream(java.io.File file) throws java.io.IOException
Opens an InputStream to a file called file.- Parameters:
file
- File to open.- Returns:
- InputStream of the file
- Throws:
java.io.IOException
-
openFileRandom
public static RandomDataInput openFileRandom(java.io.File file) throws java.io.IOException
Open a file for random access reading- Throws:
java.io.IOException
-
openFileStream
public static java.io.InputStream openFileStream(java.lang.String filename) throws java.io.IOException
Opens an InputStream to a file called filename.- Parameters:
filename
- File to open.- Returns:
- InputStream of the file
- Throws:
java.io.IOException
-
writeFileStream
public static java.io.OutputStream writeFileStream(java.io.File file) throws java.io.IOException
Opens an OutputStream to a file called file.- Parameters:
file
- File to open.- Returns:
- OutputStream of the file
- Throws:
java.io.IOException
-
writeFileRandom
public static RandomDataOutput writeFileRandom(java.io.File file) throws java.io.IOException
Open a file for random access writing and reading- Throws:
java.io.IOException
-
writeFileStream
public static java.io.OutputStream writeFileStream(java.lang.String filename) throws java.io.IOException
Opens an OutputStream to a file called filename.- Parameters:
filename
- File to open.- Returns:
- OutputStream of the file
- Throws:
java.io.IOException
-
writeFileWriter
public static java.io.Writer writeFileWriter(java.io.File file) throws java.io.IOException
Opens an Writer to a file called file. System default encoding will be used.- Parameters:
file
- File to open.- Returns:
- Writer of the file
- Throws:
java.io.IOException
-
writeFileWriter
public static java.io.Writer writeFileWriter(java.io.File file, java.lang.String charset) throws java.io.IOException
Opens an Writer to a file called file.- Parameters:
file
- File to open.charset
- Character set encoding of file. null for system default.- Returns:
- Writer of the file
- Throws:
java.io.IOException
-
writeFileWriter
public static java.io.Writer writeFileWriter(java.lang.String filename) throws java.io.IOException
Opens an Writer to a file called file. System default encoding will be used.- Parameters:
filename
- File to open.- Returns:
- Writer of the file
- Throws:
java.io.IOException
-
writeFileWriter
public static java.io.Writer writeFileWriter(java.lang.String filename, java.lang.String charset) throws java.io.IOException
Opens an Writer to a file called file.- Parameters:
filename
- File to open.charset
- Character set encoding of file. null for system default.- Returns:
- Writer of the file
- Throws:
java.io.IOException
-
copyFile
public static java.lang.Long copyFile(java.lang.String srcFilename, java.lang.String destFilename) throws java.io.IOException
Copy a file from srcFile to destFile.- Returns:
- null if OK
- Throws:
java.io.IOException
- if there was a problem copying
-
copyFile
public static java.lang.Long copyFile(java.io.File srcFile, java.io.File destFile) throws java.io.IOException
Copy a file from srcFile to destFile.- Returns:
- null if OK
- Throws:
java.io.IOException
- if there was a problem copying
-
copyFile
public static java.lang.Long copyFile(java.io.InputStream in, java.io.OutputStream out) throws java.io.IOException
Copy all bytes from in to out- Returns:
- null if OK throws IOException if there was a problem copying
- Throws:
java.io.IOException
-
createChecksum
public static java.lang.Long createChecksum(java.io.File file) throws java.io.IOException
Returns the CRC checksum of denoted file- Throws:
java.io.IOException
-
length
public static long length(java.io.File f)
returns the length of file f
-
main
public static void main(java.lang.String[] args)
Check that the a specified file exists as per Terrier's file system abstraction layer
-
-