Class Files
- java.lang.Object
-
- org.terrier.utility.Files
-
public class Files extends java.lang.ObjectUtililty class for opening readers/writers and input/output streams to files. Handles gzipped and bzipped files on the fly, ie if a file ends in ".gz" or ".GZ", then it will be opened using a GZipInputStream/GZipOutputStream. ".bz2" files are handled in a similar fashion. All returned Streams, Readers, Writers etc are Buffered. If a charset encoding is not specified, then the system default is used. New interfaces are used to descibe random data access.
FileSystem plugsinAdditional file systems can be plugged into this module, by calling the addFileSystemCapability() method. FileSystems have read and/or write capabilities, as specified using the FSCapability constants. Files using these external file systems should be denoted by scheme prefixes - eg ftp://, http:// etc. NB: file:// is the default scheme
Additional Compression SupportSupport for additional stream compression & decompression can be plugged in by calling addFilterInputStreamMapping().
File CachingTerrier can cache files which will see heavy IO activity. In particular, files mentioned in the files.to.cache property will be cached to the default temporary folder. There are also API method to populate the cache with files. For all methods, java.io.tmpdir is the default temporary directory. An IOException will occur if caching fails for some reason.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static interfaceFiles.FSCapabilityconstants declaring which capabilities a file system hasprotected static classFiles.PathTransformationa search regex and a replacement for path transformations
-
Field Summary
Fields Modifier and Type Field Description protected static java.lang.StringDEFAULT_SCHEMEdefault schemeprotected static java.util.Map<java.lang.String,FileSystem>fileSystemsmap of scheme to FileSystem implementationprotected static java.util.List<Files.PathTransformation>pathTransformationstransformations to apply to a path
-
Constructor Summary
Constructors Constructor Description Files()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static voidaddFileSystemCapability(FileSystem fs)Add a file system to Terrier.static voidaddFilterInputStreamMapping(java.lang.String regex, java.lang.Class<? extends java.io.InputStream> inputStreamClass, java.lang.Class<? extends java.io.OutputStream> outputStreamClass)Add a filter mapping to the Files layes.static voidaddPathTransormation(java.lang.String find, java.lang.String replace)add a static transformation to apply to a path.static voidcacheFile(java.lang.String filename)Cache to the temporary directory specified by java.io.tmpdir System property.static voidcacheFile(java.lang.String filename, java.lang.String temporaryFolder)Cache file to specified temporary folderstatic booleancanRead(java.lang.String filename)returns true iff path can be readstatic booleancanWrite(java.lang.String filename)returns true iff path can be readstatic java.lang.LongcopyFile(java.io.File srcFile, java.io.File destFile)Copy a file from srcFile to destFile.static java.lang.LongcopyFile(java.io.InputStream in, java.io.OutputStream out)Copy all bytes from in to outstatic java.lang.LongcopyFile(java.lang.String srcFilename, java.lang.String destFilename)Copy a file from srcFile to destFile.static java.lang.LongcreateChecksum(java.io.File file)Returns the CRC checksum of denoted filestatic booleandelete(java.lang.String filename)Delete the named file.static booleandeleteOnExit(java.lang.String path)Mark the named path as to be deleted on exit.static booleanexists(java.lang.String path)returns true iff the path is really a pathprotected static FileSystemgetFileSystem(java.lang.String filename)derive the file system to use that is associated with the scheme in the specified filename.static java.lang.StringgetFileSystemName(java.lang.String path)Get the name of the file system that would be used to access a given file or directory.static java.lang.StringgetParent(java.lang.String path)What is the parent path to the specified path?protected static voidinitialise_mappings()initialise the default compression mappingsprotected static voidinitialise_static_cache()we may have been specified some files to cache immediatelyprotected static voidintialise_transformations()initialise the transformations from Application propertystatic booleanisDirectory(java.lang.String path)return true if path is a directorystatic longlength(java.io.File f)returns the length of file fstatic longlength(java.lang.String filename)returns the length of the file, or 0L if cannot be found etcstatic java.lang.String[]list(java.lang.String path)List the contents of a directorystatic voidmain(java.lang.String[] args)Check that the a specified file exists as per Terrier's file system abstraction layerstatic booleanmkdir(java.lang.String path)returns true if the specificed path can be made as a directoryprotected static java.io.InputStreamopenFile(java.lang.String filename)Opens an OutputStream to a file called Filename, processing all allowed writable file systems named in writeFileSystemPrefixesstatic RandomDataInputopenFileRandom(java.io.File file)Open a file for random access readingstatic RandomDataInputopenFileRandom(java.lang.String filename)Returns a RandomAccessFile implementation accessing the specified filestatic java.io.BufferedReaderopenFileReader(java.io.File file)Opens a reader to the file called file.static java.io.BufferedReaderopenFileReader(java.io.File file, java.lang.String charset)Opens a reader to the file called filename.static java.io.BufferedReaderopenFileReader(java.lang.String filename)Opens a reader to the file called filename.static java.io.BufferedReaderopenFileReader(java.lang.String filename, java.lang.String charset)Opens a reader to the file called filename.static java.io.InputStreamopenFileStream(java.io.File file)Opens an InputStream to a file called file.static java.io.InputStreamopenFileStream(java.lang.String filename)Opens an InputStream to a file called filename.static booleanrename(java.lang.String sourceFilename, java.lang.String destFilename)rename a file or directory.protected static java.lang.Stringtransform(java.lang.String filename)apply any transformations to the specified filenameprotected static java.io.OutputStreamwriteFile(java.lang.String filename)Opens an OutputStream to a file called filename, using the filesystem named in the scheme component of the filename.static RandomDataOutputwriteFileRandom(java.io.File file)Open a file for random access writing and readingstatic RandomDataOutputwriteFileRandom(java.lang.String filename)Returns a RandomAccessFile implementation accessing the specificed filestatic java.io.OutputStreamwriteFileStream(java.io.File file)Opens an OutputStream to a file called file.static java.io.OutputStreamwriteFileStream(java.lang.String filename)Opens an OutputStream to a file called filename.static java.io.WriterwriteFileWriter(java.io.File file)Opens an Writer to a file called file.static java.io.WriterwriteFileWriter(java.io.File file, java.lang.String charset)Opens an Writer to a file called file.static java.io.WriterwriteFileWriter(java.lang.String filename)Opens an Writer to a file called file.static java.io.WriterwriteFileWriter(java.lang.String filename, java.lang.String charset)Opens an Writer to a file called file.
-
-
-
Field Detail
-
fileSystems
protected static final java.util.Map<java.lang.String,FileSystem> fileSystems
map of scheme to FileSystem implementation
-
pathTransformations
protected static final java.util.List<Files.PathTransformation> pathTransformations
transformations to apply to a path
-
DEFAULT_SCHEME
protected static final java.lang.String DEFAULT_SCHEME
default scheme
-
-
Method Detail
-
addFilterInputStreamMapping
public static void addFilterInputStreamMapping(java.lang.String regex, java.lang.Class<? extends java.io.InputStream> inputStreamClass, java.lang.Class<? extends java.io.OutputStream> outputStreamClass)Add a filter mapping to the Files layes. This is the method used to implement stream decompression. For example:addFilterInputStreamMapping(".+\\.gz$", GZIPInputStream.class, GZIPOutputStream.class); addFilterInputStreamMapping(".+\\.GZ$", GZIPInputStream.class, GZIPOutputStream.class);- Parameters:
regex- Regular expression that the filename must match to require the filter streaminputStreamClass- Class extending InputStream that decompresses the fileoutputStreamClass- Class extending OutputStream that compresses the file
-
initialise_static_cache
protected static void initialise_static_cache()
we may have been specified some files to cache immediately
-
intialise_transformations
protected static void intialise_transformations()
initialise the transformations from Application property
-
initialise_mappings
protected static void initialise_mappings()
initialise the default compression mappings
-
cacheFile
public static void cacheFile(java.lang.String filename) throws java.io.IOExceptionCache to the temporary directory specified by java.io.tmpdir System property.- Throws:
java.io.IOException
-
cacheFile
public static void cacheFile(java.lang.String filename, java.lang.String temporaryFolder) throws java.io.IOExceptionCache file to specified temporary folder- Throws:
java.io.IOException
-
addPathTransormation
public static void addPathTransormation(java.lang.String find, java.lang.String replace)add a static transformation to apply to a path. Find and replace are both regular expressions
-
addFileSystemCapability
public static void addFileSystemCapability(FileSystem fs)
Add a file system to Terrier. File systems are denoted by URI scheme prefixes (e.g. http). The underlying file system is represented by an FileSystem
-
transform
protected static java.lang.String transform(java.lang.String filename)
apply any transformations to the specified filename
-
getFileSystem
protected static FileSystem getFileSystem(java.lang.String filename)
derive the file system to use that is associated with the scheme in the specified filename.- Parameters:
filename-
-
getFileSystemName
public static java.lang.String getFileSystemName(java.lang.String path)
Get the name of the file system that would be used to access a given file or directory.- Parameters:
path-- Returns:
- name Name of the file system, or null if no filesystem found
-
openFile
protected static java.io.InputStream openFile(java.lang.String filename) throws java.io.IOExceptionOpens an OutputStream to a file called Filename, processing all allowed writable file systems named in writeFileSystemPrefixes- Parameters:
filename- Filename of file to open- Throws:
java.io.IOException
-
writeFile
protected static java.io.OutputStream writeFile(java.lang.String filename) throws java.io.IOExceptionOpens an OutputStream to a file called filename, using the filesystem named in the scheme component of the filename.- Parameters:
filename- Filename of file to open, optionally including scheme- Throws:
java.io.IOException
-
openFileRandom
public static RandomDataInput openFileRandom(java.lang.String filename) throws java.io.IOException
Returns a RandomAccessFile implementation accessing the specified file- Throws:
java.io.IOException
-
writeFileRandom
public static RandomDataOutput writeFileRandom(java.lang.String filename) throws java.io.IOException
Returns a RandomAccessFile implementation accessing the specificed file- Throws:
java.io.IOException
-
delete
public static boolean delete(java.lang.String filename)
Delete the named file. Returns false if the scheme of filename cannot be recognised, the filesystem doesnt have write capability, or the underlying filesystem could not delete the file- Parameters:
filename- path to file to delete
-
deleteOnExit
public static boolean deleteOnExit(java.lang.String path)
Mark the named path as to be deleted on exit. Returns false if the scheme of the filename cannot be recognised, the filesystem does not have write capability, or the file system does not have deleteOnExit capability
-
exists
public static boolean exists(java.lang.String path)
returns true iff the path is really a path
-
canRead
public static boolean canRead(java.lang.String filename)
returns true iff path can be read
-
canWrite
public static boolean canWrite(java.lang.String filename)
returns true iff path can be read
-
mkdir
public static boolean mkdir(java.lang.String path)
returns true if the specificed path can be made as a directory
-
length
public static long length(java.lang.String filename)
returns the length of the file, or 0L if cannot be found etc
-
isDirectory
public static boolean isDirectory(java.lang.String path)
return true if path is a directory
-
rename
public static boolean rename(java.lang.String sourceFilename, java.lang.String destFilename)rename a file or directory. If the two are on different file systems, it is assumed to be a file
-
getParent
public static java.lang.String getParent(java.lang.String path)
What is the parent path to the specified path?
-
list
public static java.lang.String[] list(java.lang.String path)
List the contents of a directory
-
openFileReader
public static java.io.BufferedReader openFileReader(java.io.File file) throws java.io.IOExceptionOpens a reader to the file called file. Provided for easy overriding for encoding support etc in child classes. Called from openNextFile().- Parameters:
file- File to open.- Returns:
- BufferedReader of the file
- Throws:
java.io.IOException
-
openFileReader
public static java.io.BufferedReader openFileReader(java.io.File file, java.lang.String charset) throws java.io.IOExceptionOpens a reader to the file called filename. Provided for easy overriding for encoding support etc in child classes. Called from openNextFile().- Parameters:
file- File to open.charset- Character set encoding of file. null for system default.- Returns:
- BufferedReader of the file
- Throws:
java.io.IOException
-
openFileReader
public static java.io.BufferedReader openFileReader(java.lang.String filename) throws java.io.IOExceptionOpens a reader to the file called filename. Provided for easy overriding for encoding support etc in child classes. Called from openNextFile().- Parameters:
filename- File to open.- Returns:
- BufferedReader of the file
- Throws:
java.io.IOException
-
openFileReader
public static java.io.BufferedReader openFileReader(java.lang.String filename, java.lang.String charset) throws java.io.IOExceptionOpens a reader to the file called filename. Provided for easy overriding for encoding support etc in child classes. Called from openNextFile().- Parameters:
filename- File to open.charset- Character set encoding of file. null for system default.- Returns:
- BufferedReader of the file
- Throws:
java.io.IOException
-
openFileStream
public static java.io.InputStream openFileStream(java.io.File file) throws java.io.IOExceptionOpens an InputStream to a file called file.- Parameters:
file- File to open.- Returns:
- InputStream of the file
- Throws:
java.io.IOException
-
openFileRandom
public static RandomDataInput openFileRandom(java.io.File file) throws java.io.IOException
Open a file for random access reading- Throws:
java.io.IOException
-
openFileStream
public static java.io.InputStream openFileStream(java.lang.String filename) throws java.io.IOExceptionOpens an InputStream to a file called filename.- Parameters:
filename- File to open.- Returns:
- InputStream of the file
- Throws:
java.io.IOException
-
writeFileStream
public static java.io.OutputStream writeFileStream(java.io.File file) throws java.io.IOExceptionOpens an OutputStream to a file called file.- Parameters:
file- File to open.- Returns:
- OutputStream of the file
- Throws:
java.io.IOException
-
writeFileRandom
public static RandomDataOutput writeFileRandom(java.io.File file) throws java.io.IOException
Open a file for random access writing and reading- Throws:
java.io.IOException
-
writeFileStream
public static java.io.OutputStream writeFileStream(java.lang.String filename) throws java.io.IOExceptionOpens an OutputStream to a file called filename.- Parameters:
filename- File to open.- Returns:
- OutputStream of the file
- Throws:
java.io.IOException
-
writeFileWriter
public static java.io.Writer writeFileWriter(java.io.File file) throws java.io.IOExceptionOpens an Writer to a file called file. System default encoding will be used.- Parameters:
file- File to open.- Returns:
- Writer of the file
- Throws:
java.io.IOException
-
writeFileWriter
public static java.io.Writer writeFileWriter(java.io.File file, java.lang.String charset) throws java.io.IOExceptionOpens an Writer to a file called file.- Parameters:
file- File to open.charset- Character set encoding of file. null for system default.- Returns:
- Writer of the file
- Throws:
java.io.IOException
-
writeFileWriter
public static java.io.Writer writeFileWriter(java.lang.String filename) throws java.io.IOExceptionOpens an Writer to a file called file. System default encoding will be used.- Parameters:
filename- File to open.- Returns:
- Writer of the file
- Throws:
java.io.IOException
-
writeFileWriter
public static java.io.Writer writeFileWriter(java.lang.String filename, java.lang.String charset) throws java.io.IOExceptionOpens an Writer to a file called file.- Parameters:
filename- File to open.charset- Character set encoding of file. null for system default.- Returns:
- Writer of the file
- Throws:
java.io.IOException
-
copyFile
public static java.lang.Long copyFile(java.lang.String srcFilename, java.lang.String destFilename) throws java.io.IOExceptionCopy a file from srcFile to destFile.- Returns:
- null if OK
- Throws:
java.io.IOException- if there was a problem copying
-
copyFile
public static java.lang.Long copyFile(java.io.File srcFile, java.io.File destFile) throws java.io.IOExceptionCopy a file from srcFile to destFile.- Returns:
- null if OK
- Throws:
java.io.IOException- if there was a problem copying
-
copyFile
public static java.lang.Long copyFile(java.io.InputStream in, java.io.OutputStream out) throws java.io.IOExceptionCopy all bytes from in to out- Returns:
- null if OK throws IOException if there was a problem copying
- Throws:
java.io.IOException
-
createChecksum
public static java.lang.Long createChecksum(java.io.File file) throws java.io.IOExceptionReturns the CRC checksum of denoted file- Throws:
java.io.IOException
-
length
public static long length(java.io.File f)
returns the length of file f
-
main
public static void main(java.lang.String[] args)
Check that the a specified file exists as per Terrier's file system abstraction layer
-
-