Class IOUtil


  • public class IOUtil
    extends Object
    Miscellaneous stateless static IO-oriented methods. Also used for utility methods that wrap or aggregate functionality in Java IO.
    • Constructor Detail

      • IOUtil

        public IOUtil()
    • Method Detail

      • setCompressionLevel

        public static void setCompressionLevel​(int compressionLevel)
        Sets the GZip compression level for subsequent GZIPOutputStream object creation.
        Parameters:
        compressionLevel - 0 <= compressionLevel <= 9
      • getCompressionLevel

        public static int getCompressionLevel()
      • toBufferedStream

        public static BufferedInputStream toBufferedStream​(InputStream stream)
        Wrap the given stream in a BufferedInputStream, if it isn't already wrapper
        Parameters:
        stream - stream to be wrapped
        Returns:
        A BufferedInputStream wrapping stream, or stream itself if stream instanceof BufferedInputStream.
      • transferByStream

        public static void transferByStream​(InputStream in,
                                            OutputStream out,
                                            long bytes)
        Transfers from the input stream to the output stream using stream operations and a buffer.
      • maybeBufferOutputStream

        public static OutputStream maybeBufferOutputStream​(OutputStream os)
        Returns:
        If Defaults.BUFFER_SIZE > 0, wrap os in BufferedOutputStream, else return os itself.
      • maybeBufferOutputStream

        public static OutputStream maybeBufferOutputStream​(OutputStream os,
                                                           int bufferSize)
        Returns:
        If bufferSize > 0, wrap os in BufferedOutputStream, else return os itself.
      • maybeBufferedSeekableStream

        public static SeekableStream maybeBufferedSeekableStream​(File file)
      • maybeBufferedSeekableStream

        public static SeekableStream maybeBufferedSeekableStream​(URL url)
      • maybeBufferInputStream

        public static InputStream maybeBufferInputStream​(InputStream is)
        Returns:
        If Defaults.BUFFER_SIZE > 0, wrap is in BufferedInputStream, else return is itself.
      • maybeBufferInputStream

        public static InputStream maybeBufferInputStream​(InputStream is,
                                                         int bufferSize)
        Returns:
        If bufferSize > 0, wrap is in BufferedInputStream, else return is itself.
      • maybeBufferReader

        public static Reader maybeBufferReader​(Reader reader,
                                               int bufferSize)
      • maybeBufferReader

        public static Reader maybeBufferReader​(Reader reader)
      • maybeBufferWriter

        public static Writer maybeBufferWriter​(Writer writer,
                                               int bufferSize)
      • maybeBufferWriter

        public static Writer maybeBufferWriter​(Writer writer)
      • deleteFiles

        public static void deleteFiles​(File... files)
        Delete a list of files, and write a warning message if one could not be deleted.
        Parameters:
        files - Files to be deleted.
      • deleteFiles

        public static void deleteFiles​(Iterable<File> files)
      • deletePaths

        public static void deletePaths​(Path... paths)
      • deletePaths

        public static void deletePaths​(Iterable<Path> paths)
      • isRegularPath

        public static boolean isRegularPath​(File file)
        Returns:
        true if the path is not a device (e.g. /dev/null or /dev/stdin), and is not an existing directory. I.e. is is a regular path that may correspond to an existing file, or a path that could be a regular output file.
      • isRegularPath

        public static boolean isRegularPath​(Path path)
        Returns:
        true if the path is not a device (e.g. /dev/null or /dev/stdin), and is not an existing directory. I.e. is is a regular path that may correspond to an existing file, or a path that could be a regular output file.
      • newTempFile

        public static File newTempFile​(String prefix,
                                       String suffix,
                                       File[] tmpDirs,
                                       long minBytesFree)
                                throws IOException
        Creates a new tmp file on one of the available temp filesystems, registers it for deletion on JVM exit and then returns it.
        Throws:
        IOException
      • newTempFile

        public static File newTempFile​(String prefix,
                                       String suffix,
                                       File[] tmpDirs)
                                throws IOException
        Creates a new tmp file on one of the potential filesystems that has at least 5GB free.
        Throws:
        IOException
      • getDefaultTmpDir

        public static File getDefaultTmpDir()
        Returns a default tmp directory.
      • newTempPath

        public static Path newTempPath​(String prefix,
                                       String suffix,
                                       Path[] tmpDirs,
                                       long minBytesFree)
                                throws IOException
        Creates a new tmp path on one of the available temp filesystems, registers it for deletion on JVM exit and then returns it.
        Throws:
        IOException
      • newTempPath

        public static Path newTempPath​(String prefix,
                                       String suffix,
                                       Path[] tmpDirs)
                                throws IOException
        Creates a new tmp file on one of the potential filesystems that has at least 5GB free.
        Throws:
        IOException
      • getDefaultTmpDirPath

        public static Path getDefaultTmpDirPath()
        Returns a default tmp directory as a Path.
      • basename

        public static String basename​(File f)
        Returns the name of the file minus the extension (i.e. text after the last "." in the filename).
      • assertInputIsValid

        public static void assertInputIsValid​(String input)
        Checks that an input is is non-null, a URL or a file, exists, and if its a file then it is not a directory and is readable. If any condition is false then a runtime exception is thrown.
        Parameters:
        input - the input to check for validity
      • isUrl

        public static boolean isUrl​(String input)
        Returns true iff the string is a url. Helps distinguish url inputs form file path inputs.
      • assertFileIsReadable

        public static void assertFileIsReadable​(File file)
        Checks that a file is non-null, exists, is not a directory and is readable. If any condition is false then a runtime exception is thrown.
        Parameters:
        file - the file to check for readability
      • assertFileIsReadable

        public static void assertFileIsReadable​(Path path)
        Checks that a file is non-null, exists, is not a directory and is readable. If any condition is false then a runtime exception is thrown.
        Parameters:
        path - the file to check for readability
      • assertFilesAreReadable

        public static void assertFilesAreReadable​(List<File> files)
        Checks that each file is non-null, exists, is not a directory and is readable. If any condition is false then a runtime exception is thrown.
        Parameters:
        files - the list of files to check for readability
      • assertPathsAreReadable

        public static void assertPathsAreReadable​(List<Path> paths)
        Checks that each path is non-null, exists, is not a directory and is readable. If any condition is false then a runtime exception is thrown.
        Parameters:
        paths - the list of paths to check for readability
      • assertInputsAreValid

        public static void assertInputsAreValid​(List<String> inputs)
        Checks that each string is non-null, exists or is a URL, and if it is a file then not a directory and is readable. If any condition is false then a runtime exception is thrown.
        Parameters:
        inputs - the list of files to check for readability
      • assertFileIsWritable

        public static void assertFileIsWritable​(File file)
        Checks that a file is non-null, and is either extent and writable, or non-existent but that the parent directory exists and is writable. If any condition is false then a runtime exception is thrown.
        Parameters:
        file - the file to check for writability
      • assertFilesAreWritable

        public static void assertFilesAreWritable​(List<File> files)
        Checks that each file is non-null, and is either extent and writable, or non-existent but that the parent directory exists and is writable. If any condition is false then a runtime exception is thrown.
        Parameters:
        files - the list of files to check for writability
      • assertDirectoryIsWritable

        public static void assertDirectoryIsWritable​(File dir)
        Checks that a directory is non-null, extent, writable and a directory otherwise a runtime exception is thrown.
        Parameters:
        dir - the dir to check for writability
      • assertDirectoryIsWritable

        public static void assertDirectoryIsWritable​(Path dir)
        Checks that a directory is non-null, extent, writable and a directory otherwise a runtime exception is thrown.
        Parameters:
        dir - the dir to check for writability
      • assertDirectoryIsReadable

        public static void assertDirectoryIsReadable​(File dir)
        Checks that a directory is non-null, extent, readable and a directory otherwise a runtime exception is thrown.
        Parameters:
        dir - the dir to check for writability
      • assertFilesEqual

        public static void assertFilesEqual​(File f1,
                                            File f2)
        Checks that the two files are the same length, and have the same content, otherwise throws a runtime exception.
      • assertFileSizeNonZero

        public static void assertFileSizeNonZero​(File file)
        Checks that a file is of non-zero length
      • openFileForReading

        public static InputStream openFileForReading​(File file)
        Opens a file for reading, decompressing it if necessary
        Parameters:
        file - The file to open
        Returns:
        the input stream to read from
      • openFileForReading

        public static InputStream openFileForReading​(Path path)
        Opens a file for reading, decompressing it if necessary
        Parameters:
        path - The file to open
        Returns:
        the input stream to read from
      • openGzipFileForReading

        public static InputStream openGzipFileForReading​(File file)
        Opens a GZIP-encoded file for reading, decompressing it if necessary
        Parameters:
        file - The file to open
        Returns:
        the input stream to read from
      • openGzipFileForReading

        public static InputStream openGzipFileForReading​(Path path)
        Opens a GZIP-encoded file for reading, decompressing it if necessary
        Parameters:
        path - The file to open
        Returns:
        the input stream to read from
      • openFileForWriting

        public static OutputStream openFileForWriting​(File file)
        Opens a file for writing, overwriting the file if it already exists
        Parameters:
        file - the file to write to
        Returns:
        the output stream to write to
      • openFileForWriting

        public static OutputStream openFileForWriting​(File file,
                                                      boolean append)
        Opens a file for writing
        Parameters:
        file - the file to write to
        append - whether to append to the file if it already exists (we overwrite it if false)
        Returns:
        the output stream to write to
      • openFileForBufferedWriting

        public static BufferedWriter openFileForBufferedWriting​(File file,
                                                                boolean append)
        Preferred over PrintStream and PrintWriter because an exception is thrown on I/O error
      • openFileForBufferedWriting

        public static BufferedWriter openFileForBufferedWriting​(File file)
        Preferred over PrintStream and PrintWriter because an exception is thrown on I/O error
      • openFileForBufferedUtf8Writing

        public static BufferedWriter openFileForBufferedUtf8Writing​(File file)
        Preferred over PrintStream and PrintWriter because an exception is thrown on I/O error
      • openFileForBufferedUtf8Reading

        public static BufferedReader openFileForBufferedUtf8Reading​(File file)
        Opens a file for reading, decompressing it if necessary
        Parameters:
        file - The file to open
        Returns:
        the input stream to read from
      • openGzipFileForWriting

        public static OutputStream openGzipFileForWriting​(File file,
                                                          boolean append)
        Opens a GZIP encoded file for writing
        Parameters:
        file - the file to write to
        append - whether to append to the file if it already exists (we overwrite it if false)
        Returns:
        the output stream to write to
      • openFileForMd5CalculatingWriting

        public static OutputStream openFileForMd5CalculatingWriting​(File file)
      • copyStream

        public static void copyStream​(InputStream input,
                                      OutputStream output)
        Utility method to copy the contents of input to output. The caller is responsible for opening and closing both streams.
        Parameters:
        input - contents to be copied
        output - destination
      • copyFile

        public static void copyFile​(File input,
                                    File output)
        Copy input to output, overwriting output if it already exists.
      • getFilesMatchingRegexp

        public static File[] getFilesMatchingRegexp​(File directory,
                                                    String regexp)
        Parameters:
        directory -
        regexp -
        Returns:
        list of files matching regexp.
      • getFilesMatchingRegexp

        public static File[] getFilesMatchingRegexp​(File directory,
                                                    Pattern regexp)
      • deleteDirectoryTree

        public static boolean deleteDirectoryTree​(File fileOrDirectory)
        Delete the given file or directory. If a directory, all enclosing files and subdirs are also deleted.
      • sizeOfTree

        public static long sizeOfTree​(File fileOrDirectory)
        Returns the size (in bytes) of the file or directory and all it's children.
      • copyDirectoryTree

        public static void copyDirectoryTree​(File fileOrDirectory,
                                             File destination)
        Copies a directory tree (all subdirectories and files) recursively to a destination
      • createTempDir

        public static File createTempDir​(String prefix,
                                         String suffix)
        Create a temporary subdirectory in the default temporary-file directory, using the given prefix and suffix to generate the name. Note that this method is not completely safe, because it create a temporary file, deletes it, and then creates a directory with the same name as the file. Should be good enough.
        Parameters:
        prefix - The prefix string to be used in generating the file's name; must be at least three characters long
        suffix - The suffix string to be used in generating the file's name; may be null, in which case the suffix ".tmp" will be used
        Returns:
        File object for new directory
      • openFileForBufferedReading

        public static BufferedReader openFileForBufferedReading​(File file)
        Checks that a file exists and is readable, and then returns a buffered reader for it.
      • openFileForBufferedReading

        public static BufferedReader openFileForBufferedReading​(Path path)
        Checks that a path exists and is readable, and then returns a buffered reader for it.
      • makeFileNameSafe

        public static String makeFileNameSafe​(String str)
        Takes a string and replaces any characters that are not safe for filenames with an underscore
      • fileSuffix

        public static String fileSuffix​(File f)
        Returns the name of the file extension (i.e. text after the last "." in the filename) including the .
      • getFullCanonicalPath

        public static String getFullCanonicalPath​(File file)
        Returns the full path to the file with all symbolic links resolved
      • readFully

        public static String readFully​(InputStream in)
        Reads everything from an input stream as characters and returns a single String.
      • readLines

        public static IterableOnceIterator<String> readLines​(File f)
        Returns an iterator over the lines in a text file. The underlying resources are automatically closed when the iterator hits the end of the input, or manually by calling close().
        Parameters:
        f - a file that is to be read in as text
        Returns:
        an iterator over the lines in the text file
      • slurp

        public static String slurp​(InputStream is,
                                   Charset charSet)
        Reads all of the stream into a String, decoding with the provided Charset then closes the stream quietly.
      • unrollFiles

        public static List<File> unrollFiles​(Collection<File> inputs,
                                             String... extensions)
        Go through the files provided and if they have one of the provided file extensions pass the file into the output otherwise assume that file is a list of filenames and unfold it into the output.
      • unrollPaths

        public static List<Path> unrollPaths​(Collection<Path> inputs,
                                             String... extensions)
        Go through the files provided and if they have one of the provided file extensions pass the file to the output otherwise assume that file is a list of filenames and unfold it into the output (recursively).
      • hasScheme

        public static boolean hasScheme​(String uriString)
        Check if the given URI has a scheme.
        Parameters:
        uriString - the URI to check
        Returns:
        true if the given URI has a scheme, false if not, or if the URI is malformed.
      • getPath

        public static Path getPath​(String uriString)
                            throws IOException
        Converts the given URI to a Path object. If the filesystem cannot be found in the usual way, then attempt to load the filesystem provider using the thread context classloader. This is needed when the filesystem provider is loaded using a URL classloader (e.g. in spark-submit).
        Parameters:
        uriString - the URI to convert
        Returns:
        the resulting Path
        Throws:
        IOException - an I/O error occurs creating the file system
      • toPath

        public static Path toPath​(File fileOrNull)
      • filesToPaths

        public static List<Path> filesToPaths​(Collection<File> files)
        Takes a list of Files and converts them to a list of Paths Runs .toPath() on the contents of the input.
        Parameters:
        files - a List of Files to convert to Paths
        Returns:
        a new List containing the results of running toPath on the elements of the input
      • isGZIPInputStream

        public static boolean isGZIPInputStream​(InputStream stream)
        Test whether a input stream looks like a GZIP input. This identifies both gzip and bgzip streams as being GZIP.
        Parameters:
        stream - the input stream.
        Returns:
        true if `stream` starts with a gzip signature.
        Throws:
        IllegalArgumentException - if `stream` cannot mark or reset the stream
      • addExtension

        public static Path addExtension​(Path path,
                                        String extension)
        Adds the extension to the given path.
        Parameters:
        path - the path to start from, eg. "/folder/file.jpg"
        extension - the extension to add, eg. ".bak"
        Returns:
        "/folder/file.jpg.bak"
      • isBlockCompressed

        public static boolean isBlockCompressed​(Path path,
                                                boolean checkExtension)
                                         throws IOException
        Checks if the provided path is block-compressed.

        Note that using checkExtension=true would avoid the cost of opening the file, but if hasBlockCompressedExtension(String) returns false this would not detect block-compressed files such BAM.

        Parameters:
        path - file to check if it is block-compressed.
        checkExtension - if true, checks the extension before opening the file.
        Returns:
        true if the file is block-compressed; false otherwise.
        Throws:
        IOException - if there is an I/O error.
      • isBlockCompressed

        public static boolean isBlockCompressed​(Path path)
                                         throws IOException
        Checks if the provided path is block-compressed (including extension).

        Note that block-compressed file extensions BLOCK_COMPRESSED_EXTENSIONS are not checked by this method.

        Parameters:
        path - file to check if it is block-compressed.
        Returns:
        true if the file is block-compressed; false otherwise.
        Throws:
        IOException - if there is an I/O error.
      • hasBlockCompressedExtension

        public static boolean hasBlockCompressedExtension​(String fileName)
        Checks if a file ends in one of the BLOCK_COMPRESSED_EXTENSIONS.
        Parameters:
        fileName - string name for the file. May be an HTTP/S url.
        Returns:
        true if the file has a block-compressed extension; false otherwise.
      • hasBlockCompressedExtension

        public static boolean hasBlockCompressedExtension​(Path path)
        Checks if a path ends in one of the BLOCK_COMPRESSED_EXTENSIONS.
        Parameters:
        path - object to extract the name from.
        Returns:
        true if the path has a block-compressed extension; false otherwise.
      • hasBlockCompressedExtension

        public static boolean hasBlockCompressedExtension​(File file)
        Checks if a file ends in one of the BLOCK_COMPRESSED_EXTENSIONS.
        Parameters:
        file - object to extract the name from.
        Returns:
        true if the file has a block-compressed extension; false otherwise.
      • hasBlockCompressedExtension

        public static boolean hasBlockCompressedExtension​(URI uri)
        Checks if a file ends in one of the BLOCK_COMPRESSED_EXTENSIONS.
        Parameters:
        uri - file as an URI.
        Returns:
        true if the file has a block-compressed extension; false otherwise.