Python File IO: A Whirlwind Tour - Super Fast Python

Last Updated on August 21, 2023

Manipulating files is perhaps one of the core activities in most Python programs.

Before we can explore how to manipulate files concurrently with threads or processes, we need to review the basics of how to file IO in Python.

In this tutorial, you will discover how to manipulate files in Python.

After completing this tutorial, you will know:

  • How to list, open, read, write, rename, delete, move and copy files in Python.
  • How to manipulate paths in python such as get the basename, file extension, and construct paths.
  • How to create zip files, and add files to an archive and unzip files from an archive.

Let’s dive in.

File IO refers to manipulating files on a hard disk.

This typically includes reading and writing data to files, but also includes a host of related operations such as renaming files, copying files, and deleting files. It may also refer to operations such as zipping files into common archive formats.

File IO might be one of the most common operations performed in Python programs, or programs generally. Files are where we store data, and programs need data to be useful.

Python provides a number of modules for manipulating files.

The most common Python file IO modules include the following:

  • built-in: with functions such as the open() function for opening a file.
  • os: with functions such makedirs() remove(), rename(), and many more.
  • os.path: with functions such as basename(), join(), and many more.
  • shutil: with functions such as copy(), move(), and many more.

We cannot cover all of the functions or all of the operations, but we can take a whirlwind tour of some of the most common file IO operations.

How to List Files

The contents of a directory can be listed using the os.listdir() function.

The function takes a directory path as an argument and returns string names for each file and directory it contains.

For example:

...

# list the contents of a directory

names = listdir('/')

The “.” and “..” directories that represent the current and previous directory are omitted from the results.

The names returned may be files or directories. They will not be sorted.

The example below lists the contents of the root directory on your machine.

# SuperFastPython.com

# list the contents of a directory

from os import listdir

# directory to list

directory = '/'

# report the contents of the directory

for name in listdir(directory):

    print(name)

Running the example will report the contents of the ‘/’ root directory on your machine.

On my machine, we see the following (your results will differ).

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

home

usr

bin

sbin

.file

etc

var

Library

System

.VolumeIcon.icns

private

.vol

Users

Applications

opt

dev

Volumes

tmp

cores

Alternatives to the os.listdir() function include os.scandir().

How to Open a File

Files are opened in Python using the open() built-in function.

The function takes a number of arguments, most notably the path of the file to open, the mode in which to open it and the encoding used when reading or writing from the file.

Common open modes strings values include:

  • r‘: open in read mode (default)
  • w‘: open in write mode.
  • a‘: open in append write mode.
  • b‘: open in binary mode.
  • t‘: open in text mode (assumed).
  • x‘: open in creation mode.

Modes can be combined, for example ‘rb‘ indicates opening a file for reading (r) binary (b) data.

Encodings often refer to the codec, such as ‘utf-8’ which is common for ASCII text.

The open() function returns a file object or handle on which operations can be performed like reading and writing.

For example

...

# open a file

handle = open('/path/to/file.ext', 'w', encoding='utf-8')

# ...

handle.close()

Operations on the file include functions such as read(), write(), and close(). It is important to close a file once you are finished using it, especially when writing to the file as it ensures any buffered input is written.

It is common to use the context manager when opening a file.

This involves using the “with” keyword, calling the function and specifying a name for the file handle.

For example:

...

# open a file

with open('/path/to/file.ext', 'w', encoding='utf-8') as handle:

# ...

This creates a block in which file operations can be performed. Once the block is exited, normally or by a raised exception, then the file is closed automatically.

Given that the file is closed automatically, opening files with the context manager is the preferred way to open files.

The example below creates a new file but does not read or write any data.

# SuperFastPython.com

# create a new file

# create a new file in the current working directory

with open('new_file.txt', 'x', encoding='utf-8') as handle:

    pass

Running the example creates a new file with the name ‘new_file.txt‘ in the current working directory (the same directory as the Python script).

If you try to create the file again after it already exists, the program will result in an error.

FileExistsError: [Errno 17] File exists: 'new_file.txt'


Free Concurrent File I/O Course

Get FREE access to my 7-day email course on concurrent File I/O.

Discover patterns for concurrent file I/O, how save files with a process pool, how to copy files with a thread pool, and how to append to a file from multiple threads safely.

Learn more
 


How to Create a Directory

A directory can be created using the os.makedirs() function.

The function takes the path of the directory to create. If one or more directories in the provided path do not exist, they will be created.

The function also tasks an argument exist_ok which defaults to False, which means that if the final directory in the path already exists then an error will be raised. The exist_ok argument can be set to True to ignore the case if the directory already exists, allowing the script with directory creation to be attempted each time it is run.

For example:

...

# create directories

makedirs('create/all/of/these/dirs', exist_ok=True)

The following example will create a new subdirectory in the current working directory.

# SuperFastPython.com

# create directories

from os import makedirs

# directory to create

path = 'tmp'

# create all directories in the path

makedirs(path, exist_ok=True)

Running the example creates a tmp/ subdirectory in the current working directory.

Alternatives to the os.makedirs() function include os.mkdir().

How to Write To File

Files can be written to after being opened by calling the write() function.

A file must first be opened by calling the open() built-in function with a mode that permits writing, such as ‘w‘. This function returns a file handle on which the write() function can be called.

For example:

...

# open a file for writing text

with open('path/to/file.ext', 'w', encoding='utf-8') as handle:

# write a string to file

handle.write('Hello world!')

If the file was open in binary mode, then write() must take bytes, whereas if the file was opened in text mode then write() must take strings.

The example below writes a string to file. If the file already exists, the content is replaced because we are opening the file in write mode ‘w‘ instead of append mode ‘a‘.

# SuperFastPython.com

# write a string to a file

# open a file for writing text

with open('new_file.txt', 'w', encoding='utf-8') as handle:

    # write string data to the file

    handle.write('We are writing to file')

Running the example creates a new file in the current working directory with the name “new_file.txt“.

Checking the file in a text editor, we can see the contents match the string that we wrote.

Alternatives to the write() function include writelines().


Concurrent File I/O With Python

Loving The Tutorials?

Why not take the next step? Get the book.

Learn more
 


How to Read From File

Files can be read from after being opened by calling the read() function.

A file must first be opened by calling the open() built-in function with a mode that permits reading, such as ‘r‘. This function returns a file object on which the read() function can be called.

For example:

...

# open a file for reading text

with open('path/to/file.ext', 'r', encoding='utf-8') as handle:

# read the contents of the file

data = handle.read()

The example below opens the “/etc/services” file on a POSIX system and reports the length of the data in the file.

Note, change the file to the Python script itself if you are not on a POSIX system (e.g. mac or linux).

# SuperFastPython.com

# read a file into memory

# the file to open

path = '/etc/services'

# open a file for writing text

with open(path, 'r', encoding='utf-8') as handle:

    # read the contents of the file as string

    data = handle.read()

    # report details about the content

    print(f'{path} has {len(data)} characters')

Running the example loads the file, reads the contents into memory as a string and reports the length of the string in terms of the number of characters.

/etc/services has 677972 characters

Alternatives to the read() function include readline() and readlines().

How to Move Files

Files can be moved in Python using the shutil.move() function.

The shutil.move() function takes the source file path and the destination file path as arguments.

For example:

...

# move a file

move('src/file.ext', 'dst/file.ext')

If the source is a directory, then the move() function will move the directory and its contents to the destination.

If the destination file does not match the source file, it will be renamed accordingly.

If the destination is a directory, then the source will be moved under the destination directory.

The example below creates a new subdirectory and a file in the current directory, then moves the file under the subdirectory.

# SuperFastPython.com

# move a file

from os import makedirs

from shutil import move

# create a new file in the current working directory

with open('moving_file.txt', 'x', encoding='utf-8') as handle:

    pass

# create a new sub-directory in the current working directory

makedirs('tmp', exist_ok=True)

# move the file under the sub-directory

move('moving_file.txt', 'tmp')

Running the example first creates a new file in the current directory named “moving_file.txt” and a new subdirectory under the current working directory named ‘tmp/‘. It then moves ‘./moving_file.txt‘ to ‘tmp/moving_file.txt‘.

Alternatives to the shutil.move() function include os.rename() and os.replace().

How to Copy Files

Files can be copied in Python using the shutil.copy() function.

The shutil.copy() function takes a source file path and a destination file path.

If the destination file path is a directory, the source file will be copied into the destination directory.

The example below creates a new file in the current working directory and then copies it to a file with a different name in the current working directory.

# SuperFastPython.com

# copy a file

from os import makedirs

from shutil import copy

# create a new file in the current working directory

with open('copy_file.txt', 'x', encoding='utf-8') as handle:

    pass

# copy the file

copy('copy_file.txt', 'copy_file2.txt')

Running the example creates a file named “copy_file.txt” in the current working directory then copies it to a new file with the name “copy_file2.txt” in the current working directory.

Alternatives to the shutil.copy() function include os.sendfile(), shutil.copyfileobj(), shutil.copyfile(), and shutil.copy2().

How to Rename Files

Files can be renamed in Python using the os.rename() function.

The os.rename() takes a source file path and a destination file path.

For example:

...

# rename a file

rename('src.ext', 'dst.ext')

If the destination file path already exists, then an error will be raised.

The example below will create a new file in the current working directory and will then rename it.

# SuperFastPython.com

# rename a file

from os import rename

# create a new file in the current working directory

with open('rename_file.txt', 'x', encoding='utf-8') as handle:

    pass

# rename the file

rename('rename_file.txt', 'test_file.txt')

Running the example creates a new file named “rename_file.txt” in the current working directory, then renames the file to “test_file.txt“.

Alternatives to the os.rename() function include os.renames(), os.replace(), and shutil.move().

How to Delete Files

Files can be deleted in Python via the os.remove() function.

Remove takes a file path to delete.

For example:

...

# delete a file

remove('/path/to/file.ext')

If the file does not exist, an error is raised. Similarly, if the path is a directory an error is raised.

The example below creates a file in the current working directory, then deletes it.

# SuperFastPython.com

# delete a file

from os import remove

# create a new file in the current working directory

with open('delete_file.txt', 'x', encoding='utf-8') as handle:

    pass

# delete the file

remove('delete_file.txt')

Running the example creates a new file with the name “delete_file.txt” in the current working directory, then deletes it.

Alternatives to the os.remove() function include the shutil.rmtree() function. Related functions include os.rmdir() and os.removedirs().

How to Get Path Base Name

The base filename can be retrieved in Python via the os.path.basename() function.

A path string to a file or directory may contain one or more directories. The file or directory at the end of the path is called the basename.

The os.path.basename() function takes a path string and returns the basename file string.

For example:

...

# get the basename of a path

base = basename('/path/to/a/file.ext') # returns file.ext

The function operates on the string directly and the path or path-like string does not exist on disk.

The example below returns the basename of a contrived long path string.

# SuperFastPython.com

# get the basename of a path

from os.path import basename

# path to a file

path = '/this/is/the/path/to/a/file.ext'

# get the basename from the path

name = basename(path)

print(name)

Running the example reports the basename of the path which is the file component at the end of the path in this case.

Alternatives to the os.path.basename() function include os.path.split().

How to Join Paths

A file or directory can be joined to an existing path string via the os.path.join() function.

The os.path.join() function takes a path and one or more additional directories and/or files to join to the path.

For example:

...

# join a filename to a directory

path = join('/path/to/', 'file.ext') # returns /path/to/file.ext

Different operating systems use different characters to delimit directories and files in a path string, such as ‘/’ and ‘\’.

The join() function provides a platform agnostic way of constructing string paths in Python using the appropriate path delimiter for the platform on which the program is being run.

The example below will construct a path with a directory and ending in a file.

# SuperFastPython.com

# construct a path

from os.path import join

# create a path

name = join('tmp', 'file.ext')

print(name)

Running the example constructs a path using the delimiter that is appropriate for the platform on which the Python script is being run.

In this case, it is being run on MacOS where paths are delimited by ‘/’.

How to Split Filename and Extension

The name and extension parts of a filename can be separated using the os.path.splitext() function.

The os.path.splitext() takes a path string, such as a filename and returns a tuple that contains the name part and the extension part of the provided string.

The name part will include any directory elements and the extension part will include the ‘.’ extension deliminoter.

For example:

...

# separate filename into name and extension

name, extension = splitext('filename.ext') # returns ('filename', '.ext')

The example below separates a filename into the name and extension parts.

# SuperFastPython.com

# separate a filename into name and extension

from os.path import splitext

# separate into name and extension

name, ext = splitext('filename.ext')

print(name, ext)

Running the example splits the filename ‘filename.ext‘ into the name ‘filename‘) and extension ‘.ext‘ elements.

How to Open a Zip File

A zip file can be opened by creating an instance of the zipfile.ZipFile class.

The zipfile.ZipFile class constructor takes the path to the zip file and the mode in which it is opened. Modes are the same as those used when opening files generally, such as ‘r‘ for read mode and ‘w‘ for write mode.

Once opened, operations can be performed on the ZipFile instance like adding or extracting files from the archive, and closing the file.

For example:

...

# open a zip file

handle = ZipFile('file.zip', 'r')

# ...

handle.close()

Once a file is open it must be closed, especially if files are being added to the archive.

The ZipFile class supports a context manager for opening and closing the file. Operations can be performed on the ZipFile instance within the context manager block and the file will be closed automatically once the block is exited, normally or otherwise.

As such, the context manager is the preferred way to open and use ZipFile instances.

For example:

...

# open a zip file

with ZipFile('file.zip', 'r') as handle:

# ...

The example below creates a new zip file in the current working directory.

# SuperFastPython.com

# create a new zip file

# create a zip file in the current directory

with open('archive.zip', 'x') as handle:

    pass

Running the example creates a new zip file in the current working directory with the name “archive.zip” and no content.

How to Add Files to Zip

Files can be added to a zip file in Python via the write() function on an open ZipFile instance.

The write() function takes the path of the file to add to the archive.

For example:

...

# add a file to a zip archive

handle.write('/path/to/file.ext')

The example below creates a new file in the current working directory then adds that file to a new zip file archive.

# SuperFastPython.com

# add files to a zip file

from zipfile import ZipFile

# open a file for writing text

with open('new_file.txt', 'w', encoding='utf-8') as handle:

    # write string data to the file

    handle.write('We are writing to file')

# create a zip file in the current directory

with ZipFile('file_archive.zip', 'w') as handle:

    # add a text file to the archive

    handle.write('new_file.txt')

Running the example first creates a text file in the current working directory with the name “new_file.txt” that contains a string of data.

A new zip file is then created in the current working directory with the name “file_archive.zip” and the file “new_file.txt” is then added.

Alternatives to the write() function include the writestr() function and shutil.make_archive() function.

How to Extract a Zip File

Files can be extracted from a zip file in Python via the extract() function on an open ZipFile instance.

The extract() function takes the name of the member in the zip file to extract and extracts the file into the current working directory.

For example:

...

# unzip a file from a zip file

handle.extract('filename.ext')

The example below creates a new file in the current working directory, then adds that file to a new zip file archive, then extracts the same file from the archive into the current working directory.

# SuperFastPython.com

# extract files from a zip file

from zipfile import ZipFile

# open a file for writing text

with open('new_file.txt', 'w', encoding='utf-8') as handle:

    # write string data to the file

    handle.write('We are writing to file')

# create a zip file in the current directory

with ZipFile('file_archive.zip', 'w') as handle:

    # add a text file to the archive

    handle.write('new_file.txt')

# open the zip file for extracting files

with ZipFile('file_archive.zip', 'r') as handle:

    # extract the file from the archive

    data = handle.extract('new_file.txt')

Running the example first creates a text file in the current working directory with the name “new_file.txt” that contains a string of data. A new zip file is then created in the current working directory with the name “file_archive.zip” and the file “new_file.txt” is then added.

Finally, the file “new_file.txt” is extracted from the archive “file_archive.zip” into the current working directory.

Alternatives to the extract() function include extractall() and read().

Further Reading

This section provides additional resources that you may find helpful.

Books

Guides

Python File I/O APIs

Python Concurrency APIs

File I/O in Asyncio

References

Takeaways

In this tutorial you discovered how to manipulate files in Python.

  • How to list, open, read, write, rename, delete, move and copy files in Python.
  • How to manipulate paths in python such as get the basename, file extension, and construct paths.
  • How to create zip files, and add files to an archive and unzip files from an archive.

Do you have any questions?
Leave your question in a comment below and I will reply fast with my best advice.

Photo by Paul Chambers on Unsplash