copy file in python

How to Copy File in Python [5 Easy Ways]

It’s always a good idea to have a copy of a file, It helps you in recovering from any unintended operations on the original file.

It’s better if you have a copy of a file on a different machine, which keeps you safe from one particular system failure. It’s best if you have multiple copies of a file across the machines in different data centers.

In this blog post, we will cover the below things:

  • Copy File in Python using shutil, os, subprocess libraries, and file operations
  • Copy File to server using scp command and paramiko library in Python
  • Copy File to S3 using boto3 library

Best Ways to copy file in python

There are quite a few options to copy file in Python. We recommend to use below shutil library methods mentioned below based on your requirement.

  • shutil.copyfile(): Use this only when you want to copy only the file data but not metadata and permissions of the source file.
  • shutil.copy(): This method copies both file data and permissions of the source file but not the metadata like file creation timestamp.
  • shutil.copy2(): When you need both metadata and permissions of the source file along with file data, we can use this method.
  • shutil.copyfileobj(): Use only when you don’t need both metadata and permissions of the file and when you have file objects not the file paths.

In case for some reason, if you want to copy files using the shell commands in Python then use the below methods depending on your need.

  • subprocess.run(): When you want to copy files in synchronized way, main thread will wait till subprocess completes .
  • subprocess.Popen(): For advanced cases, It copies files in async way, main thread can work on other things when subprocess works on file copying.

shutil vs subprocess libraries:

1. shutil has a various functions for catering different needs and also all are simple functions compared to subprocess functions.

2. The same shutil methods work in all the major operating systems Mac, Unix, and Windows. Whereas with subprocess library, you may have to change the commands based on the operating system.

We have listed down methods using os library methods to copy file, and copy file to server below. Avoid using them in your code since they are deprecated in the latest versions of Python.

os library notes:

1. os library functions popen(), popen2(), popen3(), and popen4() are deprecated in Python3.

2. It is recommended to use subprocess library run() function or Popen class instead of os library.

Copy File in Python – All the methods with examples

There are several options to copy file in Python. You can use shutil, os, subprocess libraries, and file operations to copy files from one directory to another in Python. We will see each method in detail below.

Method 1: shutil library

The shutil is a Python standard library that you don’t need to install separately. The shutil library provides functions for file operations like file copy and removal etc. You can import the shutil with the below command.

import shutil

shutil.copyfile() function

The shutil copyfile() function can be used to copy file in python. It mainly takes the below two parameters.

Input Parameters:

  • src: String – accepts only file paths but not directory paths.
  • dest: String – accepts only file paths but not directory paths.

copyfile() notes:

1. The copyfile() function copies data only but not metadata(creation timestamp) of the source file

2. It does not copy the permissions of the source file

3. The copyfile() creates a destination file if it doesn’t exist and writes content and If it exists then it overwrites the content.

Code Thread:

import shutil


# shutil.copyfile method returns the path of destination file
res = shutil.copyfile('movies.csv', '/home/user1/Desktop/movies_copy1.csv')

print(res)

Output:

movies_copy1.csv

shutil.copy() function

The shutil.copy() function is similar to copyfile(). The copy() function also takes below two parameters.

Input Parameters:

  • src: String – accepts only file paths but not directory paths.
  • dest: String – accepts both file and directory paths.

copy() notes:

1. The copy() function internally uses copyfile() and copymode() functions.

2. It uses copymode() function to copy the permissions of the source file.

3. It just copies data with the help of the copyfile() function but not the metadata(timestamp) of the source file. So file owner, group, and ACLs will not be copied.

Code Thread:

import shutil


# shutil.copyfile method returns the path of the destination file
res = shutil.copy('movies.csv', '/home/user1/Desktop/movies_copy2.csv')

print(res)

Output:

movies_copy2.csv

shutil.copy2() function

The shutil.copy2() function is identical to copy() function except that it copies metadata as well.

Input Parameters:

  • src: String – accepts only file paths but not directory paths.
  • dest: String – accepts both file and directory paths.

copy2() notes:

1. The copy2() function internally uses copyfile() and copystat() functions.

2. Not only data but also preserves and copies the metadata(timestamp) of the source file as well.

3. It copies the permission of the source file

Code Thread:

import shutil


# shutil.copyfile method returns the relative path of new file
res = shutil.copy2('movies.csv', 'movies_copy3.csv')

print(res)

Output:

movies_copy3.csv

shutil.copyfileobj() function

The shutil.copyfileobj() is a very high-level function that accepts two file object parameters and it copies the data from the source file object to the destination file object.

Input Parameters:

  • src: file object type
  • dest: file object type

copyfileobj() notes:

1. It just copies data from the source file obj to the destination but not the metadata(timestamp) of the source file

2. Does not copy the permission of the source file

3. copyfileobj does not return anything

Code Thread:

import shutil


src = open('movies.csv', 'rb')
# returns FileNotFoundError if file does not exists
dest = open('movies_copy4.csv', 'wb')

# returns TypeError if you just pass string path
shutil.copyfileobj(src, dest)

Possible errors to look out working with shutil copy

Following are the possible errors to watch out for when working with shutil library for the copy file.

  • IsADirectoryError: When the source path is a directory instead of a file path
  • SameFileError: When source and destination files are the same file.
  • IOError: If the destination directory is not writable.
  • TypeError: Encounters while working with copyfileobj() method when string path given instead of file object to src or dest params.

Method 2: subprocess library

The subprocess is a new library that provides functions to start a new process, run other programs in process, and control their outputs and errors.

With any of the below three functions/classes of subprocess, we can execute the commands to copy file in Python.

  • call()
  • run()
  • Popen() – Popen is a class

Read more about the other functions in the documentation.

subprocess.run() function

The subprocess run() function can be used to copy file in Python. This method works in synchronized way, so main thread will wait till child process completes.

Code thread for Ubuntu & Mac OS:

import subprocess

source = 'movies.csv'
destination = 'movies_copy9.csv'

# cp works in ubuntu and mac, you need to use "copy"
args = ['cp', source, destination]

subprocess.run(args)

Code thread for Windows:

import subprocess

source = 'movies.csv'
destination = 'movies_copy9.csv'

# copy works in windows, you need to use "cp" for ubuntu and mac
args = ['copy', source, destination]

subprocess.run(args)

subprocess.Popen function

The Popen() function is meant for much more advanced cases. With this method, main thread will invoke the child process and continue it’s work. There are also methods for knowing the status of child process like Popen.poll(), Popen.wait(), and Popen.communicate() etc.

Code thread for Ubuntu & Mac OS:

import subprocess

source = 'movies.csv'
destination = 'movies_copy9.csv'

# cp works in ubuntu and mac, you need to use "copy"
args = ['cp', source, destination]

subprocess.Popen(args)

Code thread for Windows:

import subprocess

source = 'movies.csv'
destination = 'movies_copy9.csv'

# copy works in windows, you need to use "cp" for ubuntu and mac
args = ['copy', source, destination]

subprocess.Popen(args)

subprocess.call() function

The call() is an older function and is not recommended to use in the latest versions. Recommended to use run() or Popen() in place of call() function.

Code thread for Ubuntu & Mac OS:

import subprocess

source = 'movies.csv'
destination = 'movies_copy7.csv'

# cp works in ubuntu and mac, you need to use "copy" for windows
command = 'cp ' + source + ' ' + destination

subprocess.call(command, shell=True)

Code thread for Windows OS:

import subprocess

source = 'movies.csv'
destination = 'movies_copy7.csv'

# copy works in windows, you need to use "cp" for ubuntu and mac
command = 'copy ' + source + ' ' + destination

subprocess.call(command, shell=True)

Method 3: os library

The os library provides functions to do operating system tasks like file creation, file copy, and file removal in Python. It has popen() and system() functions to write commands to the command line.

Using os library function to copy a file is deprecated with Python3 and above versions.

os popen() function

The os popen() function internally uses subprocess.popen() function.

Code thread for Ubuntu & Mac OS:

import os

source = 'movies.csv'
destination = 'movies_copy5.csv'

# cp works in ubuntu and mac, you need to use "copy" for windows
command = 'cp ' + source + ' ' + destination

os.popen(command)

Code thread for Windows OS:

import os

source = 'movies.csv'
destination = 'movies_copy5.csv'

# copy works in windows, you need to use "cp" for ubuntu and mac
command = 'copy ' + source + ' ' + destination

os.popen(command)

os system() function

Code thread for Ubuntu & Mac OS:

import os

source = 'movies.csv'
destination = 'movies_copy6.csv'

# cp works in ubuntu and mac, you need to use "copy" for windows
command = 'cp ' + source + ' ' + destination

os.system(command)

Code thread for Windows OS:

import os

source = 'movies.csv'
destination = 'movies_copy6.csv'

# copy works in windows, you need to use "cp" for ubuntu and mac
command = 'copy ' + source + ' ' + destination

os.system(command)

Possible Errors:

The os library functions returns OSError when given file paths are invalid.

Method 4: With file operations

Alternatively, in this approach, we have used normal file operations. This is not a good approach for bigger files, it reads all lines into memory and then writes to the destination file.

source = 'movies.csv'
destination = 'movies_copy8.csv'

# opening source and reading all lines as list of strings
with open(source) as source_file:
    lines = source_file.readlines()

# converting lines read to  one string and writing to destination file
with open(destination, 'w') as destination_file:
    lines_str = ''.join(lines)
    destination_file.write(lines_str)

Method 5: Using file operations with buffer

In this approach, we are reading one line at a time instead of reading all lines at once. And we are holding the lines in a buffer till we reach the buffer size. With the buffer in place, we can control the memory that the program uses.

Code Thread:

source = 'movies.csv'
destination = 'movies_copy11.csv'

# opening source and reading all lines as list of strings
with open(source) as source_file, open(destination, 'w') as destination_file:
    source_reader = source_file.read()
    
    #buffer to hold temporary lines
    buffer = []
    
    for line in source_reader:
        buffer.append(line)
        # change the buffer length as per your requirements
        if len(buffer) == 5:
            # 
            buffer_str = ''.join(buffer)
            destination_file.write(buffer_str)
            buffer = []

Copy file to server in Python

So far, In this article, we have seen copying a file to the same or different folder on the same machine. In the following section, we will cover how to copy file to a remote server in Python.

We can use all the functions from os and subprocess libraries that support the execution of commands to copy file to server in Python.

The os library functions are deprecated and it is recommended to use either run() or Popen() from the subprocess over the call() function. So, we will only explain copy file to server using subprocess.run() function and subprocess.Popen() class.

Pre-requisite:

Both subprocess.Popen() and subprocess.run() execute the scp command. For these methods to work without any issues, we should set up no password authentication via ssh keys.

subprocess.Popen() class

Using subprocess.Popen class, You can execute the scp command to copy file to a remote server that is similar to copying file in Python onto the same machine using cp/copy commands.

Code Thread:

import subprocess


source = 'movies.csv'

#replace user with actuall ssh username and ip with actual ip and path also respectively
destination = '[email protected]:/path/'

# cp works in ubuntu and mac, you need to use "copy"
args = ['scp', source, destination]

subprocess.Popen(args)

subprocess.run() function

Like in the above method, instead of the subprocess.Popen class, you can use the subprocess.run() to execute the scp command to copy the file to a remote server.

Code Thread:

import subprocess

source = 'movies.csv'
#replace user with actuall ssh username and ip with actual ip and path also respectively
destination = '[email protected]:/path/'

# cp works in ubuntu and mac, you need to use "copy"
args = ['scp', source, destination]

subprocess.run(args)

Using Paramiko Library

You can also use the library Paramiko which implements SSH protocol, to copy files to the server in Python. Paramiko supports the below types of authentications.

  • Username and password
  • ssh keys

Command to install Paramiko:

pip3 install paramiko

Paramiko internally uses the cffi module. Without cffi module installed, if you use the Paramiko library then you will get the ModuleNotFoundError. Set it up with the below command.

python3 -m pip install cffi

Code Thread:

import paramiko
from paramiko import SSHClient


ssh_client = SSHClient()

# we can pass the key file name here, if we don't pass it will look for id_rsa.pub in .ssh/ folder.
ssh_client.load_system_host_keys()

# replace ip with the actual ip and username with actual username
ssh_client.connect('ip', username='user')
sftp_connection = ssh_client.open_sftp()

# replace the path with actual path
sftp_connection.put('movies.csv', 'path')

# closing both sftp connect and ssh client
sftp_connection.close()
ssh_client.close()

Possible Errors to watch out for:

  • SSHException: when the server is not able to execute the given command.
  • OSError: Failure: when you don’t specify the file name but only the folder name in the remote path.

Copy file to S3 using boto3

Till now, we have seen copying the file onto the same machine and then to the remote server. Now, we will see how to copy file to S3 bucket in the below section.

We will be using the boto3 library which is a collection of functions to perform AWS operations in Python. If you haven’t installed boto3, then use the below command to install it.

pip3 install boto3

Code Thread:

import boto3

# replace access and secret keys
s3_session = boto3.Session(aws_access_key_id='access_key_id', aws_secret_access_key='secret_key')
s3_client = s3_session.client('s3')

bucket_name = 'bucket_name' #replace with s3 bucket name
file_path = 'movies.csv' #replace with file path including the file name
file_name = 'file_name.csv' #replace with target file name

try:
    s3_client.upload_file(file_path, bucket_name, file_name)
    print('File uploaded successfully.')
except Exception as e:
    print('Error while uploading file to S3 : ', e)

Output:

File uploaded successfully.

Thanks for patiently reading out till this point, We hope we helped you with the right information on copy file, and copy file to server in Python. Please follow us and share your thoughts via comments.

Frequently Asked Questions

  • How to copy files from one directory to another in Python?

    You can use methods from shutil, subprocess, and os libraries to copy files from one directory to another. Visit this page for complete code examples.

  • How to install shutil library on Ubuntu, Mac, or Windows?

    You don’t have to install shutil separately, it comes as a standard library. Once you install python in your system then automatically you will get shutil module.

  • How to copy all files from one directory to another directory?

    We can use shutil.copytree() method and subprocess.run() function with cp -r command to copy all files recursively from one directory to another.

  • How to copy files to S3 in Python?

    We can use boto3 library in Python to copy files to S3. Detailed code with an explanation is added in this guide.

Leave a Comment

Your email address will not be published. Required fields are marked *