The ability to split a list in Python can be an incredibly valuable operation as it allows for efficient, quick processing of big datasets. Some of the benefits of splitting a list into n chunks include Fault tolerant processing and multithreading.
Fault-tolerant processing:
Once you split a list, you can process each chunk independently. If something goes wrong during independent processing, only the unsuccessful chunk can be reprocessed rather than the entire massive list again.
Multithreading:
By splitting the list, you might best utilize the multiple cores and multithreading capabilities of modern CPUs as opposed to processing them in a single thread.
There are 10+ methods to split a list in Python. Each method has a different way of implementation and different performance. So understanding the nuances of each method is vital. We have listed all the details below.
- Quick Summary
- Fastest way to split a list?
- How to split a list into 2 halves?
- How to split a list into N equal chunks?
- Method 1: for loop and slice operator
- Method 2: for comprehension and slice operator
- Method 3: array_split() from NumPy Library
- Method 4: zip_longest from itertools
- Method 5: islice from itertools
- Method 6: Naive approach with for loop
- Method 7: for loop and yield keyword
- Method 8: batched from more_itertools
- Method 9: chunked from more_itertools
- Method 10: grouper from more_itertools
- Frequently Asked Questions
Quick Summary
Which method is the best?
In general, itertools
library is a resource optimized for all kinds of iterable operations and NumPy is CPU and memory-optimized for big numeric dataset operations.
In the following sections, we have listed the suitable methods for different requirements.
List with only numbers:
If you have a list only with numeric (int or float) data then we recommend you use array_split()
from numpy
module. NumPy’s operations are optimized for big datasets with vectorization.
NumPy library methods avoid Python’s loops and use efficient C for
loops.
General Purpose:
For most of the use cases, we suggest you use islice()
method from itertools
to split a list into n chunks. The itertools module is optimized for all kinds of list operations by using loops from C.
If you are looking for specific options like the equal or strict splitting of a list then you may find the below methods useful.
Equal-length chunks or sublists:
If you searching for a way to split a list into n equal-length chunks and fill in the last shorter list in case it is not equally divisible then the following two are the right options.
- Using the zip_longest() method from itertools to split a list into equal-length sublists. If the last list is shorter then it will fill with None values. In this approach, you will not be able to choose the fill-in value of your choice.
- The grouper() method from more_itertools, by default None, will be filled in the last shorter chunk. It also provides flexibility to pass our choice of fill-in value.
![7+ Best Ways to Split a List in Python? [iSlice method is the best] 3 zip_longest vs grouper](https://codethreads.dev/wp-content/uploads/2022/12/zip_longest_vs_grouper-1.png)
Strict splitting:
In case you are looking for strict equal-length splitting, you can opt for one of the following options based on your requirements. All these methods throw an error if equal-length chunking is not possible.
- Use the
grouper()
method from more_itertools library with incomplete parameter value as ‘strict’. - Use the
chunked()
method from the more_itertools library with strict parameter values as ‘True’. - Use the
split()
method from the NumPy module instead ofarray_split
.
Fastest way to split a list?
We have experimented with 10 million numbers for all the 10 methods listed below to split a list in python. Following are the two fastest ways split a list.
2 Fastest ways:
1. Using islice method from itertools
2. Using array_split method from numpy
Following are the times taken for each method:
Method | Time Taken in Milliseconds | Memory Usage In MB |
---|---|---|
yield keyword and for Loop | 0.14859199291095138 | NA |
array_split() from NumPy | 0.20794500596821308 | NA |
islice() from itertools | 43.75231100129895 | NA |
zip_longest() from itertools | 241.95613400661387 | NA |
for Loop and Slice Operators | 635.7849930063821 | NA |
grouper() from more_itertools | 645.1406150008552 | NA |
batched() from more_itertools | 831.6389780084137 | NA |
for Comprehension and Slice Operators | 970.2826180146076 | NA |
chunked() from more_iterTools | 1861.3601539982483 | NA |
naive for loop approach | 12574.091875023441 | NA |
How to split a list into 2 halves?
Method 1: Straightforward approach
In this method, we use len()
and slice operators(:) to split the list in half.
Algorithm/Psuedocode:
- Find the middle index integer:
- Find the length of the list using
len()
method - Divide
len(elements)
by 2 to get the middle index, but / operator return float. - Wrap
len(elements)/2
insideint()
to get the middle index integer
- Find the length of the list using
- Use Slice Operator:
- Get the first part:
elements[0:middle_index]
- Get the second part:
elements[middle_index:]
- Initiate a list with the first and second parts
- Get the first part:
def split(elements):
# step1: find the middle index
middle_index = int(len(elements) / 2)
# step 2: get first and second parts and initiate a list with them
first_part = elements[0:middle_index]
second_part = elements[middle_index:]
halves = [first_part, second_part]
return halves
print(split([1, 2, 3, 4, 5, 6, 7]))
# output: [[1, 2, 3], [4, 5, 6, 7]]
Method 2: Elegant approach [Extendable to n chunks]
The following function accepts two parameters:
- elements: list needs to be passed on to this parameter
- chunk_size: for this case, chunk_size can be len(movies)/2. But if you want to split it into 3 chunks then use len(movies)/3 or so.
#method that accepts list and chunks size
def split(elements, chunk_size):
return [elements[i:i+chunk_size]
for i in range(0, len(elements), chunk_size)]
#input data
movies = ['Thor: Love and Thunder', 'Top Gun: Maverick',
'Black Panther: Wakanda Forever', 'Jurassic World Dominion',
'Avatar: The Way of Water']
result_list = split(movies, int(len(movies)/2))
print('movie sublists:', result_list)
#output: movie sublists: [['Thor: Love and Thunder', 'Top Gun: Maverick'], ['Black Panther: Wakanda Forever', 'Jurassic World Dominion'], ['Avatar: The Way of Water']]
How to split a list into N equal chunks?
Method 1: for loop and slice operator
In this method, we use the following predefined functions to split a list:
len(list)
– to get the length of the listrange(start, end, step)
- slice(
:
) operator append()
function of a list
def split(elements, chunk_size):
element_chunks = []
for i in range(0, len(elements), chunk_size):
current_chunk = elements[i:i+chunk_size]
element_chunks.append(current_chunk)
return element_chunks
movies = ['Thor: Love and Thunder', 'Top Gun: Maverick',
'Black Panther: Wakanda Forever', 'Jurassic World Dominion',
'Avatar: The Way of Water']
print('movie sublists:', split(movies, 3))
# output: movie sublists: [['Thor: Love and Thunder', 'Top Gun: Maverick', 'Black Panther: Wakanda Forever'], ['Jurassic World Dominion', 'Avatar: The Way of Water']]
- Time Complexity: O(N)
- Space Complexity: O(N) – range() function creates an in-memory list, and element_chunks stores all sub-lists in memory.
Method 2: for comprehension and slice operator
In this approach, instead of for loop we used for comprehension and slicing operator to split a list into chunks.
- Compact Code: With the for comprehension code looks much more compactor
def split(elements, chunk_size):
return [elements[i:i+chunk_size]
for i in range(0, len(elements), chunk_size)]
movies = ['Thor: Love and Thunder', 'Top Gun: Maverick',
'Black Panther: Wakanda Forever', 'Jurassic World Dominion',
'Avatar: The Way of Water']
print('movie sublists:', split(movies, 3))
# output: movie sublists: [['Thor: Love and Thunder', 'Top Gun: Maverick', 'Black Panther: Wakanda Forever'], ['Jurassic World Dominion', 'Avatar: The Way of Water']]
- Time Complexity: O(N)
- Space Complexity: O(N)
Method 3: array_split() from NumPy Library
Requirement: Need to install the NumPy with the below command.
sudo pip3 install numpy
Alternatively, You can use the array_split()
method from the NumPy library to split a list.
Input parameters:
- array
- indices_or_section
- axis
Return type: returns a sequence of NumPy arrays.
Find more information about array_split
in the documentation.
# importing numpy library
import numpy as np
# method to split the list into number_of_chunks
def split(elements, number_of_chunks):
return [list(l)
for l in np.array_split(elements, number_of_chunks)]
movies = ['Thor: Love and Thunder', 'Top Gun: Maverick',
'Black Panther: Wakanda Forever', 'Jurassic World Dominion',
'Avatar: The Way of Water']
print('movie sublists:', split(movies, 3))
# output: movie sublists: [['Thor: Love and Thunder', 'Top Gun: Maverick'], ['Black Panther: Wakanda Forever', 'Jurassic World Dominion'], ['Avatar: The Way of Water']]
- Time Complexity: O(N)
- Space Complexity: O(N)
Method 4: zip_longest from itertools
As in the above method, You could use the zip_longest
from the itertools library to split a list into multiple lists.
Requirement: Need to install the itertools library with the below command.
sudo pip3 install itertools
from itertools import zip_longest
def split(elements, chunk_size):
return [list(chunk)
for chunk in list(zip_longest(*[iter(elements)] * chunk_size))]
movies = ['Thor: Love and Thunder', 'Top Gun: Maverick',
'Black Panther: Wakanda Forever', 'Jurassic World Dominion',
'Avatar: The Way of Water']
print('movie sublists:', split(movies, 3))
# output: movie sublists: [['Thor: Love and Thunder', 'Top Gun: Maverick'], ['Black Panther: Wakanda Forever', 'Jurassic World Dominion'], ['Avatar: The Way of Water']]
Method 5: islice from itertools
Requirement: You need to install the itertools library with the below command.
sudo pip3 install itertools
Here, we have used the islice()
from itertools to split a list in python.
There are two overloaded methods of islice()
:
- Method 1:
islice(iterable, stop)
- Method 2:
islice(iterable, start, stop, step)
– step is optional
We have used Method 2 and it slices the first parameter, iterable based on the other three parameters(start, end, and step), and returns the iterator object.
from itertools import islice
def split(elements, chunk_size):
return [list(islice(elements, i, i+chunk_size))
for i in range(0, len(elements), chunk_size)]
movies = ['Thor: Love and Thunder', 'Top Gun: Maverick',
'Black Panther: Wakanda Forever', 'Jurassic World Dominion',
'Avatar: The Way of Water']
print('movie sublists:', split(movies, 3))
#output: movie sublists: [['Thor: Love and Thunder', 'Top Gun: Maverick', 'Black Panther: Wakanda Forever'], ['Jurassic World Dominion', 'Avatar: The Way of Water']]
Method 6: Naive approach with for loop
This is a rookie way to split a list in python. Following is the algorithm.
- Step 1: Calculate the number of chunks – dividing the length of elements by chunk size
- Step 2: Use two loops, the outer loop to iterate the number of chunks times and the inner loop to iterate the chunk size times. We used two extra lists, one for storing all chunks and another for storing a temporary chunk.
Notes
1. This method is intuitive to understand but due to the usage of inner loops, it will be very slow. By implementation, Python’s loops have overhead and they are slow.
2. Use this technique only for educational purposes but not in real systems built with Python.
def split(elements, chunk_size):
chunks = []
number_of_chunks = int(len(elements)/ chunk_size)
for chunk_number in range(number_of_chunks+1):
temp_chunk = []
for index in range(chunk_size):
index = (chunk_number * chunk_size) + index
if index < len(elements):
element = elements[index]
temp_chunk.append(element)
chunks.append(temp_chunk)
return chunks
movies = ['Thor: Love and Thunder', 'Top Gun: Maverick',
'Black Panther: Wakanda Forever', 'Jurassic World Dominion',
'Avatar: The Way of Water']
print('movie sublists:', split(movies, 3))
# output: movie sublists: [['Thor: Love and Thunder', 'Top Gun: Maverick', 'Black Panther: Wakanda Forever'], ['Jurassic World Dominion', 'Avatar: The Way of Water']]
Method 7: for loop and yield keyword
def split(elements, chunk_size):
for i in range(0, len(elements), chunk_size):
current_chunk = elements[i:i+chunk_size]
yield current_chunk
movies = ['Thor: Love and Thunder', 'Top Gun: Maverick',
'Black Panther: Wakanda Forever', 'Jurassic World Dominion',
'Avatar: The Way of Water']
movies =[chunk for chunk in split(movies, 3)]
print('movie sublists:', movies)
# output: movie sublists: [['Thor: Love and Thunder', 'Top Gun: Maverick', 'Black Panther: Wakanda Forever'], ['Jurassic World Dominion', 'Avatar: The Way of Water']]
Method 8: batched from more_itertools
Prerequisite: In case you don’t have more_itertools
installed, You can install the library with the following statement.
sudo pip3 install more-itertools
The batched()
method can be used from more_itertools
library to split list into multiple lists. This approach may return a shorter last sublist if it is not equally divisible.
The batched()
internally uses islice()
method from itertools
.
from more_itertools import batched
def split(elements, chunk_size):
return [list(chunk)
for chunk in batched(elements, chunk_size)]
movies = ['Thor: Love and Thunder', 'Top Gun: Maverick',
'Black Panther: Wakanda Forever', 'Jurassic World Dominion',
'Avatar: The Way of Water']
print('movie sublists:', split(movies, 3))
# output: movie sublists: [['Thor: Love and Thunder', 'Top Gun: Maverick', 'Black Panther: Wakanda Forever'], ['Jurassic World Dominion', 'Avatar: The Way of Water']]
Method 9: chunked from more_itertools
Prerequisite: You will have to set up the more_itertools
module with the below command.
sudo pip3 install more-itertools
With the chunked()
method, you will not get equal chunks, may get the last chunk shorter.
In case you want to split the list into equal chunks then pass the strict
parameter as True
. With strict as True, chunked()
will raise an error if the list is not equally divisible.
from more_itertools import chunked
def split(elements, chunk_size):
return [list(chunk)
for chunk in chunked(elements, chunk_size)]
movies = ['Thor: Love and Thunder', 'Top Gun: Maverick',
'Black Panther: Wakanda Forever', 'Jurassic World Dominion',
'Avatar: The Way of Water']
print('movie sublists:', split(movies, 3))
# output: movie sublists: [['Thor: Love and Thunder', 'Top Gun: Maverick', 'Black Panther: Wakanda Forever'], ['Jurassic World Dominion', 'Avatar: The Way of Water']]
Method 10: grouper from more_itertools
Prerequisite: You need to install the more_itertools with the below command.
sudo pip3 install more-itertools
You can use the grouper()
method to split a list in python into equal-length chunks and fill in in case the last sublist is shorter.
By default, If the last sublist is shorter in length then None will be added at the end to make it equal in length. You can use fillvalue
parameter to pass the filling of your choice.
from more_itertools import grouper
movies = ['Thor: Love and Thunder', 'Top Gun: Maverick',
'Black Panther: Wakanda Forever', 'Jurassic World Dominion',
'Avatar: The Way of Water']
chunk_size = 2
movie_sublists = [list(sublist) for sublist in grouper(movies, chunk_size)]
print('movie sublists:', movie_sublists)
#output: movie sublists: [['Thor: Love and Thunder', 'Top Gun: Maverick'],['Black Panther: Wakanda Forever', 'Jurassic World Dominion'],['Avatar: The Way of Water', None]]
Frequently Asked Questions
How to split a list into 2 halves in Python?
There are different ways to split a list in Python. Using the slice operator is the straightforward approach to split the list into two halves.
What is the fastest way to Split list?
We tried running more than 10 different ways to split list with 10 million numbers. Using islice method from itertools is the fastest method that has performed the job in just 400 milliseconds.
Using islice() method from itertools