Run Same Function in Parallel with Different Parameters - Python
Running the same function in parallel with different parameters involves executing the function multiple times simultaneously, each time with a different set of inputs. For example, if we want to compute the square of each number in the list [0, 1, 2, 3, 4], instead of processing them one by one, running them in parallel can significantly reduce execution time by processing multiple numbers at once. The result would be [0, 1, 4, 9, 16], computed concurrently.
Using joblib.parallel
joblib.Parallel efficiently parallelizes computations, optimizing CPU and memory usage, especially in machine learning and data processing. The delayed() enables easy parallel execution with minimal inter-process communication.
from joblib import Parallel, delayed
def square(val):
return val * val
if __name__ == '__main__':
inputs = [0, 1, 2, 3, 4]
res = Parallel(n_jobs=4)(delayed(square)(val) for val in inputs)
print(res)
Output
[0, 1, 4, 9, 16]
Explanation: joblib.Parallel parallelize the execution of the square function across multiple inputs. The delayed() function wraps square, enabling it to run independently for each value in inputs. With n_jobs=4, up to four processes execute in parallel, improving efficiency.
Using mutliprocessing.Pool
multiprocessing.Pool enables asynchronous function execution across multiple processes, efficiently distributing tasks. The starmap()function handles multiple parameters, making it ideal for CPU-bound tasks.
import multiprocessing
def sq(i, v):
return i, v * v # Return index and square of value
if __name__ == '__main__':
n = 4 # Number of processes
inp = [0, 1, 2, 3, 4] # Input list
p = [(i, inp[i]) for i in range(len(inp))] # Create (index, value) pairs
with multiprocessing.Pool(processes=n) as pool:
r = pool.starmap(sq, p) # Parallel execution
_, r = zip(*r) # Extract squared values
print(list(r))
Output
[0, 1, 4, 9, 16]
Explanation: This code uses multiprocessing.Pool to parallelize squaring a list of numbers. It defines sq(i, v) to return the square of v with its index, creates index-value pairs and distributes computation using starmap().
Using concurrent.futures.ProcessPoolExecutor
ProcessPoolExecutor offers a high-level interface for parallel execution, handling process creation internally. It simplifies usage over multiprocessing.Pool with slight overhead, making it ideal for moderate parallel tasks.
import concurrent.futures
def sq(v):
return v * v # Return square of value
def run():
inp = [0, 1, 2, 3, 4] # Input list
with concurrent.futures.ProcessPoolExecutor(max_workers=4) as ex:
res = list(ex.map(sq, inp)) # Parallel execution
return res
if __name__ == '__main__':
out = run()
print(out)
Output
[0, 1, 4, 9, 16]
Explanation: ProcessPoolExecutor parallelize the sq function across multiple inputs. With max_workers=4, up to four processes run simultaneously. The map() function applies sq() to each value in inp, executing tasks in parallel.
Using mutliprocessing.Process with a loop
This method manually creates and manages multiple processes, each executing the target function independently. Unlike Pool, which manages processes automatically, this approach gives full control over process creation. However, it has higher overhead and is only recommended when each process operates independently without shared data.
import multiprocessing
import time
def worker_function(id):
time.sleep(1)
print(f"I'm the process with id: {id}")
if __name__ == '__main__':
processes = []
for i in range(5):
p = multiprocessing.Process(target=worker_function, args=(i,))
p.start()
processes.append(p)
for p in processes:
p.join()
Output
I'm the process with id: 0
I'm the process with id: 1
I'm the process with id: 2
I'm the process with id: 3
I'm the process with id: 4
Explanation: Each process runs worker_function, which prints its ID after a 1-second delay. The start() method launches the processes and join() ensures all complete before exiting.