2

I'm writing a Python program that will load a wordlist from a text file and then try unzipping an archive with each word. It wouldn't be serious if it didn't make use of all cpu cores. Because of the GIL, threading in Python isn't a great option if I'm not mistaken.

So I want to get the number of cpu_cores, split the wordlist and use the multiprocessing.process module to process different parts of the wordlist in different processes.

But would every process get pinned to a cpu core automatically? If not, is there a way to pin them manually?

1 Answer 1

6

You can use Pythons multiprocessing by importing import multiprocessing as mp and find out the number of processors by using mp.cpu_count() and should work on most platforms.

To launch programs/processes on specific CPU cores (in linux) you can use taskset and use this guide as a reference.

An alternative cross-plattform solution would be to use the psutil package for python.

However i would suggest you go with a thread/process pooling approach as in my opinion you should let the operating system assign tasks to each cpu/core. You can look at How to utilize all cores with python multiprocessing on how to approach this problem.

Sign up to request clarification or add additional context in comments.

4 Comments

Thanks for answering.. Does tasksets work on windows? If not is there a windows alternative?
@scripter I updated my answer with the psutil package, it is a cross-plattform solution that will work on Windows. To use taskset on Windows you might have to use Cygwin or something similar.
@scripter, the start command of the Windows cmd.exe shell can set the CPU affinity and preferred NUMA node when creating a process. Maybe there's a way to modify the affinity of a running process using WMI (i.e. wmic.exe). You can do it in PowerShell with something like (Get-Process -Id $target_pid).ProcessorAffinity = $affinity_mask.
Thanks for answering everyone! Written the program in Java.. Kept getting errors like can't serialize ._iobufferedreader and can't start new threads to name a few. As of now, multiprocessing in Python is unusable to me.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.