Skip to content

pip rerun is very slow with _PIP_USE_IMPORTLIB_METADATA=1 #12079

@mauritsvanrees

Description

@mauritsvanrees

Description

A pip install of a package with many dependencies, followed by the same pip install command, is much slower on Python 3.11 than on 3.10. In my case instead of about 10 seconds, it takes almost 4 minutes.

Investigation shows that this is a side effect of using importlib_metadata instead of pkg_resources on 3.11. Setting env variable _PIP_USE_IMPORTLIB_METADATA=1 on 3.10 gives the same slowdown. Setting the env variable to zero makes 3.11 fast again.

I have found where the problem is, and will prepare a PR.

Expected behavior

pip install Plone -c constraints.txt can take a while because it has lots of transitive dependencies.
But when this command is finished and you run it again in the same venv, it should be much faster.
Instead, it is almost four times slower than a fresh install.

pip version

23.1.2

Python version

3.11.2

OS

macOS Monterey 12.6.6, Intel

How to Reproduce

python3.11 -mvenv py311
cd py311
bin/pip install -U pip setuptools wheel
time bin/pip install Plone -c https://dist.plone.org/release/6.0.5/constraints.txt
time bin/pip install Plone -c https://dist.plone.org/release/6.0.5/constraints.txt
export _PIP_USE_IMPORTLIB_METADATA=0
time bin/pip install Plone -c https://dist.plone.org/release/6.0.5/constraints.txt

Output

The first pip install of Plone takes about 1 minute. Longer if you do not have the wheels in your cache.

The second takes about four minutes. You will see a lot of lines with "Requirement already satisfied". The first 100 or so go by fast, but then you start to see this slow down more and more.

The third takes about 13 seconds.

I did some measurements and have managed to narrow it down to this line in SpecifierRequirement.is_satisfied_by. Total time spent on this line is about four seconds with pkg_resources. With importlib_metadata it takes 113 seconds!

Narrowing it down further, the thing on this line that takes the most time, is simple attribute access: candidate.version. This is a method of AlreadyInstalledCandidate and it gets self.dist.version. This should be calculated once and then cached, similar to what pip already does a bit higher in that file for _InstallRequirementBackedCandidate.

When I make that change, the pip install command using importlib takes only 9 seconds. This is about 25 times faster than before!
Note that this change also positively influences the case with _PIP_USE_IMPORTLIB_METADATA=0, so using pkg_resources: it takes about 5 seconds instead of 13.

I will create a PR.

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    S: needs triageIssues/PRs that need to be triagedtype: bugA confirmed bug or unintended behavior

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions