Skip to content

Differentiating organic vs automated installations #5499

@mahmoud

Description

@mahmoud

What's the problem this feature will solve?

Currently, pip installation statistics are aggregated to the gCloud and made available on libraries.io and pepy.tech. A lot of effort has gone into these numbers, but thanks to automation, they mean less now than they did a few years ago.

CI and other automation, combined with maybe a bit too much reliance on PyPI's central infrastructure, have inflated the download numbers and diluted the signal with noise.

Describe the solution you'd like

We could detect when pip is being used interactively (by checking if stdin is a tty or some other mechanism), and include that in the pip install request headers, to be included in the statistics generated by the server.

This would provide us with much cleaner data for highlighting actual community activity, instead of drowning in automation trends, overly favoring professionalized sectors of Python. Specifically, a library being manually installed 100 times may well indicate something much more interesting than a CI (or, unfortunately, a production) fleet installing a package 10,000 times.

Additional context

  • I wasn't sure whether to file this on pip or on Warehouse, it seems kind of 🐔 / 🥚 to me.
  • I'm not really sure if/how other package indexes solve this, but would be very interested in hearing.
  • As an arbitrary example, I happen to know Mozilla uses PyPI for quite a few relatively-internal packages. Granted, they're open-source and I'm happy to see some infrastructure synergy. But, picking at random, mozlog actually ranks ok for downloads, even though it's not a very broadly-useful package, and I'm pretty sure the data will show it's mostly Mozilla infrastructure downloading it.

Thanks for your attention and keep up the good work!

Metadata

Metadata

Assignees

No one assigned

    Labels

    auto-lockedOutdated issues that have been locked by automationstate: needs discussionThis needs some more discussiontype: enhancementImprovements to functionality

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions