Skip to content

Filename encoding error in some environments with PAX sdist #7667

@ncoghlan

Description

@ncoghlan

Environment

  • pip version: any
  • Python version: 2.7
  • OS: Windows, non-Windows in C locale

(pip Windows CI hits this)

Description
The PAX format wheel 0.34.1 sdists fail to install on Python 2.7 on Windows with a UnicodeEncodeError, or on non-Windows systems in a non-utf-8 locale: pypa/wheel#331

Expected behavior
Unicode filename from the PAX tarball is correctly encoded for the local filesystem.

How to Reproduce
Attempt to install a PAX formatted tarball containing a file name that cannot be encoded to the default code page (Windows) or the default locale encoding (non-Windows).

In GNU tar, the affected paths are pre-mangled to something ASCII compatible, but PAX tar preserves them correctly, so the installer needs to handle them itself.

Output

See
https://dev.azure.com/pypa/pip/_build/results?buildId=18040&view=logs&j=404e6841-f5ba-57d9-f2c8-8c5322057572&t=0219f6bf-240d-5b08-c877-377b12af5079&l=309 for a Windows example in the pip test suite.

The wheel issue linked above has some Linux examples.

Metadata

Metadata

Assignees

No one assigned

    Labels

    C: encodingRelated to text encoding and likely, UnicodeErrorstype: bugA confirmed bug or unintended behavior

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions