Dependency specifiers

This document describes the dependency specifiers format as originally specified in PEP 508.

The job of a dependency is to enable tools like pip 1 to find the right package to install. Sometimes this is very loose - just specifying a name, and sometimes very specific - referring to a specific file to install. Sometimes dependencies are only relevant in one platform, or only some versions are acceptable, so the language permits describing all these cases.

The language defined is a compact line based format which is already in widespread use in pip requirements files, though we do not specify the command line option handling that those files permit. There is one caveat - the URL reference form, specified in PEP 440 is not actually implemented in pip, but since PEP 440 is accepted, we use that format rather than pip’s current native format.

Specification

Examples

All features of the language shown with a name based lookup:

requests [security,tests] >= 2.8.1, == 2.8.* ; python_version < "2.7"

A minimal URL based lookup:

pip @ https://github.com/pypa/pip/archive/1.3.1.zip#sha1=da9234ee9982d4bbb3c72346a6de940a148ea686

Concepts

A dependency specification always specifies a distribution name. It may include extras, which expand the dependencies of the named distribution to enable optional features. The version installed can be controlled using version limits, or giving the URL to a specific artifact to install. Finally the dependency can be made conditional using environment markers.

Grammar

We first cover the grammar briefly and then drill into the semantics of each section later.

A distribution specification is written in ASCII text. We use a parsley 2 grammar to provide a precise grammar. It is expected that the specification will be embedded into a larger system which offers framing such as comments, multiple line support via continuations, or other such features.

The full grammar including annotations to build a useful parse tree is included at the end of this document.

Versions may be specified according to the PEP 440 rules. (Note: URI is defined in std-66):

version_cmp   = wsp* '<' | '<=' | '!=' | '==' | '>=' | '>' | '~=' | '==='
version       = wsp* ( letterOrDigit | '-' | '_' | '.' | '*' | '+' | '!' )+
version_one   = version_cmp version wsp*
version_many  = version_one (wsp* ',' version_one)*
versionspec   = ( '(' version_many ')' ) | version_many
urlspec       = '@' wsp* <URI_reference>

Environment markers allow making a specification only take effect in some environments:

marker_op     = version_cmp | (wsp* 'in') | (wsp* 'not' wsp+ 'in')
python_str_c  = (wsp | letter | digit | '(' | ')' | '.' | '{' | '}' |
                 '-' | '_' | '*' | '#' | ':' | ';' | ',' | '/' | '?' |
                 '[' | ']' | '!' | '~' | '`' | '@' | '$' | '%' | '^' |
                 '&' | '=' | '+' | '|' | '<' | '>' )
dquote        = '"'
squote        = '\\''
python_str    = (squote (python_str_c | dquote)* squote |
                 dquote (python_str_c | squote)* dquote)
env_var       = ('python_version' | 'python_full_version' |
                 'os_name' | 'sys_platform' | 'platform_release' |
                 'platform_system' | 'platform_version' |
                 'platform_machine' | 'platform_python_implementation' |
                 'implementation_name' | 'implementation_version' |
                 'extra' # ONLY when defined by a containing layer
                 )
marker_var    = wsp* (env_var | python_str)
marker_expr   = marker_var marker_op marker_var
              | wsp* '(' marker wsp* ')'
marker_and    = marker_expr wsp* 'and' marker_expr
              | marker_expr
marker_or     = marker_and wsp* 'or' marker_and
                  | marker_and
marker        = marker_or
quoted_marker = ';' wsp* marker

Optional components of a distribution may be specified using the extras field:

identifier_end = letterOrDigit | (('-' | '_' | '.' )* letterOrDigit)
identifier    = letterOrDigit identifier_end*
name          = identifier
extras_list   = identifier (wsp* ',' wsp* identifier)*
extras        = '[' wsp* extras_list? wsp* ']'

Restrictions on names for extras is defined in PEP 685.

Giving us a rule for name based requirements:

name_req      = name wsp* extras? wsp* versionspec? wsp* quoted_marker?

And a rule for direct reference specifications:

url_req       = name wsp* extras? wsp* urlspec wsp+ quoted_marker?

Leading to the unified rule that can specify a dependency.:

specification = wsp* ( url_req | name_req ) wsp*

Whitespace

Non line-breaking whitespace is mostly optional with no semantic meaning. The sole exception is detecting the end of a URL requirement.

Names

Python distribution names are currently defined in PEP 345. Names act as the primary identifier for distributions. They are present in all dependency specifications, and are sufficient to be a specification on their own. However, PyPI places strict restrictions on names - they must match a case insensitive regex or they won’t be accepted. Accordingly, in this document we limit the acceptable values for identifiers to that regex. A full redefinition of name may take place in a future metadata PEP. The regex (run with re.IGNORECASE) is:

^([A-Z0-9]|[A-Z0-9][A-Z0-9._-]*[A-Z0-9])$

Extras

An extra is an optional part of a distribution. Distributions can specify as many extras as they wish, and each extra results in the declaration of additional dependencies of the distribution when the extra is used in a dependency specification. For instance:

requests[security]

Extras union in the dependencies they define with the dependencies of the distribution they are attached to. The example above would result in requests being installed, and requests own dependencies, and also any dependencies that are listed in the «security» extra of requests.

If multiple extras are listed, all the dependencies are unioned together.

Versions

See PEP 440 for more detail on both version numbers and version comparisons. Version specifications limit the versions of a distribution that can be used. They only apply to distributions looked up by name, rather than via a URL. Version comparison are also used in the markers feature. The optional brackets around a version are present for compatibility with PEP 345 but should not be generated, only accepted.

Environment Markers

Environment markers allow a dependency specification to provide a rule that describes when the dependency should be used. For instance, consider a package that needs argparse. In Python 2.7 argparse is always present. On older Python versions it has to be installed as a dependency. This can be expressed as so:

argparse;python_version<"2.7"

A marker expression evaluates to either True or False. When it evaluates to False, the dependency specification should be ignored.

The marker language is inspired by Python itself, chosen for the ability to safely evaluate it without running arbitrary code that could become a security vulnerability. Markers were first standardised in PEP 345. This document fixes some issues that were observed in the design described in PEP 426.

Comparisons in marker expressions are typed by the comparison operator. The <marker_op> operators that are not in <version_cmp> perform the same as they do for strings in Python. The <version_cmp> operators use the PEP 440 version comparison rules when those are defined (that is when both sides have a valid version specifier). If there is no defined PEP 440 behaviour and the operator exists in Python, then the operator falls back to the Python behaviour. Otherwise an error should be raised. e.g. the following will result in errors:

"dog" ~= "fred"
python_version ~= "surprise"

User supplied constants are always encoded as strings with either ' or " quote marks. Note that backslash escapes are not defined, but existing implementations do support them. They are not included in this specification because they add complexity and there is no observable need for them today. Similarly we do not define non-ASCII character support: all the runtime variables we are referencing are expected to be ASCII-only.

The variables in the marker grammar such as «os_name» resolve to values looked up in the Python runtime. With the exception of «extra» all values are defined on all Python versions today - it is an error in the implementation of markers if a value is not defined.

Unknown variables must raise an error rather than resulting in a comparison that evaluates to True or False.

Variables whose value cannot be calculated on a given Python implementation should evaluate to 0 for versions, and an empty string for all other variables.

The «extra» variable is special. It is used by wheels to signal which specifications apply to a given extra in the wheel METADATA file, but since the METADATA file is based on a draft version of PEP 426, there is no current specification for this. Regardless, outside of a context where this special handling is taking place, the «extra» variable should result in an error like all other unknown variables.

Marker

Python equivalent

Sample values

os_name

os.name

posix, java

sys_platform

sys.platform

linux, linux2, darwin, java1.8.0_51 (note that «linux» is from Python3 and «linux2» from Python2)

platform_machine

platform.machine()

x86_64

platform_python_implementation

platform.python_implementation()

CPython, Jython

platform_release

platform.release()

3.14.1-x86_64-linode39, 14.5.0, 1.8.0_51

platform_system

platform.system()

Linux, Windows, Java

platform_version

platform.version()

#1 SMP Fri Apr 25 13:07:35 EDT 2014 Java HotSpot(TM) 64-Bit Server VM, 25.51-b03, Oracle Corporation Darwin Kernel Version 14.5.0: Wed Jul 29 02:18:53 PDT 2015; root:xnu-2782.40.9~2/RELEASE_X86_64

python_version

'.'.join(platform.python_version_tuple()[:2])

3.4, 2.7

python_full_version

platform.python_version()

3.4.0, 3.5.0b1

implementation_name

sys.implementation.name

cpython

implementation_version

see definition below

3.4.0, 3.5.0b1

extra

An error except when defined by the context interpreting the specification.

test

The implementation_version marker variable is derived from sys.implementation.version:

def format_full_version(info):
    version = '{0.major}.{0.minor}.{0.micro}'.format(info)
    kind = info.releaselevel
    if kind != 'final':
        version += kind[0] + str(info.serial)
    return version

if hasattr(sys, 'implementation'):
    implementation_version = format_full_version(sys.implementation.version)
else:
    implementation_version = "0"

This environment markers section, initially defined through PEP 508, supersedes the environment markers section in PEP 345.

Complete Grammar

The complete parsley grammar:

wsp           = ' ' | '\t'
version_cmp   = wsp* <'<=' | '<' | '!=' | '==' | '>=' | '>' | '~=' | '==='>
version       = wsp* <( letterOrDigit | '-' | '_' | '.' | '*' | '+' | '!' )+>
version_one   = version_cmp:op version:v wsp* -> (op, v)
version_many  = version_one:v1 (wsp* ',' version_one)*:v2 -> [v1] + v2
versionspec   = ('(' version_many:v ')' ->v) | version_many
urlspec       = '@' wsp* <URI_reference>
marker_op     = version_cmp | (wsp* 'in') | (wsp* 'not' wsp+ 'in')
python_str_c  = (wsp | letter | digit | '(' | ')' | '.' | '{' | '}' |
                 '-' | '_' | '*' | '#' | ':' | ';' | ',' | '/' | '?' |
                 '[' | ']' | '!' | '~' | '`' | '@' | '$' | '%' | '^' |
                 '&' | '=' | '+' | '|' | '<' | '>' )
dquote        = '"'
squote        = '\\''
python_str    = (squote <(python_str_c | dquote)*>:s squote |
                 dquote <(python_str_c | squote)*>:s dquote) -> s
env_var       = ('python_version' | 'python_full_version' |
                 'os_name' | 'sys_platform' | 'platform_release' |
                 'platform_system' | 'platform_version' |
                 'platform_machine' | 'platform_python_implementation' |
                 'implementation_name' | 'implementation_version' |
                 'extra' # ONLY when defined by a containing layer
                 ):varname -> lookup(varname)
marker_var    = wsp* (env_var | python_str)
marker_expr   = marker_var:l marker_op:o marker_var:r -> (o, l, r)
              | wsp* '(' marker:m wsp* ')' -> m
marker_and    = marker_expr:l wsp* 'and' marker_expr:r -> ('and', l, r)
              | marker_expr:m -> m
marker_or     = marker_and:l wsp* 'or' marker_and:r -> ('or', l, r)
                  | marker_and:m -> m
marker        = marker_or
quoted_marker = ';' wsp* marker
identifier_end = letterOrDigit | (('-' | '_' | '.' )* letterOrDigit)
identifier    = < letterOrDigit identifier_end* >
name          = identifier
extras_list   = identifier:i (wsp* ',' wsp* identifier)*:ids -> [i] + ids
extras        = '[' wsp* extras_list?:e wsp* ']' -> e
name_req      = (name:n wsp* extras?:e wsp* versionspec?:v wsp* quoted_marker?:m
                 -> (n, e or [], v or [], m))
url_req       = (name:n wsp* extras?:e wsp* urlspec:v (wsp+ | end) quoted_marker?:m
                 -> (n, e or [], v, m))
specification = wsp* ( url_req | name_req ):s wsp* -> s
# The result is a tuple - name, list-of-extras,
# list-of-version-constraints-or-a-url, marker-ast or None


URI_reference = <URI | relative_ref>
URI           = scheme ':' hier_part ('?' query )? ( '#' fragment)?
hier_part     = ('//' authority path_abempty) | path_absolute | path_rootless | path_empty
absolute_URI  = scheme ':' hier_part ( '?' query )?
relative_ref  = relative_part ( '?' query )? ( '#' fragment )?
relative_part = '//' authority path_abempty | path_absolute | path_noscheme | path_empty
scheme        = letter ( letter | digit | '+' | '-' | '.')*
authority     = ( userinfo '@' )? host ( ':' port )?
userinfo      = ( unreserved | pct_encoded | sub_delims | ':')*
host          = IP_literal | IPv4address | reg_name
port          = digit*
IP_literal    = '[' ( IPv6address | IPvFuture) ']'
IPvFuture     = 'v' hexdig+ '.' ( unreserved | sub_delims | ':')+
IPv6address   = (
                  ( h16 ':'){6} ls32
                  | '::' ( h16 ':'){5} ls32
                  | ( h16 )?  '::' ( h16 ':'){4} ls32
                  | ( ( h16 ':')? h16 )? '::' ( h16 ':'){3} ls32
                  | ( ( h16 ':'){0,2} h16 )? '::' ( h16 ':'){2} ls32
                  | ( ( h16 ':'){0,3} h16 )? '::' h16 ':' ls32
                  | ( ( h16 ':'){0,4} h16 )? '::' ls32
                  | ( ( h16 ':'){0,5} h16 )? '::' h16
                  | ( ( h16 ':'){0,6} h16 )? '::' )
h16           = hexdig{1,4}
ls32          = ( h16 ':' h16) | IPv4address
IPv4address   = dec_octet '.' dec_octet '.' dec_octet '.' dec_octet
nz            = ~'0' digit
dec_octet     = (
                  digit # 0-9
                  | nz digit # 10-99
                  | '1' digit{2} # 100-199
                  | '2' ('0' | '1' | '2' | '3' | '4') digit # 200-249
                  | '25' ('0' | '1' | '2' | '3' | '4' | '5') )# %250-255
reg_name = ( unreserved | pct_encoded | sub_delims)*
path = (
        path_abempty # begins with '/' or is empty
        | path_absolute # begins with '/' but not '//'
        | path_noscheme # begins with a non-colon segment
        | path_rootless # begins with a segment
        | path_empty ) # zero characters
path_abempty  = ( '/' segment)*
path_absolute = '/' ( segment_nz ( '/' segment)* )?
path_noscheme = segment_nz_nc ( '/' segment)*
path_rootless = segment_nz ( '/' segment)*
path_empty    = pchar{0}
segment       = pchar*
segment_nz    = pchar+
segment_nz_nc = ( unreserved | pct_encoded | sub_delims | '@')+
                # non-zero-length segment without any colon ':'
pchar         = unreserved | pct_encoded | sub_delims | ':' | '@'
query         = ( pchar | '/' | '?')*
fragment      = ( pchar | '/' | '?')*
pct_encoded   = '%' hexdig
unreserved    = letter | digit | '-' | '.' | '_' | '~'
reserved      = gen_delims | sub_delims
gen_delims    = ':' | '/' | '?' | '#' | '(' | ')?' | '@'
sub_delims    = '!' | '$' | '&' | '\\'' | '(' | ')' | '*' | '+' | ',' | ';' | '='
hexdig        = digit | 'a' | 'A' | 'b' | 'B' | 'c' | 'C' | 'd' | 'D' | 'e' | 'E' | 'f' | 'F'

A test program - if the grammar is in a string grammar:

import os
import sys
import platform

from parsley import makeGrammar

grammar = """
    wsp ...
    """
tests = [
    "A",
    "A.B-C_D",
    "aa",
    "name",
    "name<=1",
    "name>=3",
    "name>=3,<2",
    "name@http://foo.com",
    "name [fred,bar] @ http://foo.com ; python_version=='2.7'",
    "name[quux, strange];python_version<'2.7' and platform_version=='2'",
    "name; os_name=='a' or os_name=='b'",
    # Should parse as (a and b) or c
    "name; os_name=='a' and os_name=='b' or os_name=='c'",
    # Overriding precedence -> a and (b or c)
    "name; os_name=='a' and (os_name=='b' or os_name=='c')",
    # should parse as a or (b and c)
    "name; os_name=='a' or os_name=='b' and os_name=='c'",
    # Overriding precedence -> (a or b) and c
    "name; (os_name=='a' or os_name=='b') and os_name=='c'",
    ]

def format_full_version(info):
    version = '{0.major}.{0.minor}.{0.micro}'.format(info)
    kind = info.releaselevel
    if kind != 'final':
        version += kind[0] + str(info.serial)
    return version

if hasattr(sys, 'implementation'):
    implementation_version = format_full_version(sys.implementation.version)
    implementation_name = sys.implementation.name
else:
    implementation_version = '0'
    implementation_name = ''
bindings = {
    'implementation_name': implementation_name,
    'implementation_version': implementation_version,
    'os_name': os.name,
    'platform_machine': platform.machine(),
    'platform_python_implementation': platform.python_implementation(),
    'platform_release': platform.release(),
    'platform_system': platform.system(),
    'platform_version': platform.version(),
    'python_full_version': platform.python_version(),
    'python_version': '.'.join(platform.python_version_tuple()[:2]),
    'sys_platform': sys.platform,
}

compiled = makeGrammar(grammar, {'lookup': bindings.__getitem__})
for test in tests:
    parsed = compiled(test).specification()
    print("%s -> %s" % (test, parsed))

Summary of changes to PEP 508

The following changes were made based on feedback after its initial implementation:

  • The definition of python_version was changed from platform.python_version()[:3] to '.'.join(platform.python_version_tuple()[:2]), to accommodate potential future versions of Python with 2-digit major and minor versions (e.g. 3.10). 3

References

1

pip, the recommended installer for Python packages (http://pip.readthedocs.org/en/stable/)

2

The parsley PEG library. (https://pypi.python.org/pypi/parsley/)

3

Future Python versions might be problematic with the definition of Environment Marker Variable python_version (https://github.com/python/peps/issues/560)