furl is a small Python library that makes parsing and<br>modifying URLs easy.

Python's standard urllib and urlparse modules provide a number of URL related functions, but using these functions to perform common URL operations proves tedious. Furl makes parsing and modifying URLs easy.

Furl is well tested, Unlicensed in the public domain, and supports Python 3 and PyPy3.

Furl is maintained by Alex Cochran, with support from the confidential computing folks at 🌖 Lunal.

Usage

Code time: Paths and query arguments are easy. Really easy.

>>> from furl import furl
>>> f = furl('http://www.google.com/?one=1&two=2')
>>> f /= 'path'
>>> del f.args['one']
>>> f.args['three'] = '3'
>>> f.url
'http://www.google.com/path?two=2&three=3'

Or use furl's inline modification methods.

>>> furl('http://www.google.com/?one=1').add({'two':'2'}).url
'http://www.google.com/?one=1&two=2'

>>> furl('http://www.google.com/?one=1&two=2').set({'three':'3'}).url
'http://www.google.com/?three=3'

>>> furl('http://www.google.com/?one=1&two=2').remove(['one']).url
'http://www.google.com/?two=2'

Encoding is handled for you. Unicode, too.

>>> f = furl('http://www.google.com/')
>>> f.path = 'some encoding here'
>>> f.args['and some encoding'] = 'here, too'
>>> f.url
'http://www.google.com/some%20encoding%20here?and+some+encoding=here,+too'
>>> f.set(host=u'ドメイン.テスト', path=u'джк', query=u'☃=☺')
>>> f.url
'http://xn--eckwd4c7c.xn--zckzah/%D0%B4%D0%B6%D0%BA?%E2%98%83=%E2%98%BA'

Fragments also have a path and a query.

>>> f = furl('http://www.google.com/')
>>> f.fragment.path.segments = ['two', 'directories']
>>> f.fragment.args = {'one': 'argument'}
>>> f.url
'http://www.google.com/#two/directories?one=argument'

Installation

Installing furl with pip is easy.

$ pip install furl

API

Basics
Scheme, Username, Password, Host, Port, Network Location, and Origin
Path
- Modification
Query
- Modification
- Parameters
Fragment
Encoding
Inline modification
Miscellaneous

Basics

furl objects let you access and modify the various components of a URL.

scheme://username:password@host:port/path?query#fragment

scheme is the scheme string (all lowercase) or None. None means no scheme. An empty string means a protocol relative URL, like //www.google.com.
username is the username string for authentication.
password is the password string for authentication with username.
host is the domain name, IPv4, or IPv6 address as a string. Domain names are all lowercase.
port is an integer or None. A value of None means no port specified and the default port for the given scheme should be inferred, if possible (e.g. port 80 for the scheme http).
path is a Path object comprised of path segments.
query is a Query object comprised of key:value query arguments.
fragment is a Fragment object comprised of a Path object and Query object separated by an optional ? separator.

Scheme, Username, Password, Host, Port, Network Location, and Origin

scheme, username, password, and host are strings or None. port is an integer or None.

>>> f = furl('http://user:pass@www.google.com:99/')
>>> f.scheme, f.username, f.password, f.host, f.port
('http', 'user', 'pass', 'www.google.com', 99)

furl infers the default port for common schemes.

>>> f = furl('https://secure.google.com/')
>>> f.port
443

>>> f = furl('unknown://www.google.com/')
>>> print(f.port)
None

netloc is the string combination of username, password, host, and port, not including port if it's None or the default port for the provided scheme.

>>> furl('http://www.google.com/').netloc
'www.google.com'

>>> furl('http://www.google.com:99/').netloc
'www.google.com:99'

>>> furl('http://user:pass@www.google.com:99/').netloc
'user:pass@www.google.com:99'

origin is the string combination of scheme, host, and port, not including port if it's None or the default port for the provided scheme.

>>> furl('http://www.google.com/').origin
'http://www.google.com'

>>> furl('http://www.google.com:99/').origin
'http://www.google.com:99'

Path

URL paths in furl are Path objects that have segments, a list of zero or more path segments that can be modified directly. Path segments in segments are percent-decoded and all interaction with segments should take place with percent-decoded strings.

>>> f = furl('http://www.google.com/a/large%20ish/path')
>>> f.path
Path('/a/large ish/path')
>>> f.path.segments
['a', 'large ish', 'path']
>>> str(f.path)
'/a/large%20ish/path'

Modification

>>> f.path.segments = ['a', 'new', 'path', '']
>>> str(f.path)
'/a/new/path/'

>>> f.path = 'o/hi/there/with%20some%20encoding/'
>>> f.path.segments
['o', 'hi', 'there', 'with some encoding', '']
>>> str(f.path)
'/o/hi/there/with%20some%20encoding/'

>>> f.url
'http://www.google.com/o/hi/there/with%20some%20encoding/'

>>> f.path.segments = ['segments', 'are', 'maintained', 'decoded', '^`<>[]"#/?']
>>> str(f.path)
'/segments/are/maintained/decoded/%5E%60%3C%3E%5B%5D%22%23%2F%3F'

A path that starts with / is considered absolute, and a Path can be absolute or not as specified (or set) by the boolean attribute isabsolute. URL Paths have a special restriction: they must be absolute if a netloc (username, password, host, and/or port) is present. This restriction exists because a URL path must start with / to separate itself from the netloc, if present. Fragment Paths have no such limitation and isabsolute and can be True or False without restriction.

Here's a URL Path example that illustrates how isabsolute becomes True and read-only in the presence of a netloc.

>>> f = furl('/url/path')
>>> f.path.isabsolute
True
>>> f.path.isabsolute = False
>>> f.url
'url/path'
>>> f.host = 'blaps.ru'
>>> f.url
'blaps.ru/url/path'
>>> f.path.isabsolute
True
>>> f.path.isabsolute = False
Traceback (most recent call last):
  ...
AttributeError: Path.isabsolute is True and read-only for URLs with a netloc (a username, password, host, and/or port). URL paths must be absolute if a netloc exists.
>>> f.url
'blaps.ru/url/path'

Conversely, the isabsolute attribute of Fragment Paths isn't bound by the same read-only restriction. URL fragments are always prefixed by a # character and don't need to be separated from the netloc.

>>> f = furl('http://www.google.com/#/absolute/fragment/path/')
>>> f.fragment.path.isabsolute
True
>>> f.fragment.path.isabsolute = False
>>> f.url
'http://www.google.com/#absolute/fragment/path/'
>>> f.fragment.path.isabsolute = True
>>> f.url
'http://www.google.com/#/absolute/fragment/path/'

A path that ends with / is considered a directory, and otherwise considered a file. The Path attribute isdir returns True if the path is a directory, False otherwise. Conversely, the attribute isfile returns True if the path is a file, False otherwise.

>>> f = furl('http://www.google.com/a/directory/')
>>> f.path.isdir
True
>>> f.path.isfile
False

>>> f = furl('http://www.google.com/a/file')
>>> f.path.isdir
False
>>> f.path.isfile
True

A path can be normalized with normalize(), and normalize() returns the Path object for method chaining.

>>> f = furl('http://www.google.com////a/./b/lolsup/../c/')
>>> f.path.normalize()
>>> f.url
'http://www.google.com/a/b/c/'

Path segments can also be appended with the slash operator, like with pathlib.Path.

>>> from __future__ import division  # For Python 2.x.
>>>
>>> f = furl('path')
>>> f.path /= 'with'
>>> f.path = f.path / 'more' / 'path segments/'
>>> f.url
'/path/with/more/path%20segments/'

For a dictionary representation of a path, use asdict().

>>> f = furl('http://www.google.com/some/enc%20oding')
>>> f.path.asdict()
{ 'encoded': '/some/enc%20oding',
  'isabsolute': True,
  'isdir': False,
  'isfile': True,
  'segments': ['some', 'enc oding'] }

Query

URL queries in furl are Query objects that have params, a one dimensional ordered multivalue dictionary of query keys and values. Query keys and values in params are percent-decoded and all interaction with params should take place with percent-decoded strings.

>>> f = furl('http://www.google.com/?one=1&two=2')
>>> f.query
Query('one=1&two=2')
>>> f.query.params
omdict1D([('one', '1'), ('two', '2')])
>>> str(f.query)
'one=1&two=2'

furl objects and Fragment objects (covered below) contain a Query object, and args is provided as a shortcut on these objects to access query.params.

>>> f = furl('http://www.google.com/?one=1&two

Furl

Install / Use

README