Natsort
Simple yet flexible natural sorting in Python.
Install / Use
/learn @SethMMorton/NatsortREADME
natsort
.. image:: https://img.shields.io/pypi/v/natsort.svg :target: https://pypi.org/project/natsort/
.. image:: https://img.shields.io/pypi/pyversions/natsort.svg :target: https://pypi.org/project/natsort/
.. image:: https://img.shields.io/pypi/l/natsort.svg :target: https://github.com/SethMMorton/natsort/blob/main/LICENSE
.. image:: https://github.com/SethMMorton/natsort/workflows/Tests/badge.svg :target: https://github.com/SethMMorton/natsort/actions
.. image:: https://codecov.io/gh/SethMMorton/natsort/branch/main/graph/badge.svg :target: https://codecov.io/gh/SethMMorton/natsort
.. image:: https://img.shields.io/pypi/dw/natsort.svg :target: https://pypi.org/project/natsort/
Simple yet flexible natural sorting in Python.
- Source Code: https://github.com/SethMMorton/natsort
- Downloads: https://pypi.org/project/natsort/
- Documentation: https://natsort.readthedocs.io/
- `Examples and Recipes`_
- `How Does Natsort Work?`_
- `API`_
- `Quick Description`_
- `Quick Examples`_
- `FAQ`_
- `Requirements`_
- `Optional Dependencies`_
- `Installation`_
- `How to Run Tests`_
- `How to Build Documentation`_
- `Dropped Deprecated APIs`_
- `History`_
NOTE: Please see the Dropped Deprecated APIs_ section for changes.
Quick Description
When you try to sort a list of strings that contain numbers, the normal python sort algorithm sorts lexicographically, so you might not get the results that you expect:
.. code-block:: pycon
>>> a = ['2 ft 7 in', '1 ft 5 in', '10 ft 2 in', '2 ft 11 in', '7 ft 6 in']
>>> sorted(a)
['1 ft 5 in', '10 ft 2 in', '2 ft 11 in', '2 ft 7 in', '7 ft 6 in']
Notice that it has the order ('1', '10', '2') - this is because the list is being sorted in lexicographical order, which sorts numbers like you would letters (i.e. 'b', 'ba', 'c').
natsort_ provides a function natsorted()_ that helps sort lists
"naturally" ("naturally" is rather ill-defined, but in general it means
sorting based on meaning and not computer code point).
Using natsorted()_ is simple:
.. code-block:: pycon
>>> from natsort import natsorted
>>> a = ['2 ft 7 in', '1 ft 5 in', '10 ft 2 in', '2 ft 11 in', '7 ft 6 in']
>>> natsorted(a)
['1 ft 5 in', '2 ft 7 in', '2 ft 11 in', '7 ft 6 in', '10 ft 2 in']
natsorted()_ identifies numbers anywhere in a string and sorts them
naturally. Below are some other things you can do with natsort_
(also see the Examples and Recipes_ for a quick start guide, or the
API_ for complete details).
Note: natsorted()_ is designed to be a drop-in replacement for the
built-in sorted()_ function. Like sorted(), natsorted()
does not sort in-place. To sort a list and assign the output to the same
variable, you must explicitly assign the output to a variable:
.. code-block:: pycon
>>> a = ['2 ft 7 in', '1 ft 5 in', '10 ft 2 in', '2 ft 11 in', '7 ft 6 in']
>>> natsorted(a)
['1 ft 5 in', '2 ft 7 in', '2 ft 11 in', '7 ft 6 in', '10 ft 2 in']
>>> print(a) # 'a' was not sorted; "natsorted" simply returned a sorted list
['2 ft 7 in', '1 ft 5 in', '10 ft 2 in', '2 ft 11 in', '7 ft 6 in']
>>> a = natsorted(a) # Now 'a' will be sorted because the sorted list was assigned to 'a'
>>> print(a)
['1 ft 5 in', '2 ft 7 in', '2 ft 11 in', '7 ft 6 in', '10 ft 2 in']
Please see Generating a Reusable Sorting Key and Sorting In-Place_ for
an alternate way to sort in-place naturally.
Quick Examples
Sorting Versions_Sort Paths Like My File Browser (e.g. Windows Explorer on Windows)_Sorting by Real Numbers (i.e. Signed Floats)_Locale-Aware Sorting (or "Human Sorting")_Further Customizing Natsort_Sorting Mixed Types_Handling Bytes_Generating a Reusable Sorting Key and Sorting In-Place_Other Useful Things_
Sorting Versions ++++++++++++++++
natsort_ does not actually comprehend version numbers.
It just so happens that the most common versioning schemes are designed to
work with standard natural sorting techniques; these schemes include
MAJOR.MINOR, MAJOR.MINOR.PATCH, YEAR.MONTH.DAY. If your data
conforms to a scheme like this, then it will work out-of-the-box with
natsorted()_ (as of natsort_ version >= 4.0.0):
.. code-block:: pycon
>>> a = ['version-1.9', 'version-2.0', 'version-1.11', 'version-1.10']
>>> natsorted(a)
['version-1.9', 'version-1.10', 'version-1.11', 'version-2.0']
If you need to versions that use a more complicated scheme, please see
these version sorting examples_.
Sort Paths Like My File Browser (e.g. Windows Explorer on Windows) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Prior to natsort_ version 7.1.0, it was a common request to be able to
sort paths like Windows Explorer. As of natsort_ 7.1.0, the function
os_sorted()_ has been added to provide users the ability to sort
in the order that their file browser might sort (e.g Windows Explorer on
Windows, Finder on MacOS, Dolphin/Nautilus/Thunar/etc. on Linux).
.. code-block:: python
import os
from natsort import os_sorted
print(os_sorted(os.listdir()))
# The directory sorted like your file browser might show
Output will be different depending on the operating system you are on.
For users not on Windows (e.g. MacOS/Linux) it is strongly recommended
to also install PyICU, which will help
natsort give results that match most file browsers. If this is not installed,
it will fall back on Python's built-in locale_ module and will give good
results for most input, but will give poor results for special characters.
Sorting by Real Numbers (i.e. Signed Floats) ++++++++++++++++++++++++++++++++++++++++++++
This is useful in scientific data analysis (and was the default behavior
of natsorted()_ for natsort_ version < 4.0.0). Use the realsorted()_
function:
.. code-block:: pycon
>>> from natsort import realsorted, ns
>>> # Note that when interpreting as signed floats, the below numbers are
>>> # +5.10, -3.00, +5.30, +2.00
>>> a = ['position5.10.data', 'position-3.data', 'position5.3.data', 'position2.data']
>>> natsorted(a)
['position2.data', 'position5.3.data', 'position5.10.data', 'position-3.data']
>>> natsorted(a, alg=ns.REAL)
['position-3.data', 'position2.data', 'position5.10.data', 'position5.3.data']
>>> realsorted(a) # shortcut for natsorted with alg=ns.REAL
['position-3.data', 'position2.data', 'position5.10.data', 'position5.3.data']
Locale-Aware Sorting (or "Human Sorting") +++++++++++++++++++++++++++++++++++++++++
This is where the non-numeric characters are also ordered based on their
meaning, not on their ordinal value, and a locale-dependent thousands
separator and decimal separator is accounted for in the number.
This can be achieved with the humansorted()_ function:
.. code-block:: pycon
>>> a = ['Apple', 'apple15', 'Banana', 'apple14,689', 'banana']
>>> natsorted(a)
['Apple', 'Banana', 'apple14,689', 'apple15', 'banana']
>>> import locale
>>> locale.setlocale(locale.LC_ALL, 'en_US.UTF-8')
'en_US.UTF-8'
>>> natsorted(a, alg=ns.LOCALE)
['apple15', 'apple14,689', 'Apple', 'banana', 'Banana']
>>> from natsort import humansorted
>>> humansorted(a) # shortcut for natsorted with alg=ns.LOCALE
['apple15', 'apple14,689', 'Apple', 'banana', 'Banana']
You may find you need to explicitly set the locale to get this to work
(as shown in the example). Please see locale issues_ and the
Optional Dependencies_ section below before using the humansorted()_ function.
Further Customizing Natsort +++++++++++++++++++++++++++
If you need to combine multiple algorithm modifiers (such as ns.REAL,
ns.LOCALE, and ns.IGNORECASE), you can combine the options using the
bitwise OR operator (|). For example,
.. code-block:: pycon
>>> a = ['Apple', 'apple15', 'Banana', 'apple14,689', 'banana']
>>> natsorted(a, alg=ns.REAL | ns.LOCALE | ns.IGNORECASE)
['Apple', 'apple15', 'apple14,689', 'Banana', 'banana']
>>> # The ns enum provides long and short forms for each option.
>>> ns.LOCALE == ns.L
True
>>> # You can also customize the convenience functions, too.
>>> natsorted(a, alg=ns.REAL | ns.LOCALE | ns.IGNORECASE) == realsorted(a, alg=ns.L | ns.IC)
True
>>> natsorted(a, alg=ns.REAL | ns.LOCALE | ns.IGNORECASE) == humansorted(a, alg=ns.R | ns.IC)
True
All of the available customizations can be found in the documentation for
the ns enum_.
You can also add your own custom transformation functions with the key
argument. These can be used with alg if you wish.
.. code-block:: pycon
>>> a = ['apple2.50', '2.3apple']
>>> natsorted(a, key=lambda x: x.replace('apple', ''), alg=ns.REAL)
['2.3apple', 'apple2.50']
Sorting Mixed Types +++++++++++++++++++
You can mix and match int, float, and str_ types when you sort:
.. code-block:: pycon
>>> a = ['4.5', 6, 2.0, '5', 'a']
>>> natsorted(a)
[2.0, '4.5', '5', 6, 'a']
>>> # sorted(a) would raise an "unorderable types" TypeError
Handling Bytes ++++++++++++++
natsort_ does not officially support the bytes_ type, but
convenience functions are provided that help you decode to str_ first:
.. code-block:: pycon
>>> from natsort import as_utf8
>>> a = [b'a', 14.0, 'b']
>>> # natsorted(a) would raise a TypeError (bytes() < str())
>>> natsorted(a, key=as_utf8) == [14.0, b'a', 'b']
True
>>> a = [b'a56', b'a5', b'a6', b'a40']
>>> # natsorted(a) would return the same results as sorted(a)
>>> natsorted(a, key=as_utf8) == [b'a5', b'a6', b'a40', b'a56']
True
Generating a Reusable Sorting Key and Sorting In-Place +++++++++++++++++++++++++++++++++++++++++++++++++++++
