Rawutil
A pure-python module to read and write binary packed data
Install / Use
/learn @Tyulis/RawutilREADME
Rawutil
A pure-python and lightweight module to read and write binary data
Introduction
Rawutil is a module aimed at reading and writing binary data in python in the same way as the built-in struct module, but with more features.
rawutil's interface is thus compatible with struct, with a few small exceptions, and many things added.
It does not have any non-builtin dependency.
What’s already in struct
- Unpack and pack fixed structures from/to bytes (
pack,pack_into,unpack,unpack_from,iter_unpack,calcsize) Structobjects that allow to parse one and for all a structure that may be used several times
What’s different compared to struct
- Some rarely-used format characters are not in rawutil (
N,Pandpare not available,nis used for a different purpose) - There is no consideration for native size and alignment, thus the
@characters simply applies system byte order with standard sizes and no alignment, just like= - There are several differences in error handling that are described below
What has been added to struct
- Reading and writing files and file-like objects
- New format characters, to handle padding, alignment, strings, ...
- Internal references in structures
- Loops in structures
- New features to handle variable byte order
Usage
Rawutil exports more or less the same interface as struct. In all those functions, structure may be a simple format string or a Struct object.
unpack
unpack(structure, data, names=None, refdata=(), byteorder=None)
Unpacks the given data according to the structure, and returns the unpacked values as a list.
structureis the structure of the data to unpack, as a format string or aStructobjectdatamay be a bytes-like or a file-like object. If it is a file-like object, the data will be unpacked starting from the current position in the file, and will leave the cursor at the end of the data that has been read (effectively reading the data to unpack from the file).namesmay be a list of field names for anamedtuple, or a callable that takes all unpacked elements in order as arguments, like anamedtupleor adataclass.refdatamay be used to easily input external data into the structure, as#nreferences. This will be described in the References part belowbyteorder("little" / "big") may be used to force the byteorder over the one defined in the format string
Unlike struct, this function does not raises any error if the data is larger than the structure expected size.
Examples :
>>> unpack("4B 3s 3s", b"\x01\x02\x03\x04foobar")
(1, 2, 3, 4, b"foo", b"bar")
>>> unpack("<4s #0I", b"ABCD\x10\x00\x00\x00\x20\x00\x00\x00", names=("string", "num1", "num2"), refdata=(2, ))
RawutilNameSpace(string=b'ABCD', num1=16, num2=32)
unpack_from
unpack_from(structure, data, offset=None, names=None, refdata=(), getptr=False)
Unpacks the given data according to the structure starting from the given position, and returns the unpacked values as a list
This function works exactly like unpack, with two more optional arguments :
offsetcan be used to specify a starting position to read. In a file-like object, the cursor is moved to the given absoluteoffset, then the data to unpack is read and the cursor is left at the end of the data that has been read. If this parameter is not set, it works likeunpackand reads from the current positiongetptrcan be set to True to return the final position in the data, after the unpacked data. The function will then return(values, end_position). If left to False, it works likeunpackand only returns the values.
Examples :
>>> unpack_from("<4s #0I", b"ABCD\x10\x00\x00\x00\x20\x00\x00\x00", names=("string", "num1", "num2"), refdata=(2, ))
RawutilNameSpace(string=b'ABCD', num1=16, num2=32)
>>> values, endpos = unpack_from("<2I", b"ABCD\x10\x00\x00\x00\x20\x00\x00\x00EFGH", offset=4, getptr=True)
>>> values
[16, 32]
>>> endpos
12
iter_unpack
iter_unpack(structure, data, names=None, refdata=())
Returns an iterator that will unpack according to the structure and return the values as a list at each iteration.
The data must be of a multiple of the structure’s length. If names is defined, each iteration will return a namedtuple, most like unpack and unpack_from. refdata also works the same.
This function is present mostly to ensure compatibility with struct. It is rather recommended to use iterators in structures, that are faster and offer much more control.
Examples :
>>> for a, b, c in iter_unpack("3c", b"abcdefghijkl"):
... print(a.decode("ascii"), b.decode("ascii"), c.decode("ascii"))
...
a b c
d e f
g h i
j k l
pack
pack(self, *data, refdata=(), byteorder=None, padding_byte=0x00)
Packs the given data in the binary format defined by structure, and returns the packed data as a bytes object.
refdatais still there to insert external data in the structure using the#nreferences, and is a named argument only.byteorder("little" / "big") may be used to force the byteorder over the one defined in the format stringpadding_byteis the value of the padding bytes inserted by"x"and"a"format characters
Examples :
>>> pack("<2In", 10, 100, b"String")
b'\n\x00\x00\x00\n\x00\x00\x00String\x00'
>>> pack(">#0B #1I", 10, 100, 1000, 10000, 100000, refdata=(2, 3))
b"\nd\x00\x00\x03\xe8\x00\x00'\x10\x00\x01\x86\xa0"
>>> unpack(">2B3I", _)
[10, 100, 1000, 10000, 100000]
pack_into
pack_into(structure, buffer, offset, *data, refdata=(), byteorder=None)
Packs the given data into the given buffer at the given offset according to the given structure. Refdata still has the same usage as everywhere else.
buffermust be a mutable bytes-like object (typically abytearray). The data will be written directly into it at the given positionoffsetspecifies the position to write the data to. It is a required argument.byteorder("little" / "big") may be used to force the byteorder over the one defined in the format stringpadding_byteis the value of the padding bytes inserted by"x"and"a"format characters
Examples :
>>> b = bytearray(b"AB----GH")
>>> pack_into("4s", b, 2, b"CDEF")
>>> b
bytearray(b'ABCDEFGH')
pack_file
pack_file(structure, file, *data, position=None, refdata=(), byteorder=None)
Packs the given data into the given file according to the given structure. refdata is still there for the external references data.
filecan be any binary writable file-like object.positioncan be set to pack the data at a specific position in the file. If it is left toNone, the data will be packed at the current position in the file. In either case, the cursor will end up at the end of the packed data.byteorder("little" / "big") may be used to force the byteorder over the one defined in the format stringpadding_byteis the value of the padding bytes inserted by"x"and"a"format characters
Examples :
>>> file = io.BytesIO(b"\x00\x00\x00\x00\x00\x00\x00\x00")
>>> rawutil.pack_file("2B", file, 60, 61) # Writes at the current position (0)
>>> rawutil.pack_file("c", file, b"A") # Writes at the current position (now 2)
>>> rawutil.pack_file("2c", file, b"y", b"z", position=6) # Writes at the given position (6)
>>> file.seek(0)
>>> file.read()
b'<=A\x00\x00\x00yz'
calcsize
calcsize(structure, refdata=())
Returns the size of the data represented by the given structure.
Rawutil structures are not always of a fixed length, as they use internal references and variable length formats.
Hence calcsize only works on fixed-length structures, that only use :
- Fixed-length format characters (basic types with set repeat count)
- External references (
#0type references, if you provide their value inrefdata) - Iterators with fixed number of repeats (
2(…)or5[…]will work) - Alignments (structures with
aand|). As long as everything else is fixed, alignments are too.
Trying to compute the size of a structure that includes any of the following will raise a FormatError (basically, anything that depends on the data to read / write) :
- Variable-length format characters (namely
nand$) {…}iterators, as they depend on the amount of data remaining.- Internal references (any
/1or/p1types references)
Struct
Struct(format, names=None, safe_references=True)
Struct objects allow to pre-parse format strings once and for all. Using only format strings will force to parse them every time you use them. If a structure is used more than once, it will thus save time to wrap it in a Struct object. You can also set the element names once, they will then be used by default every time you unpack data with that structure. Any function that accepts a format string also accepts Struct objects.
A Struct object is initialized with a format string, and can take a names parameter that may be a namedtuple or a list of names, that allows to return data unpacked with this structure in a more convenient namedtuple. It works exactly the same as the names parameter of unpack and its variants, but without having to specify it each time.
The namedtuple type can also be retrieved from the structure.names attributes, and can be used to clarify packed values :
>>> my_structure = rawutil.Struct("4s I", names=("magic", "size"))
>>> data = my_structure.names(magic=b"1234", size=64)
>>> my_structure.pack(*data)
b'1234@\x00\x00\x00'
The safe_references parameter, when set to
Related Skills
node-connect
346.8kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
107.6kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
346.8kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
346.8kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
