SkillAgentSearch skills...

Pyiwa

Read iWork's internal iwa format using python

Install / Use

/learn @matchaxnb/Pyiwa
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

pyiwa

Read iWork's internal iwa format using python.

pyiwa was written as a recovery tool for ugly disk crashes or accidental deletions. It is not a .pages file parser.

Actually, there's no such thing as a iWork '09+ pages file. They are directories containing another directory and a zip file which contains the actual, useful payload.

This payload is constituted of iwa files. They were reverse-engineered by @obriensp (see: https://github.com/obriensp/iWorkFileFormat).

pyiwa reads iwa files extracted from iWork zips.

First it extracts the hacked-upon-snappy compressed stream that an iwa file actually is.

Then it uses introspection to parse the various contatenated fields and stores them in a list of Protocol buffers (actually an array of them). The reverse engineering is not complete so some fields are just not understood and skipped. There is no way to match them with a known protocol buffer so we ignore them.

Once all the fields have been listed, another loop searches for fields containing a text item and prints the text items.

Therefore pyiwa could be considered, as it is, as an iwacat.

However there's more to it as one could use it to parse the formatting, fonts, etc. used to make it an iwa2*, though I doubt it's worth it.

It's also a good programming exercise in python with binary streams.

View on GitHub
GitHub Stars8
CategoryDevelopment
Updated2y ago
Forks1

Languages

Python

Security Score

70/100

Audited on Feb 28, 2024

No findings