Xpyth

A module for querying the DOM tree and writing XPath expressions using native Python syntax.

Generate Convert Improve

Install / Use

/learn @hchasestevens/Xpyth

About this skill

Quality Score

0/100

README

xpyth

PyPI - Python Version

A module for querying the DOM tree and writing XPath expressions using native Python syntax.

Example usage

>>> from xpyth import xpath, DOM, X

>>> xpath(X for X in DOM if X.name == 'main')
"//*[@name='main']"

>>> xpath(span for div in DOM for span in div if div.id == 'main')
"//div[@id='main']//span"

>>> xpath(a for a in DOM if '.com' not in a.href)
"//a[not(contains(@href, '.com'))]"

>>> xpath(a.href for a in DOM if any(p for p in a.ancestors if p.id))
"//a[./ancestor::p[@id]]/@href"

>>> xpath(X.data-bind for X in DOM if X.data-bind == '1')
"//*[@data-bind='1']/@data-bind"

>>> xpath(
...     form.action 
...     for form in DOM 
...     if all(
...         input 
...         for input in form.children 
...         if input.value == 'a'
...     )
... )
"//form[not(./input[not(@value='a')])]/@action"

>>> allowed_ids = list('abc')
>>> xpath(X for X in DOM if X.id in allowed_ids)
"//*[@id='a' or @id='b' or @id='c']"

Motivation

XPath is the de facto standard in querying XML and HTML documents. In Python (and most other languages), XPath expressions are represented as strings; this not only constitutes a potential security threat, but also means that developers are denied standard text-editor and IDE features such as syntax highlighting and autocomplete when writing XPaths. Furthermore, having to become familiar with XPath (or CSS selectors) presents a barrier to entry for developers who want to interact with the web.

Great inroads have been made in various programming languages in allowing the use of native list-comprehension-like syntax to generate SQL queries. xpyth piggybacks off one such effort, Pony, to extend this functionality to XPath. Now anyone familiar with Python comprehension syntax can query XML/HTML documents quickly and easily. Moreover, xpyth integrates with the popular lxml library to enable developers to go beyond the querying capabilities of XPath (when necessary).

Installation

pip install xpyth

Use with lxml

xpyth supports querying lxml ElementTrees using the query function. For example, given a document

<html>
    <div id='main' class='main'>
        <a href='http://www.google.com'>Google</a>
        <a href='http://www.chasestevens.com'>Not Google</a>
        <p>Lorem ipsum</p>
        <p id='123'>no numbers here</p>
        <p id='numbers_only'>123</p>
    </div>
    <div id='123' class='secondary'>
        <a href='http://www.google.org'>Google Charity</a>
        <a href='http://www.chasestevens.org'>Broken link!</a>
    </div>
</html>

accessible as the ElementTree tree, the following can be executed:

>>> len(query(a for a in tree))
4
>>> query(a for a in tree if 'Not Google' not in a.text)[0].attrib.get('href')
"http://www.google.com"
>>> next(
...     node 
...     for node in 
...     query(
...         p 
...         for p in 
...         tree 
...         if p.id
...     ) 
...     if re.match(r'\D+', node.attrib.get('id'))
... ).text
"123"

Known Issues

HTML tag names that contain special characters (dashes) cannot be selected, as they violate Python's generator comprehension syntax. HTML attributes containing dashes, e.g. data-bind, work normally.

The use of all is quite buggy, e.g. the following return incorrect expressions:

>>> xpath(X for X in DOM if all(p.id in ('a', 'b') for p in X))
"//*[not(.//p/@id='a' or //p/@id='b')]"  # expected "//*[not(.//p[./@id!='a' and ./@id!='b'])]"
>>> xpath(X for X in DOM if all('x' in p.id for p in X))
"//*[not(.contains(@id, //p))]"  # expected "//*[not(.//p[not(contains(@id, 'x'))])]"

Contacts

Name: H. Chase Stevens
Twitter: @hchasestevens

Related Skills

claude-opus-4-5-migration

81.7k

Migrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5

docs-writer

98.8k

`docs-writer` skill instructions As an expert technical writer and editor for the Gemini CLI project, you produce accurate, clear, and consistent documentation. When asked to write, edit, or revie

model-usage

332.3k

Use CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.

TrendRadar

49.6k

⭐AI-driven public opinion & trend monitor with multi-platform aggregation, RSS, and smart alerts.🎯 告别信息过载，你的 AI 舆情监控助手与热点筛选工具！聚合多平台热点 + RSS 订阅，支持关键词精准筛选。AI 智能筛选新闻 + AI 翻译 + AI 分析简报直推手机，也支持接入 MCP 架构，赋能 AI 自然语言对话分析、情感洞察与趋势预测等。支持 Docker ，数据本地/云端自持。集成微信/飞书/钉钉/Telegram/邮件/ntfy/bark/slack 等渠道智能推送。

hchasestevens

View profile

View on GitHub

GitHub Stars126

CategoryContent

Updated6d ago

Forks5

hchasestevens/xpyth

Languages

Python

Security Score

100/100

Audited on Mar 17, 2026

No findings