Expath
Lightning-fast XML parsing and XPath querying for Elixir, powered by Rust NIFs.
Install / Use
/learn @wearecococo/ExpathREADME
Expath
Lightning-fast XML parsing and XPath querying for Elixir, powered by Rust NIFs.
Expath provides blazing-fast XML processing through Rust's battle-tested sxd-document and sxd-xpath libraries, delivering 2-10x performance improvements compared to existing Elixir XML libraries.
✨ Key Features
- 🚀 Blazing Fast: 2-10x faster than SweetXml with Rust-powered NIFs
- 🔄 Parse-Once, Query-Many: Efficient document reuse for multiple XPath queries
- 🛡️ Battle-Tested: Built on proven Rust XML libraries (sxd-document, sxd-xpath)
- 🎯 Simple API: Clean, intuitive interface with comprehensive documentation
- ⚡ Thread-Safe: Safe concurrent access to parsed documents
- 🌐 Namespace Support: Full XML namespace support for SOAP, RSS, and complex XML
- 🔧 Zero Dependencies: No external XML parsers required
🚀 Quick Start
Installation
Add expath to your list of dependencies in mix.exs:
def deps do
[
{:expath, "~> 0.2.0"}
]
end
Then run:
mix deps.get
mix deps.compile
Basic Usage
Simple XPath query
xml = """
<library>
<book id="1">
<title>The Great Gatsby</title>
<author>F. Scott Fitzgerald</author>
</book>
<book id="2">
<title>1984</title>
<author>George Orwell</author>
</book>
</library>
"""
# Extract all book titles
{:ok, titles} = Expath.select(xml, "//title/text()")
# => ["The Great Gatsby", "1984"]
# Find specific book
{:ok, [title]} = Expath.select(xml, "//book[@id='1']/title/text()")
# => ["The Great Gatsby"]
# Count books
{:ok, [count]} = Expath.select(xml, "count(//book)")
# => ["2"]
Parse-Once, Query-Many (Recommended for Multiple Queries)
# Parse document once
{:ok, doc} = Expath.new(xml)
# Run multiple queries efficiently
{:ok, titles} = Expath.query(doc, "//title/text()")
{:ok, authors} = Expath.query(doc, "//author/text()")
{:ok, book_count} = Expath.query(doc, "count(//book)")
# Document is automatically cleaned up when out of scope
📊 Performance Benchmarks
Real-world performance comparison with SweetXml across different document sizes:
| Document Size | Speed Improvement | Use Case | |---------------|------------------|----------| | Small (644B) | 2-3x faster | API responses, config files | | Medium (5.6KB) | 2.3x faster | RSS feeds, small datasets | | Large (904KB) | 8-10x faster | Large documents, bulk processing |
Benchmark Results Summary
*** Large XML Performance ***
Expath (Rust NIFs) 78.27 iterations/sec (12.78 ms avg)
SweetXml 7.77 iterations/sec (128.64 ms avg)
Comparison: Expath is 10.07x faster
Run your own benchmarks:
mix run bench/benchmark.exs
📖 API Reference
Core Functions
Expath.select/2 - Single Query
Perfect for one-off XPath queries.
Expath.select(xml_string, xpath_expression)
# Returns: {:ok, results} | {:error, reason}
Expath.new/1 - Parse Document
Creates a reusable document for multiple queries.
{:ok, doc} = Expath.new(xml_string)
# Returns: {:ok, %Expath.Document{}} | {:error, reason}
Expath.query/2 - Query Parsed Document
Query a previously parsed document.
{:ok, results} = Expath.query(document, xpath_expression)
# Returns: {:ok, results} | {:error, reason}
XPath Support
Expath supports the full XPath 1.0 specification:
# Node selection
Expath.select(xml, "//book") # All book elements
Expath.select(xml, "/library/book[1]") # First book
Expath.select(xml, "//book[@id='1']") # Book with id="1"
# Text extraction
Expath.select(xml, "//title/text()") # All title text
Expath.select(xml, "//book/@id") # All id attributes
# Functions
Expath.select(xml, "count(//book)") # Count books
Expath.select(xml, "//book[position()=1]") # First book
Expath.select(xml, "//book[contains(@class,'fiction')]") # Contains filter
# Complex expressions
Expath.select(xml, "//book[price > 10]/title/text()") # Conditional selection
XML Namespace Support
Expath provides full support for XML namespaces, essential for SOAP, RSS, and complex XML documents:
# XML with namespaces
xml = """
<library xmlns:book="http://example.com/book" xmlns:meta="http://example.com/metadata">
<book:collection meta:id="sci-fi">
<book:title>1984</book:title>
<book:author>George Orwell</book:author>
</book:collection>
</library>
"""
# Define namespace mappings
namespaces = %{
"book" => "http://example.com/book",
"meta" => "http://example.com/metadata"
}
# Query with namespace support
{:ok, titles} = Expath.select(xml, "//book:title/text()", namespaces)
# => ["1984"]
{:ok, ids} = Expath.select(xml, "//book:collection/@meta:id", namespaces)
# => ["sci-fi"]
# Multiple queries with namespace support
{:ok, doc} = Expath.new(xml)
{:ok, titles} = Expath.query(doc, "//book:title/text()", namespaces)
{:ok, authors} = Expath.query(doc, "//book:author/text()", namespaces)
For comprehensive namespace documentation, see NAMESPACE_GUIDE.md.
Error Handling
Expath provides detailed error information:
# Invalid XML (detected during query)
{:error, :invalid_xml} = Expath.select("<root><unclosed>", "/*")
# Invalid XPath expression
{:error, :invalid_xpath} = Expath.select(xml, "//[invalid")
# XPath evaluation errors
{:error, :xpath_error} = Expath.query(doc, "unknown-function()")
Performance
Expath is designed for high-performance XML processing:
- Native Speed: Rust NIFs provide near-native performance
- Zero-Copy: Efficient string handling between Elixir and Rust
- Resource Caching: Parse once, query many times without re-parsing
- Memory Efficient: Automatic memory management via Erlang garbage collection
Performance Example
# Large XML document
xml = File.read!("large_document.xml")
# Parse once (expensive operation)
{:ok, doc} = Expath.new(xml)
# Multiple queries (very fast - no re-parsing)
Enum.each(1..1000, fn _i ->
{:ok, _results} = Expath.query(doc, "//some/xpath")
end)
Platform Support
Expath supports all platforms where Rust and Erlang are available:
- Linux (x86_64, aarch64)
- macOS (Intel, Apple Silicon)
- Windows (x86_64)
Apple Silicon (M1/M2) Setup
Expath includes special configuration for Apple Silicon Macs. If you encounter linking issues, ensure you have:
- Native Erlang installation (not x86_64 via Rosetta)
- Native Rust toolchain for aarch64-apple-darwin
The included Cargo configuration handles the necessary linker flags automatically.
Examples
RSS Feed Processing
defmodule RSSProcessor do
def process_feed(rss_xml) do
{:ok, doc} = Expath.new(rss_xml)
{:ok, titles} = Expath.query(doc, "//item/title/text()")
{:ok, links} = Expath.query(doc, "//item/link/text()")
{:ok, descriptions} = Expath.query(doc, "//item/description/text()")
titles
|> Enum.zip([links, descriptions])
|> Enum.map(fn {title, [link, description]} ->
%{title: title, link: link, description: description}
end)
end
end
Configuration File Parsing
defmodule ConfigParser do
def parse_config(xml_config) do
{:ok, doc} = Expath.new(xml_config)
{:ok, database_host} = Expath.query(doc, "//database/@host")
{:ok, database_port} = Expath.query(doc, "//database/@port")
{:ok, features} = Expath.query(doc, "//features/feature/@name")
%{
database: %{host: database_host, port: database_port},
features: features
}
end
end
Data Extraction Pipeline
defmodule DataExtractor do
def extract_products(xml_data) do
{:ok, doc} = Expath.new(xml_data)
# Extract in parallel using cached document
tasks = [
Task.async(fn -> Expath.query(doc, "//product/@id") end),
Task.async(fn -> Expath.query(doc, "//product/name/text()") end),
Task.async(fn -> Expath.query(doc, "//product/price/text()") end),
Task.async(fn -> Expath.query(doc, "//product/category/text()") end)
]
[ids, names, prices, categories] =
tasks
|> Enum.map(&Task.await/1)
|> Enum.map(fn {:ok, results} -> results end)
[ids, names, prices, categories]
|> Enum.zip()
|> Enum.map(fn {id, name, price, category} ->
%{id: id, name: name, price: price, category: category}
end)
end
end
Development
Prerequisites
- Elixir 1.18 or later
- Erlang/OTP 27 or later
- Rust 1.70 or later
- C compiler (gcc, clang, or MSVC)
Building from Source
git clone https://github.com/yourusername/expath.git
cd expath
mix deps.get
mix compile
Running Tests
mix test
Building Documentation
mix docs
Docker Development
For cross-platform testing or if you prefer containerized development, Expath includes comprehensive Docker support:
Quick Start with Docker
# Run all tests in Linux container
./scripts/docker-test.sh
# Or use docker-compose for specific tasks
docker-compose run test
docker-compose run benchmark
docker-compose run quality
Available Docker Services
dev: Development environment with all dependenciestest: Run the full test suitebenchmark: Execute performance benchmarksquality: Run code quality checks (Credo)
Docker Commands
# Build and test everything
docker-compose up test
# Run interactive development shell
docker-compose run dev iex -S mix
# Execute benchmarks
docker-compose run benchmark
# Check code quality
docker-com
Related Skills
node-connect
351.8kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
110.9kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
351.8kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
351.8kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
