Cxml
A lightweight C++ XML parser with support for XPath expressions.
Install / Use
/learn @tomwangcn/CxmlREADME
CXML 


A lightweight C++ XML parser with support for XPath expressions.
<small><i><a href='http://ecotrust-canada.github.io/markdown-toc/'>Table of contents generated with markdown-toc</a></i></small>
Introduction
CXML is a simple and efficient XML parser written in C++ with built-in support for XPath queries.
The project combines fundamental C++ techniques with classic data structures to deliver a practical tool for XML data processing.
Usage
Installation
git clone https://github.com/CodeCat-maker/cxml.git
Demo
#include "src/parser.hpp"
#include "src/xpath.hpp"
extern int CXML_PARSER_STATUS;
extern int XPATH_PARSE_STATUE;
int main() {
using std::cout;
using std::endl;
clock_t start, end;
start = clock();
CXMLNode *root = parse_from_string(
"<bookstore company=\"codecat\" boss=\"man\">"
" <book category=\"CHILDREN\">"
" <title>Harry Potter</title>"
" <author>J K.Rowlingk</author>"
" <year>2005</year><br>"
" <price>29.99 </price>"
" </book>"
" <book category=\"WEB\">"
" <title>Learning XML</title>"
" <author>Erik T.Ray</author>"
" <year>2003 </year>"
" <price>39.95 </price>"
" </book>"
"</bookstore>"
);
if (CXML_PARSER_STATUS == CXML_SYNTAX_ERROR) {
std::puts("> XML parsing failed");
return 0;
} else {
std::puts("> XML parsing succeeded");
}
const CXMLNode_result *result1 =
xpath("/bookstore/book[@category=CHILDREN]/@category//text()", root);
const CXMLNode_result *result2 =
xpath("/bookstore/book/title/../price/text()", root);
if (XPATH_PARSE_STATUE == XPATH_SYNTAX_ERROR) {
std::puts("> XPath parsing failed");
return 0;
} else {
std::puts("> XPath parsing succeeded");
}
cout << "Example 1: " << result1->text << endl;
cout << "Example 2: " << result2->text << endl;
end = clock();
cout << "\nExecution time: "
<< (double)(end - start) / CLOCKS_PER_SEC << " seconds";
return 0;
}
Supported XPath Syntax
/name– Selects child elementname//name– Selects descendant elementname/.– Selects the current element/..– Selects the parent element/name[@attr=value]– Selects elements with attributeattr=value/name[@attr]– Selects elements with attributeattr/name[n]– Selects the nth occurrence ofname/text()– Returns text content of the current element/@attr– Returns the value of attributeattr//text()– Returns all descendant text nodes//@attr– Returns all descendant attributes
CMakeLists
cmake_minimum_required(VERSION 3.13)
project(cxml)
set(CMAKE_CXX_STANDARD 11)
add_subdirectory(src)
add_executable(cxml main.cpp)
target_link_libraries(cxml CxmlFunction)
Build
mkdir -p build
cd build
cmake ..
make -j
Example Output
> XML parsing succeeded
> XPath parsing succeeded
Example 1: Harry Potter J K.Rowlingk 2005 29.99
Example 2: 29.99
Execution time: 0.000135 seconds
Design
1. Core Logic
1. Input Handling
- From file
- From string
2. Parsing
- Parse element name
- Parse attributes
- Parse element text
- Construct the XML tree
3. Building the XML Tree
- Stack-based construction
- Push
<tag>on open - Pop on
</tag> - Naturally maintains parent–child relationships
- Push
- Recursive descent for nested content
4. XPath Evaluation
Approach
Tokenize the expression with a two-pointer scan, enqueue operations, and evaluate in FIFO order.
Constants
get_parent_node–/..get_this_node–/.get_node_from_child_by_name–/nameget_node_from_genera_by_name–//nameget_node_by_array_and_name–/name[n]get_node_by_attr_and_name–/name[@attr]get_node_by_attrValue_and_name–/name[@attr=value]get_text_from_this–/text()get_texts_from_genera–//text()get_attr_from_this–/@attr
Queue Construction with Two-Pointer Parsing
Split the XPath string into tokens and enqueue them.
Switch-Based Execution
Dequeue operations and dispatch to the corresponding handlers.
BFS Traversal
Collect text nodes from all descendants.
DFS Traversal
Search for the first matching descendant by name or predicate.
5. Data Structures
- Stack, linked list, tree, queue, pair/tuple
6. STL Containers
vector,map,string,pair,stack,queue
Algorithms
- DFS, BFS, two-pointer parsing
2. Future Improvements
- HTML parsing support
- Extended XPath coverage
- Performance optimizations
Contributions
Contributions are welcome—feel free to fork and open a PR.
