OCGumbo
An Objective-C HTML5 parser based on Google Gumbo.
Install / Use
/learn @tracy-e/OCGumboREADME
OCGumbo - An Objective-C HTML5 parser.
OCGumbo is an Objective-C wrapper of the Google Gumbo.
Basic Usage
- Add Gumbo sources or lib to your project.
- Add OCGumbo file and import "OCGumbo.h", then use OCGumboDocument to parse an html string.
Objects
<table> <tr><th>Class</th><th>Description</th></tr> <tr><td>OCGumboDocument</td><td>the root of a document tree</td></tr> <tr><td>OCGumboElement</td><td>an element in an HTML document</td></tr> <tr><td>OCGumboText</td><td>the textual content of an element</td></tr> <tr><td>OCGumboNode </td><td>a single node in the document tree</td></tr> <tr><td>OCGumboAttribute</td><td>an attribute of an Element object</td></tr> </table>Examples
OCGumboDocument *document = [[OCGumboDocument alloc] initWithHTMLString:htmlString];
OCGumboElement *root = document.rootElement;
//document: do something with the document.
//rootElement: do something with the html tree.
Extension
Now, OCGumbo add more Query support, add "OCGumbo+Query.h" and enjoy it.
Query APIs
<table> <tr><th width="100">Method</th><th>Description</th></tr> <tr><td>.Query( )</td><td>Query children elements from current node by selector</td></tr> <tr><td>.text( )</td><td>Get the combined text contents of current object</td></tr> <tr><td>.textArray( )</td><td>Get the combined text array of current object</td></tr> <tr><td>.html( )</td><td>Get the raw contents of current element</td></tr> <tr><td>.attr( )</td><td>Get the attribute value of the element by attributeName</td></tr> <tr><td>.find( )</td><td>Find elements that match the selector in the current collection</td></tr> <tr><td>.children( )</td><td>Get immediate children of each element in the current collection matching the selector</td></tr> <tr><td>.parent( )</td><td>Get immediate parents of each element in the collection matching the selector</td></tr> <tr><td>.parents( )</td><td>Get all ancestors of each element in the collection matching the selector</td></tr> <tr><td>.first( )</td><td>Get the first element of the current collection</td></tr> <tr><td>.last( )</td><td>Get the last element of the current collection</td></tr> <tr><td>.get ( )</td><td>Get the element by index from current collection</td></tr> <tr><td>.index( )</td><td>Get the position of an element in current collection</td></tr> <tr><td>.hasClass( )</td><td>Check if any elements in the collection have the specified class</td></tr> </table>Examples
NSLog(@"options: %@", document.Query(@"body").find(@"#select").find(@"option"));
NSLog(@"title: %@", document.Query(@"title").text());
NSLog(@"attribute: %@", document.Query(@"select").first().attr(@"id"));
NSLog(@"class: %@", document.Query(@"#select").parents(@".main"));
NSLog(@"tag.class: %@", document.Query(@"div.theCls"));
NSLog(@"tag#id : %@", document.Query(@"div#theId"));
Contact
Weibo: @TracyYih
Email: tracy.cpp@gmail.com
License
Related Skills
node-connect
346.4kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
107.2kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
346.4kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
346.4kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
