HtmlTinkerX
HtmlTinkerX is a powerful async C# library for HTML, CSS, and JS processing, parsing, formatting, and optimization. It provides web content processing capabilities including browser automation, document parsing with multiple engines, resource optimization, and more. PSParseHTML is the PowerShell module exposing HtmlTinkerX to PowerShell.
Install / Use
/learn @EvotecIT/HtmlTinkerXREADME
HtmlTinkerX & PSParseHTML - Modern HTML Processing for .NET and PowerShell
HtmlTinkerX is available as NuGet from the Nuget Gallery and PSParseHTML PowerShell module from PSGallery
📦 NuGet Package
💻 PowerShell Module
🛠️ Project Information
👨💻 Author & Social
What it's all about
HtmlTinkerX is a powerful async C# library for HTML, CSS, and JavaScript processing, parsing, formatting, and optimization. It provides comprehensive web content processing capabilities including browser automation with Playwright, document parsing with multiple engines, resource optimization, and much more. PSParseHTML is the PowerShell module that exposes HtmlTinkerX functionality through easy-to-use cmdlets.
Whether you're working in C# or PowerShell, you get access to:
- 🔍 HTML Parsing - Multiple parsing engines (AngleSharp, HtmlAgilityPack)
- 🎨 Resource Optimization - Minify and format HTML, CSS, JavaScript
- 🌐 Browser Automation - Full Playwright integration for screenshots, PDFs, interaction
- 📊 Data Extraction - Tables, forms, metadata, microdata, Open Graph
- 📧 Email Processing - CSS inlining for email compatibility
- 🔧 Network Tools - HAR export, request interception, console logging
- 🍪 State Management - Cookie handling, session persistence
- 📱 Multi-Platform - .NET Framework 4.7.2, .NET Standard 2.0, .NET 8.0
🔄 PowerShell to C# Method Mapping
This comprehensive table shows how PowerShell cmdlets map to C# methods, making it easy to transition between platforms:
HTML Parsing & Processing
| PowerShell Cmdlet | C# Method | Description |
| ---------------------------- | ---------------------------------------- | ------------------------------------------- |
| ConvertFrom-HTML | HtmlParser.ParseWithAngleSharp() | Parse HTML documents |
| ConvertFrom-HtmlTable | HtmlParser.ParseTablesWithAngleSharp() | Extract tables with rowspan/colspan support |
| ConvertFrom-HTMLAttributes | HtmlParserExtensions.GetElements() | Query elements by tag, class, id, or name |
| ConvertFrom-HtmlForm | HtmlFormExtractor.ExtractForms() | Extract form data and structure |
| ConvertFrom-HtmlList | HtmlListParser.ParseLists() | Parse list elements into structured data |
| ConvertFrom-HtmlMeta | HtmlMetaParser.ExtractMeta() | Extract meta tag name/content pairs |
| ConvertFrom-HtmlMicrodata | HtmlMicrodataParser.ExtractMicrodata() | Extract schema.org structured data |
| ConvertFrom-HtmlOpenGraph | HtmlOpenGraphParser.ExtractOpenGraph() | Extract Open Graph metadata |
| Convert-HTMLToText | HtmlUtilities.ConvertToText() | Convert HTML to plain text |
| Compare-HTML | HtmlDiffer.Compare() | Compare HTML documents |
| Measure-HTMLDocument | HtmlParser.AnalyzeDocument() | Analyze document metrics |
Resource Formatting & Optimization
| PowerShell Cmdlet | C# Method | Description |
| --------------------- | ------------------------------------ | ---------------------------------- |
| Format-HTML | HtmlFormatter.FormatHtml() | Pretty-print HTML markup |
| Format-CSS | HtmlFormatter.FormatCss() | Format CSS stylesheets |
| Format-JavaScript | HtmlFormatter.FormatJavaScript() | Beautify JavaScript with options |
| Optimize-HTML | HtmlOptimizer.OptimizeHtml() | Minify HTML content |
| Optimize-CSS | HtmlOptimizer.OptimizeCss() | Minify CSS stylesheets |
| Optimize-JavaScript | HtmlOptimizer.OptimizeJavaScript() | Minify JavaScript code |
| Optimize-Email | PreMailerClient.MoveCssInline() | Inline CSS for email compatibility |
Browser Automation & Sessions
| PowerShell Cmdlet | C# Method | Description |
| ----------------------- | ---------------------------------- | -------------------------------- |
| Start-HTMLSession | HtmlBrowser.OpenSessionAsync() | Create browser session |
| Close-HTMLSession | session.DisposeAsync() | Close browser session |
| Invoke-HTMLRendering | HtmlBrowser.OpenSessionAsync() | Render pages with authentication |
| Invoke-HTMLNavigation | HtmlBrowser.NavigateAsync() | Navigate to different URLs |
| Invoke-HTMLScript | HtmlBrowser.ExecuteScriptAsync() | Execute JavaScript in browser |
| Invoke-HTMLDomScript | HtmlScriptRunner.ExecuteScript() | Run JavaScript with AngleSharp |
| Export-BrowserState | HtmlBrowser.ExportStateAsync() | Save browser state |
| Import-BrowserState | HtmlBrowser.ImportStateAsync() | Restore browser state |
| Export-HTMLSession | HtmlBrowser.ExportSessionAsync() | Export session data |
| Import-HTMLSession | HtmlBrowser.ImportSessionAsync() | Import session data |
Screenshots & Media
| PowerShell Cmdlet | C# Method | Description |
| -------------------------- | ---------------------------------------- | ------------------------------- |
| Save-HTMLScreenshot | HtmlBrowser.SaveScreenshotAsync() | Capture page screenshots |
| Save-HTMLPdf | HtmlBrowser.SavePdfAsync() | Generate PDFs from pages |
| Start-HTMLVideoRecording | HtmlBrowser.StartVideoRecordingAsync() | Start recording browser session |
| Stop-HTMLVideoRecording | HtmlBrowser.StopVideoRecordingAsync() | Stop video recording |
Element Interaction
| PowerShell Cmdlet | C# Method | Description |
| ---------------------- | -------------------------------------------- | ------------------------------ |
| Invoke-HTMLClick | HtmlBrowser.ClickAsync() | Click elements |
| Set-HTMLInput | HtmlBrowser.SetInputAsync() | Set input field values |
| Set-HTMLSelectOption | HtmlBrowser.SetSelectAsync() | Select dropdown options |
| Set-HTMLChecked | HtmlBrowser.SetCheckedAsync() | Check/uncheck checkboxes |
| Submit-HTMLForm | HtmlBrowser.SubmitFormAsync() | Submit forms |
| Get-HTMLInteractable | HtmlBrowser.GetInteractableElementsAsync() | List clickable elements |
| Get-HTMLFormField | HtmlFormExtractor.GetFormFields() | Extract form field information |
| Get-HTMLLoginForm | HtmlFormExtractor.DetectLoginForms() | Detect login forms |
Network & Debugging
| PowerShell Cmdlet | C# Method | Description |
| ---------------------- | ------------------------------------ | ------------------------------- |
| Get-HTMLNetworkLog | HtmlBrowser.GetNetworkLog() | View network requests/responses |
| Get-HTMLConsoleLog | HtmlBrowser.GetConsoleLog() | Retrieve console messages |
| Save-HTMLHar | HtmlBrowser.SaveHarAsync() | Export network traffic to HAR |
| Start-HTMLTracing | HtmlBrowser.StartTracingAsync() | Start Playwright tracing |
| Stop-HTMLTracing | HtmlBrowser.StopTracingAsync() | Stop tracing and save |
| Register-HTMLRoute | HtmlBrowser.RegisterRouteAsync() | Intercept requests |
| `U
