SkillAgentSearch skills...

WebBrowserWaiter

A headless wrapper of the .NET WebBrowser useful for automated browser testing, web crawling or screen scrapping and includes both HTML and JavaScript processing capabilities.

Install / Use

/learn @LeastOne/WebBrowserWaiter
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

WebBrowserWaiter

Summary

A headless wrapper of the .NET WebBrowser useful in automated browser testing, or as web crawler or screen scrapper and includes both HTML and JavaScript processing capabilities.

Installation

Install-Package WebBrowserWaiter

Sample

using (var waiter = new WebBrowserWaiter())
{
    waiter.Await(
        // NOTE: wb is an instance of System.Windows.Forms.WebBrowser
        wb => wb.Navigate("https://www.google.com/search?q=pi"),
        wb => Assert.IsTrue(
            wb.DocumentText.Contains("3.14159265")
        )
    );
}

Overview

The WebBrowserWaiter serves up functionality of the .NET WebBrowser in a manor that can be more easily consumed than with the native component. If you've ever tried using the .NET WebBrowser for automated browser testing or screen scraping you've likely discovered the challenges it presents due to its Message Loop implementation. WebBrowserWaiter seeks to solve this by abstracting the nuances of the message loop and providing a more familiar (though somewhat peculiar) method of interaction.

Highlights

Here are a few of the noteworthy benefits to using WebBrowserWaiter:

  • Small. Its essentially a single type that wraps the already provided implementation of the .NET WebBrowser.
  • Simple. Unlike some other headless browser solutions there are no plugins or browser extensions to install or integrate. WebBrowserWaiter leverages the pre-integrated configuration of Internet Explorer and the WebBrowser control.
  • Powerful. Because WebBrowser is based on Internet Explorer interaction with and execution of JavaScript makes it possible to test and process modern web applications based on popular libraries and frameworks such as jQuery, AngularJS, etc.
  • Familiar. If you've used the .NET WebBrowser before you'll be able to utilize that same knowledge as the WebBrowserWaiter provides a WebBrowser instance to invoke actions on.
  • Testable. Not only can you use WebBrowserWaiter to perform integration tests on production code, but if you need to use it in production code you can easily mock and test it also.
  • Configurable. Though designed to be headless the WebBrowserWaiter can display the WebBrowser instance in an on screen .NET Form to view navigation transitions. Helpful when you need to see why your instructions may not be executing as you expected.

Compatibility

.NET 3.5 and above.

Examples

Extract Document Text

Retrieve a page and extract its text to check for an expected string. Note that the first lambda issues the navigation command which causes the waiter to wait until navigation completes before executing the second lambda which checks the document text. If the instructions were not split into separate lambda's the document text would be checked before the navigation had completed, almost assuredly providing an incorrect result.

using (var waiter = new WebBrowserWaiter())
{
    string text = null;

    waiter.Await(
        // NOTE: wb is an instance of System.Windows.Forms.WebBrowser
        wb => wb.Navigate("http://www.google.com/"),
        wb => text = wb.DocumentText
    );

    Assert.IsTrue(
        text.Contains("Google")
    );
}

Follow Redirect

Retrieve a page and assert that a 302 redirect is automatically followed to the final expected url. This example also demonstrates that if all the lambda's provided return a value (note the return null in the first lambda) an array of those values are returned as a result of the Await which can be assigned / queried directly to a variable.

using (var waiter = new WebBrowserWaiter())
{
    var url = waiter.Await(
        wb => { wb.Navigate("http://www.google.com/"); return null; },
        wb => wb.Url.ToString()
    ).Last();

    Assert.AreEqual("https://www.google.com/?gws_rd=ssl", url);
}

Get Then Submit

Get's a page, sets a form value and then submits the form.

using (var waiter = new WebBrowserWaiter())
{
    waiter.Await(
        wb => wb.Navigate("https://www.google.com/"),
        wb => {
            wb.Document.All["q"].SetAttribute("value", "time in duluth");
            wb.Document.All.Cast<HtmlElement>().First(p => p.TagName == "FORM").InvokeMember("Submit");
        }
    );

    var text = waiter.Await(
        wb => wb.DocumentText
    );

    Assert.IsTrue(
        text.Contains("Time in Duluth, MN")
    );
}

More

The goal is add more examples overtime, especially examples demonstrating JavaScript interaction and capabilities.

Warnings

Place Navigation Instructions In Separate Lambda's

If more than one navigation instruction is contained within the same lambda its likely to result in unpredictable behavior including causing the WebBrowserWaiter to hang. The following demonstrates a lambda with multiple navigations that is NOT recommended.

using (var waiter = new WebBrowserWaiter())
{
    waiter.Await(
        wb => {
            wb.Navigate(uri + "search");
            wb.Document.All.Cast<HtmlElement>().First(p => p.TagName == "FORM").InvokeMember("Submit");
        }
    );
}

Do NOT Extract WebBrowser

Its highly recommended that WebBrowser not be extracted from the WebBrowserWaiter into an outside scope. Doing so can have unpredictable results if the WebBrowser is acted upon in the outside scope as the WebBrowser is meant to be operated on from a lone ApartmentState thread. The following demonstrates an extraction that is NOT recommended.

using (var waiter = new WebBrowserWaiter())
{
    WebBrowser webBrowser = null;

    waiter.Await(
        wb => webBrowser = wb
    );

    webBrowser.Navigate("/path"); // < DO NOT ATTEMPT
}

Notes

The change from v1.0.1.0 to 1.1.0.0 contained a semi-significant behavioural change. The default two second wait following a navigation was reduced to zero seconds. Originally this wait was added to account for situations where a navigation attempt resulted in a redirect (i.e. 302, 304, etc.) and would allow the browser time to complete the redirect before returning. However this caused all requests to be delayed two seconds before returning which was most often unnecessary. Since redirects are less common the default was changed to zero for the overall speed improvement. If your use of the WebBrowserWaiter is contingent upon redirects you can still implement a wait of your own as the following examples demonstrate.

using (var waiter = new WebBrowserWaiter())
{
    // Set a default wait
	waiter.DefaultWait = TimeSpan.FromSeconds(1);

    // Specify wait an int in milliseconds per Await
    waiter.Await(
		2000,
        wb => // Do Something
    );
	
	// Specify wait as a timespan per Await
    waiter.Await(
		TimeSpan.FromSeconds(2),
        wb => // Do Something
    );
}

Troubleshooting

System.Windows.Form

You need to reference the System.Windows.Forms DLL in order to use WebBrowserWaiter.

See Warnings

Before doing anything be sure you've read and understand the warnings.

Questions

Create a GitHub issue.

Contributions

Pull requests welcome!

View on GitHub
GitHub Stars30
CategoryDevelopment
Updated1y ago
Forks9

Languages

C#

Security Score

75/100

Audited on Jan 18, 2025

No findings