Category Archives: htmlunit

HtmlUnit is a “GUI-Less browser for Java programs”. It models HTML documents and provides an API that allows you to invoke pages, fill out forms, click links, etc… just like you do in your “normal” browser.

It has fairly good JavaScript support (which is constantly improving) and is able to work even with quite complex AJAX libraries, simulating either Firefox or Internet Explorer depending on the configuration you want to use.

It is typically used for testing purposes or to retrieve information from web sites.

HtmlUnit is not a generic unit testing framework. It is specifically a way to simulate a browser for testing purposes and is intended to be used within another testing framework such as JUnit or TestNG. Refer to the document “Getting Started with HtmlUnit” for an introduction.

HtmlUnit is used as the underlying “browser” by different Open Source tools like Canoo WebTest, JWebUnit, WebDriver, JSFUnit, Celerity, …

HtmlUnit was originally written by Mike Bowler of Gargoyle Software and is released under the Apache 2 license. Since then, it has received many contributions from other developers, and would not be where it is today without their assistance.

HTMLUNIT 2.12 NTLM authenticaiton

I recently updated an applicaiton that used NTLM authentication. I posted on an earlier blog using 2.9 detailed code on how to acomplish this. I’m relieved that the new version of  HTMLUNIT has this built in. Here is the new sample code: package ntlm_demo;

WebConnectionWrapper — HTMLUNIT

This is how you can add a  wrapper to HTMLUNIT to skip bad javascript or some other bad element: new WebConnectionWrapper(webClient) { @Override public WebResponse getResponse(WebRequest settings) throws IOException { WebResponse wr = super.getResponse(settings); String url = settings.getUrl().toExternalForm(); if (url.contains(“bad.js”) ) { System.out.println(“## Skipping ” + settings.getUrl().toExternalForm()); final byte[] body = {}; final WebResponseData wrd […]

HTMLUNIT & THREADS

I see this question quite  a bit on the HTML user list: is HTMLUNIT thread safe?  HTMLUNIT is, but webClient isn’t.  You have to create a new webclient for each thread and then load the cookie manager on that webClient. For example, here is a thread: class webThread implements Runnable {  private WebClient webClient = […]

NTLMv2 with HTMLUNIT 2.9 (Snapshot)

I had a need to authenticate to a Windows Domain using HTMLUNIT.  The documentation posted on the HTMLUNIT page here doesn’t work with NTLMv2, which our Corporate Domain uses. In addition, if you don’t accept the security certificate you might get this error when you try to navigate to your https page: “htmlunit javax.net.ssl.SSLPeerUnverifiedException: peer […]

htmlunit tmp files

If you wrote a web spider in htmlunit (2.9 snapshot as of this publication) you might have noticed that it creates a lot of HTMLUNIT temp files.  These files should be deleted by calldeleteOnExit(), when the JVM exists.  Unfortunately in a less then perfect world, your application can crash, or, in a perfect world may […]