Skip to main content

HTML Parsing using jsoup

Came across jsoup of late, while automating web accessibility tests using Selenium.
Selenium gets me the page html and jsoup does the magic of extracting required information from html to find if web page is accessibility compliant or not.
You would largely be dealing with Document (which in turn extends Element) and Elements classes when using jsoup.

Consider you want to find all 'class' attributes in "div" of a web page then you could use some thing like -


Document document = Jsoup.parse(selenium.getHTMLSource);
        Elements elements = document.getElementsByTag("div");
        for(IteratordivIterator=elements.iterator(); divIterator.hasNext();) {

            System.out.println(divIterator.next().attr("class"));
}


Not only this, if you know the attribute value you could also find out if it appears under correct node. It could be used in automating aria test for attribute role for a web page.



For a detailed list of jsoup capabilities visit jsoup page at - http://jsoup.org/

Comments

Popular posts from this blog

Verify email confirmation using Selenium WebDriver

Note: If you are new to java and selenium then start with selenium java training videos .   How to Verify Email Confirmation Using Selenium 4 and JavaMail (2026 Guide) Email confirmation is a critical part of most registration flows — account activation, password reset, multi-factor authentication, and onboarding. Every automation engineer eventually faces the same challenge: How do you verify an email confirmation link inside a Selenium test without making it slow and flaky? The wrong instinct is to automate Gmail's UI with Selenium. It's fragile, slow, and breaks constantly. The right approach: Use Selenium for browser automation Use JavaMail (IMAP) to read the email directly Extract the confirmation link Continue the test in Selenium Why Not Automate Gmail UI With Selenium? Automating the Gmail UI means logging in, searching, clicking a message, and parsing content from a third-party interface that changes frequently. This leads to: Flaky...

Selenium Tutorial: Ant Build for Selenium Java project

Ant is a build tool which could be used to have your tests running either from command line or from Hudson CI tool. There is detailed documentation available for ant here but probably you need to know only a little part of it for you selenium tests. The essentials which are needed to know are: Project Target (ant execution point and collection of tasks) Tasks (could be as simple as compilation) And there would usually be following targets for Selenium tools - setClassPath - so that ant knows where you jar files are loadTestNG - so that you could use testng task in ant and use it to execute testng tests from ant init - created the build file clean - delete the build file compile - compiles the selenium tests run - executes the selenium tests Here is my project set up for ant -

Capture network traffic using WebDriver

We often come across testing requirements when we need to analyze the network traffic to find - HTTP status of page Analyze header information to find if right information is passed Validating parameters related to ajax requests etc Selenium 1 has had a way to capture n/w traffic but the feature does not always work as expected. At times Selenium 1 does not capture all n/w traffic, And given that Selenium 1 APIs are almost dead it is