Skip to main content

HTML Parsing using jsoup

Came across jsoup of late, while automating web accessibility tests using Selenium.
Selenium gets me the page html and jsoup does the magic of extracting required information from html to find if web page is accessibility compliant or not.
You would largely be dealing with Document (which in turn extends Element) and Elements classes when using jsoup.

Consider you want to find all 'class' attributes in "div" of a web page then you could use some thing like -


Document document = Jsoup.parse(selenium.getHTMLSource);
        Elements elements = document.getElementsByTag("div");
        for(IteratordivIterator=elements.iterator(); divIterator.hasNext();) {

            System.out.println(divIterator.next().attr("class"));
}


Not only this, if you know the attribute value you could also find out if it appears under correct node. It could be used in automating aria test for attribute role for a web page.



For a detailed list of jsoup capabilities visit jsoup page at - http://jsoup.org/

Comments

Popular posts from this blog

Using xPath to reach parent of an element

Note: If you are new to java and selenium then start with selenium java training videos .   I prefer css locator over xPath but there are times when css locators don't fit requirement. One such requirement is when you want to navigate to parent element of an element and may be parent of parent and even more. Unfortunately css locators don't provide any mechanism to navigate to parent of an element. See this for more. Of late I came across a scenario when I wanted to click on a link depending upon the text in a text box. Herein parent of text box and parent of link were at the same location. More over there could have been many such combinations in application. Fortunately I just need to pick first such instance and Web Driver any way considers only first instance when multiple locators are found matching an element. Element in question is in following html - Here I need to click on highlighted anchor on the basis of input element (which is also hig...

Verify email confirmation using Selenium WebDriver

Note: If you are new to java and selenium then start with selenium java training videos .   How to Verify Email Confirmation Using Selenium 4 and JavaMail (2026 Guide) Email confirmation is a critical part of most registration flows — account activation, password reset, multi-factor authentication, and onboarding. Every automation engineer eventually faces the same challenge: How do you verify an email confirmation link inside a Selenium test without making it slow and flaky? The wrong instinct is to automate Gmail's UI with Selenium. It's fragile, slow, and breaks constantly. The right approach: Use Selenium for browser automation Use JavaMail (IMAP) to read the email directly Extract the confirmation link Continue the test in Selenium Why Not Automate Gmail UI With Selenium? Automating the Gmail UI means logging in, searching, clicking a message, and parsing content from a third-party interface that changes frequently. This leads to: Flaky...

Real Time JMeter Result Using Backend Listener

Since JMeter 2.13 Backend Listener has been available to create real time graph of JMeter Test. Following tutorial explain the entire process in detail. At the end of this tutorial you would be able to create JMeter Live Test Result dashboard similar to following - This tutorial borrows information from many sources and my own experiments with JMeter live reporting dashboard. I have added source of information wherever applicable But before we can build such a snazzy JMeter Live Reporting dashboard we need to understand two more components - influxDB (a time series database) and Grafana Dashboard This is a big tutorial, so take deep breath :-) and follow on. Once you complete set up specified in this tutorial then you can watch JMeter Training Video Tutorial to watch this in action. What is Time Series Database? A time series is a sequence of data points , typically consisting of successive measurements made over a time interval . Examples of time ...