HTML parsing is always been a burning requirement with selenium. Though Selenium doesn’t have built in API which could do HTML parsing,
given its high integrability it could be integrated with HTMP parser to achieve the same. I have experimented on HTML parsing using Jericho which is java library. To begin HTML parsing the only demand Jericho makes is about HTML Source and this could be obtained using Selenium API - getHtmlSource(). Herein I have listed functions which I have developed using Jericho -
Count number of tables on a page –
Retrieve Table Data-
Definition of ***getTableData*** is as following –
Count Number of columns in a individual rows –
Definition of ***countTableColumnsInRows*** is as following –
given its high integrability it could be integrated with HTMP parser to achieve the same. I have experimented on HTML parsing using Jericho which is java library. To begin HTML parsing the only demand Jericho makes is about HTML Source and this could be obtained using Selenium API - getHtmlSource(). Herein I have listed functions which I have developed using Jericho -
Count number of tables on a page –
// Get Source object for HTML Tables. Source source = new Source(selenium.getHtmlSource()); Listtable = source.getAllElements(HTMLElementName.TABLE); Reporter.log("Number of Tables are: " +table.size()); ***Reporter is TestNG API***
// Retrieve table data from a specific table.Source tableSource = new Source(table.get(3).toString()); Reporter.log("Table data is:" +HTMLTableParser.getTableData(tableSource, false)); Reporter.log("True Table data is:" +HTMLTableParser.getTableData(tableSource, true));
Definition of ***getTableData*** is as following –
/** * Returns theSegment
or content of HTML table * available between Start and End tag * * @param tableSource * @param rawHTMLData * * @return HTML Table data */ public static ListgetTableData(Source tableSource, Boolean rawHTMLData) { // Table data to be returned List tableData = new ArrayList (); // Collect table rows List tableRows = tableSource.getAllElements(HTMLElementName.TR); // Loop through table rows for (int tableRowIndex=0; tableRowIndex data = tableRow.getAllElements(HTMLElementName.TD); // Loop through table columns for(int tableColummnIndex=0; tableColummnIndex tableRows = tableSource.getAllElements(HTMLElementName.TR); return tableRows.size(); }
Count Number of columns in a individual rows –
MaprowAndCoumnCount = HTMLTableParser.countTableColumnsInRows(tableSource); for(Map.Entry rowAndColumnData : rowAndCoumnCount.entrySet()){ Reporter.log("Number of columns at row: " +rowAndColumnData.getKey() +" are: " +rowAndColumnData.getValue()); } // Get data from individual columns. Reporter.log("Column specific table data is:" +HTMLTableParser.getTableDataForColumn(tableSource, false, 0, 1)); Reporter.log("Column specific raw table data is:" +HTMLTableParser.getTableDataForColumn(tableSource, true, 0, 1));
Definition of ***countTableColumnsInRows*** is as following –
/** * * Retrieves table data for specific columns beginning from specific row * To return data from beginning of row passrowNumber
as *0 * * @param tableSource * @param rawHTMLData * @param rowNumber * @param columnNumber * @return Table Data */ public static ListgetTableDataForColumn(Source tableSource, Boolean rawHTMLData, int rowNumber, int columnNumber) { // Table data to be returned List tableData = new ArrayList (); // Collect table rows List tableRows = tableSource.getAllElements(HTMLElementName.TR); // Loop through table rows for (int tableRowIndex=rowNumber; tableRowIndex data = tableRow.getAllElements(HTMLElementName.TD); // If supplied index is with in size of table data // This check is useful when retrieving data from uneven html table if (columnNumber < rawhtmldata ="="">
hi tarun
ReplyDeletei have got stuck in selnium RC ,am very new to it please help me
i want to capture all the links in the new web page.i have almost tried with all the selenium .get commands but its not working please help me;if a code sample is given as an example its very much appreciated