crawlers

Crawling Your Way to QA

Testing the quality of a Web application, like any application, can be a cumbersome process. Companies and testers alike are always looking for new ways to test more efficiently and obtain broader coverage with every development cycle. With the introduction of new features and ever-expanding site trees it is tough to keep up with changes in web applications using traditional testing techniques such as manual input validation.

In this article we will take a look at how one crawler could test major portions of a web application with little effort or time from the user. By utilizing crawl-based testing tools we can test the security, verify proper search engine optimization ( SEO ) techniques, and test performance of most web applications without having to write as many complex scripts or test cases. Using crawlers for QA doesn’t replace the need for traditional tools. It accompanies tools and human logic to provide more information in an automated fashion. The biggest benefits in using a crawl-based tool for web applications are its speed and ease of use. You can enter a URL and in only a few hours you will be presented with enough information to keep developers busy for hours, days or even weeks. While not the only tool you should use, crawl-based tools are likely among the most dynamic and should be the first tool that gets executed. Once a new feature or content section is placed in the testing environment the crawl-based tool can be launched against the site or the new section. While the testers are executing test plans the crawl-based tool will be running in the background to identify places where changes may need to be made. After the initial assessment a qualified tester can take an in-depth look at the results and decide what needs to be done with them. Now that you understand why a crawl-based tool can be helpful and how it could fit into and assist in your testing environment we will take a deeper look into how it works.

Using crawl-based programs for security testing is a common practice in the Web application security industry. In relation to SEO tools and performance testing tools, crawlers appear to be most prevalent in the security arena. Most enterprise web application security testing packages are based on a crawler that uses various types of logic or checks to test for vulnerabilities and best practices. Current black-box testing systems can detect cross-site scripting ( XSS ), SQL Injection ( SQLi ) and various other security issues. Utilizing a Web application security scanner is a proven technology that will immediately assist the quality assurance effort.

Just as there are techniques to make a Web Application more secure there are also techniques to assist in improving your ranks in search engines, this commonly referred to as SEO. Search Engine Optimization ( SEO ) is the process of improving the volume and quality of traffic to a web site from search engines via “natural” (“organic” or “algorithmic”) search results. –Wikipedia. This process has two parts. The first part is the optimization of the Web application. Second are external promotions which involve link submissions and social based web promotion to name a few. There are applications which do automated link submissions but they are not crawl-based and therefore outside our focus. Currently the most widely used functions of a crawler in SEO programs is to check for broken links and check web site structure against web standards. Most other tools that check for code-to-text ratios, meta-data, and keyword density are stand alone but can be rolled into a crawler. Since SEO is somewhat subjective, a crawler would have to test for best practices. It would also have to have knowledge of “black hat” SEO techniques so that it could alert you to a page that could get your site removed from top ranking search engines. As you can see, a crawler itself cannot complete all tasks related to SEO but can easily test many aspects of a Web application to make sure it is configured in an optimal manner for search engines and users.

Performance testing tools use predefined scripts to put a load on a Web server and calculate how it is affected. This process can also be done to some extent with a crawl-based testing program. As part of the performance testing section of the crawl the test application can execute load testing followed by stress testing all in one scan. This can all be accomplished by iterating the number of crawl threads and increasing other variables while providing input to fields that are available in the Web application. Then, by analyzing the response times and load times of each page, it can determine how multiple users could affect the Web server. To go one step further, the integration of pre-recorded scripts could introduce more “human like” load on the server.

Testing the various aspects of a Web application can take many tools and is a continuous process. As you have seen, crawlers can accomplish much larger tasks than they are currently used for as well as enhance the coverage of a Web application. By understanding the ROI that a crawl based tool can have you can better understand how it could assist your company, application, and testers. As long as we are using crawlers, we should take advantage of their full spectrum of effectiveness in security, search engine optimization, and performance testing.

Tags: , , ,

Wednesday, October 15th, 2008 QA, Uncategorized No Comments

SugarCRM Tutorials and Modules

SugarCRM Consulting

Technorati

Add to Technorati Favorites