Larry Suto has written a paper reviewing Webinspect, Appscan, and NTO Spider. From the article
"The study centered around testing the effectiveness of the top three web application scanners in the following 4 areas.
1. Links crawled
2. Coverage of the applications tested using Fortify Tracer
3. Number of verified vulnerability findings
4. Number of false positives"
It is important to consider the results given his criteria. As I've written about before there are many Challenges faced by automated web application security assessment tools.
"One of the most surprising result is the discrepancy in coverage and vulnerability findings between the three tools. Lesser known NTOSpider excelled in every category, and the vulnerability findings show AppScan missed 88% and WebInspect missed 95% of the legitimate vulnerabilities found by NTOSpider." - Larry
"While security professionals testing small, highly secure, simple applications may achieve acceptable results from AppScan and WebInpsect, these results indicate that they may have some concern with relying on the results of these tools for larger applications. The relatively large number of false positives, particularly for WebInspect, is also a matter of some concern. False positives can be difficult for all but the most experienced security professional to identify. If they are not identified, they can cause difficulties by weakening the credibility of the security team with application developers. Additionally, vetting false positives by hand, even by experienced security professionals is a very time intensive process that will increase the cost of the program. While WebInspect has certain tools to reduce false positives, it would appear that this remedy is not necessary if using NTOSpider (and to a lesser extent AppScan). In any case, training the tool to reduce false positives will need to be done by experienced personnel and will increase program costs."
As pointed out False Positives and the ability to address them is very important. It is also important to note that taking a product out of the box and aiming it at something doesn't always yield the best results. As a matter of fact if a vendor claims the default configuration as being the best I'd stay away from their product.
"The false positive findings were of interest because some appeared to be caused by custom 404 error handling routines in the web application, and some simply were based on faulty assumptions."
If you are guessing for a generic filename and you can't know the contents in advance what can you do besides write a 200 response code signature? About the only thing you can do is compare the 200 response page to a known bad URL (intentional 404 page) and see if they match by X percent. A good test for evaluating a scanner is to make a web server that response back with 200 on every page and see how many vulns flag.
Scanner Review Link: http://ha.ckers.org/files/CoverageOfWebAppScanners.pdf