Let's introduce another tool into your toolbox: the ability to look only on pages that use a certain word or words in their title by incorporating the "intitle" operator into your search. At SecurityFocus, this query would narrow our results list down to only 5, an incredible tightening of our search: "site:www.securityfocus.com intitle:password cracking" (note that "password" is the only word that must be in the title; "cracking" should appear on the page as a search term, but not in the title, since I didn't place "intitle:" prior to it).

Bad guys know about the "intitle" operator, but they know something else that makes it even more powerful. Often Web servers are left configured to list the contents of directories if there is no default Web page in those directories; on top of that, those directories often contain lots of stuff that the Web site owners don't actually want to be on the Web. That makes such directory lists prime targets for snoopers. The title of these directory listings almost always start with "Index of", so let's try a new query that I guarantee will generate results that should make you sit up and worry: "intitle:"index of" site:edu password". 2,940 results, and many, if not most, would be completely useless to a potential attacker. Many, however, would yield passwords in plain text, while others could be cracked using common tools like Crack and John the Ripper.

There are other operators, but these should be enough to make the picture clear. Once you start to think about it, the potentially troublesome words and phrases that can be searched for and leveraged should begin to multiply in your mind: passwd. htpasswd. accounts. users.pwd. web_store.cgi. finances. admin. secret. fpadmin.htm. credit card. ssn. And so on. Heck, even "robots.txt" would be useful: after all, if someone doesn't want search engines to find the stuff listed in robots.txt, that stuff could very well be worth a look. Remember, robots.txt just indicates that the Web site doesn't want search engines to index the files and folders listed in robots.txt; nothing inherently stops users from accessing that content once they know it exists.

SecurityFocus HOME Columnists: Googling Up Passwords