That’s when someone from our team suggested a post about this kind of data exposure issue. We’ve mentioned this type of security problem in previous posts, as it’s a common source for security researchers to find valuable private information about any website.
Hack Security Camera using Google. Intitle:index.of finances.xls. Hacking Google to Gain Access to Free Stuffs: Ever wondered how to hack Google for free music or eBooks. Well, here is a way to do that. To download free music, just enter the following query on Google search box and hit enter. Intitle:”Index of”.shhistory intitle:”Index of” cfide intitle:”index of” etc/shadow intitle:”index of” htpasswd intitle:”index of” intext:globals.inc intitle:”index of” master.passwd intitle:”index of” members OR accounts intitle:”index of” passwd intitle:”Index of” passwords modified intitle:”index of.
Today we are going to dig into Google hacking techniques, also known as Google Dorks.
What is a Google Dork?
A Google Dork, also known as Google Dorking or Google hacking, is a valuable resource for security researchers. For the average person, Google is just a search engine used to find text, images, videos, and news. However, in the infosec world, Google is a useful hacking tool.
How would anyone use Google to hack websites?
Well, you can’t hack sites directly using Google, but as it has tremendous web-crawling capabilities, it can index almost anything within your website, including sensitive information. This means you could be exposing too much information about your web technologies, usernames, passwords, and general vulnerabilities without even knowing it.
In other words: Google “Dorking” is the practice of using Google to find vulnerable web applications and servers by using native Google search engine capabilities.
Unless you block specific resources from your website using a robots.txt file, Google indexes all the information that is present on any website. Logically, after some time any person in the world can access that information if they know what to search for.
Important note: while this information is publicly available on the Internet, and it is provided and encouraged to be used by Google on a legal basis, people with the wrong intentions could use this information to harm your online presence.
Be aware that Google also knows who you are when you perform this kind of query For this reason and many others, it’s advised to use it only with good intentions, whether for your own research or while looking for ways to defend your website against this kind of vulnerability.
While some webmasters expose sensitive information on their own, this doesn’t mean it’s legal to take advantage of or exploit that information. If you do so you’ll be marked as a cybercriminal. It’s pretty easy to track your browsing IP, even if you’re using a VPN service. It’s not as anonymous as you think.
![Login Login](http://fossbytes.com/wp-content/uploads/2016/08/google-hacking-1.png)
Before reading any further, be aware that Google will start blocking your connection if you connect from a single static IP. It will ask for captcha challenges to prevent automated queries.
Popular Google Dork operators
Google’s search engine has its own built-in query language. The following list of queries can be run to find a list of files, find information about your competition, track people, get information about SEO backlinks, build email lists, and of course, discover web vulnerabilities.
Let’s look at the most popular Google Dorks and what they do.
cache
: this dork will show you the cached version of any website, e.g.cache: securitytrails.com
allintext
: searches for specific text contained on any web page, e.g.allintext: hacking tools
allintitle
: exactly the same as allintext, but will show pages that contain titles with X characters, e.g.allintitle:'Security Companies'
allinurl
: it can be used to fetch results whose URL contains all the specified characters, e.g: allinurl client areafiletype
: used to search for any kind of file extensions, for example, if you want to search for jpg files you can use:filetype: jpg
inurl
: this is exactly the same asallinurl
, but it is only useful for one single keyword, e.g.inurl: admin
intitle
: used to search for various keywords inside the title, for example,intitle:security tools
will search for titles beginning with “security” but “tools” can be somewhere else in the page.inanchor
: this is useful when you need to search for an exact anchor text used on any links, e.g.inanchor:'cyber security'
intext
: useful to locate pages that contain certain characters or strings inside their text, e.g.intext:'safe internet'
link
: will show the list of web pages that have links to the specified URL, e.g.link: microsoft.com
site
: will show you the full list of all indexed URLs for the specified domain and subdomain, e.g.site:securitytrails.com
*
: wildcard used to search pages that contain “anything” before your word, e.g.how to * a website
, will return “how to…” design/create/hack, etc… “a website”.|
: this is a logical operator, e.g.'security' 'tips'
will show all the sites which contain “security” or “tips,” or both words.+
: used to concatenate words, useful to detect pages that use more than one specific key, e.g.security + trails
–
: minus operator is used to avoiding showing results that contain certain words, e.g.security -trails
will show pages that use “security” in their text, but not those that have the word “trails.”
If you’re looking for the complete set of Google operators, you can follow this SEJ post which covers almost every known dork available today.
Google Dork examples
Let’s take a look at some practical examples. You’ll be surprised how easy is to extract private information from any source just by using Google hacking techniques.
Log files
Log files are the perfect example of how sensitive information can be found within any website. Error logs, access logs and other types of application logs are often discovered inside the public HTTP space of websites. This can help attackers find the PHP version you’re running, as well as the critical system path of your CMS or frameworks.
For this kind of dork we can combine two Google operators, allintext and filetype, for example:
allintext:username filetype:log
This will show a lot of results that include username inside all *.log files.
In the results we discovered one particular website showing an SQL error log from a database server that included critical information:
This example exposed the current database name, user login, password and email values to the Internet. We’ve replaced the original values with “XXX”.
Vulnerable web servers
The following Google Dork can be used to detect vulnerable or hacked servers that allow appending “/proc/self/cwd/” directly to the URL of your website.
inurl:/proc/self/cwd
As you can see in the following screenshot, vulnerable server results will appear, along with their exposed directories that can be surfed from your own browser.
Open FTP servers
Google does not only index HTTP-based servers, it also indexes open FTP servers.
With the following dork, you’ll be able to explore public FTP servers, which can often reveal interesting things.
intitle:'index of' inurl:ftp
In this example, we found an important government server with their FTP space open. Chances are that this was on purpose — but it could also be a security issue.
ENV files
.env files are the ones used by popular web development frameworks to declare general variables and configurations for local and online dev environments.
One of the recommended practices is to move these .env files to somewhere that isn’t publicly accessible. However, as you will see, there are a lot of devs who don’t care about this and insert their .env file in the main public website directory.
As this is a critical dork we will not show you how do it; instead, we will only show you the critical results:
You’ll notice that unencrypted usernames, passwords and IPs are directly exposed in the search results. You don’t even need to click the links to get the database login details.
SSH private keys
SSH private keys are used to decrypt information that is exchanged in the SSH protocol. As a general security rule, private keys must always remain on the system being used to access the remote SSH server, and shouldn’t be shared with anyone.
With the following dork, you’ll be able to find SSH private keys that were indexed by uncle Google.
Let’s move on to another interesting SSH Dork.
If this isn’t your lucky day, and you’re using a Windows operating system with PUTTY SSH client, remember that this program always logs the usernames of your SSH connections.
In this case, we can use a simple dork to fetch SSH usernames from PUTTY logs:
filetype:log username putty
Here’s the expected output:
Email lists
It’s pretty easy to find email lists using Google Dorks. In the following example, we are going to fetch excel files which may contain a lot of email addresses.
filetype:xls inurl:'email.xls'
We filtered to check out only the .edu domain names and found a popular university with around 1800 emails from students and teachers.
site:.edu filetype:xls inurl:'email.xls'
Remember that the real power of Google Dorks comes from the unlimited combinations you can use. Spammers know this trick too, and use it on a daily basis to build and grow their spamming email lists.
Live cameras
Have you ever wondered if your private live camera could be watched not only by you but also by anyone on the Internet?
The following Google hacking techniques can help you fetch live camera web pages that are not restricted by IP.
Here’s the dork to fetch various IP based cameras:
inurl:top.htm inurl:currenttime
To find WebcamXP-based transmissions:
intitle:'webcamXP 5'
And another one for general live cameras:
inurl:'lvappl.htm'
There are a lot of live camera dorks that can let you watch any part of the world, live. You can find education, government, and even military cameras without IP restrictions.
If you get creative you can even do some white hat penetration testing on these cameras; you’ll be surprised at how you’re able to take control of the full admin panel remotely, and even re-configure the cameras as you like.
MP3, Movie, and PDF files
Nowadays almost no one downloads music after Spotify and Apple Music appeared on the market. However, if you’re one of those classic individuals who still download legal music, you can use this dork to find mp3 files:
intitle: index of mp3
The same applies to legal free media files or PDF documents you may need:
intitle: index of pdf
intext: .mp4
Weather
Google hacking techniques can be used to fetch any kind of information, and that includes many different types of electronic devices connected to the Internet.
In this case, we ran a dork that lets you fetch Weather Wing device transmissions. If you’re involved in meteorology stuff or merely curious, check this out:
intitle:'Weather Wing WS-2'
The output will show you several devices connected around the world, which share weather details such as wind direction, temperature, humidity and more.
Preventing Google Dorks
There are a lot of ways to avoid falling into the hands of a Google Dork.
These measures are suggested to prevent your sensitive information from being indexed by search engines.
- Protect private areas with a user and password authentication and also by using IP-based restrictions.
- Encrypt your sensitive information (user, passwords, credit cards, emails, addresses, IP addresses, phone numbers, etc).
- Run regular vulnerability scans against your site, these usually already use popular Google Dorks queries and can be pretty effective in detecting the most common ones.
- Run regular dork queries against your own website to see if you can find any important information before the bad guys do. You can find a great list of popular dorks at the Exploit DB Dorks database.
- If you find sensitive content exposed, request its removal by using Google Search Console.
- Block sensitive content by using a robots.txt file located in your root-level website directory.
Using robots.txt configurations to prevent Google Dorking
One of the best ways to prevent Google dorks is by using a robots.txt file. Let’s see some practical examples.
The following configuration will deny all crawling from any directory within your website, which is pretty useful for private access websites that don’t rely on publicly-indexable Internet content.
You can also block specific directories to be excepted from web crawling. If you have an /admin area and you need to protect it, just place this code inside:
This will also protect all the subdirectories inside.
Restrict access to specific files:
![Intitle Intitle](http://www.darkmoreops.com/wp-content/uploads/2014/08/useful-google-hacks.jpg)
Restrict access to dynamic URLs that contain ‘?’ symbol
To restrict access to specific file extensions you can use:
In this case, all access to .php files will be denied.
Final thoughts
Google is one of the most important search engines in the world. As we all know, it has the ability to index everything unless we explicitly deny it.
Today we learned that Google can be also used as a hacking tool, but you can stay one step ahead of the bad guys and use it regularly to find vulnerabilities in your own websites. You can even integrate this and run automated scans by using custom third-party Google SERPs APIs.
If you’re a security researcher it can be a practical tool for your cybersecurity duties when used responsibly.
While Google Dorking can be used to reveal sensitive information about your website that is located and indexable via HTTP protocol, you can also perform a full DNS audit by using the SecurityTrails toolkit.
If you’re looking for a way to do it all from a single interface—analyze your DNS records, zones, server IP map, related domains, subdomains as well as SSL Certificates—take a look into your SurfaceBrowser tool, request a demo with us today, or sign up for a free API account.
Sign up for our newsletter!
Smart searching with googleDorking
“googleDorking,” also known as “Google hacking”, is a technique used by newsrooms, investigative organisations, security auditors as well as tech savvy criminals to query various search engines for information hidden on public websites and vulnerabilities exposed by public servers. Dorking is a way of using search engines to their full capacity to penetrate web-based services to depths that are not necessarily visible at first.
All you need to carry out a googleDork is a computer, an internet connection and knowledge of the appropriate search syntax.
This guide will describe what googleDorking is and how it works across different search engines, provide tips on how to protect yourself while googleDorking and suggest ways to protect your websites and servers from those who would use these techniques for malicious purposes.
History
A brief history of the googleDork
googleDorking has been in documented use since the early 2000s. Like many of the most successful hacks, googleDorking is not technically sophisticated. It simply requires that you use certain operators — special key words supported by a given search engine — correctly and sometimes creatively. Johnny Long, aka j0hnnyhax, was a pioneer of googleDorking. Johnny first posted his definition of the newly coined term in 2002:
Johnny Long's 2002 definition of a googleDork.
In an 2011 interview, Johnny Long said, “In the years I've spent as a professional hacker, I've learned that the simplest approach is usually the best. As hackers, we tend to get down into the weeds, focusing on technology, not realizing there may be non-technical methods at our disposal that work as well or better than their high-tech counterparts. I always kept an eye out for the simplest solution to advanced challenges.”
Rather than an ordinary type of search query that focuses on a semantic way of asking questions, either directly through writing the whole question or selected key words, googleDorking is based on reverse engineering the way machines scan and index web content.
In this context, googleDorking uses search functions beyond their semantic role, which not only changes how we typically imagine using search engines, but also vastly expands the capacity of the tool in the hands of people searching for a way of exploring content and access to various services.
Such access might lead to the discovery of information that can be used for fraud or terrorism, finding information on yourself or your institution, as well as information that assists in the investigation of governments, corporations or powerful individuals. These results, rather than being characteristic of the tool or method itself, instead rely on the intentions of those using googleDorking, the questions they are asking, and what they do with the results.
Dorking exposes vulnerabilities and also unleashes the unintended, often powerful, consequences of searching search engines.
To dork or not to dork
If you are thinking about using googleDorking as an investigative technique, there are several precautions to take. Although you are free to search at-will on search engines, accessing certain webpages or downloading files from them can be a prosecutable offense, especially in the United States in accordance with the extremely vague and overreaching Computer Fraud and Abuse Act (CFAA). Moreover, if you're dorking in a country with heavy internet surveillance (i.e. any country), it's possible that your searches could be recorded and used against you in the future.
As protection, we recommend using the Tor Browser or Tails when googleDorking on any search engine. Tor masks your internet traffic, divorcing your computer's identifying information from the webpages that you are accessing. Security-in-a-Box includes detailed guides on how to use the Tor Browser on Linux and on Windows. Using Tor will often make your searches more difficult. Google and other search engines might ask you to solve captchas to prove you're human. If your Tor exit node has recently been overrun with bots, search engines might block your searches entirely. In this case, you should refresh your Tor circuit until you connect to an exit node that's not blacklisted. To do so, click the onion icon in the upper-left hand corner of the browser and select “New Tor Circuit for this Site,” as shown below.
Please note that, depending on what country you are in, using Tor might flag your online activity as suspicious. This is a risk you must be wiling to take when using Tor, though you can mitigate that risk to some extent by using a Tor Bridge with an obfuscated pluggable transport. Unless your are specifically targeted by an advanced attack, however, the Tor Browser is quite good at preventing anyone from associating your online identity with the websites you visit or the search terms you enter. If you can not use Tor, you might want to find a VPN provider that you trust and use it with a privacy-aware search engine, such as DuckDuckGo.
If you decide to proceed with an investigation that involves googleDorking, the remainder of this guide will help you get started and provide a comparison of supported dorks across search engines as of March 2017.
How it works
Dorking can be employed across various search engines, not just on Google. In everyday use, search engines like Google, Bing, Yahoo, and DuckDuckGo accept a search term, or a string of search terms and return matching results. But search engines are also programmed to accept more advanced operators that refine those search terms. An operator is a key word or phrase that has particular meaning for the search engine. Operators include things like “inurl”, “intext”, “site”, “feed”, “language”, and so on. Each operator is followed by a colon which is followed by the relevant term or terms (with no space before or after the colon).
A googleDork is just a search that uses one or more of these advanced techniques to reveal something interesting.
These operators allow a search to target more specific information, such as certain strings of text in the body of a website or files hosted on a given url. Among other things, a googleDorker can locate hidden login pages, error messages that give away too much information and files that a website administrator might not realise are publicly accessible.
Not all advanced search techniques rely on operators. For example, including quotation marks around text prompts the engine to search for only the exact phrase in quotes. Using an all-caps “OR” between search terms prompts the engine to return results with one term or the other.
A simple example of a dork that does rely on an operator might be:
This googleDork will search https://tacticaltech.org for all PDF files hosted under that domain name.
Another example might look something like this:
If the search term contains multiple words, they should be surrounded by quotation marks:
Dorks can also be paired with a general search term. For example:
or
Here, “exposing” is the general search terms, and the operators “site” and “filetype” narrow down the results returned.
Example search results are shown below:
A similar search on https:exposingtheinvisible.org turns up no documents, showing us that there are not any public PDF's hosted on that website:
You can use more than one operator, and the order generally does not matter. However, if your search isn't working, it wouldn't hurt to switch around operator names and test out the different results.
Dorking for Dummies
There are many existing googleDork operators, and they vary across search engines. To give you a general idea of what can be found, we have included four dorks below. Even if two search engines support the same operators, they often return different results. Replicating these searches across various search engines is a good way to get a sense of those differences. (You might also want to have a look at our Dorking operators across Google, DuckDuckGo, Yahoo and Bing table below.)
As you explore these searches, you might locate some sensitive information, so it's a good idea to use the Tor Browser, if you can, and to refrain from downloading any files. (In addition to legal issues, it's good to keep in mind that random files on the internet sometimes contain malware. Always download with caution.)
Example 1: Finding budgets on the US Homeland Security website
This dork will bring you all excel spreadsheets that contain the word budget:
The “filetype” operator does not recognise different versions of the same or similar formats (i.e. doc vs. docx, xls vs. xlsx vs. csv), so each of these formats must be dorked separately:
This dork will bring you all publicly-accessible PDF files on the NASA website:
This dork will bring you all publicly-accessible xlsx spreadsheets with the word “budget” on the United States Department of Homeland Security website:
That final query, performed across various search engines, will return different results, as illustrated below:
On Google, we had to solve a captcha:
Bing
Yahoo
DuckDuckGo
As you can see, results vary from engine to engine. Importantly, the DuckDuckGo query does not return correct results. However, using the filetype operator on its own does return correct results, just not targeted to the dhs.gov website.
But using the ext operator, which serves the same purpose on DuckDuckGo does return results targeted to the dhs.gov website.
You will have to investigate quirks like this as you proceed.
Example 2: Finding passwords
Searching for login and password information can be useful as a defensive dork. Passwords are, in rare cases, clumsily stored in publicly accessible documents on webservers. Try the following dorks in different search engines:
In this case, the search engines again returned different results. When we tried this search without the 'site:[Your site]' term, Google returned documents that contained actual usernames and passwords for a North American high school. We have blocked out these results in the screenshot below, and notified the school that their data is vulnerable. The other search engines did not return this information on the first few pages of results. As you can see, both Yahoo and DuckDuckGo also returned some non-relevant results. This is to be expected when dorking: some queries work better than others.
Example 3: London house prices
Another interesting example targets housing price information in London, below are the results from the following query we entered into four different search engines:
Example 4: Looking for security plans on the government of India's website
A final example will locate any documents containing the words “security plan” on Indian government websites, below are the results from the following query we entered into four different search engines:
Perhaps now you have your own ideas about what websites you'd like to focus on with your search. You can find more ideas in this guide from the Center for Investigative Journalism. In the following section, we will share the dorks we found, and how they work across search engines.
Dork It Yourself
Below, is an updated list of the relevant dorks we identified as of March 2017. This list might not be exhaustive, but the operators below should help you get started. In order to understand advanced implementation of these dorks, see the Google Hacking Databases (GHDB). We collected and tested these dorks across search engines with the help of the following resources: Bruce Clay Inc, Wikipedia, DuckDuckGo, Microsoft and Google.
DorkDorkGo
We have included the most widely-used search engines in this analysis. Our recommendation is always to use DuckDuckGo, which is a privacy-focused search engine that does not log any data about its users. However, you should still use DuckDuckGo in combination with Tor while dorking to ensure someone else is not snooping on your search. (For general searching, we also recommend using StartPage, which is a search engine that returns Google results via a privacy filter, also masking user information from Google. However, as important as it is to use privacy-aware search engines in your day-to-day browsing, Tor should offer enough protection to let you dork across search engines. It might be interesting and helpful to your investigation to see the different results that search engines return even when they share the same set of operators.)
Dorking operators across Google, DuckDuckGo, Yahoo and Bing
Key | Colour |
Query works on this search engine | |
Query does not work on this search engine |
Dork | Description | DuckDuckGo | Yahoo | Bing | |
cache:[url] | Shows the version of the web page from the search engine’s cache. | ||||
related:[url] | Finds web pages that are similar to the specified web page. | ||||
info:[url] | Presents some information that Google has about a web page, including similar pages, the cached version of the page, and sites linking to the page. | ||||
site:[url] | Finds pages only within a particular domain and all its subdomains. | ||||
intitle:[text] or allintitle:[text] | Finds pages that include a specific keyword as part of the indexed title tag. You must include a space between the colon and the query for the operator to work in Bing. | ||||
inurl:[text] or allinurl:[text] | Finds pages that include a specific keyword as part of their indexed URLs. | ||||
meta:[text] | Finds pages that contain the specific keyword in the meta tags. | ||||
filetype:[file extension] | Searches for specific file types. | ||||
intext:[text], allintext:[text], inbody:[text] | Searches text of page. For Bing and Yahoo the query is inbody:[text]. For DuckDuckGo the query is intext:[text]. For Google either intext:[text] or allintext:[text] can be used. | ||||
inanchor:[text] | Search link anchor text | ||||
location:[iso code] or loc:[iso code] region:[region code] | Search for specific region. For Bing use location:[iso code] or loc:[iso code] and for DuckDuckGo use region:[region code]. | ||||
contains:[text] | Identifies sites that contain links to filetypes specified (i.e. contains:pdf) | ||||
altloc:[iso code] | Searches for location in addition to one specified by language of site (i.e. pt-us or en-us) | ||||
domain:[url] | Wider than the site: operator, locates any subdomain containing the “suffix” of the main website's url | ||||
feed:[feed type, i.e. rss] | Find RSS feed related to search term | ||||
hasfeed:[url] | Finds webpages that contain both the term or terms for which you are querying and one or more RSS or Atom feeds. | ||||
imagesize:[digit, i.e. 600] | Constrains the size of returned images. | ||||
ip:[ip address] | Find sites hosted by a specific ip address | ||||
keyword:[text] | Metaoperator; that is, an operator that is used with other operators. Takes a simple list as a parameter. All the elements in the list are searched as and/or pairs together. keyword:(intitle inbody)software. This example is equivalent to intitle:software OR inbody:software. | ||||
language:[language code] | Returns websites that match the search term in a specified language | ||||
book:[title] | Searches for book titles related to keywords | ||||
maps:[location] | Searches for maps related to keywords | ||||
linkfromdomain:[url] | Shows websites that link to the specified url (with errors) |
Defensive dorking
googleDorking can be used to protect your own data and to defend websites for which you are responsible. In 2011, after googleDorking his own name, a Yale university student discovered a spreadsheet containing his personal information, including his name and social security number, along with that of 43,000 others. The file had been publicly accessible for several years but had not been exposed by search engines until 2010, when Google began to index FTP (file transfer protocol) servers. Once indexed, it was possible for anyone to find, and it might have remained accessible if the student had not informed those responsible. Similarly, within ten minutes of beginning our research for this guide, we located PDFs containing login and password details for two different schools. We alerted both schools, and the information has since been removed.
There are two types of defensive dorking, firstly when looking for security vulnerabilities in online services you administer yourself, such as webservers or FTP servers. The second type concerns sensitive information about yourself, sources or colleagues that might be unintentionally exposed.
The security software company McAfee recommends six precautions that webmasters and system administrators should take, and googleDorking can sometimes help identify failure to comply with the vast majority of them:
- Keep Operating Systems, services and applications are up-to-date
- Make use of security solutions that prevent intrusion
- Understand how search engine crawlers work, know what is public, and audit your exposure
- Move sensitive resources out of public locations
- Block access to all non-essential resources from external or foreign identities
- Perform frequent penetration testing
In fact, googleDorking is an example of that final point. Frequent 'penetration testing' can be undertaken by anyone who might be concerned about their data or the data of those they want to protect. To perform defensive googleDorking, we recommend starting with the following simple commands on your own websites, your name, and other websites that might contain information about you. For example:
You can repeat this search with other potentially relevant filetypes: xls, xlsx, doc, docx, etc. Or you can search for regular website content with:
See the table above for information about whether your search engine of choice uses intext: or inbody: as the text-searching operator.
You can also search for information associated with the IP address of your servers:
Other useful test might include:
or
If you're not running a lot of websites, scanning through several pages of results should be enough to give you an idea of what's publicly available. However, you can refine this with keywords and other terms taken from the Google Hacking Databases (linked below).
To strengthen this defense, try some of the malicious attacks in the Google Hacking Databases (GHDB) on your own websites and IP addresses. Various incarnations of the GHDB can be found here (the original), here (the original “reborn”), here, and here. Note that these databases include search operators as well as search terms. While they may help attackers locate vulnerable websites, they also help administrators protect their own.
Published on 29 May 2017. Follow us @seeingsideways, get in touch, or read another of our guides here.