Ask Sourcing Ninja: What is the Deep Web?
Dear Sourcing Ninja,
Today I met a recruiter and he mentioned that he had access to something called the “Deep Web”. The recruiter tried to explain to me the difference between the normal internet I use and this secret dark web but I must admit I came out more confused than when I entered the meeting. Is there such a thing as the Deep web? If so then what is it and how do I go about navigating it?
It might surprise you to know that the Deep web a.k.a. the dark/invisible/hidden web does in fact exist but also that it is not as “dark” as one might think. To begin, let’s think about the Internet as an enormous pond which consists of both the Surface web and Deep web. Following on, search engines would be the like fishing boats which scrape the top layer of the pond with their nets (i.e. the Surface web) and the lower parts of the pond which are outside the reach of their nets would be considered ‘Deep web’.
So the Surface web is all the information that has been indexed by search engines (we’ve already touched on how search engines work here) and is located on their database – categorised and accessible to you from which to search on. In contrast, the Deep web is everything else.
To give you a rough idea of how big the deep web is, in 2000, Google announced it had one billion web pages indexed and thus had the largest search engine database at the time (i). In that same year a whitepaper was released by the Journal of Electronic Publishing which estimated that the Deep web was roughly 500 times larger than the surface web (ii). It further went to state that the growth rate of the Deep web was exponentially greater than that of the Surface web because there weren’t any tools to uncover the hidden information. Fast forward right years and by 2008, Google reached its milestone – one trillionth unique URL indexed (iii).
With that in mind, you might be thinking – what types of information in the Deep web can help me source candidates? Well, think about all the online databases you use such as LinkedIn, Seek, Monster, ZoomInfo, Spokeo, etc; forums; comments on blogs and articles; member lists of meet-up groups or associations; attendee lists of conferences; mailing lists; video blogs; anything with flash content (search engine’s only index html files), PDF files (though Google has already started to index this filetype as well); academic journals; patent lists; and many, many more. These are all examples of the Deep web.
How do you access them? Well, it really does depend on what part of the Deep web you’re looking at. There is always a ‘portal’ for you to access the information (why else would it be on the Internet), the only question is whether it’s freely available or if it’s restricted. If it’s restricted, my advice is use ONLY legitimate ways to access the information – email the webmaster, sales or administrator to get access.
Hope that helps,
[Image from Kethi Copeland]