Deep Web Research 2012 Bots, Blogs and News Aggregators ( is a keynote presentation that I have been delivering over the last several years, and much of my information comes from the extensive research that I have completed over the years into the "invisible" or what I like to call the "deep" web. The Deep Web covers somewhere in the vicinity of 1 trillion plus pages of information located through the world wide web in various files and formats that the current search engines on the Internet either cannot find or have difficulty accessing. The current search engines find hundreds of billions of pages at the present time of this writing. In the last several years, some of the more comprehensive search engines have written algorithms to search the deeper portions of the world wide web by attempting to find files such as .pdf, .doc, .xls, ppt, .ps. and others. This Deep Web Research 2012 report and guide is divided into the following sections: Bot Research
MEGA-SEARCH.ME - Search engine for MEGA links - How does it work in practice Copy your Unique Token Contact our Support Verify your token live The refund process will be completed within 72 hours. That time is needed for a detailed verification. The WWW Virtual Library How to use Google for Hacking. | Arrow Webzine Google serves almost 80 percent of all search queries on the Internet, proving itself as the most popular search engine. However Google makes it possible to reach not only the publicly available information resources, but also gives access to some of the most confidential information that should never have been revealed. In this post I will show how to use Google for exploiting security vulnerabilities within websites. The following are some of the hacks that can be accomplished using Google. 1. There exists many security cameras used for monitoring places like parking lots, college campus, road traffic etc. which can be hacked using Google so that you can view the images captured by those cameras in real time. inurl:”viewerframe? Click on any of the search results (Top 5 recommended) and you will gain access to the live camera which has full controls. you now have access to the Live cameras which work in real-time. intitle:”Live View / – AXIS” 2. filetype:xls inurl:”email.xls” 3. “? 4.
TasteKid | Recommends music, movies, books, games SHODAN - Computer Search Engine Shodan is the world's first computer search engine that lets you search the Internet for computers. Find devices based on city, country, latitude/longitude, hostname, operating system and IP. Check out Shodan Exploits if you want to search for known vulnerabilities and exploits. It lets you search across Exploit DB, Metasploit, CVE, OSVDB and Packetstorm with one simple interface. Check out the official Shodan API documentation and learn how to access Shodan from Python, Ruby or Perl. The Shodan Research website includes projects that provide new insights and interesting information using the Shodan data or API.
Invisible Web Gets Deeper By Danny Sullivan From The Search Engine Report Aug. 2, 2000 I've written before about the "invisible web," information that search engines cannot or refuse to index because it is locked up within databases. Now a new survey has made an attempt to measure how much information exists outside of the search engines' reach. The company behind the survey is also offering up a solution for those who want tap into this "hidden" material. The study, conducted by search company BrightPlanet, estimates that the inaccessible part of the web is about 500 times larger than what search engines already provide access to. That sounds terrible, but as I've commented numerous times before, the size of a search engine does not necessarily equate to its relevancy or usefulness. For example, assume you wanted to do a trademark search against databases in various parts of the world. To date, meta search tools like this have been few and far between. Don't expect a web based version of LexiBot to be coming.
backstitch | Personalize the Web The Peer to Peer Search Engine: Home Invisible Web: What it is, Why it exists, How to find it, and Its inherent ambiguity What is the "Invisible Web", a.k.a. the "Deep Web"? The "visible web" is what you can find using general web search engines. It's also what you see in almost all subject directories. The "invisible web" is what you cannot find using these types of tools. The first version of this web page was written in 2000, when this topic was new and baffling to many web searchers. These types of pages used to be invisible but can now be found in most search engine results: Pages in non-HTML formats (pdf, Word, Excel, PowerPoint), now converted into HTML. Why isn't everything visible? There are still some hurdles search engine crawlers cannot leap. The Contents of Searchable Databases. How to Find the Invisible Web Simply think "databases" and keep your eyes open. Use Google and other search engines to locate searchable databases by searching a subject term and the word "database". Examples: plane crash database languages database toxic chemicals database Remember that the Invisible Web exists.
Releaselog | RLSLOG.net White Papers and Publications Archives | BrightPlanet Being in the business of Deep Web harvesting, we find ourselves answering the same questions regularly. Online search is such a part of our daily lives; Kleenex is to facial tissue as Google is to search. Google has become synonymous with search in many people’s minds. However, what Google does and what Deep Web harvesting is are very different. In our latest whitepaper, we answer some of the frequently asked questions we get about Deep Web harvesting. Continue reading You’ve heard the hype, now learn the process. HarvestingNormalizing and EnrichmentReporting and Analytics Download our latest whitepaper to discover how Big Data from the Deep Web becomes usable. Continue reading What is the Deep Web and why should you care? Continue reading In our latest whitepaper we dive into how healthcare research can harness Big Data available on the Deep Web to create efficiency and improve collaboration. Continue reading Continue reading Abstract: Download this White Paper in PDF Format
The Ultimate Guide to the Invisible Web Search engines are, in a sense, the heartbeat of the internet; “Googling” has become a part of everyday speech and is even recognized by Merriam-Webster as a grammatically correct verb. It’s a common misconception, however, that Googling a search term will reveal every site out there that addresses your search. Typical search engines like Google, Yahoo, or Bing actually access only a tiny fraction — estimated at 0.03% — of the internet. The sites that traditional searches yield are part of what’s known as the Surface Web, which is comprised of indexed pages that a search engine’s web crawlers are programmed to retrieve. "As much as 90 percent of the internet is only accessible through deb web websites." So where’s the rest? So what is the Deep Web, exactly? Search Engines and the Surface Web Understanding how surface pages are indexed by search engines can help you understand what the Deep Web is all about. How is the Deep Web Invisible to Search Engines? Reasons a Page is Invisible Art
Intelligence Community Archives | BrightPlanet At BrightPlanet we often throw around the acronym OSINT and talk about open source intelligence but what is it, what/if anything does it have to do with the Deep Web and how is it being used? We answer those questions in this post. Continue reading Posted in Case Studies, Deep Web and Big Data, Intelligence Community, Law EnforcementTagged big data, deep web, external data, open-source intelligence, OSINT, OSINT intelligence, unclassified information We get a lot of questions asking how law enforcement can leverage Twitter data to enhance situational awareness and provide actionable intelligence. BrightPlanet is teaming up with GeoTime to help effectively take geolocated Twitter data and visualize movement and communication patterns. Continue reading Continue reading During almost every webinar we do and in many conversations we have about our BlueJay Twitter crime scanner, we seem to always get the question, why is BlueJay only Twitter? Continue reading 24 Hour Event Monitor30 Day Playback
Some of the links are out-of-date but it is still a useful list of sites for deep web research by marylenegoulet Sep 28