First Data Acquires Gyft

India Based Boku Acquires Qubecell

LivingSocial: Gautam Thakar Replaces Tim O’Shaughnessy as CEO

Google It. But How Does The Search Engine Actually Work?

How-Search-Works-The-Story-–-Inside-Search-–-Google-1-600x137Google it. While this term is widely used, do you really know how Google’s search engine actually works? How they come up with what sites list first and which sites will help you find what information you’re looking for? In a new section entitled “How Search Works” Google walks interested users through how billions of searches happen each day. An interactive infographic by Google allows users to educate themselves about the search process and how Google deals with the spam and other useless pages.

According to SearchEngineLand, the new area was inspired by Google’s The Story Of Send, an interactive infographic that Google released last year to explain how it handles email.

The interactive infographic has three parts: crawling & indexing, algorithms and fighting spam.

Crawling and Indexing

Section 1: Crawling and Indexing

Google uses software called “web crawlers” to discover publicly available webpages. The most popular crawler is “Googlebot”. The crawling bot, sometimes also called a “spider”, preforms the crawling process by which Googlebot discovers new and updated pages to be added to the Google index. This means Google looks at different webpages and follows the links on those pages, much like you would if you were browsing content on the web. Then the crawlers go from link to link and bring data about those webpages back to Google’s servers. The software pays special attention to new sites, changes to existing sites and dead links.

The internet can be compared to a limitless and ever growing library. Google gathers pages during the crawl process and creates an index, so users know exactly how to look things up. Much like the index in the back of a book, the Google index includes information about words and their locations. When you search, at the most basic level, our algorithms look up your search terms in the index to find the appropriate pages.

However, the search process gets a little more complicated from there. When a user searches for “dogs” they don’t necessarily want a page with the word “dogs” on it hundreds of times. They probably want pictures, videos or a list of breeds. Google’s indexing systems note many different aspects of pages, such as when they were published, whether they contain pictures and videos, and much more.

As users begin to explore the interactive infographic they will notice there are links and hidden pop ups and may discover that they reveal more information, as they hover the mouse over certain areas and click.

Atgorithm 1

Section 2: Algorithms

Users want answers, not trillions of webpages. Algorithms are computer programs that look for clues to give users back exactly what they want. This section of the infographic also lets user discover different different aspects of the process to learn more.

Algorithm

Algorithms are the computer processes and formulas that take your questions and turn them into answers. Today Google’s algorithms rely on more than 200 unique signals or “clues” that make it possible to guess what you might really be looking for. These signals include things like the terms on websites, the freshness of content, your region and PageRank.

Spam

Section 3: Fighting Spam

Every day, millions of useless spam pages are created. Google makes a valiant effort to fight spam through a combination of computer algorithms and manual review.

Spam sites attempt to make their way to the top of search results through different techniques - like repeating keywords over and over, buying links that pass PageRank or putting invisible text on the screen. This is horrible for search because relevant websites get buried below all of the nonsense  and it’s bad for legitimate website owners because their sites become harder to find.

The good news is that Google’s algorithms can detect the vast majority of spam and demote it automatically. For the rest, we have teams who manually review sites.

The different types of spam include: cloaking and/or sneaking redirects, hacked sites, hidden texts and/or keyword stuffing, parked domains, pure spam, spammy free hosts and dynamic DNS providers, thin content with little or no added value, unnatural links from a site, unnatural links to a site and user-generated spam.

Source: Google

Megan Bildner

Megan is currently the Jr. Editor of the Daily Deal Media and Digital MI sites with Mogul Media. Here, she is responsible for the site's content and publication. Along with these tasks, Megan also engages with the DDM & DMI audience through social pages/groups/tweets. When she is not working on the DDM & DMI sites Megan enjoys tweeting, traveling, snowboarding, hiking, and socializing with her friends. She grew up in Michigan's Clarkston and Lake Orion and loves being on the lake, boating, and wake-boarding. You can follow her @MeganBildner or e-mail her at megan@mogulmedia.com
Newsletter
DDM Reports
The 2014 Edition of the DDM Daily Deal Publisher Directory is the most comprehensive contact list for daily deal sites, flash retailers, aggregators and individuals operating in the daily deal industry. Each record includes the following: - Company Name - Website - Contact Name - Contact Title - Email - Phone - Address (not all records contain a full mailing address)
The 2013 Media List is a comprehensive database of all major media, bloggers and product/app review websites. If you're looking for PR distribution, this is a great list to begin with. Each record includes: - Contact Name - Publication Name - Website URL - Address - Email - Description
2.7M Consumer Subscribers. This data has been aggregated from a number of websites, which include daily deal sites that have gone out of business. Data includes: - Full Name - Address - Phone Number - Email - Signup Website - IP Address - Date of Signup Data will be delivered via a dropbox link in CSV format.