Google Hacking - A Crash Course by alexkeller

download Google Hacking - A Crash Course by alexkeller

of 19

Transcript of Google Hacking - A Crash Course by alexkeller

  • 8/7/2019 Google Hacking - A Crash Course by alexkeller

    1/19

    Hacking : A Crash CourseHacking : A Crash Course

    Alex Keller, Network/Systems Administrator for BSSAlex Keller, Network/Systems Administrator for BSS

    Computing @ SFSUComputing @ SFSU

  • 8/7/2019 Google Hacking - A Crash Course by alexkeller

    2/19

    What is "Google"?What is "Google"?

    Definition: GoogolDefinition: Googol

    Pronunciation: 'gPronunciation: 'g--"gol"gol

    Function:Function: nounnoun

    Google is a play on the wordGoogle is a play on the word googolgoogol, which was coined by Milton Sirotta, nephew of, which was coined by Milton Sirotta, nephew of

    American mathematician Edward Kasner, and was popularized in the book,American mathematician Edward Kasner, and was popularized in the book,

    "Mathematics and the Imagination" by Kasner and James Newman. It refers to the"Mathematics and the Imagination" by Kasner and James Newman. It refers to thenumber represented by the numeral 1 followed by 100 zeros. Google's use of thenumber represented by the numeral 1 followed by 100 zeros. Google's use of the

    term reflects the company's mission to organize the immense, seemingly infiniteterm reflects the company's mission to organize the immense, seemingly infinite

    amount of information available on the web.amount of information available on the web.

    Originally called "Backrub", the logic behind the Google search engine was developOriginally called "Backrub", the logic behind the Google search engine was develop

    by graduate students Larry Page and Sergey Brin at Stanford University in 1995.by graduate students Larry Page and Sergey Brin at Stanford University in 1995.Their first place of business was literally a garage. The garage location was chosenTheir first place of business was literally a garage. The garage location was chosen

    because it had a washer/dryer and a hot tub out back, they were already servingbecause it had a washer/dryer and a hot tub out back, they were already serving

    10,000 searches a day.10,000 searches a day.

    http://www.google.com/corporate/history.html

  • 8/7/2019 Google Hacking - A Crash Course by alexkeller

    3/19

    HowWe Got Here....HowWe Got Here....

    For the last 5 years, Google has been theFor the last 5 years, Google has been the

    undisputed leader in online search technology.undisputed leader in online search technology.

    Before Google; Altavista, FAST, and Inktomi hadBefore Google; Altavista, FAST, and Inktomi had

    the largest databases; but suffered from poorerthe largest databases; but suffered from poorersearch algorithms.search algorithms.

    Google's profit is partially ad driven, butGoogle's profit is partially ad driven, but

    sponsors do not garner higher ratings insponsors do not garner higher ratings in

    searches.searches.

  • 8/7/2019 Google Hacking - A Crash Course by alexkeller

    4/19

    LocalizationLocalization

    Language optionsLanguage options

    ToolbarToolbar

    BloggerBlogger

    TranslationTranslation

    CalculatorCalculator

    Stock QuotesStock Quotes

    PhonebookPhonebook

    NewsgroupsNewsgroups

    Searching and Beyond...Searching and Beyond...

    Programming toolsProgramming tools

    IntraIntra--network searchesnetwork searches

    Print searchingPrint searching Desktop searchDesktop search

    Mobile AccessMobile Access

    NewsNews

    Spell CheckerSpell Checker

    PricingPricing

  • 8/7/2019 Google Hacking - A Crash Course by alexkeller

    5/19

    Search Engine SupremacySearch Engine Supremacy

    http://searchenginewatch.com/reports/article.php/2156481

  • 8/7/2019 Google Hacking - A Crash Course by alexkeller

    6/19

    How Big is Google?How Big is Google?

    http://searchenginewatch.com/reports/article.php/2156481

  • 8/7/2019 Google Hacking - A Crash Course by alexkeller

    7/19

    Searches Per Day in MillionsSearches Per Day in Millions

    250

    167

    80

    45

    80

    Google

    Yahoo/Overture

    Inktomi

    Looksmart

    Others

    http://searchenginewatch.com/reports/article.php/2156461

  • 8/7/2019 Google Hacking - A Crash Course by alexkeller

    8/19

    So How Does Google Work?So How Does Google Work?

    Crawls and indexes web pages et al.Crawls and indexes web pages et al.

    Stores copies of web pages and graphicsStores copies of web pages and graphics

    on their caching serverson their caching servers Presents users with simple front end toPresents users with simple front end to

    query the database of cached pagesquery the database of cached pages

    Returns search results in a orderedReturns search results in a orderedfashion based upon relevancyfashion based upon relevancy

  • 8/7/2019 Google Hacking - A Crash Course by alexkeller

    9/19

    Anatomy of a SearchAnatomy of a Search

    http://computer.howstuffworks.com/search-engine1.htm

    Server Side Client Side

  • 8/7/2019 Google Hacking - A Crash Course by alexkeller

    10/19

    What Can Google Search?What Can Google Search?

    Adobe Portable Document Format (pdf)Adobe Portable Document Format (pdf)

    Adobe PostScript (ps)Adobe PostScript (ps)

    Lotus 1Lotus 1--22--3 (wk1, wk2, wk3, wk4, wk5, wki, wks, wku)3 (wk1, wk2, wk3, wk4, wk5, wki, wks, wku)

    LotusWordPro (lwp)LotusWordPro (lwp)

    MacWrite (mw)MacWrite (mw)

    Microsoft Excel (xls)Microsoft Excel (xls)

    Microsoft PowerPoint (ppt)Microsoft PowerPoint (ppt)

    MicrosoftWord (doc)MicrosoftWord (doc)

    MicrosoftWorks (wks, wps, wdb)MicrosoftWorks (wks, wps, wdb)

    MicrosoftWrite (wri)MicrosoftWrite (wri)

    Rich Text Format (rtf)Rich Text Format (rtf) Shockwave Flash (swf)Shockwave Flash (swf)

    Text (ans, txt)Text (ans, txt)

  • 8/7/2019 Google Hacking - A Crash Course by alexkeller

    11/19

    SoWhat Determines PageSoWhat Determines Page

    Relevance and Rating?Relevance and Rating?

    Exact Phrase: are your keywords found as anExact Phrase: are your keywords found as an

    exact phrase in any pages?exact phrase in any pages?

    Adjacency: how close are your keywords to eachAdjacency: how close are your keywords to each

    other?other?

    Weighting: how many times do the keywordsWeighting: how many times do the keywords

    appear in the page?appear in the page?

    PageRank/Links: How many links point to thePageRank/Links: How many links point to thepage? How many links are actually in the page?page? How many links are actually in the page?

    Equation: (Exact Phrase Hit)+(AdjacencyFactor)+(Weight) * (PageRank/Links)

    From: Google 201, Advanced Googology - Patrick Crispen, CSU

  • 8/7/2019 Google Hacking - A Crash Course by alexkeller

    12/19

    Enough BS, How Do I Get Results?Enough BS, How Do I Get Results?

    Pick your keywords carefully & be specificPick your keywords carefully & be specific

    Do NOT exceed 10 keywordsDo NOT exceed 10 keywords

    Use Boolean modifiersUse Boolean modifiers Use advanced operatorsUse advanced operators

    Google ignores some words*:Google ignores some words*:a, about, an, and, are, as, at, be, by, from, how, i, in, is, it, of,a, about, an, and, are, as, at, be, by, from, how, i, in, is, it, of,

    on, or, that, the, this, to, we, what, when, where, which, withon, or, that, the, this, to, we, what, when, where, which, with

    *From: Google 201, Advanced Googology - Patrick Crispen, CSU

  • 8/7/2019 Google Hacking - A Crash Course by alexkeller

    13/19

    Google's Boolean ModifiersGoogle's Boolean Modifiers

    AND is always implied.AND is always implied.

    OR: Escobar (Narcotics OROR: Escobar (Narcotics OR

    Cocaine)Cocaine)

    ""--" = NOT: Escobar" = NOT: Escobar --PabloPablo "+" = MUST: Escobar +Roberto"+" = MUST: Escobar +Roberto

    Use quotes for exact phraseUse quotes for exact phrase

    matching:matching: "nobody puts baby in a corner""nobody puts baby in a corner"

    OROR"there are known knowns; there are things we know we know. We also"there are known knowns; there are things we know we know. We alsoknow there are known unknowns; that is to say we know there areknow there are known unknowns; that is to say we know there aresome things we do not know. But there are also unknown unknowns,some things we do not know. But there are also unknown unknowns,the ones we don't know we don't know."the ones we don't know we don't know."

  • 8/7/2019 Google Hacking - A Crash Course by alexkeller

    14/19

    WildcardsWildcards

    Google supports word wildcards but NOTGoogle supports word wildcards but NOT

    stemming.stemming.

    "It's the end of the * as we know it" works."It's the end of the * as we know it" works. but "American Psycho*" won't get you decentbut "American Psycho*" won't get you decent

    results on American Psychology or Americanresults on American Psychology or American

    Psychophysics.Psychophysics.

  • 8/7/2019 Google Hacking - A Crash Course by alexkeller

    15/19

    Advanced SearchingAdvanced Searching

    Advanced Search Page:Advanced Search Page:

    http://www.google.com/advanced_searchhttp://www.google.com/advanced_search

  • 8/7/2019 Google Hacking - A Crash Course by alexkeller

    16/19

    Advanced OperatorsAdvanced Operators

    cache:cache:

    define:define:

    info:info:

    intext:intext:

    intitle:intitle:

    inurl:inurl:

    link:link:

    related:related:

    stocks:stocks:

    filetype:filetype:

    numrange 1973..2005numrange 1973..2005

    source:source:

    phonebook:phonebook:

    http://www.googleguide.com/advanced_operators.html

    DEMO:

    on-2-13-1973..2004

    visa

    4356000000000000..435699999999

    9999

  • 8/7/2019 Google Hacking - A Crash Course by alexkeller

    17/19

    Extras...Extras...

    Translation and Language optionsTranslation and Language options -- over 100 to choose from:over 100 to choose from:

    http://www.google.com/language_toolshttp://www.google.com/language_tools

    Stock QuotesStock Quotes -- enter stocks:, example: stocks:GOOGenter stocks:, example: stocks:GOOG

    NewsgroupsNewsgroups -- http://groups.google.comhttp://groups.google.com

    CalculatorCalculator -- "1024 minus 768" or "12 to the 10 power""1024 minus 768" or "12 to the 10 power" FroogleFroogle -- http://froogle.google.comhttp://froogle.google.com

    ImagesImages -- http://images.google.comhttp://images.google.com

    Spell CheckingSpell Checking -- just type it in: "convienence"just type it in: "convienence"

    BloggerBlogger -- http://www.blogger.com/starthttp://www.blogger.com/start

    Extras can be found atExtras can be found at http://www.google.com/help/features.htmlhttp://www.google.com/help/features.html

  • 8/7/2019 Google Hacking - A Crash Course by alexkeller

    18/19

    Google, doesn't make it right...Google, doesn't make it right...

    GOODGOOD

    FAIRFAIR -- Fairness and Accuracy in ReportingFairness and Accuracy in Reporting

    http://www.fair.org/http://www.fair.org/

    Federation of American Scientists:Federation of American Scientists:

    http://www.fas.org/main/home.jsphttp://www.fas.org/main/home.jsp

    OneWorld.net:OneWorld.net:

    http://www.oneworld.net/http://www.oneworld.net/

    BADBAD

    Holocaust Never Happened?Holocaust Never Happened?

    http://www.airhttp://www.air--photo.com/englishphoto.com/english

    School of the Americas:School of the Americas:

    http://carlislehttp://carlisle--www.army.mil/usamhi/usarsa/main.htmwww.army.mil/usamhi/usarsa/main.htm

    UGLYUGLY

    Pixyland!Pixyland!

    http://www.pixyland.org/peterpan/photo_closeups_pp4.htmhttp://www.pixyland.org/peterpan/photo_closeups_pp4.htm

  • 8/7/2019 Google Hacking - A Crash Course by alexkeller

    19/19

    Bibliography and Further ResearchBibliography and Further Research

    Search Engine Watch:Search Engine Watch:

    http://searchenginewatch.comhttp://searchenginewatch.com

    Google Hacks: 100 IndustrialGoogle Hacks: 100 Industrial--Strength Tips & ToolsStrength Tips & Tools

    by Tara Calishain, Rael Domfestby Tara Calishain, Rael Domfest

    Johnny I Hack Stuff:Johnny I Hack Stuff:http://johnny.ihackstuff.comhttp://johnny.ihackstuff.com

    Google:Google:

    http://www.google.comhttp://www.google.com

    HowStuffWorks:HowStuffWorks:

    http://computer.howstuffworks.com/search-engine1.htm