Site Search Analytics in a Nutshell

download Site Search Analytics in a Nutshell

of 85

  • date post

    17-Aug-2014
  • Category

    Technology

  • view

    44.830
  • download

    27

Embed Size (px)

description

Originally presented at SXSW March 13, 2011, on panel with Fred Beecher and Austin Govella. Modified and updated for Web 2.0 Expo talk, October 12, 2011, UX Web Summit September 26, 2012; Webdagene September 10, 2013.

Transcript of Site Search Analytics in a Nutshell

  • Site Search Analytics in a Nutshell Louis Rosenfeld lou@louisrosenfeld.com @louisrosenfeld Webdagane 10 September 2013
  • Hello, my name is Lou www.louisrosenfeld.com | www.rosenfeldmedia.com
  • Lets look at the data
  • No, lets look at the real data Critical elements in bold: IP address, time/date stamp, query, and # of results: XXX.XXX.X.104 - - [10/Jul/2006:10:25:46 -0800] "GET /search?access=p&entqr=0 &output=xml_no_dtd&sort=date%3AD%3AL %3Ad1&ud=1&site=AllSites&ie=UTF-8 &client=www&oe=UTF-8&proxystylesheet=www& q=lincense+plate&ip=XXX.XXX.X.104 HTTP/1.1" 200 971 0 0.02 XXX.XXX.X.104 - - [10/Jul/2006:10:25:48 -0800] "GET /searchaccess=p&entqr=0 &output=xml_no_dtd&sort=date%3AD%3AL %3Ad1&ie=UTF-8&client=www& q=license+plate&ud=1&site=AllSites &spell=1&oe=UTF-8&proxystylesheet=www& ip=XXX.XXX.X.104 HTTP/1.1" 200 8283 146 0.16
  • No, lets look at the real data Critical elements in bold: IP address, time/date stamp, query, and # of results: XXX.XXX.X.104 - - [10/Jul/2006:10:25:46 -0800] "GET /search?access=p&entqr=0 &output=xml_no_dtd&sort=date%3AD%3AL %3Ad1&ud=1&site=AllSites&ie=UTF-8 &client=www&oe=UTF-8&proxystylesheet=www& q=lincense+plate&ip=XXX.XXX.X.104 HTTP/1.1" 200 971 0 0.02 XXX.XXX.X.104 - - [10/Jul/2006:10:25:48 -0800] "GET /searchaccess=p&entqr=0 &output=xml_no_dtd&sort=date%3AD%3AL %3Ad1&ie=UTF-8&client=www& q=license+plate&ud=1&site=AllSites &spell=1&oe=UTF-8&proxystylesheet=www& ip=XXX.XXX.X.104 HTTP/1.1" 200 8283 146 0.16 What are users searching?
  • No, lets look at the real data Critical elements in bold: IP address, time/date stamp, query, and # of results: XXX.XXX.X.104 - - [10/Jul/2006:10:25:46 -0800] "GET /search?access=p&entqr=0 &output=xml_no_dtd&sort=date%3AD%3AL %3Ad1&ud=1&site=AllSites&ie=UTF-8 &client=www&oe=UTF-8&proxystylesheet=www& q=lincense+plate&ip=XXX.XXX.X.104 HTTP/1.1" 200 971 0 0.02 XXX.XXX.X.104 - - [10/Jul/2006:10:25:48 -0800] "GET /searchaccess=p&entqr=0 &output=xml_no_dtd&sort=date%3AD%3AL %3Ad1&ie=UTF-8&client=www& q=license+plate&ud=1&site=AllSites &spell=1&oe=UTF-8&proxystylesheet=www& ip=XXX.XXX.X.104 HTTP/1.1" 200 8283 146 0.16 What are users searching? How often are users failing?
  • SSA is semantically rich data, and...
  • SSA is semantically rich data, and... Queries sorted by frequency
  • ...what users want--in their own words
  • A little goes a long wayA handful of queries/tasks/ways to navigate/features/ documents meet the needs of your most important audiences
  • A little goes a long wayA handful of queries/tasks/ways to navigate/features/ documents meet the needs of your most important audiences Not all queries are distributed equally
  • A little goes a long wayA handful of queries/tasks/ways to navigate/features/ documents meet the needs of your most important audiences
  • A little goes a long wayA handful of queries/tasks/ways to navigate/features/ documents meet the needs of your most important audiences Nor do they diminish gradually
  • A little goes a long wayA handful of queries/tasks/ways to navigate/features/ documents meet the needs of your most important audiences
  • A little goes a long wayA handful of queries/tasks/ways to navigate/features/ documents meet the needs of your most important audiences 80/20 rule isnt quite accurate
  • (and the tail is quite long)
  • (and the tail is quite long)
  • (and the tail is quite long)
  • (and the tail is quite long)
  • (and the tail is quite long) The Long Tail is much longer than youd suspect
  • The Zipf Distribution, textually
  • Some things you can do with SSA 1.Make it harder to get lost in deep content 2.Make search smarter 3.Reduce jargon 4.Learn how your audiences differ 5.Know when to publish what 6.Own and enjoy your failures 7.Avoid disaster 8.Predict the future
  • #1 Make it harder to get lost
  • Start with basic SSA data: queries and query frequency Percent: volume of search activity for a unique query during a particular time period Cumulative Percent: running sum of percentages
  • Tease out common content types
  • Tease out common content types
  • Tease out common content types Took an hour to... Analyze top 50 queries (20% of all search activity) Ask and iterate: what kind of content would users be looking for when they searched these terms? Add cumulative percentages Result: prioritized list of potential content types #1) application: 11.77% #2) reference: 10.5% #3) instructions: 8.6% #4) main/navigation pages: 5.91% #5) contact info: 5.79% #6) news/announcements: 4.27%
  • Clear content types lead to better contextual navigation artist descriptions album reviews album pages artist biosdiscography TV listings
  • #2 Make search smarter
  • Clear content types improve search performance
  • Clear content types improve search performance
  • Clear content types improve search performance Content objects related to products
  • Clear content types improve search performance Content objects related to products Raw search results
  • Contextualizing advanced features
  • Session data suggest progression and context
  • Session data suggest progression and context search session patterns 1. solar energy 2. how solar energy works
  • Session data suggest progression and context search session patterns 1. solar energy 2. how solar energy works search session patterns 1. solar energy 2. energy
  • Session data suggest progression and context search session patterns 1. solar energy 2. how solar energy works search session patterns 1. solar energy 2. energy search session patterns 1. solar energy 2. solar energy charts
  • Session data suggest progression and context search session patterns 1. solar energy 2. how solar energy works search session patterns 1. solar energy 2. energy search session patterns 1. solar energy 2. solar energy charts search session patterns 1. solar energy 2. explain solar energy
  • Session data suggest progression and context search session patterns 1. solar energy 2. how solar energy works search session patterns 1. solar energy 2. energy search session patterns 1. solar energy 2. solar energy charts search session patterns 1. solar energy 2. explain solar energy search session patterns 1. solar energy 2. solar energy news
  • Recognizing proper nouns, dates, and unique ID#s
  • #3 Reduce jargon
  • Saving the brand by killing jargon at a community college Jargon related to online education: FlexEd, COD, College on Demand Marketings solution: expensive campaign to educate public (via posters, brochures) The Numbers (from SSA): Result: content relabeled, money saved query rank query #22 online* #101 COD #259 College on Demand #389 FlexTrack *onlinepart of 213 queries
  • #4 Learn how your audiences differ
  • Who cares about what?
  • Who cares about what?