Quality Data: Fresno State's Analytics Strategy Rob Robinson Web Developer for Fresno State...

30
Quality Data: Fresno State's Analytics Strategy Rob Robinson Web Developer for Fresno State [email protected] @robrobinsonii

Transcript of Quality Data: Fresno State's Analytics Strategy Rob Robinson Web Developer for Fresno State...

Quality Data:Fresno State's Analytics

Strategy

Rob RobinsonWeb Developer for Fresno State

[email protected]@robrobinsonii

Basic Organization

• Our Web Communications team is responsible for the entire campus web presence.– Except specific applications such as

PeopleSoft Portal, Email, and Blackboard.

• Maintaining a large set of pages gives us a much bigger picture of trends in usage

• We can see campus-wide trends over time, and real-time current usage

Basic Infrastructure

• Single physical Dell machine hosted with Rackspace– Our centralized web team is responsible

for the server

• Centralized Google Analytics– Our centralized web team is responsible

for all Google Analytics accounts

Some Stats

• Total http requests per day ( avg )– .html ( 620,000 )– All files : ( 2,400,000 )

• Total pages on server – .html ( 70,002 )

• Total pages in CMS : ( 19,781 )• We will be moving to a fully

responsive template this summer

Not Just Web Analytics

• Web Analytics –Who is viewing / How are they viewing ?

• Server Analytics• User / Staff Analytics– From OU Campus Users “Custom

Report”

• Page Freshness– From OU Campus Pages “Custom

Report”– Page age vs page views ?

Problems to be Solved

• Where are our major entry points ? – ( page views / entry pages )

• What are people doing on our pages ? – ( searches / events )

• Given that information, can we optimize our entry points for proper navigation ?

• What types of devices are being used ?

Problems to be Solved

• Volume of requests over time• Previous year or term usage

( especially 1st week of classes )– Preferably Predictive Indicators

Entry Points and Page Views

• Data Sources:– Apache Access Log Data– Google Analytics

Searches

• Apache Access logs –We can see searches if referrer was our

Google Search Appliance, and which page the user landed on

– Regular search terms from Google are now hidden.

• Google Analytics ( sometimes )– GA does provide some searches

Searches

• Which page did the user land on?• What is the user searching for ?• Did the user click on what we wanted

them to click on ?• Search vs. Navigation ?– Nielson

Norman Group Says 18-24 year olds display search dominate behavior.

– Converting Search Into Navigation

Event Tracking

• Javascript / DOM events captured by GA– Catalog Tabs$(document).ready(function(){

window.setTimeout(function() {var maxLen = $('#tabsaccordion-0-tab-0').parent().children().length;for ( var i = 0; i < maxLen; i += 1 ) {

$('#tabsaccordion-0-tab-0').parent().children().eq(i).on('click', function(){

ga('send','event','Tab Click',location.href.replace("http://www.fresnostate.edu/catalog",""),$(this).text().toLowerCase());

});}

},800);});

• http://www.fresnostate.edu/catalog/subjects/biology/biology-ms.html#courses

Event Tracking

• Javascript / DOM events captured by GA–Map Checkboxes

$("[id$=-cb]").each(function(){

$(this).change(function(){

var sFormattedMessage = $(this).attr('id') + " " + $(this).is(':checked');

_gaq.push(['_trackEvent', 'CheckBox', 'Use', sFormattedMessage, null, true]);

});});

• http://www.fresnostate.edu/map/

Event Tracking - Errors

• Dom Errors with Google Analytics• Classic:

window.onerror = function(message, file, line) { var formattedMessage = '[' + file + ' (' + line + ')] ' + message; _gaq.push(['_trackEvent', 'Exceptions', 'Application', formattedMessage, null, true]);}

• Universal:window.onerror = function(message, file, line) {

var formattedMessage = '[' + file + ' (' + line + ')] ' + message; ga('send','event','Exceptions','Application',formattedMessage);

}

Errors and Such

• Top error pages / documents– From Apache error log

• Large Images embedded in pages– Python and Bash

• Large Images stored on server– find /var/www/htdocs/ -size +10M -exec ls -lah

{} \;

Mobile Users

• Our IT Strategic Plan states that we should ensure our infrastructure (hardware, software, network and support services) is adequate to sustain the widening use of smartphones, tablets, and laptops

• How much of our traffic is actually coming from mobile devices ?

Mobile Users -- Homepage

Mobile Users – Help Center

Mobile Users – Student Affairs

Mobile Users -- Catalog

Mobile Users – Map

Server Analytics

• CPU load• Incoming and Outgoing Bandwidth• Outgoing Mail• Breech attempts• Concurrent connections sampling

Established Connections

• Get a count of all established connections to your apache web server:– netstat -pant | grep httpd | grep -c ESTAB

• Get a count of all connections that are in a waiting state:– netstat -pant | grep httpd | grep -c WAIT

• Every 5 minutes, each of the previous entries are placed into a JSON file named as today’s date

• http://www.fresnostate.edu/web-stats/apache.php?div=performance

Google Analytics

• Popular Pages / Entry Points• Unique Page Views• Device Usage• Bounce Rates• Click / Event Tracking• Window.error event tracking

Real-Time Server Analytics

• # during the time period of 10:01 central time, what where the top 10 referrers– grep '2014:10:01' /logs/web/apache/www.fresnostate.edu/access.2014-03-10-00_00_00 | awk -

F\" '{print $4}' | sort | uniq -c | sort -nr | head -10

• # top 10 referrers from the last 1000 requests– tail -1000 /logs/web/apache/www.fresnostate.edu/access.2014-03-10-00_00_00 | awk -F\"

'{print $4}' | sort | uniq -c | sort -nr | head -10

• # top 10 visited pages of the last 1000 requests– tail -1000 /logs/web/apache/www.fresnostate.edu/access.2014-03-10-00_00_00 | awk -F\"

'{print $2}' | sort | uniq -c | sort -nr | head -10

• # top 10 most requested jpg files of the last 1000 requests– tail -1000 /logs/web/apache/www.fresnostate.edu/access.2014-03-10-00_00_00 | grep 'jpg' |

awk -F\" '{print $2}' | sort | uniq -c | sort -nr | head -10

Our Home Grown Dashboard

• http://www.fresnostate.edu/web-stats/• Uses:• JSON – aggregated and collected daily• jQuery Async Calls to RESTful “Web

Services”• Highcharts – for graphing• Uses GAPI for accessing Google

Analytics Data via web service ( nothing active now… )

• Still an active prototype

Our GA Dashboard

• http://www.google.com/analytics/

Our GA Dashboard

• http://www.google.com/analytics/

Where do we go from here ?

• Combining data from OU Campus “Custom Reporting”, and our server analytics…

• What pages are not being used ?• Pattern detection …– “big data” ?

Questions ???