WCMS and Campus Web Search Updates
description
Transcript of WCMS and Campus Web Search Updates
WCMS and Campus Web Search Updates
Campus Search• Previous campus search used free search providers
• It switched between providers as daily usage limits were exceeded
• Started with Google, then Yahoo then Bing
#watitis2012
Campus Search• Google Search Appliance went live in October
• Indexes web pages in the uwaterloo.ca domain as well as documents and LDAP records
#watitis2012
Campus Search• 1,000,000 document license
– 680,889 documents being served• 59,372 of those “documents” are people from UWLDAP
• The initial crawl quickly used up the license– Duplicate and “retired” servers were omitted– Revision history of wiki pages was ignored– Some sites had pages that presented the same content in different orders
#watitis2012
#watitis2012
#watitis2012
#watitis2012
Campus Search• Content breakdown
– HTML: 514,648– PDF: 82,303– Text files: 37,979– Postscript: 4,938– MS Word: 4,308– MS Powerpoint: 2,496– Flash: 2,140– MS Excel: 399
#watitis2012
Campus Search• We can add suggestions for acronyms
#watitis2012
Campus Search• Or promote pages that should be prominent
#watitis2012
Campus Search• If you want to have something removed from the
campus search, adjust your robots.txt file accordingly or hide it behind a login and it will disappear over time
• If you need something removed immediately, submit an RT ticket
#watitis2012
Campus Search• Future plans
– Upgrade to Google Search Appliance 7.0
– Add different content type options on search page
– Faceted search
#watitis2012
Google Search Appliance 7.0• Universal Search
– Search content across silos– New SharePoint connector
• Relevance and performance improvements– Enhancements to Google’s algorithms– Assisted navigation to refine search results
#watitis2012
Google Search Appliance 7.0• Document preview
– Not just HTML, also includes MS Word, PowerPoint and PDFs
– Good for mobile use as you do not have to download an entire PDF to know it is the one you want
#watitis2012
Google Search Appliance 7.0• Document translation and language capabilities
– Translate titles and snippets from inside the search appliance with support for 60 languages
– Better support for crawling languages such as Arabic and Japanese
#watitis2012
Google Search Appliance 7.0• Index compressed files
– Now crawls and indexes compressed files in .zip, .tar, .tar.gz and .tgz
• Expert search– Find subject matter experts on campus by searching
on keywords• Ex. Search for “network security” and a list of network
security experts will appear on the sidebar
#watitis2012
Other search enhancements
• Auto suggest on search fields
• Not just on the main search page, but on individual Drupal sites as well
#watitis2012
Other search enhancements• Adding content types to the Google search
#watitis2012
Other search enhancements• Faceted search
#watitis2012
WCMS Updates• Lily Yan - New content types
• Liam Morland - Opening up some sites so all CAS users can authenticate
• Chris Shantz – Prototypes for the IT Strategic Plan site and access control with Organic Groups
• Kris Olafson – Feature requests site
#watitis2012
Six content types• Project • Service• Graduate award• Teaching tip• Grebel publication• Exchange board
#watitis2012
Project content type• Proposed, ongoing and completed projects
• It includes project title, description, members (name, role), status, audience and topic
• Projects can be searched by status, topic and audience
• Initially the IST pilot site will use this content type
#watitis2012
Project
#watitis2012
Service content type• Services offered at the University of Waterloo
• Service content type includes service name, description, cost, support for this, link to service, service audience and category
• Can be searched by categories and audience
• IST website will use this content type
#watitis2012
Service
#watitis2012
Graduate award content type• This is a custom content type for graduate-
studies site
• This content type includes all information about graduate awards
• Graduate awards can be searched by name, description, value, program, deadline, type, category and citizenship
#watitis2012
Graduate award
#watitis2012
Graduate award content type• Site managers can view graduate award
report and download a .csv file
• Site mangers can mass publish content
#watitis2012
Graduate award
#watitis2012
Teaching tip content type• A custom content type for the Centre for
Teaching Excellence - this is a tip to help inform users
• This content type includes title, listing image, body, categories, tags and audience.
• Teaching tips can be searched by entering keywords and tag, and by categories
#watitis2012
Teaching tip
#watitis2012
Grebel publication• Grebel publication includes two content types
(Grebel journal and Grebel journal publication)
• The full article of HTML and PDF for past issues can be viewed, PDF can be downloaded
• Only title and author's name can be viewed for current issue
• Issues archived by year
#watitis2012
Grebel publication
#watitis2012
Exchange board content type• This content type is for recreation-committee site
only
• It includes personal information and exchange information
• It can be searched by exchange type (for sale, for rent, free or wanted)
#watitis2012
Exchange board
#watitis2012
#watitis2012
#watitis2012
#watitis2012
#watitis2012
#watitis2012
lkmorlan
#watitis2012
GUID
#watitis2012
Authentication and Performance
• Authenticated traffic skirts around our caching servers (varnish)
• Lack of caching contributed to our last outage
• The Main site is a fairly content heavy site compared to most, but we should still proceed with caution
#watitis2012
Prototypes for New Functionality
• On the IT Strategic Plan site, the following features are being evaluated:
– Organic Groups
– Commenting
– Inline Diffs
#watitis2012
Organic Groups (OG)● Enable users to create and manage their
own 'groups'. ● Each group can have subscribers, and
maintains a group home page where subscribers communicate amongst themselves. (from Drupal.org/project/og)
#watitis2012
Organic Groups (OG)● There are over 35,000 sites that use OG
including whitehouse.gov and groups.drupal.org
● OG allows us to have two sets of permissions for content on a site
● Can have private or public content
● Can have private or public groups
#watitis2012
Comments● With authenticated access we can explore
having comments available for certain content types
● Currently being used on the IT Strategic Plan site with OG content
#watitis2012
Inline Diff● Inline Diff allows you to view changes to
revisions on your site on the node page instead of having to go to the revisions page to see the difference
● The diff widget is in a block on the right, you choose the revision to review from a select box
● Changed text is yellow, added text is green, deleted text is grey
#watitis2012
Inline Diff
#watitis2012
What’s next? You decide!• In the new year, with our next major
release, you will see a link appear on your site dashboards to request a new feature
• We will open up our project management site and you can watch our progress on your requests
#watitis2012
Get involved!• We are starting 3 working groups in
collaboration with Student Technology Services team
– Mobile (apps and web user experience)– Student Portal– Open Data
#watitis2012
Questions?
#watitis2012