SharePoint 101:
Which SharePoint Search is Right for You?
Miles KehoeNew Idea Engineering [email protected]
SharePoint and FAST Search
IntroductionThe Products
Grand TourTech DetailsWhat Really counts
Mapping Business Requirements to Technology
Data CapacitiesPrice
Search Resources
New Idea Engineering Inc.Company Background
– Founded in 1996– Headquarters: Santa Clara, California– Customers in Europe and North America
Vendor neutral approach to search– Focus is on what is best for our clients
Products and Consulting Services:– Evaluation/Selection/Implementing Enterprise Search– Search Best Practices– SearchTrack Reporting & Analytics– Search Data Quality Toolkit– Enterprise Search Newsletter
3
The Book
Published Fall 2010Covers Search:
BusinessSharePointFS4SPESP 5.3Trends
SharePoint Search Products:The Grand Tour
FAST Search Server 2010
for Internet Sites
FAST Search Server 2010
for SharePoint
Microsoft Search Server 2010
SharePoint Server 2010
Microsoft Search Server 2010, Express
Business Productivity(server/(e)CAL
licensing)
Internet Business(Server licensing)
SharePoint Server 2010 for Internet Sites,
Standard
SharePoint Server 2010 for Internet Sites, Enterprise
Stand-Alone
Integrated with
SharePointSharePoint Foundation 2010
Entry Level
Infrastructure
High End
Entry Level
Infrastructure
High End
Naming Conventions
FAST Search Server 2010 for Internal
Applications
FAST Search Server 2010
for SharePoint
Internet Business(Server licensing)
Stand-Alone
Integrated with
SharePoint
Entry Level
Infrastructure
High End
Entry Level
Infrastructure
High End
The ‘SharePoint’ CodebaseBusiness Productivity
(server/(e)CAL licensing)
Microsoft Search Server 2010
SharePoint Server 2010
Microsoft Search Server 2010, Express
SharePoint Server 2010 for Internet Sites,
Standard
SharePoint Server 2010 for Internet Sites, Enterprise
SharePoint Foundation 2010
FAST Search Server 2010 for Internal
Applications
FAST Search Server 2010
for Internet Sites
Product NamesSharePoint Codebase
Microsoft SharePoint Foundation 2010 SharePoint Foundation 2010 SPF
Microsoft Search Server Express 2010 Search Server 2010 Express (MSS Express) MSS-X
Microsoft Search Server 2010 Search Server 2010 MSS
Microsoft SharePoint Server 2010 SharePoint Server 2010 SP
Microsoft SharePoint Server 2010 for Internet Sites, Standard SharePoint Server 2010 FIS SP-FIS
Microsoft SharePoint Server 2010 for Internet Sites, Enterprise SharePoint Server 2010 FIS-E SP-FIS-E
Hybrid Codebase
Microsoft FAST Search Server 2010 for SharePoint FAST Search for SharePoint FS4SP
FAST Codebase
Microsoft FAST Search Server 2010 for Internal Applications FAST Search Internal Applications FSIA
Microsoft FAST Search Server 2010 for Internet Sites FAST Search for Internet Sites FSIS
Search Server
SharePoint Server for Internet Sites
FAST Search for SharePoint Internet Sites
SharePoint Server
FAST Search for SharePoint
FAST Search for Internal Applications
FAST Search For Internet Sites
Solutions for Internet Business
Solutions for Business Productivity
Integrated with
SharePoint
Stand-alone
Entry-LevelSolutions
Search Server Express
SharePoint Foundation
The Marketing Fantasy
In Summary
• Two entry level (SPF, MSS-X), three infrastructure-tier (SP, SP_FIS, MSS), four high end (FS4SP, FS4SP-IS, FSIA, FSIS)
• Four stand-alone (MSS-X, MSS, FSIA, FSIS), five integrated with SharePoint) SPF, SP, SP-FIS, FS4SP, FS4SP-IS)
• Three intended/licensed for internally facing applications (SP, FSIA, FS4SP), three intended/licensed for externally facing applications (SP-FIS, FS4SP-IS, FSIS)
• Six different images/media sets (SPF, MSS/MSS-X, SP/SP-FIS/SP/FIS-E, FS4SP/FS4SP/FS4SP-IS. FSIA, FSIS)
Confused Yet?
The Products
SP and FS4SP Platforms
High Level Overview: SP vs FS4SPArea Feature SharePoint FAST
Conversational UI Basic refiners Deep, customizable refiners Social definitions/social tag ranking
Document preview and thumbnails
Best Bets Visual and Contextualized Best Bets
Ranking Great ranking, improves with use Customization Basic ranking customization
Extensive ranking schema management
Keyword-based document boosts Sorting
Contextualization Query-based
Group-based
SharePoint Server and FS4SP
FS/FS4SP: Two different codebases
united by (mostly) common features SharePoint environment PowerShell iFilters OOB experience
FS4SP adds Index Pipeline Deep facets Geo/Location Search FAST XRANK operator Deep facets Predictable scalability …
Most visual & dynamic user experience
Thumbnails & PreviewsVisual Best BetsDeep refiners with countsUser context from user profileMultiple relevance profilesSorting on any propertySimilarity SearchBroader, better language support Richer query language
SP and FS4SP: End users
Great OOB search experience
Excellent OOB relevanceFederated resultsMetadata based refiners Query Suggestions, Did You Mean & Related SearchesView in browserPeople searchPhonetic & nickname matchingSocial behavior improves relevance
SP and FS4SP: IT
Highly capable and easy to manage
Same capabilities, plus:Extreme scale-out (500M docs)Content Processing pipeline Entity extractionTunable relevance rankingEasy setup of User Context, Visual Best Bets, Promotion/ DemotionEasy to configure sorting, and refinement
Enterprise class search system integrated with SharePoint
Enterprise Scale-out (to 100M docs)Full Fault ToleranceNative 64 bit; Hyper-V supportWizard - driven installationConsolidated search dashboardPowerShell supportSCOM support Full search reportingFull set of connectors OOB Easy to add new sources via BDC
The power to build a whole new class of search applications
Same capabilities, plus:Public web parts with high end capabilitiesAdd custom property extractorsExtend content processingInclude external data in relevanceBuild multiple relevance profilesExtend user contextUse advanced query capabilities to create powerful applications
SP and FS4SP: Developers
Open /flexible search platform
Public web partsNew connector frameworkIntegrated with Business Connectivity Services (BCS)Integrate search with BI, Workflow, Social, & Collaboration BDC tooling built into SharePoint DesignerApplication tooling in VS2010
Powerful FS4SP Features
USERDeep refiners with counts
Context based user profile
Multiple relevance profiles
Sorting on any property
Similarity Search
Broader, better language support
Richer query language
ITContent Processing pipeline
Entity extraction
Tunable relevance ranking
DevelopersAdd custom property extractors
Extend content processing
Include external data in relevance
Build multiple relevance profiles
Indexing Pipeline
Form
atCo
nver
sion
Lang
uage
Dete
ction
Entit
yEx
trac
tion
Lem
mati
zatio
n
Map
per
…
FS4S
P In
dex
FS4SPStages coded in .Net
Configure via UI or PowerShell
Custom stages before ‘Mapper’
Runs in sandbox w/ timeout
FS4SPStages coded in Python
(any CLI language OK)
Configure via XML config file
Custom stages allowed anywhere
Runs in-line
• Format Conversion• Language and encoding detection• Lemmatizer • Tokenizer• Entity Extraction• DateTimeNormalizer • Vectorizer • WebAnalyzer• PropertiesMapper• PropertiesReporter
Default Optional• XML Properties mapper• Offensive Content Filter• Verbatim (whole word) extractor
Loads dictionary for custom extraction, e.g. product names
• Field Collapsing• Entity Extraction
‘Persons1’
Typical Content Pipeline Stages
The Role of the Pipeline
The Content PipelineProcessing & refinement
…
FormatConvers
ion
Language
Detection
EntityExtracti
on
Configurable
Stages
Mapper
REDMOND, Wash., and OSLO, Norway — Jan. 8, 2008 Microsoft Corp. (Nasdaq “MSFT”) today announced that it will make an offer to acquire Fast Search & Transfer ASA (OSE: “FAST”), a leading provider of enterprise search solutions, through a cash tender offer for 19.00 Norwegian kroner (NOK) per share. This offer represents a 42 percent premium to the closing share price on Jan. 4, 2008 (the last trading day prior to this announcement), …
Location
Date
Company
Lookaside
The ‘Virtual Document’The initial document text content plus:
Explicit metadata (Title/Author/etc.) Implicit metadata (Path/repository/filename) Look-aside content from pipeline
(synonyms/taxonomies/other mark-up) Anything you can add to ‘make the needle bigger’
Entity Extraction
FS4SP• Create/Edit Config Files• Update files (Persons/Places/Things)• Wait for system to update (5 minutes)• Voila!!
ESP• Create/Edit dictionary file• Compile dictionary w/ ‘dictupdate’
• Names, Companies, Job titles• Samples provided in many languages• Whitelist and blacklist
• Bazinga!
Deep Refiners
SP Shallow refiners only
No count provided Only managed properties/metadata Uses top 50 results to populate refiners
FS4SP Provides document counts with refiners All refiners shown (based on config)
FS4SP Result List
ESP Platforms
FAST ESP
FAST ESP 5.3 • Multiple Platform Support• Extreme Scalability
• Petabytes of Content• >10K QPS
FSIA = ESP• Identical to FAST ESP 5.3• ‘Internal Applications’• License based on ECAL + Server
FSIS = FAST ESP 5.3 + IMS / CTS• IMS/CTS - Windows only (Visual Studio)• ‘Internet Sites’• License based on servers
ESP Platform
Search development environment• ‘Some assembly required’• Not based on SharePoint
• Connectors and WebParts available• No OOB end-user UI• Separate user and IT consoles• Fully customizable pipeline
Development Tools• Java/Python/PostgreSQL• GUI and CLI (no PowerShell support)
• Petabytes of Content• 10K+ QPS
ESP Admin Console
ESP Search Business Console
Search View: Results
Search View: Debugging
Search View: Debug Output
New FSIS Products
Content Transformation Services: CTS• Design workflows for indexing from multiple sources
CTS and IMS:• Based on Visual Studio: Windows Only!• Licensed/Intended for FSIS ONLY (for now)
Interaction Management Services: IMS• Manage query/results processing
Content Transformation Services
Interaction Management Services
ESP Customer: Best Buy
ESP Customer: Financial Times
But…. What’s really important..What does your customer expect?
Mapping Business Requirements to the Technology
Identify business rules for facets/refiners
If refiners are a business need: Choose to index the appropriate metadata fields; or Upgrade your data
If refiners are not a business need: Push the business rules into the 21st century Use what you have; or Update your content
The Data Audit
Understand the dataRepositories
Where does the content liveIs there security involved
Documents & document structureDo documents have good metadata?Do you need to extract dataAre there recognizable blocks of content?
Lightweight publishing contentLook at email/wikis/blogs/support calls…
Great search doesn’t just happen..
Do search owners understand?Staffing Expectations
Is there a search manager? An SCOE?How many people will be involved day to day?
How to manage search?Review activity logsUpdate and manage best bets, new contentEvangelize
User SkillsAre users knowledge workers or casual searchers?Is search business critical?
So what is the right answer?It depends
Internal or external facing search?SharePoint or stand-alone?Casual users or Knowledge workers?Resources for managing search?Search box or search derived application?What’s the risk of missing content?
That’s why your customer needs you!
Resources
Search Dev Newsgroup:www.SearchDev.org
Newsletter & Whitepapers:www.ideaeng.com/currentwww.ideaeng.com/wp
EnterpriseSearchBlog.com
Blog:
Top Related