Embed Size (px)
Transcript of Goat search
GOAT Enterprise Search
Goat searchRevorg GOAT Search Solution (Powered by Lucene)1About MeGrover FieldsRevorg, LLC (Owner)M.S. Information System (Troy University)B.S. Industrial Engineering (Florida A&M University)Stanford Project Management Courses2About Me10+ years of development, analysis, and implementation10+ years of ColdFusion experience2+ years of Java experienceCommonspot, Strongmail, ClickFix (Developer)Email: firstname.lastname@example.orgWeb site: http://www.groverfields.com3AgendaWhat?What can we do with GOAT?Why?Why do we want to use GOAT and not Verity?How?How do we do that?Conclusion and alternative solutions4WhatWhat is a Search Engine?Builds an index on textAnswers queries using that index, a la VerityExisting database alreadyA search engine offers?ScalabilityReliance RankingTweakingIntegrates different sources (email, web pages, files, DATABASES)5What is a search engine? (cont.)Works on words, not on substringsAuto != automatic, automobileIndexing process:Convert documentExtract text and meta dataNormalize textWrite (inverted) index6Apache Lucene OverviewLucene Java 2.4A high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.No GUIhttp://lucene.apache.org7Apache Lucene OverviewJava library for indexing and searchingNo dependenciesWorks with Java 1.4 or laterInput for indexing: Document objects Each document: set of Fields, field name, field contentStores its index as files on disk or memoryNo document convertersNo web crawler8Lucene Java usersHBCU.infoLinkedInIBM OmniFind Yahoo! EditionTechorati.comEclipseMonster.com9Lucene Java SummaryJava Library for indexing and searchingLightweight /no dependenciesPowerful and fast and tested!No document conversionNo GUI10Why?Cost of Enterprise Search SolutionNeed for search speedJava projects to work onThings to do11Verity Limitations10,000 documents for ColdFusion Developer Edition125,000 documents of ColdFusion Standard Edition250,000 documents for ColdFusion Enterprise EditionWhat do developers do in a shared hosting environment?Is it possible for the hosting company to limit the number of documents per Web site?
12T-SQL Limitations?Search for Yahoo on my blogSELECT entry.id FROM tbl_mango_entry as entry INNER JOIN tbl_mango_post as post ON entry.id = post.id WHERE entry.blog_id = default AND (entry.title LIKE %yahoo% OR entry.content LIKE %yahoo% OR entry.excerpt LIKE %yahoo% ) AND post.posted_on