Post on 18-Dec-2014
description
DirectEDGAR
Going Beyond Search:Post-Query Processing
A Case Study Presentation byBurch KealeyProfessional ResearcherAssociate Professor, University of Nebraska at OmahaCreator-directEDGARPresident AcademicEDGAR+ LLC
DirectEDGAR
Case Study
Using Search as a back-office tool, finding a flexible search engine to manipulate results and assist in research processes
DirectEDGAR
Case Study
• Academic Research is interesting but can be tedious– Work analyzing the United States’ largest
companies and how corporate governance affects their value
– Research requires identifying key bits of information from specific groups of publicly traded companies
• Much harder to accomplish than expected
– Work is tedious and time consuming– GO to SEC Site or use Demo movie
DirectEDGAR
Case Study
• Decision = Search is the answer• A search tool needs to be:
– Flexible in the back end– Able to extract results, manipulate findings and re-
enter into the search tool for additional searching– Powerful enough to search AND assist with analysis
and processing of results– Capable of advanced search functions, including
strong text mining capabilities– affordable
DirectEDGAR
Case Study
• The search for search– Evaluated X1 Technologies, Autonomy and Google
and others– None of the competitors were able to offer
sophisticated search combined with the ability to process information on a post-query basis
– Hard coded solution wouldn’t work-research needs change too frequently
– Volume of search results was overwhelming• All search technologies seemed equally adept at
finding the bits and pieces• Focusing the search to only the relevant bits and
pieces seemed impossible
DirectEDGAR
Case Study
• Search Domain limited to SEC Filings• More than 20,000 different companies make these
filings• Research requires looking at a specific subset of
that 20,000, subset changes with every research project
• Other tools only allow very limited pre-search filtering e.g. [“search term” and (company1 or company2 or company 3 . . . . company X)]
• When X is less than about 20 no real problems• In academic research X can be as large as 4,500
DirectEDGAR
Case Study
• Current Research Project– Ceteris paribus, are cross-sectional differences in
audit fees associated with executive compensation structures more heavily weighted towards options?
– Interesting because prior research has found that executives of firms that have had financial reporting problems received more options than their peers w/o financial reporting problems.
– In a rational expectations equilibrium with competitive markets auditors should anticipate the effect of this compensation strategy on financial reporting choices and price the risk competitively.
DirectEDGAR
Case Study
• Current Process-Identify the sample– 21,630 initial companies– Filter Criteria
• 12/31/ year-end• Trading on NYSE or NASDAQ• No-ADRs• Market Cap >800 million
– 1,842 sample companies– Need twelve data items
• Audit fees• Compensation• Going Concern• etc . .
DirectEDGAR
Case Study
• Current Process-those companies in the sample with a Going Concern modification in their audit report–
Intuitive searchinterface
DirectEDGAR
Case Study
• In seconds ISYS found more than 7,500 potentially relevant documents out of more than 6,000,000 in the set–
DirectEDGAR
Case Study
• Search yielded more than 7,500 documents.• Even with an efficient search results
window can you imagine the time involved in evaluating the relevance of those 7,500 hits?
• Relevance has to be measured on two dimensions– Is it in my sample? Hard-not in context– Is this the right going concern? Easy-in context
• Most tools require expert to perform clerk’s tasks-Not ISYS
Clerk’s job
Expert’s job
DirectEDGAR
• ISYS Provides Microsoft Automation Objects– Allows manipulation of search results– Allow users with very basic skills to manipulate the search
results in imaginative ways– Can be embedded using Add-In or Macros for easy
distribution to wide-range of users with access to Office platform
• ISYS approach allows me to be an expert in what I want from my search while fully utilizing their expertise in search– Eliminates time-shifting, clerical tasks are handled in an
automatic manner so no delay
Case Study
DirectEDGAR• Simple process
to do the back end filtering
Case Study
DirectEDGAR
• Simple process to do the back end filtering (sample relevancy)
Case Study
From more than 7,500 documents to 676 in minutes
DirectEDGAR
Case Study
• Evaluation of Context Relevance is Simplified– Each instance of search term is highlighted– Each new documents open at the first instance of the
search term within the document– No scrolling- buttons allow jumping
• From one instance to the next• From one document to the next
• Actual Data Collection has been simplified– Identifiers can be extracted from each document using
MAO into EXCEL
DirectEDGAR
Case Study
• Evaluating Context Relevance
DirectEDGAR
Case Study
• Data Collection and Summary
One to One correspondencebetween each line in the spread-sheet and each linein the ISYS Search results
DirectEDGAR
• Summary of impact of ISYS adoption– Without using any search technology-simply
document review, this task would have taken 150 or more hours
– ISYS adoption and integration into process cut task time to less than two hours
Case Study
DirectEDGAR
Case Study
• Best Practices/Lessons Learned– Opposed to contrary belief, search is very flexible
and can be molded to fit nearly any situation (if using the right tool)
– Search vendors with exceptional support staff bring an invaluable piece to the puzzle
– When search goes beyond its input/output function, to offer post-query searches and other advanced features, the value it provides is immense
DirectEDGAR
Questions?
For any additional questions, please contact Mo SheahanISYS Search Softwaremo.sheahan@isys-search.com