Exploiting ERP Systems in Enterprise Search

14
Exploiting ERP Systems in Enterprise Search Diego Tosato

Transcript of Exploiting ERP Systems in Enterprise Search

Exploiting ERP Systems in Enterprise Search

Diego Tosato

Outline

• Introduction

• Our approach

• Experiments

• Open issues

• Conclusions

Introduction - idea

• Enterprise search on small data is much more important than web search on big data for many companies. [Liu et al. IR 2014]

In particular for SME (Small and Medium Enterprises).

• Last advances in enterprise search focus on the extraction of concepts or entities from enterprise data. [Brauer WWW 2010, Liu et al. IR 2014, Meij et al.

WSDM 2014, Graus et al. WSDM 2016]

• Among entities, Enterprise Rresource Planning entities (such as orders, invoices, estimates, etc.) play a key role for enterprises.

[Nazemi et al. IJAMT 2012]

Introduction - idea

• ERP systems are used by organizations to collect, store, manage and interpret data from many business activities.

• They are typically composed of several modules.

• We build a graph knowledge base that we call Entity Graph (EG).

• We model its main type of entities (33) and the related entity links (70).

ERP

Sales

Production

Purchasing

Finance

Our approach – system design

Our approach – Entity Graph (EG)

• EG is a directed graph

• A configuration file determines the queries to extract the relations, their direction, and the weights of each type of relation.

• EG improves enterprise search with an exploration experience complementary to faceted navigation and full text search.

A set of nodes

A set of edges

A set of edge weights

Our approach – Entity Ranking

• We follow the idea proposed by Turney et al. JAIR 2010, which led us to design a pipeline of components.

• The final rank of the results is

• We instantiate the model as follows

= entity

= set of scores

= set of weights

= # components

date scoreTF-IDF score

EG score

Page Rank score

Our approach – Prototype

• SeNSE (Skyline eNterprise Search Engine) is the name of the prototype.

– Back-end tecnology

– Front-end technology

Our approach – User Experience

• The protoype of our SERP (Search Engine Results Page).

Experiments

• We built three different enterprise datasets with real (small) data

– 1 million entities and 10 million entity links.

• How effective is the method?

– we computed the precision on a testing set of 100 user information needs.

– Relevance judgments are obtained by merging the user ranking on the top 5 entities.

– We assigned a weight performing a grid search. TF-IDF score is the most important contribution.

Method Precision

TF-IDF score (baseline) 54%

Our approach 69%

Open issues

• By analyzing the links of EG, we found that there are huge node hubs because there are some types of entities. This is a problem for PageRank because it gives higher rank to hubs which are not necessarily relevant for each enterprise information need.

• SeNSE needs different representations of an entity to provide its services. This is not only a scalability issue but also a modeling one.

The extension of the search pipeline with further components could introduce novel representations for the entities.

Entity =?

+ + . . .

Open issues

• Another tricky problem concerns the update of the indexed entities, because enterprise search engines updates should be processed in near real-time.

• The system has to deal with all the type of updates, in particular it has to manage the cancellation of entities which is the most difficult case.

• SeNSE implements three update policies: batch full, batch delta, and real time.

Conclusions and Future Works

• We presented an enterprise search model that exploits ERP entities to enhance the enterprise search experience and its implementation: SeNSE.

• We presented the open issue coming from our industrial experience.

• In future work, we aim to clarify the benefit given by each contribution to entity ranking

• We will implement an automatic method to compute the weights for those contributions.

www.freewayskyline.com/demosense