Dw allegro alain ozan.

22
Allegro’s DWH Implementation on Oracle Database Machine with OWB & OBIEE Rafał Kudliński, BI Manager Allegro Group

Transcript of Dw allegro alain ozan.

Page 1: Dw allegro alain ozan.

Allegro’s DWH Implementation on

Oracle Database Machine

with OWB & OBIEE

Rafał Kudliński, BI Manager Allegro Group

Page 2: Dw allegro alain ozan.

Allegro Group operates market leading e-commerce trading platforms; general, automotive and real

estate classified sites; a price comparison site; and payment web services across Eastern Europe

under various brands. Allegro Group operates 46 platforms in 13 countries.

Do you really know Allegro?

2Rafał Kudliński – BI Manager Allegro Group

Page 3: Dw allegro alain ozan.

Allegro is the most successful eCommerce trading system in Poland and the largest non-eBay auction

platform worldwide.

Do you really know Allegro?

MORE THAN 14.5

MILLION USERS

Over 9500 new users

every day.

Over 3.5 million new

users every year.

1000 EMPLOYEES

250 employees in the

IT department:

150 in the

development division

and 100 in other IT

divisions (IS, DWH,

R&D, etc.).

MORE THAN 500

MILLION PAGE

VIEWS (Peak)

The number of page

views has doubled

within the last 3 years.

MORE THAN 90

MILLION LISTED

ITEMS

The number of listed

items has increased

by 75 million over the

past 3 years.

3Rafał Kudliński – BI Manager Allegro Group

Page 4: Dw allegro alain ozan.

Our figures are extraordinary in all areas. We use a leading-edge technology wherever possible. The

power consumption of our DCs is bigger than 800 average households together!

Do you really know Allegro?

4Rafał Kudliński – BI Manager Allegro Group

Page 5: Dw allegro alain ozan.

Business requirements (Priority 1)

What is our internet services performance?

We need to play with data, simulate and have influence on growth; Users

expecting quality and we are expecting more sales; We need to measure

the categories?; We need to know if our pricing model is optimal?; We

need to know when the growth is getting slowed down? What influence the

success rate for auctions for different categories? We need to know our

We decided that measuring our internet services performance is most important for our business

growth. It covers both services and user areas. We have to really understand what is happening on

our websites and who is our most valuable user.

success rate for auctions for different categories? We need to know our

refunds, fees; We need to analyse the method of payment; Auctions on

the front page and we need to analyse the conversion rate and why? We

want to measure the effect of the changes in the categories; We need to

benchmark countries; What products are sold most frequently ? new or

used? What is source of our profit?; What is result of search? What user

do after that? We need to know where user click and what he do on our

site?

5Rafał Kudliński – BI Manager Allegro Group

Page 6: Dw allegro alain ozan.

Business requirements (Priority 2 and 3)

What is our operational performance?

� We need to do our job faster, better, more efficient; we need to know which

projects to realize, which are profitable; We need to measure the activity?

What is our Marketing campaigns performance?

� (We need to measure the campaigns?; We need to know effect of

The Operational performance and marketing campaign effectiveness is also crucial to our business.

Co brand and affiliate programs as valuable sources of traffic and new users registrations have to be

carefully monitored as well.

� (We need to measure the campaigns?; We need to know effect of

marketing action? source of traffic? need to track the result of our

spending.

What is our co-brand performance?

� What are the sources of new registrations?

What is our Affiliate Performance?

� We need to know affiliation program impact?

What is offered Products performance?

� We need to measure products?

6Rafał Kudliński – BI Manager Allegro Group

Page 7: Dw allegro alain ozan.

Agenda

7Rafał Kudliński – BI Manager Allegro Group

Page 8: Dw allegro alain ozan.

Projects in Numbers

The first project took 6 months to complete, with 8 people working on it. Support from external companies

was necessary due to the implementation of a new technology and software.

� Project duration : 12 months

� Project team: 8 – 12 people

� Man/days spent: 800

� Active Users – 120

8

� Implemented reports – 100

� Implemented KPIs – 160

� Biggest source system size – 7TB

� Largest Tables – 2.8 billion records

Rafał Kudliński – BI Manager Allegro Group

Page 9: Dw allegro alain ozan.

Data warehouse architecture

DWH Staging AreaDWH Staging AreaOracle Oracle DBDB

Load

Oracle Data Guard

DWH ProductionDWH ProductionOracle Oracle DBDB

ETL

We load data from a real time copy of the production system. Extraction and transformation processes

are performed to load data to DWH production scheme. Finally aggregations are built to improve query

processing performance. We use OWB as ETL tool.

Logical Logical

StandbyStandbyOracle Oracle DBDB

Allegro Allegro

Production Production Oracle Oracle DBDB

Oracle Data Guard

Production Environment2 * IBM P590 Machine

Data Warehouse EnvironmentOracle Database Machine

DataMartDataMartOracle Oracle DBDB

ET

L

OWB

Click Stream recording

Environment10 * DL360 Machine

DB 1DB 1

MySqlMySql

DB 2 ..DB 2 ..

MySqlMySql

DB ..10DB ..10

MySqlMySql

9Rafał Kudliński – BI Manager Allegro Group

Page 10: Dw allegro alain ozan.

Oracle BI Server

Ad-hoc Analysis

Interactive Dashboards

Allegro DWH & BI system architecture

We use Oracle Business Intelligence Enterprise Edition as BI tool. OBIEE is connected to both Target

and DataMart schemas. We have almost 120 active users. 10 power users perform Ad- hoc queries.

DWH ProductionDWH Production

TargetTarget

Oracle BI Server

Deliversand Alerts

MS OfficePlug-in

Transaction Transaction PlatformsPlatforms

Data Warehouse EnvironmentOracle Database Machine

Oracle 11g

DWH ProductionDWH Production

DataMartDataMart

Other Other SystemsSystems

10Rafał Kudliński – BI Manager Allegro Group

Page 11: Dw allegro alain ozan.

BI Portal presents the most important reports / KPIs describing performance of our major auction

platforms in all countries we operate. We can find there information about open auctions, registered

users, bids, sales and charges.

Allegro Performance KPIs

11Rafał Kudliński – BI Manager Allegro Group

Page 12: Dw allegro alain ozan.

Product managers can analyze a number of measures drilling down in the product category tree. They

can filter data by selecting an auction type or a seller type.

Auction Category Analysis

12Rafał Kudliński – BI Manager Allegro Group

Page 13: Dw allegro alain ozan.

BI Portal contains also information about IT department performance. Managers can see current

budget realization, SLA, Traffic and status of most important current IT projects.

IT Department KPIs

13Rafał Kudliński – BI Manager Allegro Group

Page 14: Dw allegro alain ozan.

We deliver information about the number of clicks grouped by users, user locations, services, scripts and,

most importantly (not available yet), by product categories. The users can drill down to detail information.

Click Stream analysis

14Rafał Kudliński – BI Manager Allegro Group

Page 15: Dw allegro alain ozan.

DB Machine is very efficient in all types of ETL processing. We do parsing, cleaning, merging and joining

of almost 500 million records each day. The number of aggregation tables is calculated and refreshed.

Lesson 2 – ETL Processing – DB Machine do it all

15Rafał Kudliński – BI Manager Allegro Group

Page 16: Dw allegro alain ozan.

As usual, in order to have excellent performance, you have to think about partitioning, compression and

parallel query execution. No indices; full table scan performance needs to be considered.

Lesson 3 – Data Architecture – Standard but Improved

� We use Standard Star Schema with a collection of fact tables

� Our largest fact tables have almost 3 billion records (billings, clicks)

� Our largest dimension tables have more than 15 million records (users,

locations)

16

� No Indices - no need; in some cases using them was even worse

� Full Table Smart Scan – works very efficiently

� We heavily use partitions (days, months) and sub-partitions (attributes)

� Compression – saves space (avg. 30%) and improves performance

� Parallel query execution /*+parallel(table,8,3)*/ - works very well – average

query execution time improvement = X10

Rafał Kudliński – BI Manager Allegro Group

Page 17: Dw allegro alain ozan.

Even when using DB Machine, it is necessary to use the aggregation tables to achieve necessary user

interface performance. What you get is scalable and fast aggregation and reporting environment.

Lesson 4 – User Access – fast and reliable

� We create a number of aggregation tables to avoid joins between million-

record tables

� Reports and dashboards are delivered within seconds (<5s) even with >100

users working

17

� OBIEE works very well with DB Machine especially in reporting and

dashboarding

� OBIEE ad-hoc Answers application is very powerful but still some users need

to use SQL to get what they want (automatically generated queries are not

adjusted to use all DB machine features)

Rafał Kudliński – BI Manager Allegro Group

Page 18: Dw allegro alain ozan.

Exadata storage server brings an additional value but not many additional tasks. It can be handled by

DBA without any special skills

Lesson 5 - DB Administration – no complexity

� Just typical RAC environment

� Fully integrated with Grid Control – storage cell monitored with a dedicated

plug-in

� Distributed command execution

18

� Easy storage layer administration – replace/create a disk/diskgroup with no

more than 3 commands

� Comprehensive command shell on a storage cell

� Additional hardware/software components needed for integration with SAN

backup environment.

� Self-monitoring storage layer with email notifications

Rafał Kudliński – BI Manager Allegro Group

Page 19: Dw allegro alain ozan.

Support from Oracle and external experienced consultants is necessary for successful DW & BI

implementation (using new environment)

Lesson 7 – External support – helps and speeds up

Business Needs

� Business Discovery (performed with the help of Oracle Consulting ) was very

valuable to prioritize business requirements - Jamal El Faiz

ETL Process

� Experience in massive data processing from ISE – Igor Michaljow

19

� Experience in massive data processing from ISE – Igor Michaljow

OBIEE

� Expertise in building robust reports and dashboards from Oracle Consulting

Alessandro Sabelli, Małgorzata Baran, Marzena Krzanowska

DB Machine administration

� Some initial configuration made by Oracle

� Support from RAC PAC team (best practices, service requests)

� Update Patches are frequently released

� Experienced internal DBA is crucial – Wojciech Semenowicz

Rafał Kudliński – BI Manager Allegro Group

Page 20: Dw allegro alain ozan.

Next Steps and Outlook

FUTUREFUTURE

Right now we are working on processing Click Stream data to DWH. We have more than 400 mln page

views every day. In next few months data from our payment and classified services will be loaded .

20

PRESENTPRESENT

PASPASTT

Rafał Kudliński – BI Manager Allegro Group

Page 21: Dw allegro alain ozan.

� Thing big act small and before you start search for the right staff!

� Plan and manage project carefully

� Have right sponsor and support from business site

� Oracle Database Machine is definitely right choice

Recommendations

� Oracle Database Machine is definitely right choice

� At the beginning Support from Oracle consulting is crucial

� Stand for Information Democracy in your company

21Rafał Kudliński – BI Manager Allegro Group

Page 22: Dw allegro alain ozan.

Q&A

22Rafał Kudliński – BI Manager Allegro Group