Real Time Analytics HPE Vertica Meetup.pdf · PDF file Vertica analytics platform Fast...

Click here to load reader

  • date post

    16-Oct-2020
  • Category

    Documents

  • view

    0
  • download

    0

Embed Size (px)

Transcript of Real Time Analytics HPE Vertica Meetup.pdf · PDF file Vertica analytics platform Fast...

  • Real Time Analytics

  • Vertica

    – A SQL analytic engine

    – Built for Speed, Scale and Efficiency

    – Supports standard SQL

    – Provides rich Analytic functionality and is extensible

    – Integrates well with Big Data ecosystem tools

    – Runs on premises, in the Cloud, and on Hadoop

  • What's wrong with this picture?

    – SQL ??

    – Real-time Analytics ???

    – Real-time, continuous load ?

    – Real-time, very short response time ?

    – Big Data ????

  • Vertica – Does it scale ???

    select GET_COMPLIANCE_STATUS();

  • Vertica – Does it scale ??? (not a fake, believe me…)

    select GET_COMPLIANCE_STATUS();

    GET_COMPLIANCE_STATUS

    --------------------------------------------------------------------------------

    Raw Data Size: 2.75PB +/- 0.30PB

    License Size : 1.95PB

    Utilization : 141%

    Audit Time : 2016-09-27 23:59:29.367875+00

    Compliance Status : ***** NOTICE OF LICENSE NON-COMPLIANCE *****

    Continued use of this database is in violation of the current license agreement.

    Maximum licensed raw data size: 1.95PB

    Current raw data size: 2.75PB

    License utilization: 141%

    IMMEDIATE ACTION IS REQUIRED, PLEASE CONTACT VERTICA

  • Vertica – Is it really fast ?

    – Trillion Row Qlik-on-Vertica Dashboard

    – https://www.youtube.com/watch?v=ZnMDeg8V2sg

    https://www.youtube.com/watch?v=ZnMDeg8V2sg

  • Vertica – Is it so simple ?

    – HPE Vertica and Qlik Direct Discovery: A Technical Exploration

    – https://community.dev.hpe.com/t5/Vertica-Knowledge-Base/HPE-Vertica-and-Qlik-Direct-Discovery-A- Technical-Exploration/ta-p/234332

    https://community.dev.hpe.com/t5/Vertica-Knowledge-Base/HPE-Vertica-and-Qlik-Direct-Discovery-A-Technical-Exploration/ta-p/234332

  • Vertica – Is it so simple ?

    – No !

    – HPE Vertica and Qlik Direct Discovery: A Technical Exploration

    – Implementation Methods

    – Fact and dimension tables in-memory. Most applications are created using this approach. However, this paper does not cover the all-in-memory option because it is not suitable for big data (such as a few billion rows of fact data) and requires too much memory.

    – Fact and dimension tables in Direct Discovery (regular star schema).

    – BFFT (big flat fact table) in Direct Discovery. There are no dimension tables with BFFT.

    – Fact tables in Direct Discovery and dimensions in memory.

    – Multiple fact tables in Direct Discovery. This is not generally recommended because of complex design considerations.

  • Vertica @ Nimble Storage

    10

  • Changing the game with the Internet of (Powerful) Things

    InfoSight

  • Nimble Storage – Some metrics

    – >7,500 customers

    – millions of virtual objects under continuous monitoring

    – collected per day

    – Database Characteristics

    – Raw Data : 550TB - Disk: 200 TB - On Nimble: 100 TB

    – 350K selects per day

    – 60K inserts/deletes per day

    – Configuration

    – 2 Vertica clusters – 2x8 servers – 2x8x54 cores – Nimble Storage instead of DAS

    >250 billion sensor values

    >2 billion log events

    >100 million configuration variables

  • More on Vertica by Nimble Storage

    – https://my.vertica.com/wp-content/uploads/2016/09/B10823_10823_Presentation_2.pdf

    – From Vertica Big Data Conference 2016 : https://my.vertica.com/big-data-conference-2016/

    https://my.vertica.com/wp-content/uploads/2016/09/B10823_10823_Presentation_2.pdf https://my.vertica.com/big-data-conference-2016/

  • Vertica @ Criteo

    14

  • Hadoop for Primary Storage

    and MapReduce

    Cascading, Scalding and

    Hive for Data Transformation

    Hive and Vertica for

    Data Warehousing

    Tableau and ROLAP Cube

    for Structured Data Access

    Vizatra for speed

    The analytics stack at Criteo

  • More on Vizatra+Vertica by Criteo

    –SBTB FinagleCon 2015: Justin Coffey, Presenting Vizatra – YouTube

    –https://www.youtube.com/watch?v=uXmEhSFzNLs

    https://www.youtube.com/watch?v=uXmEhSFzNLs https://www.youtube.com/watch?v=uXmEhSFzNLs

  • More on Vertica

  • Vertica analytics platform

    Fast

    Boost performance by 500% or more

    Scalable

    Handles huge workloads at high speeds

    Standard

    No need to learn new languages or add complexity

    Costs

    Significantly lower cost over legacy platforms

    18

  • About Vertica

    Massively Parallel Processing

    – Shared Nothing

    – Elastic scale-out architecture

    – Built-in high availability

    – Commodity Hardware

    – Easy setup and administration

    – And more …

    Client Network

    Private Data Network

    20 TB 20 TB 20 TB

    Node 1  2 x 12 Cores  128+GB RAM

    Node 2  2 x 12 Cores  128+GB RAM

    Node 3  2 x 12 Cores  128+GB RAM

  • Core Vertica Technology Built for performance and scale

    20

  • my.vertica.com

    –Download Vertica Community Edition on my.vertica.com

    –Up to 1 TB and 3 nodes

    21