Massively Parallel Postgres Kubernetes Operator for ... GemFire Spark Object Storage HDFS JSON,...

download Massively Parallel Postgres Kubernetes Operator for ... GemFire Spark Object Storage HDFS JSON, Apache

of 26

  • date post

    22-May-2020
  • Category

    Documents

  • view

    0
  • download

    0

Embed Size (px)

Transcript of Massively Parallel Postgres Kubernetes Operator for ... GemFire Spark Object Storage HDFS JSON,...

  • © Copyright 2018 Pivotal Software, Inc. All rights Reserved.

    Kubernetes Operator for Massively Parallel Postgres

    Goutam Tadi (@goutamtadi) Senior Software Engineer, Pivotal Software Inc Email: gp-kubernetes@pivotal.io

    PGConf India 2019

  • Agenda

    ● Intro to Greenplum

    ● Kubernetes 101

    ● Greenplum for Kubernetes

    ○ Components

    ■ Greenplum Operator

    ■ Greenplum Cluster

    ● Demo

  • Massively Parallel Postgres

    Greenplum

  • Greenplum Data Platform

    ANALYTICAL APPLICATIONS

    NATIVE INTERFACES

    PIVOTAL GREENPLUM PLATFORM

    MULTI- STRUCTURED DATA

    SOURCES & PIPELINES

    Structured Data

    JDBC, ODBC

    SQL

    ANSI SQL

    FLEXIBLE DEPLOYMENT

    Local Storage

    Other RDBMSes

    SparkGemFire Cloud Object

    Storage HDFS

    JSON, Apache AVRO, Apache Parquet and XML

    Teradata SQL

    Other DB SQL

    Apache MADlib

    ML/Statistics/Graph

    Python. R, Java, Perl, C

    Programmatic

    Apache SOLR

    Text

    PostGIS

    GeoSpatial

    Custom Apps BI / Reporting Machine Learning AI

    On-Premises

    NEXT GENERATION

    DATA PLATFORM

    KafkaETL Spring Cloud

    Data Flow

    Massively Parallel (MPP)

    PostgreSQL Kernel

    Petabyte Scale

    Loading

    Query Optimizer (GPORCA)

    Workload Manager

    Polymorphic Storage

    Command Center

    SQL Compatibility

    (Hyper-Q)

    DS AnalystsIT Dev

    Public Clouds

    Private Clouds

    Fully Managed

    Clouds

    5

  • Container Orchestration System

    Kubernetes

  • Greenplum on Kubernetes 101

    Kubernetes Master

  • Greenplum on Kubernetes 101

    Kubernetes Master

    kubelet kube-proxy docker

    Node

    kubelet kube-proxy docker

    Node

  • Greenplum on Kubernetes 101

    Kubernetes Master

    Pod

    kubelet kube-proxy docker

    Node

    Pod

    kubelet kube-proxy docker

    Node

  • Greenplum on Kubernetes 101

    Kubernetes Master

    Pod

    kubelet kube-proxy docker

    Node

    Pod

    kubelet kube-proxy docker

    Node

    Storage volumes

  • Greenplum on Kubernetes 101

    Kubernetes Master

    Pod

    Postgres Container

    kubelet kube-proxy docker

    Node

    Pod

    Postgres Container

    kubelet kube-proxy docker

    Node

    Storage volumes

  • Greenplum on Kubernetes 101

    Kubernetes MasterGreenplum Service

    Pod

    Postgres container

    kubelet kube-proxy docker

    Node

    Pod

    Postgres Container

    kubelet kube-proxy docker

    Node

    Storage volumes

  • Greenplum on Kubernetes

    Node

    Pod

    segment-b-0

    kubelet kube-proxy docker

    Pod

    segment-a-0

    kubelet kube-proxy docker

    Node

    Storage volumes

    Pod

    master-0

    kubelet kube-proxy docker

    Pod

    master-1

    kubelet kube-proxy docker

  • Massively Parallel Postgres on Kubernetes

    Greenplum for Kubernetes

  • Components ● Greenplum Operator

    ● Greenplum Cluster

  • Greenplum Operator

    apiVersion: "greenplum.pivotal.io/v1" kind: "GreenplumCluster" metadata: name: my-greenplum spec: masterAndStandby: …. Segments: ….

    Greenplum Operator

    Greenplum Cluster

    CREATE / UPDATE / DELETE

  • Greenplum Cluster

    Namespace: default Kubernetes Cluster

    Master StatefulSet

    master-0 master-1

  • Greenplum Cluster

    Namespace: default Kubernetes Cluster

    Primary StatefulSet

    segment-a-0 segment-a-1

    Mirror StatefulSet

    segment-b-0 segment-b-1 Master StatefulSet

    master-0 master-1

  • Greenplum Cluster

    Namespace: default Kubernetes Cluster

    Primary StatefulSet

    segment-a-0 segment-a-1

    Mirror StatefulSet

    segment-b-0 segment-b-1

    ConfigMap

    Master StatefulSet

    master-0 master-1

  • Greenplum Cluster

    Namespace: default Kubernetes Cluster

    Primary StatefulSet

    segment-a-0 segment-a-1

    Mirror StatefulSet

    segment-b-0 segment-b-1

    ConfigMap

    Master StatefulSet

    master-0 master-1

  • Greenplum Cluster

    Namespace: default Kubernetes Cluster

    Primary StatefulSet

    segment-a-0 segment-a-1

    Mirror StatefulSet

    segment-b-0 segment-b-1

    ConfigMap

    Master StatefulSet

    master-0 master-1

    Psql query

  • Benefits

    ● Declarative style deployments

    ● Auto cluster initialization

    ● Quick and fast deployments

    ● Easy to expand

    ● Delete the compute and retain storage for later use

  • Demo https://youtu.be/d8X2BXSg07Q

    Install Greenplum Operator Install Greenplum Cluster

    Failover Scenarios Expand Greenplum Cluster

    https://youtu.be/d8X2BXSg07Q

  • Future

    Auto Failover Auto Cluster Rejoin

    PVC Snapshot Pod Affinity

  • Transforming How The World Builds Software

    © Copyright 2017 Pivotal Software, Inc. All rights Reserved.25

  • © Copyright 2017 Pivotal Software, Inc. All rights Reserved. Version 1.0

    Resources

    http://greenplum-kubernetes.docs.pivotal.io https://network.pivotal.io

    http://greenplum-kubernetes.docs.pivotal.io