Integration Techniques for the Apache Cassandra non-SQL ......The Apache Cassandra database is the...

14
Integration Techniques for the Apache Cassandra non-SQL database with the Pega Platform Created by: Pawel Nowak On March 30, 2020

Transcript of Integration Techniques for the Apache Cassandra non-SQL ......The Apache Cassandra database is the...

Page 1: Integration Techniques for the Apache Cassandra non-SQL ......The Apache Cassandra database is the right choice when you need scalability and high availability without compromising

Integration Techniques for

the Apache Cassandra non-SQL database

with the Pega Platform

Created by: Pawel Nowak On March 30, 2020

Page 2: Integration Techniques for the Apache Cassandra non-SQL ......The Apache Cassandra database is the right choice when you need scalability and high availability without compromising

2 | P a g e

2020

Pegasystems Inc., Cambridge, MA

All rights reserved.

Trademarks

For Pegasystems Inc. trademarks and registered trademarks, all rights reserved. All other trademarks or service marks are property of their

respective holders.

For information about the third-party software that is delivered with the product, refer to the third-party license file on your installation media that

is specific to your release.

Notices

This publication describes and/or represents products and services of Pegasystems Inc. It may contain trade secrets and proprietary

information that are protected by various federal, state, and international laws, and distributed under licenses restricting their use, copying,

modification, distribution, or transmittal in any form without prior written authorization of Pegasystems Inc.

This publication is current as of the date of publication only. Changes to the publication may be made from time to time at the discretion of

Pegasystems Inc. This publication remains the property of Pegasystems Inc. and must be returned to it upon request. This publication does not

imply any commitment to offer or deliver the products or services described herein.

This publication may include references to Pegasystems Inc. product features that have not been licensed by you or your company. If you have

questions about whether a particular capability is included in your installation, please consult your Pegasystems Inc. services consultant.

Although Pegasystems Inc. strives for accuracy in its publications, any publication may contain inaccuracies or typographical errors, as well as

technical inaccuracies. Pegasystems Inc. shall not be liable for technical or editorial errors or omissions contained herein. Pegasystems Inc.

may make improvements and/or changes to the publication at any time without notice.

Any references in this publication to non-Pegasystems websites are provided for convenience only and do not serve as an endorsement of

these websites. The materials at these websites are not part of the material for Pegasystems products and use of those websites is at your own

risk.

Information concerning non-Pegasystems products was obtained from the suppliers of those products, their publications, or other publicly

available sources. Address questions about non-Pegasystems products to the suppliers of those products.

This publication may contain examples used in daily business operations that include the names of people, companies, products, and other

third-party publications. Such examples are fictitious and any similarity to the names or other data used by an actual business enterprise or

individual is coincidental.

This document is the property of:

Pegasystems Inc. One Rogers Street Cambridge, MA 02142-1209 USA Phone: (617) 374-9600 Fax: (617) 374-9620 www.pega.com Document Name: Integration Techniques for the Apache Cassandra non-SQL database with the Pega Platform Updated: March 30th, 2020

Page 3: Integration Techniques for the Apache Cassandra non-SQL ......The Apache Cassandra database is the right choice when you need scalability and high availability without compromising

3 | P a g e

Use Case: Present the implementation use cases for the integration techniques of the Apache Cassandra non-SQL

database with the Pega Platform. The main concept of this document is to describe, how the Apache

Cassandra database can be used together with the Pega Platform to leverage the business requirements

to process high loads of the data.

Apache Cassandra The Apache Cassandra database is the right choice when you need scalability and high availability without

compromising performance. Linear scalability and proven fault-tolerance on commodity hardware or

cloud infrastructure make it the perfect platform for mission-critical data. Cassandra's support for

replicating across multiple datacenters is best-in-class, providing lower latency for your users and the

peace of mind of knowing that you can survive regional outages.

More about the Apache Cassandra:

http://cassandra.apache.org/

Pega Implementation The Apache Cassandra non-SQL database can be integrated with Pega Platform in various ways, as an

either external, or internal database instance to the Pega Platform:

1. The Pega Database instance with Pega Data Type executed by the Pega Report Definition mapped to the Pega Concrete Class https://community.pega.com/knowledgebase/articles/database-management/connecting-apache-cassandra-database

2. The Pega Decision Data Store (DDS) executed indirectly in the Pega Data Flow from the Pega Data Set of the DDS type, of in the Pega Activity https://community.pega.com/knowledgebase/articles/decision-management-overview/integrating-cassandra-your-application https://community.pega.com/knowledgebase/articles/decision-management-overview/techniques-integrating-cassandra-your-application-pega-722

3. The Pega Connect-Cassandra rule executed in the Pega Activity https://community.pega.com/sites/default/files/help_v73/methods/connect-cassandra/connect-cassandra.htm https://community.pega.com/sites/default/files/help_v73/rule-/rule-connect-/rule-connect-cassandra/main.htm

4. The Cassandra Query Language (CQL) script executed in the Java step from the Pega Activity https://collaborate.pega.com/question/using-cql-cassandra-query-language-pega-query-cassandra

The below paragraphs present the implementation of the each above integration technique.

Page 4: Integration Techniques for the Apache Cassandra non-SQL ......The Apache Cassandra database is the right choice when you need scalability and high availability without compromising

4 | P a g e

The Pega Database instance with Pega Data Type executed by the Pega Report Definition mapped to

the Pega Concrete Class

To integrate the Pega Platform with the Apache Cassandra database, using the Database instance

configuration, the “Cassandra” Database instance must be created first, as in below. The port number

9042 is a default communication port for the native protocol clients.

The connection test of the Cassandra cluster should show the status value, as “Good” for the successful

connection from the Pega Platform to the Apache Cassandra cluster.

After a successful Database instance configuration, the Database Table instance must be configured to

map the Apache Cassandra table to the Pega Database Table instance. It can be done either manually or

using the Data Type wizard. The below steps present manual configuration of the Pega Data Table

instance:

Page 5: Integration Techniques for the Apache Cassandra non-SQL ......The Apache Cassandra database is the right choice when you need scalability and high availability without compromising

5 | P a g e

The “Test connectivity” test must show no configuration problem in the Data Table configuration.

After a successful configuration of the Pega Database instance and the Database Table instance, the Pega

Concrete Class can be created and mapped to the Pega Database Table instance, as in below:

Page 6: Integration Techniques for the Apache Cassandra non-SQL ......The Apache Cassandra database is the right choice when you need scalability and high availability without compromising

6 | P a g e

The Pega Concrete Class key should be configured, as in the Apache Cassandra table key.

Having the Pega Concrete Class configured, the Pega Report Definition can be created on that Pega

Concrete Class, to query the Apache Cassandra data.

Page 7: Integration Techniques for the Apache Cassandra non-SQL ......The Apache Cassandra database is the right choice when you need scalability and high availability without compromising

7 | P a g e

The Pega Report Definition created on the Cassandra table has some filtering and aggregation limitations,

which a user must be aware about:

https://community.pega.com/knowledgebase/articles/database-management/apache-cassandra-

database-support

An example of the Report Viewer configuration:

Page 8: Integration Techniques for the Apache Cassandra non-SQL ......The Apache Cassandra database is the right choice when you need scalability and high availability without compromising

8 | P a g e

The “Get row key” must be deselected in the Data Access tab, as it is not available in the Cassandra table:

Page 9: Integration Techniques for the Apache Cassandra non-SQL ......The Apache Cassandra database is the right choice when you need scalability and high availability without compromising

9 | P a g e

The Pega Connect-Cassandra rule executed in the Pega Activity

The second integration technique for the Apache Cassandra and Pega Platform is the Connect-Cassandra

rule, which allows to make a direct connection to the Cassandra table, as in below:

Page 10: Integration Techniques for the Apache Cassandra non-SQL ......The Apache Cassandra database is the right choice when you need scalability and high availability without compromising

10 | P a g e

The Connect-Cassandra rule can be invoked from a Pega Activity, to either get, put, or list the records:

Page 11: Integration Techniques for the Apache Cassandra non-SQL ......The Apache Cassandra database is the right choice when you need scalability and high availability without compromising

11 | P a g e

The Pega Decision Data Store (DDS) executed indirectly in the Pega Data Flow from the Pega Data Set

of the DDS type

The third integration technique for the Apache Cassandra and Pega Platform is the Decision Data Store

(DDS), which is available to configure in the Decision Data Store Landing Page, as in below:

The DDS configuration lists the available Cassandra instances (nodes), configured as an either external, or

internal to the Pega Platform.

Page 12: Integration Techniques for the Apache Cassandra non-SQL ......The Apache Cassandra database is the right choice when you need scalability and high availability without compromising

12 | P a g e

The DDS configuration is available under the “Edit settings” link.

Having at least one DDS configured with the “NORMAL” status in the Pega Platform, the Data Set of the

DDS type can be created to access/store the Apache Cassandra Data. The DDS Data Set configuration

requires the configuration of the Cassandra Table key, where the first field in that key is used, as the

partition key in execution of the Data Flow runs.

After the first save operation of the DDS Data Set, the wizard creates the Cassandra table linked with the

DDS Data Type, as in below:

Page 13: Integration Techniques for the Apache Cassandra non-SQL ......The Apache Cassandra database is the right choice when you need scalability and high availability without compromising

13 | P a g e

This configuration is not changeable after the first save operation of the Data Set.

The DDS Data Set, as the other Data Sets can be tested using the Run -> Browse option.

To execute the DDS Data set, either the Pega Activity, or the Pega Data Flow can be used. Since the DDS

Data Set is not a stream Data Set, the Pega Platform uses the Batch Data Flow run to execute the Data

Flow with the DDS Data Set configured, as the input to that Data Flow.

Page 14: Integration Techniques for the Apache Cassandra non-SQL ......The Apache Cassandra database is the right choice when you need scalability and high availability without compromising

14 | P a g e

The Cassandra Query Language (CQL) script executed in the Java step from the Activity

The Apache Cassandra allows to execute the CQL query to browse/update the data directly from the

Apache Cassandra using the query scripts. Due to the limitations of the Pega Connect-SQL rule, which

cannot execute the non-SQL query script, the CQL script execution can be embedded into the Java code

and executed using the Java driver in the Apache Cassandra.

https://github.com/datastax/java-driver/tree/4.x/manual/query_builder

The Java code can be inserted into the Java step in the Pega Activity, so the Pega Activity run will execute

the Java code to query the Cassandra with the CQL script. This integration technique, even if doable, might

be not the best one to use due to the maintenance of the CQL queries and writing the custom Java steps

in the Pega Platform, but having a complex CQL query over a big Apache Cassandra tables, it might be

a good option to consider, due to the performance and query efficiency reasons.