Async SOQL Guide (Pilot) -...

20
Async SOQL Guide (Pilot) Salesforce, Winter 16 @salesforcedocs Last updated: December 8, 2015

Transcript of Async SOQL Guide (Pilot) -...

Async SOQL Guide (Pilot)Salesforce, Winter ’16

@salesforcedocsLast updated: December 8, 2015

© Copyright 2000–2015 salesforce.com, inc. All rights reserved. Salesforce is a registered trademark of salesforce.com, inc.,as are other names and marks. Other marks appearing herein may be trademarks of their respective owners.

CONTENTS

Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Running Async SOQL Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

Async SOQL Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

Supported SOQL Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

OVERVIEW

EDITIONS

Available in: SalesforceClassic

Available in:• Enterprise• Performance• Unlimited• Developer

Note: This feature is available to select customers through a pilot program. To be nominatedto join this pilot program, contact salesforce.com.

With the advent of BigObjects, it’s possible to keep billions of records on the platform. Salesforcehas made new platform services available to help you work at this new scale, by combining BigObjectdata with your core business data.

Async SOQL is a method for running SOQL queries in the background over Salesforce entity data,including sObjects, BigObjects, and external objects (accessed via Lightning Connect). It providesa convenient way to query large amounts of data stored in Salesforce.

Async SOQL is implemented in the form of a RESTful API that enables you to run queries in thefamiliar syntax of the SOQL language. You can run multiple queries in the background, and monitortheir completion status.

The results of each query are deposited into an object you specify, which can be an sObject or BigObject. As a result, you can subset,join, and create more complex queries that are not subject to timeout limits. This is ideal when you have millions or billions of records,and need more performant processing than is possible using synchronous SOQL.

In this initial pilot, the focus is on allowing customers to try out and evaluate this new functionality. The focus is not on scale or parallelcapabilities, so you’re limited to three Async SOQL queries at a time.

Async SOQL Versus Data Pipelines

Async SOQL is related to Data Pipelines, but is focused on object-level segmentation, aggregation, and filtering, and does not allow youto interact with Salesforce files. The figure below compares the two methods.

1

To use Async SOQL effectively, it’s helpful to understand its key components and other related concepts. Why would you use anasynchronous SOQL query instead of standard SOQL? The following table lists the key decision criteria across the broadest array of usecases.

Reasons AgainstReasons For

Immediacy of ResultReliability

User ExperienceResilience

< 50M ScaleScale

Here’s a simple example: Reduce your billion-row dataset to a manageable size, so you can do something with it.

Because you can’t make aggregate queries with BigObjects using synchronous SOQL, a good strategy is to instead create a customobject, Aggregate_Queries__c, that stores aggregate data for your billion-row BigObject.

For instance, you would create an object with the following fields.

• Count__c

• Average__c

• Time__c

• Type__c (which represents each individual query or source BigObject)

Here’s another simple example: Reduce your billion-row dataset and join it with other BigObject or sObject data to create a new dataset.

2

Overview

Here’s a more complex example: Run queries over multiple objects and multiple levels in order to best define individual downstreamuse cases. For example, you can reduce ApiEvents (a standard BigObject, currently in pilot) to extract fields out of the SOQL query andcombine them with FieldHistoryArchive (a standard BigObject), to determine the state of the data when it was queried by the API.

3

Overview

RUNNING ASYNC SOQL QUERIES

The Force.com REST API provides a powerful, convenient, and simple Web services API for interacting with Force.com. REST API allowsfor ease of integration and development, and is an excellent choice for use with mobile applications and Web 2.0 projects.

Formulating your Async SOQL Query

Using Async SOQL, each query is formulated in the POST request as a JSON encoded list of three key-value pairs.

DescriptionParameter

This specifies the parameters for the SOQL query you want to execute.query

Note: In this pilot, we only support a limited subset of SOQL commands. See the lastsection for details.

This is an sObject or BigObject into which the results of the query are delivered.targetObject

This defines how the fields in the source object map to the fields in the target object in whichthe results are written.

targetFieldMap

Here’s an example of a simple Async SOQL command. It queries a source custom object, SourceObject__c, and directs the result toanother custom object, TargetObject__c. You can easily map your fields by linking the fields in the source object you’re querying to thefields of the target object where you want the results to be written.

Example URI

https://instance_name—api.salesforce.com/services/data/v35.0/async-queries/

Example request body

{"query": "SELECT firstField__c, secondField__c FROM SourceObject__c",

"targetObject": "TargetObject__c",

"targetFieldMap": {"firstField__c":"firstFieldTarget__c","secondField__c":"secondFieldTarget__c"}

}

Example response body

{"jobId": "08PD000000003kiT",

"query": "SELECT firstField__c, secondField__c FROM SourceObject__c",

"status": "Complete",

4

"targetObject": "TargetObject__c",

"targetFieldMap": {"firstField__c":"firstFieldTarget__c","secondField__c":"secondFieldTarget__c"}

}

The response of an Async SOQL query displays the SOQL command, details of the field mapping, and the target object. It also displaysthe query status, which can have one of three values: Running, Complete, or Failed.

Note: When defining the targetFieldMap parameter, make sure the field type mappings are consistent. For example, youcan’t map a compound field to a standard text field, or a number field to a decimal field, and vice-versa. If the source and targetfields don’t match, these considerations apply.

• Any source field can be mapped onto a target text field.

• If the source and target fields are both numerical, the target field must be of equal or greater precision than the source field.If not, the request will fail. This is to ensure no data is lost in the conversion.

Tracking the Status of Your Query

Here’s how the above command would look executed in Workbench.

To track the status of any Async SOQL command, specify its jobID in a GET request. The jobID is the first field in the response.

5

Running Async SOQL Queries

Alternatively, you can obtain a more detailed perspective of your Async SOQL query, by specifying the jobID after the BackgroundOperationentity.

Note: All Async SOQL jobs run non-deterministically, so you can schedule when they start, but not when they finish.

States of an Async SOQL Query

Your Async SOQL job goes through several state transitions that allow you to track its progress.

6

Running Async SOQL Queries

DetailsState

A new job has been created and scheduled, but is not yet running.New

The job is running sucessfully and the org hasn’t exceeded any limits.Scheduled > Running

The job failed because the org exceeded the Async SOQL limits.Scheduled > Failed

The job failed after the system successfully submitted it.Running > Failed

The job was successfully completed.Running > Success

7

Running Async SOQL Queries

ASYNC SOQL USE CASES

Several key use cases for BigObjects have already been implemented successfully by pilot customers. This section provides details ofsome common use cases.

Data Archiving

Data archiving allows companies to regularly archive snapshot copies of their CRM data in BigObjects for regulatory compliance whilekeeping the original data in Salesforce.

To archive records for long-term retention in a BigObject, Async SOQL can easily move any number of records from an sObject into aBigObject. The data archive process can be achieved via the following programmatic flow.

1. Define source sObject records.

2. Define target BigObject(s).

3. Define sObject to BigObject field mappings.

4. Use Async SOQL to copy records from sObject to BigObject storage.

5. Conceive and orchestrate the delete process driven by the parent IDs via APIs.

8

Create a new job by sending a POST request to the following URI. The request body identifies the type of object to be queried, the targetobject, and supporting mapping. This example archives all cases whose status is closed.

Example URI

https://instance_name—api.salesforce.com/services/data/v35.0/async-queries/

Example request body

{"query": "SELECT id, CaseNumber, ClosedDate, ContactId, CreatedById, Description

FROM Case WHERE Status = 'Closed'",

"targetObject": "ArchivedClosedCases__b",

"targetFieldMap": {"id": "originalCaseId__c","CaseNumber": "caseNumber__c","ClosedDate": "closedDate__c","ContactId": "contactId__c","CreatedById": "createdById__c","Description": "Description__c"}

}

Example response body

{"jobId": "08PB000000003NS",

"query": "SELECT id, CaseNumber, ClosedDate, ContactId, CreatedById, DescriptionFROM Case WHERE Status = 'Closed'",

"status": "Complete",

"targetFieldMap": {"id": "originalCaseId__c","CaseNumber": "caseNumber__c","ClosedDate": "closedDate__c","ContactId": "contactId__c","CreatedById": "createdById__c","Description": "Description__c"},

"targetObject": "ArchivedClosedCases__b"}

Customer 360° and Filtering

In this use case, customers load a variety of customer engagement data from external sources into Salesforce BigObjects and then processthe data to enrich customer profiles in Salesforce. The idea is to store customer transactions and interactions such as point-of-sale data,orders, and line items in BigObjects, then process and correlate that data with your core CRM data. This anchoring of customer transactionsand interactions in core master data in CRM allows a richer 360-degree view that translates into an enhanced customer experience.

9

Async SOQL Use Cases

In the following example, we want to analyze the customer data stored in the Rider record of a car sharing service. The source BigObject,Rider_Record_b, has a lookup relationship with the Contact object, allowing for an enriched view of the contact’s car riding history.

Example URI

https://instance_name—api.salesforce.com/services/data/v35.0/async-queries/

Example request body

{"query": "SELECT End_Location_Lat__c, End_Location_Lon__c, End_Time__c,

Start_Location_Lat__c, Start_Location_Lon__c, Start_Time__c,Uber_Type__c, Rider__r.FirstName, Rider__r.LastName,Rider__r.Email

FROM Rider_Record__b WHERE Star_Rating__c = '5'",

"targetObject": "Rider_Reduced__b",

"targetFieldMap": {"End_Location_Lat__c":"End_Lat__c","End_Location_Lon__c":"End_Long__c","Start_Location_Lat__c": "Start_Lat__c","Start_Location_Lon__c": "Start_Long__c","End_Time__c": "End_Time__c","Start_Time__c": "Start_Time__c","Uber_Type__c": "Uber_Type__c","Rider__r.FirstName": "First_Name__c","Rider__r.LastName": "Last_Name__c","Rider__r.Email": "Rider_Email__c"}

}

Example response body

{"jobId": "08PB000000000NA",

"query": "SELECT End_Location_Lat__c, End_Location_Lon__c, End_Time__c,Start_Location_Lat__c, Start_Location_Lon__c, Start_Time__c,Uber_Type__c, Rider__r.FirstName, Rider__r.LastName,Rider__r.Email

FROM Rider_Record__b WHERE Star_Rating__c = '5'",

"status": "Complete",

"targetFieldMap": {"End_Location_Lat__c":"End_Lat__c","End_Location_Lon__c":"End_Long__c","Start_Location_Lat__c": "Start_Lat__c","Start_Location_Lon__c": "Start_Long__c","End_Time__c": "End_Time__c","Start_Time__c": "Start_Time__c","Uber_Type__c": "Uber_Type__c","Rider__r.FirstName": "First_Name__c","Rider__r.LastName": "Last_Name__c","Rider__r.Email": "Rider_Email__c"},

10

Async SOQL Use Cases

"targetObject": "Rider_Reduced__b"}

Note: The metadata files for creating the BigObjects in this example are included in this zip file. Deploy the zip file into your org,using Workbench or the Metadata API, to try out the example.

Field Audit Trail

Field Audit Trail lets you define a policy to retain archived field history data up to ten years, independent of field history tracking. Thisfeature helps you comply with industry regulations related to audit capability and data retention.

You define a Field Audit Trail policy, using the HistoryRetentionPolicy object, for each object you want to archive. The fieldhistory data for that object is then moved from the History related list into the FieldHistoryArchive object at periodic intervals,specified by the policy. For more information, see: Field Audit Trail Implementation Guide.

You can use Async SOQL to query archived fields, stored in the FieldHistoryArchive object. You can use the WHERE clauseto filter the query by specifying comparison expressions for the FieldHistoryType, ParentId, and CreatedDate fields,as long as you specify them in that order.

This example queries archived accounts created within the last month.

Example URI

https://instance_name—api.salesforce.com/services/data/v35.0/async-queries/

Example request body

{"query": "SELECT ParentId, FieldHistoryType, Field, Id, NewValue, OldValue

FROM FieldHistoryArchive WHERE FieldHistoryType = ‘Account’AND CreatedDate > LAST_MONTH”,

"targetObject": "ArchivedAccounts__b",

"targetFieldMap": {"ParentId": "ParentId__c","FieldHistoryType": "FieldHistoryType__c","Field": "Field__c","Id": "Id__c","NewValue": "NewValue__c","OldValue": "OldValue__c"}

}

Example response body

{"jobId": "07PB000000006PN",

"query": "SELECT ParentId, FieldHistoryType, Field, Id, NewValue, OldValueFROM FieldHistoryArchive WHERE FieldHistoryType = ‘Account’AND CreatedDate > LAST_MONTH”,

"status": "Complete",

"targetFieldMap": {"ParentId": "ParentId__c",

11

Async SOQL Use Cases

"FieldHistoryType": "FieldHistoryType__c","Field": "Field__c","Id": "Id__c","NewValue": "NewValue__c","OldValue": "OldValue__c"}

"targetObject": "ArchivedAccounts__b"}

Note: All number fields returned from a SOQL query of archived objects are in standard notation, not scientific notation, as in thenumber fields in the entity history of standard objects.

Event Monitoring

Login Forensics and Data Leakage Detection, both currently in pilot, enable you to to track who is accessing confidential and sensitivedata in your Salesforce org. You can view information about individual events or track trends in events to swiftly identify unusual behaviorand safeguard your company’s data. This is particularly useful for compliance with regulatory and audit requirements.

Note: These features are available to select customers through a pilot program. To be nominated to join this pilot program,contact salesforce.com.

In the current pilot, you can monitor data accessed through API calls. This covers many common scenarios, as more than 50% of SOQLqueries occur using the SOAP, REST, or Bulk APIs. Key information about each query, such as the Username, UserId, UserAgent, andSourceIP, is stored in the ApiEvent object. You can then run SOQL queries on this object to find out details of user activity in yourorganization.

For example, let's say you want to know everyone who viewed the contact record of your company’s CEO. The key to this query is theCEO’s contact record Id. Let’s say the CEO’s name is Jane Doe, and her Id is 003D000000QYVZ5. (You can also query this using SOQL:SELECT Id FROM Contact WHERE Name = 'Jane Doe'). You can use the following Async SOQL query to determineall users who saw her contact information, as well as when, how, and where they saw it.

Example URI

https://instance_name—api.salesforce.com/services/data/v35.0/async-queries/

Example request body

{"query": "SELECT Soql, SourceIp, Username, EventTime FROM ApiEvent

WHERE RecordInfo Like '%003D000000QYVZ5%'",

"targetObject": "QueryEvents__c",

"targetFieldMap": {"Soql":"QueryString__c","SourceIp":"IPAddress__c","Username":"User__c", "EventTime":"EventTime__c","UserAgent":"UserAgent__c"}

}

Example response body

{"jobId": "05PB000000001PQ",

12

Async SOQL Use Cases

"query": "SELECT Soql, SourceIp, Username, EventTimeFROM ApiEvent WHERE RecordInfo Like '%003D000000QYVZ5%'",

"status": "Complete",

"targetFieldMap": {"Soql":"QueryString__c","SourceIp":"IPAddress__c","Username":"User__c", "EventTime":"EventTime__c","UserAgent":"UserAgent__c"},

"targetObject": "QueryEvents__c"}

Alternatively, if you need to ask this question on a repeated basis for audit purposes, you can automate the query using a curl script.

curl -H "Content-Type: application/json" -X POST -d'{"query": "SELECT Soql, SourceIp, UserAgent, Username, EventTime FROM ApiEvent WHERERecordInfo Like'%003D000000QYVZ5%'","targetObject": "AQ__c","targetFieldMap": {"Soql":"QueryString__c", "SourceIp":"IPAddress__c", "Username":"User__c","EventTime":"EventTime__c",UserAgent}}'"https://na1.salesforce.com/services/data/v35.0/async-queries" -H"Authorization: Bearer 00D30000000V88A!ARYAQCZOCeABy29c3dNxRVtv433znH15gLWhLOUv7DVu.uAGFhW9WMtGXCul6q.4xVQymfh4Cjxw4APbazT8bnIfxlRvUjDg"

Another event monitoring use case is to identify all users who accessed a sensitive field, such as Social Security Number or Email. Forexample, you can use the following Async SOQL query to determine the users who saw social security numbers, and the records in whichthose numbers were exposed.

Example URI

https://instance_name—api.salesforce.com/services/data/v35.0/async-queries/

Example request body

{"query": "SELECT Soql, Username, RecordIds, EventTime FROM ApiEvent

WHERE Soql Like '%SSN__c%'",

"targetObject": "QueryEvents__c",

"targetFieldMap": {"Soql":"Text__c", "Username":"User__c","EventTime":"EventTime__c", "RecordIds":"Records_Seen__c"}

}

Example response body

{"jobId": "08PB000000001RS",

"query": "SELECT Soql, Username, RecordIds, EventTime FROM ApiEventWHERE Soql Like '%SSN__c%'",

"status": "Complete",

"targetFieldMap": {"Soql":"Text__c", "Username":"User__c",

13

Async SOQL Use Cases

"EventTime":"EventTime__c", "RecordIds":"Records_Seen__c"},

"targetObject": "QueryEvents__c"}

14

Async SOQL Use Cases

SUPPORTED SOQL COMMANDS

Async SOQL supports a subset of commands in the SOQL language. This includes the most common commands, relevant to key usecases. We plan to support additional SOQL commands in future releases.

Note: For details of any command, refer to the SOQL documentation.

WHERE Clause

• Comparison operators for TEXT and NUMBER fields only

=, !=, <, <=, >, >-, >=

• Logical operators

AND, OR

Example

SELECT AnnualRevenue From Account WHERE NumberOfEmployees > 1000 AND ShippingState = ‘CA’

Aggregate Functions

COUNT(field), AVG(), SUM()

Example

SELECT COUNT(Id) FROM FieldHistoryArchive

GROUP BY Clause

Example

SELECT COUNT(Id) count, CreatedById createdBy FROM FieldHistoryArchive GROUP BY CreatedById

Relationship Queries

Single-level child-to-parent relationships are supported using the dot notation. These can be used with the SELECT, WHERE, and GROUPBY clauses.

Example

SELECT Account.ShippingState s, COUNT(Id) c FROM Contact GROUP BY Account.ShippingState

15

Using Aliases with Aggregates

Examples

• {"query":"SELECT count(Id) c, EventTime t FROM LoginEvent group by EventTime","targetObject":"AQ__c","targetFieldMap":"{"c":"Count__c", "t", "EventTime__c"}}

• {"query":"SELECT count(Id), EventTime FROM LoginEvent group by EventTime","targetObject":"AQ__c","targetFieldMap":"{"expr0":"Count__c","EventTime", "EventTime__c"}}

• {"query":"SELECT count(Id ) c , firstField__c fFROM SourceObject__c",

"targetObject":"TargetObject__c","targetFieldMap":{"c":"countTarget__c","f":"secondFieldTarget__c"}}

16

Supported SOQL Commands