Bulk API

Bulk API – Salesforce : Having trouble with processing large data set?

·

4 min read

When it comes to making customers happy you need to make sure the data is up-to-date, comprehensive and easy to access in your system. Correctness of the data will help you to predict the customer’s needs and take appropriate action to maintain the relationship with them.

Salesforce has a great solution to manipulate your huge amount of data, i.e., Bulk API.

Why does bulk API fit as the right solution?

Now that we know the solution, it is straightforward that if we are dealing with a large set of data loads then bulk API is the most fitted option.

Nevertheless, this is not the only reason to use bulk API. As we know bulk load is a time-consuming activity due to automation like workflow, process builder, triggers and so on., getting executed for each record.

To optimize the performance of the data load bulk API is recommended.

How does it Work?

Let’s resolve the above use case by using the steps:

I am using JSON format to send the data and receive the output. Please handle exceptions as per the requirements.

Create a Job

Create a job using the session ID of the user and URL of the salesforce org. This would be an initial callout to create a job. We need to create a JSON string to let the system know the object and the operation should be performed on data. '/services/async/46.0/job' URL to create a job.

I have used JOSNCreator to create a string:

JSONGenerator generator = JSON.createGenerator(true);
generator.writeStartObject(); 
generator.writeStringField('operation',’harddelete’); // Hard delete permission is required
generator.writeStringField('object', sobjectStr);
generator.writeStringField('contentType',‘application/json; charsert=UTF-8’);
Http http = new Http();HttpRequest request = new HttpRequest();
request.setEndpoint(URL.getSalesforceBaseUrl().toExternalForm()+’/services/async/46.0/job’);  
request.setMethod('POST');
request.setHeader('Content-Type', 'application/json; charsert=UTF-8');
request.setHeader('Accept', 'application/json');
request.setHeader('X-SFDC-Session',UserInfo.getSessionId());
request.setBody(generator.getAsString());
HttpResponse response = http.send(request);
if (response.getStatusCode() == 201) {
    Map results = (Map) JSON.deserializeUntyped(response.getBody());
    jobId= String.valueOf(results.get('id')); // Job id required to do next callout
}

Send the data

Send data to the job created in step 1. Store the job ID in some string. I am using JSON to send data. '/services/async/46.0/job/'+jobId+'/batch' URL to pass data to the job.

String jsonString = '';
JSONGenerator gen = JSON.createGenerator(true);
gen.writeStartObject();
gen.writeFieldName('');
gen.writeStartArray();
for (salesforceobject obj: lstrecords) {
    gen.writeStartObject();
    gen.writeStringField('id', obj.Id);
    gen.writeEndObject();
}
gen.writeEndArray();
gen.writeEndObject();
jsonString = gen.getAsString();
jsonString.trim(); // removed trailing spaces
Http http = new Http();
HttpRequest request = new HttpRequest();
request.setEndpoint(URL.getSalesforceBaseUrl().toExternalForm() + '/services/async / 46.0 / job / '+jobId+' / batch ');
setHeadersForCallout(http, request); //following are string functions to remove non required elements from json string
jsonStringRecord = jsonStringRecord.removeStart('{'); jsonStringRecord = jsonStringRecord.removeStart('\n'); jsonStringRecord.trim(); jsonStringRecord = jsonStringRecord.replaceAll('\s', ''); jsonStringRecord = jsonStringRecord.removeStart('""'); jsonStringRecord = jsonStringRecord.removeStart(':'); jsonStringRecord = jsonStringRecord.removeEnd('}'); request.setBody(jsonStringRecord); HttpResponse response = http.send(request); String jobIdfromDelete = '';
if (response.getStatusCode() == 201) {
    Map results = (Map) JSON.deserializeUntyped(response.getBody());
    jobIdfromDelete = String.valueOf(results.get('jobid')); // Store this job id to close the job
}

Close the Job

Using job id, close the job of bulk API. 'services/async/46.0/job/'+jobIdfromDelete URL to close the job.

Http httpClose = new Http();
HttpRequest requestClose = new HttpRequest();
requestClose.setEndpoint(URL.getSalesforceBaseUrl().toExternalForm() + ’/services/async / 46.0 / job / ’+jobIdfromDelete);
setHeadersForCallout(httpClose, requestClose);
requestClose.setBody('{"state" : "Closed"}'); // consider jobid 
HttpResponse responseClose = http.send(requestClose);
jsonStringRecord = jsonStringRecord.removeEnd('}');
request.setBody(jsonStringRecord);
HttpResponse response = http.send(request);
String jobIdfromDelete = '';
if (response.getStatusCode() == 201) {
    Map results = (Map) JSON.deserializeUntyped(response.getBody());
    jobIdfromDelete = String.valueOf(results.get('jobid')); // Store this job id to close the job
}

Monitor the status of the job:

Merits and demerits of Bulk API:

  1. Submit up to 10,000 batches per rolling 24-hour period.

  2. A batch can contain a maximum of 10,000 records.

  3. Batches can accept single CSV, XML or JSON files up to 10 MB

  4. The results of the closed/submitted batch can be retrieved within 7 days after that it will vanish from the Salesforce org.

  5. Batches are processed into chunks. The API version of the bulk API URL decides the size of the chunk. API version 20.0 and earlier, chunk size = 100 and the rest of the versions, chunk size = 200.

  6. Bulk API comes with an extra 2-minute limit to process the query.

  7. The only disadvantage is the inability to obtain immediate results upon completing the task. Monitoring or fetching the outcome requires checking in either the setup or implementing code snippets that run at specified intervals.

References:

  1. https://developer.salesforce.com/docs/atlas.en-us.api_asynch.meta/api_asynch/asynch_api_concepts_limits.htm

  2. https://developer.salesforce.com/page/The_Salesforce_Bulk_API_-_Maximizing_Parallelism_and_Throughput_Performance_When_Integrating_or_Loading_Large_Data_Volumes

Sukanya Banekar