How to get started with Salesforce Data Cloud API: Step-by-step Guide

Article Written By:
Varalatchumi Veerasamy
Created On:
get started with Salesforce Data Cloud API: Step-by-step Guide

Are you looking to integrate external data into Salesforce? The Salesforce Data Cloud API, specifically the Ingestion API, serves as a high-performance gateway for programmatically bringing your external data into the Data Cloud. This step-by-step guide covers how to architect your data, configure secure access, and push records into the system.

To get started with the Salesforce Data Cloud API, you need to follow these core steps:

1. Set up an Ingestion API connector using a YAML schema file.

2. Create a Data Stream and map your data objects to specific categories.

3. Generate a digital certificate and configure a Salesforce Connected App with the correct OAuth scope.

4. Request a Salesforce access token using a JWT and exchange it for a tenant-specific Data Cloud token.

5. Send your data using either the Streaming API (JSON) or the Bulk API (CSV).

5 Essential Steps to Master the Salesforce Data Cloud Ingestion API

Step 1: Set Up the Schema and Connector

Before sending data, you must define its structure for Data Cloud.

  • Create a Schema File: Create a schema file in YAML format (.yml). This file outlines your event types as objects and defines their specific fields and data types (e.g., strings or dates).
  • Create the Connector: Navigate to Data Cloud Setup, create a new Ingestion API connector, and upload your YAML schema file. Data Cloud will use this to establish the objects and fields available for integration.

Step 2: Create a Data Stream

Once your connector is established, you must configure how the data flows into the system and expose the API for external consumption.

  • Go to the Data Channels tab and create a new stream associated with your Ingestion API connector.
  • Select the objects you want to ingest and assign each to a Category: Profile (for consumer identity data like accounts, emails, or phone numbers), Engagement (for time-series data like interactions as well as events), or Other (for data that doesn't fit the previous two, like product inventory).
  • Assign a primary key, and if using Engagement objects, specify a date/time field. Deploy the stream. Once deployed, Data Cloud automatically generates a corresponding Data Lake Object to house the incoming API data.

Step 3: Configure Security and Authentication

The Data Cloud API has stricter authentication requirements than other REST-based Salesforce APIs.

  • Generate a Certificate: Use OpenSSL to create a 2048-bit RSA private key and a self-signed digital certificate.
  • Create a Connected App: In the Salesforce App Manager, create a new Connected App and enable OAuth settings. Provide the callback URL http://localhost:1717/OauthRedirect and upload the self-signed certificate you created.
  • Assign OAuth Scopes: Ensure your app has the required scopes for your use case, such as cdp_ingest_api (to manage data ingestion), cdp_profile_api, cdp_query_api, api, and refresh_token (to enable offline access and token refreshing). Copy your Consumer Key (Client ID) for the next step.

Step 4: Request and Exchange Access Tokens

Data Cloud utilizes a two-step token exchange process.

  • Get a Salesforce Token: Encode a JSON Web Token (JWT) using your client ID and private key, then issue a POST request to the Salesforce login endpoint to acquire a standard Salesforce access token.
  • Get a Data Cloud Token: Issue a subsequent POST request to the /services/a360/token endpoint, using the Salesforce token as the subject_token. The response will provide your Data Cloud access token and a system-generated, tenant-specific instance URL (which uses the c360a.salesforce.com structure). Include this token as a bearer header in your subsequent API calls.

Step 5: Choose Your Ingestion Pattern and Push Data

You can interact with the Ingestion API using two distinct patterns according to your data volume and frequency:

  • Streaming Ingestion: Ideal for live events or micro-batches. It accepts JSON payloads up to 200 KB per request. Data sent via streaming is processed asynchronously in real-time, approximately every 3 to 15 minutes.
  • Bulk Ingestion: Best for large legacy data migrations or scheduled batch jobs. It uses CSV files with a maximum size of 150 MB. This pattern requires multiple API calls: creating a job, uploading up to 100 files to the open job, and then calling a closed job endpoint to start background processing.

Pro Tip: Accelerate Testing with Postman: To save time during development, download the official Salesforce Data Cloud APIs collection from the Postman API Network. This collection handles the complex authentication steps for you-its pre-request scripts automatically encode the JWT, retrieve the bearer token, and refresh it when it expires. You can also use the API's synchronous record validation endpoint in Postman to quickly verify that your JSON payloads are correctly formatted before committing data to the database.

Frequently Asked Questions (FAQs)

1. What is the difference between Streaming and Bulk Ingestion in the Data Cloud API?

The Salesforce Data Cloud Ingestion API supports two patterns based on your data needs. Streaming Ingestion is designed for near real-time, micro-batch data using JSON payloads up to 200 KB per request, which Data Cloud processes asynchronously every 3 to 15 minutes. Bulk Ingestion is ideal for extensive datasets, scheduled batches, or legacy data migrations; it accepts CSV files up to 150 MB and processes them in the background once a job is closed.

2. What format must I use for my Data Cloud schema file?

Your schema file must be written in YAML format (.yml). This file defines the exact structure of the external data you want to ingest, detailing the event types as objects along with their specific fields and data types (such as strings or dates).

3. How should I categorize my objects when creating a Data Stream?

When mapping your objects in the Data Stream setup, you must assign each to one of three categories. Use Profile for consumer identity data (such as phone numbers, email addresses, or account IDs). Use Engagement for time-series and event-driven data (like customer engagements or performance readings), which requires specifying a primary time field. Use Other data that doesn't fit the first two categories, such as product inventory.

4. What OAuth scopes are required to authenticate the Ingestion API?

When setting up your Salesforce Connected App, you need strict OAuth scopes to enable the JWT beare flow. At a minimum, you must include cdp_ingest_api (to manage Data Cloud ingestion), api (to access Salesforce APIs for token exchange), and refresh_token or offline_access (to allow for token refreshes). If you are also querying data, you may need cdp_query_api and cdp_profile_api.

5. How can I test my Ingestion API payloads before saving them to the database?

During development, you can use a synchronous record validation endpoint to quickly verify that your JSON payloads are correctly formatted against your schema. This validation endpoint acts much like a test class; it checks the payload for errors (e.g., passing a string when a number is required) without committing any data to your Data Lake Object.

Conclusion: Accelerating Your Integration with Minuscule Technologies

Integrating external data into Salesforce is no longer a complex hurdle. By learning the core mechanics of the Data Cloud Ingestion API - from YAML schema definition to the two-step JWT authentication flow - you can unify your entire data ecosystem. Whether you are pushing real-time events or migrating massive legacy datasets, the right configuration guarantees your data arrives accurately, securely, and ready for action.

However, building a high-performance data pipeline requires more than just following steps; it requires an architectural vision that prevents data silos and guarantees scalability. Proper mapping of Data Lake Objects and optimized token management are critical to sustaining system health and security.

At Minuscule Technologies, we specialize in engineering the "connective tissue" between your external platforms and Salesforce Data Cloud. We don't just set up APIs; we architect robust ingestion strategies that guarantee your data is clean, unified, and ready to power your AI and analytics. Our mission is to convert your disjointed data into a streamlined engine for business growth.

Ready to turn your external data into real-time intelligence?

Don't let authentication hurdles or schema complexities stall your data strategy. Partner with Minuscule Technologies to architect a high-velocity Salesforce Data Cloud API integration that delivers results. Connect with our Data Integration Experts today to start your roadmap.

Contact Us for Free Consultation
Thank you! We will get back in touch with you within 48 hours.
Oops! Something went wrong while submitting the form.

Recent Blogs

Ready to Architect Your Salesforce Success?

You've seen what's possible. Now, let's make it happen for your business. Whether you need an end-to-end Salesforce solution, a complex integration, or ongoing managed services, our team is ready to deliver.

Schedule a Free Strategic Call