---
template: "page.peb"
title: "Intelligent Document Processing Skills"
description: "Add Salesforce File skills to an iDialogue agent so it can semantically read ContentVersion documents, read exact text source, and create new file versions."
displayName: "Intelligent Document Processing Skills"
category: "skills"
contentType: "reference"
audience: "admin"
tags: "skills,idp,document-processing,files,salesforce,contentversion"
section: "skills"
seoTitle: "Intelligent Document Processing Skills"
seoDescription: "Learn how to add read_salesforce_file and update_salesforce_file skills to an iDialogue agent for Salesforce File document processing and source update workflows."
---

## Intelligent Document Processing Skills

## Overview
Use the Salesforce File skill set to let an iDialogue Agent work with Salesforce `ContentVersion` records:

1. `create_salesforce_file` creates a brand-new text-like Salesforce File.
2. `read_salesforce_file` reads a file as semantic markdown, exact text source, or automatic mode.
3. `update_salesforce_file` creates a new `ContentVersion` for the same `ContentDocument` when the agent must save full updated text source.
4. `attach_file` attaches generated artifacts or inline text to a Salesforce record as a File.

`read_salesforce_file` replaces the older process-then-read workflow for admin-facing agents. It creates or reuses the semantic `file.md` cache just in time.

## Who This Is For
Salesforce administrators, solution architects, and developers who configure agents to review contracts, invoices, case attachments, intake forms, web templates, exported HTML, and other Salesforce Files.

## Requirements
- Agent model must be GPT-5.4.
- Agent must have the File skills needed for the workflow, usually `read_salesforce_file` and optionally `create_salesforce_file` or `update_salesforce_file`.
- Read and update workflows must provide the target Salesforce File version as a `contentVersionId`.
- Background execution is recommended for longer document-processing runs or multi-step document workflows.

## Artifact Cache

| Artifact | Purpose |
|---|---|
| `file.md` | Semantic markdown cache for summaries, extraction, Q&A, and document review |
| `source.{ext}` | Exact decoded text source for text-like files such as HTML, CSS, JavaScript, JSON, XML, SVG, CSV, Markdown, templates, YAML, and plain text |
| `file.meta.json` | Sidecar metadata with ContentVersion fields, content type, artifact keys, checksums, charset, and the mode used |

Caches are scoped by immutable `ContentVersionId`. A new Salesforce file version naturally gets a new artifact namespace.

## What Each Skill Does

### `create_salesforce_file`
Use `create_salesforce_file` when the agent needs to create a new text-like Salesforce File.
- Creates a brand-new `ContentDocument` and initial `ContentVersion`.
- Accepts the full source body, not a diff.
- Encodes the body server-side; agents do not perform base64 encoding.
- Writes `source.{ext}` and `file.meta.json`.

### `read_salesforce_file`
Use `read_salesforce_file` when the agent needs to inspect, quote, summarize, extract from, review, or modify a Salesforce File.
- `auto` returns one representation only.
- `semantic` creates or reuses `file.md`.
- `source` decodes exact `VersionData` for text-like files and writes `source.{ext}`.

### `update_salesforce_file`
Use `update_salesforce_file` when the agent needs to save the actual file body/source back to Salesforce.
- Creates a new `ContentVersion`; it does not mutate the existing immutable version.
- Accepts the full updated text body, not a diff.
- Encodes the body server-side; agents do not perform base64 encoding.
- Supports `expectedSourceChecksum` so stale source updates can fail safely.

## Common Inputs

| Skill | Required Input | Common Optional Inputs | Typical Result |
|---|---|---|---|
| `create_salesforce_file` | `body`, `pathOnClient` | `title`, `contentType`, `recordId`, `description` | New `ContentVersion` Id, `ContentDocument` Id, checksum, byte count |
| `read_salesforce_file` | `contentVersionId` | `mode`, `offset`, `maxChars` | One body, mode used, artifact keys, metadata, checksums |
| `update_salesforce_file` | `contentVersionId`, `body`, `reasonForChange` | `expectedSourceChecksum`, `pathOnClient`, `title` | New `ContentVersion` Id, same `ContentDocument` Id, checksum, byte count |
| `attach_file` | Target record and file content reference | File name, content type, description | Salesforce File attachment status and identifiers |

## Admin Setup Checklist
1. Add `read_salesforce_file` to agents that need to inspect Salesforce Files.
2. Add `create_salesforce_file` only to agents trusted to create new text-like Salesforce Files.
3. Add `update_salesforce_file` only to agents trusted to create new file versions.
4. Make sure your invocation flow can provide the target `contentVersionId` for read/update workflows.
5. Steer agents to use `source` mode before updating HTML, CSS, JavaScript, JSON, XML, SVG, template, Markdown, YAML, or plain text files.
6. Use background execution for long-running document jobs or multi-file workflows.

## Prompt Guidance Snippet
Use this in your system or skill prompt to steer document-processing behavior:

```text
When a user asks you to review, extract, analyze, or summarize a Salesforce File,
call read_salesforce_file with mode=auto unless the user explicitly needs exact source.
Use semantic mode for document understanding.
Use create_salesforce_file when creating a brand-new text-like Salesforce File.
Use source mode before editing text-like files, and pass the returned source checksum to
update_salesforce_file as expectedSourceChecksum when saving a new version.
Do not base64 encode file bodies.
```

## Examples

### Contract Review
1. A user asks the agent to review a contract attached to a Salesforce record.
2. The agent calls `read_salesforce_file` with `mode=auto`.
3. The tool uses semantic mode for the document and creates or reuses `file.md`.
4. The agent answers the user from the semantic content.

### HTML Template Update
1. A user asks the agent to update copy in an HTML file stored in Salesforce.
2. The agent calls `read_salesforce_file` with `mode=source`.
3. The agent modifies the exact source body.
4. The agent calls `update_salesforce_file` with the full updated body, `reasonForChange`, and `expectedSourceChecksum`.

### New HTML Template Creation
1. A user asks the agent to create a new HTML template on the current Salesforce record.
2. The agent creates the full HTML source body.
3. The agent calls `create_salesforce_file` with `body`, `pathOnClient`, and optional `description`.
4. The tool returns the new `ContentVersion` and `ContentDocument` identifiers.

### Spreadsheet Intake
1. A user asks the agent to extract values from an Excel workbook.
2. The agent calls `read_salesforce_file` with `mode=semantic`.
3. The tool creates `file.md` through local Excel extraction with evaluated worksheet values.
4. The agent returns the extracted values.

## Troubleshooting
- Agent is answering without using the file:
  - Confirm the agent has `read_salesforce_file` enabled and the prompt tells it to call the tool.
- Source mode fails:
  - The file is likely binary or document-oriented. Use `semantic` mode for review.
- An update fails because `expectedSourceChecksum` does not match:
  - Read the source again and reapply the user-requested change to the latest source body.
- Long documents take multiple steps:
  - Use background execution and bounded reads with `offset` and `maxChars`.
