Webhooks
Configure webhooks to receive real-time notifications about document processing and ingestion job events.
Overview
Webhooks allow you to receive real-time notifications when specific events occur in your SourceSync system. When an event happens, we'll send a HTTP POST request to your configured webhook URL with event details.
Events
Currently, SourceSync supports the following events:
Document Events
Event | Description |
---|---|
DOCUMENTS_QUEUED_FOR_INGESTION | Documents have been queued for initial ingestion |
DOCUMENTS_QUEUED_FOR_UPDATE | Documents have been queued for content update |
DOCUMENTS_QUEUED_FOR_RESYNC | Documents have been queued for resynchronization |
DOCUMENTS_QUEUED_FOR_DELETION | Documents have been queued for deletion |
DOCUMENTS_PROCESSING | Documents are currently being processed |
DOCUMENTS_ERROR | Error occurred during document processing |
DOCUMENTS_READY | Documents have been successfully processed |
DOCUMENTS_DELETED | Documents have been deleted |
Ingestion Job Run Events
Event | Description |
---|---|
INGEST_JOB_RUN_QUEUED | Ingestion job run has been queued |
INGEST_JOB_RUN_PRE_PROCESSING | Ingestion job run is in pre-processing phase |
INGEST_JOB_RUN_PROCESSING | Ingestion job run is processing |
INGEST_JOB_RUN_COMPLETED | Ingestion job run has completed |
Webhook Workflow
The following diagram explains the complete webhook flow from registration to processing:
- Registration: Client registers a webhook URL with SourceSync
- Event Occurs: An event (e.g., document processing) happens in SourceSync
- Webhook Delivery: SourceSync sends a webhook to the registered URL
- Verification: Client verifies the webhook signature
- Processing: Client processes the webhook data
- Response: Client responds to confirm receipt
Step 1: Register a Webhook
First, you need to register your webhook endpoint with SourceSync:
curl -X POST https://api.sourcesync.ai/v1/webhooks \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"url": "https://your-domain.com/webhooks",
"name": "My Webhook"
}'
The response will include a signingKey
that you should store securely:
{
"success": true,
"message": "Created the webhook successfully",
"data": {
"webhook": {
"id": "webhook-123",
"name": "My Webhook",
"url": "https://your-domain.com/webhooks",
"signingKey": "your-signing-key",
"status": "ACTIVE",
"createdAt": "2024-03-14T12:34:56.789Z",
"updatedAt": "2024-03-14T12:34:56.789Z"
}
}
}
Step 2: Receive and Process Webhooks
When events occur in SourceSync (e.g., documents being processed), we'll send HTTP POST requests to your registered URL with event details.
Here's what happens:
- SourceSync detects an event (e.g., documents ready)
- SourceSync creates a webhook payload with event details
- SourceSync generates a signature using your webhook's signing key
- SourceSync sends the webhook to your registered URL
- Your server receives and processes the webhook
Step 3: Verify and Process the Webhook
When you receive a webhook, follow these steps:
- Verify the timestamp to prevent replay attacks
- Verify the signature to ensure authenticity
- Process the webhook data asynchronously
- Respond quickly (within 10 seconds)
app.post('/v1/webhooks', express.json(), async (req, res) => {
try {
// 1. Extract headers and verify timestamp
const signature = req.headers['x-sourcesync-signature']
const timestamp = req.headers['x-sourcesync-timestamp']
if (!signature || !timestamp) {
return res.status(400).json({ error: 'Missing required headers' })
}
// Check timestamp freshness (prevent replay attacks)
const eventTime = new Date(timestamp).getTime()
const currentTime = Date.now()
if (currentTime - eventTime > 5 * 60 * 1000) {
// 5 minutes
return res.status(400).json({ error: 'Webhook expired' })
}
// 2. Verify signature
const isValid = await verifyWebhookSignature(
req.body,
signature,
timestamp,
process.env.WEBHOOK_SIGNING_KEY,
)
if (!isValid) {
return res.status(401).json({ error: 'Invalid signature' })
}
// 3. Respond immediately - this is crucial!
res.status(200).json({ received: true })
// 4. Process webhook asynchronously (after responding)
const { event, data, requestId } = req.body
processWebhookAsync(requestId, event, data)
} catch (error) {
console.error('Webhook error:', error)
return res.status(500).json({ error: 'Internal server error' })
}
})
Important Considerations
Response Time
- You must respond within 10 seconds, or SourceSync will consider the webhook delivery failed
- Process webhooks asynchronously after sending a 2xx response
- Use a job queue for resource-intensive processing
Retry Behavior
- Failed webhook deliveries will be retried once after a 2-second delay
- If the retry fails, the webhook is permanently marked as failed
- Implement idempotency to handle potential duplicate deliveries
Security Requirements
- Always verify signatures using your webhook signing key
- Verify timestamp freshness - reject webhooks older than 5 minutes
- Use HTTPS endpoints for your webhook URLs
- Store your signing key securely - treat it like a password
- Implement IP allowlisting if possible
Idempotency
Use the requestId
field to ensure you don't process the same webhook twice:
async function processWebhookAsync(
requestId: string,
event: string,
data: any,
) {
// Check if we've already processed this requestId
if (await hasProcessedWebhook(requestId)) {
console.log(`Ignoring duplicate webhook: ${requestId}`)
return
}
// Process the webhook based on event type
switch (event) {
case 'DOCUMENTS_READY':
await handleDocumentsReady(data.documentIds)
break
// Handle other events...
}
// Mark this requestId as processed
await markWebhookAsProcessed(requestId)
}
Webhook Payloads
Document Event Payload
All the document events have the below payload structure.
{
"requestId": "unique-request-id",
"event": "DOCUMENTS_READY", // One of the document events
"data": {
"documentIds": ["doc-1", "doc-2"] // Array of affected document IDs
},
"tenantId": "tenant-123", // or null
"namespaceId": "namespace-456",
"organizationId": "org-789",
"timestamp": "2024-03-14T12:34:56.789Z"
}
Ingestion Job Run Event Payload
All the ingestion job run events have the below payload structure.
{
"requestId": "unique-request-id",
"event": "INGEST_JOB_RUN_COMPLETED", // One of the ingestion job events
"data": {
"ingestJobRunId": "job-123" // ID of the ingestion job run
},
"tenantId": "tenant-123", // or null
"namespaceId": "namespace-456",
"organizationId": "org-789",
"timestamp": "2024-03-14T12:34:56.789Z"
}
Security
Webhook Signatures
Every webhook request includes a signature in the X-SourceSync-Signature
header. You should verify this signature to ensure the webhook came from SourceSync:
// TypeScript implementation
async function verifyWebhookSignature(
payload: any,
signature: string | null,
timestamp: string | null,
signingKey: string,
): Promise<boolean> {
try {
if (!signature || !timestamp) {
return false
}
// Check if the timestamp is too old
const eventTimestamp = new Date(timestamp).getTime()
const currentTimestamp = Date.now()
const timeDiff = Math.abs(currentTimestamp - eventTimestamp)
const fiveMinutesInMs = 5 * 60 * 1000
if (timeDiff > fiveMinutesInMs) {
console.error('Webhook timestamp is too old')
return false
}
// Recreate the signed string
const payloadWithTimestamp = JSON.stringify({ payload, timestamp })
// Convert message and key to ArrayBuffer
const encoder = new TextEncoder()
const messageBuffer = encoder.encode(payloadWithTimestamp)
const keyBuffer = encoder.encode(signingKey)
// Import the key
const cryptoKey = await crypto.subtle.importKey(
'raw',
keyBuffer,
{ name: 'HMAC', hash: 'SHA-256' },
false,
['verify'],
)
// Convert signature from hex to ArrayBuffer
const signatureBuffer = hexToArrayBuffer(signature)
// Verify the signature
return await crypto.subtle.verify(
'HMAC',
cryptoKey,
signatureBuffer,
messageBuffer,
)
} catch (error) {
console.error('Signature verification failed:', error)
return false
}
}
// Helper function to convert hex string to ArrayBuffer
function hexToArrayBuffer(hexString: string): ArrayBuffer {
const matches = hexString.match(/.{1,2}/g) || []
return new Uint8Array(matches.map((byte) => parseInt(byte, 16))).buffer
}
# Python implementation
import hmac
import hashlib
import json
from typing import Any, Dict
def verify_webhook_signature(
payload: Dict[str, Any],
signature: str | None,
timestamp: str | None,
signing_key: str
) -> bool:
"""Verify that the webhook signature is valid."""
try:
if not signature or not timestamp:
return False
event_time = datetime.fromtimestamp(timestamp, timezone.utc)
current_time = datetime.now(timezone.utc)
if (current_time - event_time).total_seconds() > 300: # 5 minutes
return False
# Recreate the signed string
payload_with_timestamp = json.dumps({
"payload": payload,
"timestamp": timestamp
},separators=(',', ':')) # Use compact JSON encoding
# Create HMAC signature
computed_signature = hmac.new(
signing_key.encode('utf-8'),
payload_with_timestamp.encode('utf-8'),
hashlib.sha256
).hexdigest()
# Compare signatures using constant-time comparison
return hmac.compare_digest(computed_signature, signature)
except Exception as e:
print(f"Signature verification failed: {e}")
return False
Headers
Each webhook request includes these headers:
X-SourceSync-Signature
: HMAC-SHA256 signature of the payloadX-SourceSync-Timestamp
: Timestamp when the webhook was sentContent-Type
: Alwaysapplication/json
Best Practices
-
Verify Signatures
- Always verify webhook signatures using your signing key
- Check timestamp freshness (within 5 minutes)
-
Quick Response
- Respond to webhooks quickly (within 10 seconds)
- Process webhooks asynchronously
- Return 2xx status code as soon as you receive the webhook
-
Handle Retries
- We retry failed webhook deliveries up to 1 time
- Implement idempotency using the
requestId
- Each retry happens after 2 seconds
-
Error Handling
- Store failed webhooks for later processing
- Log webhook processing errors
- Monitor webhook delivery success rates
Example Implementation
Here's a basic Express.js webhook handler:
import express from 'express'
const app = express()
app.post('/v1/webhooks', express.json(), async (req, res) => {
try {
// Get headers
const signature = req.headers['x-sourcesync-signature']
const timestamp = req.headers['x-sourcesync-timestamp']
// Verify signature
const isValid = await verifyWebhookSignature(
req.body,
signature,
timestamp,
process.env.WEBHOOK_SIGNING_KEY,
)
if (!isValid) {
return res.status(401).json({ error: 'Invalid signature' })
}
// Process webhook asynchronously
const { event, data } = req.body
processWebhookAsync(event, data)
// Respond immediately
res.status(200).json({ received: true })
} catch (error) {
console.error('Webhook error:', error)
res.status(500).json({ error: 'Internal server error' })
}
})
async function processWebhookAsync(event: string, data: any) {
switch (event) {
case 'DOCUMENTS_READY':
await handleDocumentsReady(data.documentIds)
break
case 'DOCUMENTS_ERROR':
await handleDocumentsError(data.documentIds)
break
// Handle other events...
}
}
Here's an equivalent Python implementation using Flask:
from flask import Flask, request, jsonify
import os
import threading
from datetime import datetime, timezone
import json
import hmac
import hashlib
app = Flask(__name__)
@app.route('/webhooks', methods=['POST'])
def webhook_handler():
try:
# Get headers
signature = request.headers.get('x-sourcesync-signature')
timestamp = request.headers.get('x-sourcesync-timestamp')
# Verify signature
is_valid = verify_webhook_signature(
request.json,
signature,
timestamp,
os.environ.get('WEBHOOK_SIGNING_KEY')
)
if not is_valid:
return jsonify({'error': 'Invalid signature'}), 401
# Process webhook asynchronously (after responding)
webhook_data = request.json
event = webhook_data.get('event')
data = webhook_data.get('data')
# Start processing in a separate thread
threading.Thread(
target=process_webhook_async,
args=(event, data)
).start()
# Respond immediately
return jsonify({'received': True}), 200
except Exception as e:
print(f"Webhook error: {e}")
return jsonify({'error': 'Internal server error'}), 500
def verify_webhook_signature(payload, signature, timestamp, signing_key):
try:
if not signature or not timestamp:
return False
# Check timestamp freshness
try:
event_time = datetime.fromisoformat(timestamp.replace('Z', '+00:00'))
current_time = datetime.now(timezone.utc)
if (current_time - event_time).total_seconds() > 300: # 5 minutes
print('Webhook timestamp is too old')
return False
except ValueError:
return False
# Recreate the signed string
payload_with_timestamp = json.dumps(
{'payload': payload, 'timestamp': timestamp},
separators=(',', ':') # Use compact JSON encoding
)
# Create HMAC signature
computed_signature = hmac.new(
signing_key.encode('utf-8'),
payload_with_timestamp.encode('utf-8'),
hashlib.sha256
).hexdigest()
# Compare signatures using constant-time comparison
return hmac.compare_digest(computed_signature, signature)
except Exception as e:
print(f"Signature verification failed: {e}")
return False
def process_webhook_async(event, data):
"""Process webhook asynchronously to avoid blocking the response."""
try:
if event == 'DOCUMENTS_READY':
handle_documents_ready(data.get('documentIds', []))
elif event == 'DOCUMENTS_ERROR':
handle_documents_error(data.get('documentIds', []))
# Handle other events...
except Exception as e:
print(f"Error processing webhook: {e}")
def handle_documents_ready(document_ids):
"""Handle the DOCUMENTS_READY event."""
# Your business logic here
print(f"Processing {len(document_ids)} ready documents")
def handle_documents_error(document_ids):
"""Handle the DOCUMENTS_ERROR event."""
# Your business logic here
print(f"Processing {len(document_ids)} documents with errors")
if __name__ == '__main__':
# Run the Flask app
app.run(host='0.0.0.0', port=5000)