Webhooks

Configure webhooks to receive real-time notifications about document processing and ingestion job events.

Overview

Webhooks allow you to receive real-time notifications when specific events occur in your SourceSync system. When an event happens, we'll send a HTTP POST request to your configured webhook URL with event details.

Events

Currently, SourceSync supports the following events:

Document Events

EventDescription
DOCUMENTS_QUEUED_FOR_INGESTIONDocuments have been queued for initial ingestion
DOCUMENTS_QUEUED_FOR_UPDATEDocuments have been queued for content update
DOCUMENTS_QUEUED_FOR_RESYNCDocuments have been queued for resynchronization
DOCUMENTS_QUEUED_FOR_DELETIONDocuments have been queued for deletion
DOCUMENTS_PROCESSINGDocuments are currently being processed
DOCUMENTS_ERRORError occurred during document processing
DOCUMENTS_READYDocuments have been successfully processed
DOCUMENTS_DELETEDDocuments have been deleted

Ingestion Job Run Events

EventDescription
INGEST_JOB_RUN_QUEUEDIngestion job run has been queued
INGEST_JOB_RUN_PRE_PROCESSINGIngestion job run is in pre-processing phase
INGEST_JOB_RUN_PROCESSINGIngestion job run is processing
INGEST_JOB_RUN_COMPLETEDIngestion job run has completed

Webhook Workflow

The following diagram explains the complete webhook flow from registration to processing:

  1. Registration: Client registers a webhook URL with SourceSync
  2. Event Occurs: An event (e.g., document processing) happens in SourceSync
  3. Webhook Delivery: SourceSync sends a webhook to the registered URL
  4. Verification: Client verifies the webhook signature
  5. Processing: Client processes the webhook data
  6. Response: Client responds to confirm receipt

Step 1: Register a Webhook

First, you need to register your webhook endpoint with SourceSync:

curl -X POST https://api.sourcesync.ai/v1/webhooks \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://your-domain.com/webhooks",
    "name": "My Webhook"
  }'

The response will include a signingKey that you should store securely:

{
  "success": true,
  "message": "Created the webhook successfully",
  "data": {
    "webhook": {
      "id": "webhook-123",
      "name": "My Webhook",
      "url": "https://your-domain.com/webhooks",
      "signingKey": "your-signing-key",
      "status": "ACTIVE",
      "createdAt": "2024-03-14T12:34:56.789Z",
      "updatedAt": "2024-03-14T12:34:56.789Z"
    }
  }
}

Step 2: Receive and Process Webhooks

When events occur in SourceSync (e.g., documents being processed), we'll send HTTP POST requests to your registered URL with event details.

Here's what happens:

  1. SourceSync detects an event (e.g., documents ready)
  2. SourceSync creates a webhook payload with event details
  3. SourceSync generates a signature using your webhook's signing key
  4. SourceSync sends the webhook to your registered URL
  5. Your server receives and processes the webhook

Step 3: Verify and Process the Webhook

When you receive a webhook, follow these steps:

  1. Verify the timestamp to prevent replay attacks
  2. Verify the signature to ensure authenticity
  3. Process the webhook data asynchronously
  4. Respond quickly (within 10 seconds)
app.post('/v1/webhooks', express.json(), async (req, res) => {
  try {
    // 1. Extract headers and verify timestamp
    const signature = req.headers['x-sourcesync-signature']
    const timestamp = req.headers['x-sourcesync-timestamp']

    if (!signature || !timestamp) {
      return res.status(400).json({ error: 'Missing required headers' })
    }

    // Check timestamp freshness (prevent replay attacks)
    const eventTime = new Date(timestamp).getTime()
    const currentTime = Date.now()
    if (currentTime - eventTime > 5 * 60 * 1000) {
      // 5 minutes
      return res.status(400).json({ error: 'Webhook expired' })
    }

    // 2. Verify signature
    const isValid = await verifyWebhookSignature(
      req.body,
      signature,
      timestamp,
      process.env.WEBHOOK_SIGNING_KEY,
    )

    if (!isValid) {
      return res.status(401).json({ error: 'Invalid signature' })
    }

    // 3. Respond immediately - this is crucial!
    res.status(200).json({ received: true })

    // 4. Process webhook asynchronously (after responding)
    const { event, data, requestId } = req.body
    processWebhookAsync(requestId, event, data)
  } catch (error) {
    console.error('Webhook error:', error)
    return res.status(500).json({ error: 'Internal server error' })
  }
})

Important Considerations

Response Time

  • You must respond within 10 seconds, or SourceSync will consider the webhook delivery failed
  • Process webhooks asynchronously after sending a 2xx response
  • Use a job queue for resource-intensive processing

Retry Behavior

  • Failed webhook deliveries will be retried once after a 2-second delay
  • If the retry fails, the webhook is permanently marked as failed
  • Implement idempotency to handle potential duplicate deliveries

Security Requirements

  1. Always verify signatures using your webhook signing key
  2. Verify timestamp freshness - reject webhooks older than 5 minutes
  3. Use HTTPS endpoints for your webhook URLs
  4. Store your signing key securely - treat it like a password
  5. Implement IP allowlisting if possible

Idempotency

Use the requestId field to ensure you don't process the same webhook twice:

async function processWebhookAsync(
  requestId: string,
  event: string,
  data: any,
) {
  // Check if we've already processed this requestId
  if (await hasProcessedWebhook(requestId)) {
    console.log(`Ignoring duplicate webhook: ${requestId}`)
    return
  }

  // Process the webhook based on event type
  switch (event) {
    case 'DOCUMENTS_READY':
      await handleDocumentsReady(data.documentIds)
      break
    // Handle other events...
  }

  // Mark this requestId as processed
  await markWebhookAsProcessed(requestId)
}

Webhook Payloads

Document Event Payload

All the document events have the below payload structure.

{
  "requestId": "unique-request-id",
  "event": "DOCUMENTS_READY",  // One of the document events
  "data": {
    "documentIds": ["doc-1", "doc-2"]  // Array of affected document IDs
  },
  "tenantId": "tenant-123",  // or null
  "namespaceId": "namespace-456",
  "organizationId": "org-789",
  "timestamp": "2024-03-14T12:34:56.789Z"
}

Ingestion Job Run Event Payload

All the ingestion job run events have the below payload structure.

{
  "requestId": "unique-request-id",
  "event": "INGEST_JOB_RUN_COMPLETED",  // One of the ingestion job events
  "data": {
    "ingestJobRunId": "job-123"  // ID of the ingestion job run
  },
  "tenantId": "tenant-123",  // or null
  "namespaceId": "namespace-456",
  "organizationId": "org-789",
  "timestamp": "2024-03-14T12:34:56.789Z"
}

Security

Webhook Signatures

Every webhook request includes a signature in the X-SourceSync-Signature header. You should verify this signature to ensure the webhook came from SourceSync:

// TypeScript implementation
async function verifyWebhookSignature(
  payload: any,
  signature: string | null,
  timestamp: string | null,
  signingKey: string,
): Promise<boolean> {
  try {
    if (!signature || !timestamp) {
      return false
    }

    // Check if the timestamp is too old
    const eventTimestamp = new Date(timestamp).getTime()
    const currentTimestamp = Date.now()
    const timeDiff = Math.abs(currentTimestamp - eventTimestamp)
    const fiveMinutesInMs = 5 * 60 * 1000

    if (timeDiff > fiveMinutesInMs) {
      console.error('Webhook timestamp is too old')
      return false
    }

    // Recreate the signed string
    const payloadWithTimestamp = JSON.stringify({ payload, timestamp })

    // Convert message and key to ArrayBuffer
    const encoder = new TextEncoder()
    const messageBuffer = encoder.encode(payloadWithTimestamp)
    const keyBuffer = encoder.encode(signingKey)

    // Import the key
    const cryptoKey = await crypto.subtle.importKey(
      'raw',
      keyBuffer,
      { name: 'HMAC', hash: 'SHA-256' },
      false,
      ['verify'],
    )

    // Convert signature from hex to ArrayBuffer
    const signatureBuffer = hexToArrayBuffer(signature)

    // Verify the signature
    return await crypto.subtle.verify(
      'HMAC',
      cryptoKey,
      signatureBuffer,
      messageBuffer,
    )
  } catch (error) {
    console.error('Signature verification failed:', error)
    return false
  }
}

// Helper function to convert hex string to ArrayBuffer
function hexToArrayBuffer(hexString: string): ArrayBuffer {
  const matches = hexString.match(/.{1,2}/g) || []
  return new Uint8Array(matches.map((byte) => parseInt(byte, 16))).buffer
}
# Python implementation
import hmac
import hashlib
import json
from typing import Any, Dict

def verify_webhook_signature(
    payload: Dict[str, Any],
    signature: str | None,
    timestamp: str | None,
    signing_key: str
) -> bool:
    """Verify that the webhook signature is valid."""
    try:
        if not signature or not timestamp:
          return False

        event_time = datetime.fromtimestamp(timestamp, timezone.utc)
        current_time = datetime.now(timezone.utc)
        if (current_time - event_time).total_seconds() > 300:  # 5 minutes
          return False

        # Recreate the signed string
        payload_with_timestamp = json.dumps({
          "payload": payload,
          "timestamp": timestamp
        },separators=(',', ':'))  # Use compact JSON encoding

        # Create HMAC signature
        computed_signature = hmac.new(
            signing_key.encode('utf-8'),
            payload_with_timestamp.encode('utf-8'),
            hashlib.sha256
        ).hexdigest()

        # Compare signatures using constant-time comparison
        return hmac.compare_digest(computed_signature, signature)
    except Exception as e:
        print(f"Signature verification failed: {e}")
        return False

Headers

Each webhook request includes these headers:

  • X-SourceSync-Signature: HMAC-SHA256 signature of the payload
  • X-SourceSync-Timestamp: Timestamp when the webhook was sent
  • Content-Type: Always application/json

Best Practices

  1. Verify Signatures

    • Always verify webhook signatures using your signing key
    • Check timestamp freshness (within 5 minutes)
  2. Quick Response

    • Respond to webhooks quickly (within 10 seconds)
    • Process webhooks asynchronously
    • Return 2xx status code as soon as you receive the webhook
  3. Handle Retries

    • We retry failed webhook deliveries up to 1 time
    • Implement idempotency using the requestId
    • Each retry happens after 2 seconds
  4. Error Handling

    • Store failed webhooks for later processing
    • Log webhook processing errors
    • Monitor webhook delivery success rates

Example Implementation

Here's a basic Express.js webhook handler:

import express from 'express'
const app = express()

app.post('/v1/webhooks', express.json(), async (req, res) => {
  try {
    // Get headers
    const signature = req.headers['x-sourcesync-signature']
    const timestamp = req.headers['x-sourcesync-timestamp']

    // Verify signature
    const isValid = await verifyWebhookSignature(
      req.body,
      signature,
      timestamp,
      process.env.WEBHOOK_SIGNING_KEY,
    )

    if (!isValid) {
      return res.status(401).json({ error: 'Invalid signature' })
    }

    // Process webhook asynchronously
    const { event, data } = req.body
    processWebhookAsync(event, data)

    // Respond immediately
    res.status(200).json({ received: true })
  } catch (error) {
    console.error('Webhook error:', error)
    res.status(500).json({ error: 'Internal server error' })
  }
})

async function processWebhookAsync(event: string, data: any) {
  switch (event) {
    case 'DOCUMENTS_READY':
      await handleDocumentsReady(data.documentIds)
      break
    case 'DOCUMENTS_ERROR':
      await handleDocumentsError(data.documentIds)
      break
    // Handle other events...
  }
}

Here's an equivalent Python implementation using Flask:

from flask import Flask, request, jsonify
import os
import threading
from datetime import datetime, timezone
import json
import hmac
import hashlib

app = Flask(__name__)

@app.route('/webhooks', methods=['POST'])
def webhook_handler():
    try:
        # Get headers
        signature = request.headers.get('x-sourcesync-signature')
        timestamp = request.headers.get('x-sourcesync-timestamp')

        # Verify signature
        is_valid = verify_webhook_signature(
            request.json,
            signature,
            timestamp,
            os.environ.get('WEBHOOK_SIGNING_KEY')
        )

        if not is_valid:
            return jsonify({'error': 'Invalid signature'}), 401

        # Process webhook asynchronously (after responding)
        webhook_data = request.json
        event = webhook_data.get('event')
        data = webhook_data.get('data')

        # Start processing in a separate thread
        threading.Thread(
            target=process_webhook_async,
            args=(event, data)
        ).start()

        # Respond immediately
        return jsonify({'received': True}), 200

    except Exception as e:
        print(f"Webhook error: {e}")
        return jsonify({'error': 'Internal server error'}), 500

def verify_webhook_signature(payload, signature, timestamp, signing_key):
    try:
        if not signature or not timestamp:
            return False

        # Check timestamp freshness
        try:
            event_time = datetime.fromisoformat(timestamp.replace('Z', '+00:00'))
            current_time = datetime.now(timezone.utc)
            if (current_time - event_time).total_seconds() > 300:  # 5 minutes
                print('Webhook timestamp is too old')
                return False
        except ValueError:
            return False

        # Recreate the signed string
        payload_with_timestamp = json.dumps(
            {'payload': payload, 'timestamp': timestamp},
            separators=(',', ':')  # Use compact JSON encoding
        )

        # Create HMAC signature
        computed_signature = hmac.new(
            signing_key.encode('utf-8'),
            payload_with_timestamp.encode('utf-8'),
            hashlib.sha256
        ).hexdigest()

        # Compare signatures using constant-time comparison
        return hmac.compare_digest(computed_signature, signature)
    except Exception as e:
        print(f"Signature verification failed: {e}")
        return False

def process_webhook_async(event, data):
    """Process webhook asynchronously to avoid blocking the response."""
    try:
        if event == 'DOCUMENTS_READY':
            handle_documents_ready(data.get('documentIds', []))
        elif event == 'DOCUMENTS_ERROR':
            handle_documents_error(data.get('documentIds', []))
        # Handle other events...
    except Exception as e:
        print(f"Error processing webhook: {e}")

def handle_documents_ready(document_ids):
    """Handle the DOCUMENTS_READY event."""
    # Your business logic here
    print(f"Processing {len(document_ids)} ready documents")

def handle_documents_error(document_ids):
    """Handle the DOCUMENTS_ERROR event."""
    # Your business logic here
    print(f"Processing {len(document_ids)} documents with errors")

if __name__ == '__main__':
    # Run the Flask app
    app.run(host='0.0.0.0', port=5000)