File Conversion API Reference

Learn about the file conversion endpoints and how to convert various file formats to markdown.

POST/v1/convert/file

Convert File

Convert a file to markdown format.

Authorization

  • Name
    Authorization*
    Type
    string
    Description
    Bearer token authentication. Include your API key as Bearer your_api_key
  • Name
    Accept*
    Type
    string
    Description
    application/json

Request Form Data

  • Name
    file*
    Type
    file
    Description
    File to convert
  • Name
    ocrConfig
    Type
    object (stringified)(optional)
    Description
    Configuration for OCR
    • Name
      strategy*
      Type
      enum<string>
      Description
      OCR strategy. Defaults to BASIC_PARSER.
      Available options: BASIC_PARSER, STANDARD_OCR

Request

POST
/v1/convert/file
curl -X POST https://api.sourcesync.ai/v1/convert/file \
  -H "Authorization: Bearer $SOURCE_SYNC_API_KEY" \
  -H "Accept: application/json" \
  -F 'file=@"/Users/Downloads/sample.pdf"' \
  -F 'ocrConfig="{\"strategy\": \"STANDARD_OCR\"}"'


Response Body

  • Name
    success*
    Type
    boolean
    Description
    Indicates whether the request is successful or not. This is always true for success responses.
  • Name
    message*
    Type
    string
    Description
    Human readable message mentioning the result of the request
  • Name
    data*
    Type
    object
    Description
    Data returned from the API.
    • Name
      documents*
      Type
      array<object>
      Description
      Details of the converted documents
      • Name
        filename*
        Type
        string
        Description
        Name of the converted file
      • Name
        markdown*
        Type
        string
        Description
        Converted content in markdown format

Response

POST
/v1/convert/file
{
  "success": true,
  "message": "File converted successfully",
  "data": {
    "documents": [
      {
        "filename": "sample.pdf",
        "markdown": "# Sample Document\n\nThis is the converted content of the document in markdown format.\n\n## Section 1\n\nContent of section 1.\n\n## Section 2\n\nContent of section 2."
      }
    ]
  }
}

Usage Guidelines

Synchronous Processing

The /v1/convert/file endpoint processes files synchronously and returns results immediately, which makes it ideal for:

  • Quick document previews
  • Small files
  • Testing and development environments
  • User-facing applications where immediate results are needed

File Size Limitations

Due to its synchronous nature, this endpoint has the following limitations:

  • Request timeout: 30 seconds
  • For larger files, the request may time out before processing completes

For Larger Files

Recommended approach: Use the /v1/ingest/file endpoint which:

  • Processes files asynchronously
  • Can handle much larger documents
  • Provides better scaling for production workloads
  • Includes document tracking and status updates

OCR Configuration

The /v1/convert/file endpoint supports Optical Character Recognition (OCR) for extracting text from images or documents that contain images.

  • Name
    OCR Strategy
    Description

    Choose between basic parsing and advanced OCR capabilities: BASIC_PARSER (default method, faster but less accurate for image-heavy documents) or STANDARD_OCR (more accurate OCR for documents with images or scanned content).

Supported File Types

  • Name
    Document Files
    Description
    PDF, DOCX, DOC, TXT, RTF, ODT
  • Name
    Presentation Files
    Description
    PPTX, PPT, ODP
  • Name
    Spreadsheet Files
    Description
    XLSX, XLS, CSV, ODS
  • Name
    Image Files
    Description

    PNG, JPG, JPEG, TIFF, GIF, BMP (OCR is applied automatically)

Error Codes

  • Name
    FILE_TOO_LARGE
    Description

    The uploaded file exceeds the size limit

  • Name
    UNSUPPORTED_FILE_TYPE
    Description

    The file format is not supported for conversion

  • Name
    CONVERT_FILE_FAILED
    Description

    Internal error during file conversion process

  • Name
    INVALID_OCR_CONFIG
    Description

    Invalid OCR configuration provided

  • Name
    REQUEST_TIMEOUT
    Description

    The request timed out due to file size or processing complexity