Getting Started
Welcome to Curacel Doc Extractor! This guide will help you make your first API call and get started with document processing.
Prerequisites
Before you begin, ensure you have:
- An API key for the production environment
- Basic knowledge of HTTP requests and JSON
Quick Start
Step 1: Set Up Your Environment
First, set up your environment with the necessary credentials:
# Set your API key
export DOC_EXTRACTOR_API_KEY="your_production_api_key_here"
# Set the base URL for production
export DOC_EXTRACTOR_BASE_URL="https://extract.curacel.co/api"
Step 2: Make Your First API Call
Let's extract data from a sample document:
curl -X POST "https://api.doc-extractor.curacel.co/api/extract" \
  -H "X-API-Key: your_production_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "files": [
      {
        "type": "url",
        "file": {
          "name": "sample.pdf",
          "content": "https://example.com/sample.pdf"
        }
      }
    ],
    "fields": ["first_name", "last_name", "email", "phone_number"],
    "location": "Kenya"
    "data_type": "finance"
  }'
Step 3: Understand the Response
A successful response will look like this:
{
  "success": true,
  "data": {
    [
      "sample.pdf": {
        "first_name": "John",
        "last_name": "Doe",
        "email": "john.doe@example.com",
        "phone_number": "+1234567890"
      }
    ]
  },
  "message": "Data extracted successfully"
}
Understanding the Request Structure
File Input Types
The API supports three types of file inputs:
1. URL Input
{
  "type": "url",
  "file": {
    "name": "document.pdf",
    "content": "https://example.com/document.pdf"
  }
}
2. Base64 Input
{
  "type": "base64",
  "file": {
    "name": "document.pdf",
    "content": "data:application/pdf;base64,JVBERi0xLjQKJcfsj6IKNSAwIG9iago8PAovVHlwZSAvUGFnZQovUGFyZW50IDMgMCBSCi9NZWRpYUJveCBbMCAwIDU5NSA4NDJdCi9SZXNvdXJjZXMgPDwKL0ZvbnQgPDwKL0YxIDIgMCBSCj4+Cj4+Ci9Db250ZW50cyA0IDAgUgo+PgoKZW5kb2JqCg=="
  }
}
Field Configuration
Specify which fields you want to extract:
{
  "fields": [
    "first_name",
    "last_name",
    "email",
    "phone_number",
    "address",
    "date_of_birth"
  ]
}
Code Examples
JavaScript/Node.js
const axios = require("axios");
async function extractDocumentData() {
  try {
    const response = await axios.post(
      "https://extract.curacel.co/api/annotate",
      {
        files: [
          {
            type: "url",
            file: {
              name: "document.pdf",
              content: "https://example.com/document.pdf",
            },
          },
        ],
        fields: ["first_name", "last_name", "email", "phone_number"],
      },
      {
        headers: {
          "X-API-Key": process.env.DOC_EXTRACTOR_API_KEY,
          "Content-Type": "application/json",
        },
      },
    );
    console.log("Extracted data:", response.data);
    return response.data;
  } catch (error) {
    console.error("Error:", error.response?.data || error.message);
    throw error;
  }
}
// Usage
extractDocumentData();
Python
import requests
import os
def extract_document_data():
    url = 'https://extract.curacel.co/api/annotate'
    headers = {
        'X-API-Key': os.getenv('DOC_EXTRACTOR_API_KEY'),
        'Content-Type': 'application/json'
    }
    data = {
        'files': [
            {
                'type': 'url',
                'file': {
                    'name': 'document.pdf',
                    'content': 'https://example.com/document.pdf'
                }
            }
        ],
        'fields': ['first_name', 'last_name', 'email', 'phone_number']
    }
    try:
        response = requests.post(url, json=data, headers=headers)
        response.raise_for_status()
        result = response.json()
        print('Extracted data:', result)
        return result
    except requests.exceptions.RequestException as e:
        print('Error:', e)
        raise
# Usage
extract_document_data()
PHP
<?php
function extractDocumentData() {
    $url = 'https://extract.curacel.co/api/annotate';
    $headers = [
        'X-API-Key: ' . $_ENV['DOC_EXTRACTOR_API_KEY'],
        'Content-Type: application/json'
    ];
    $data = [
        'files' => [
            [
                'type' => 'url',
                'file' => [
                    'name' => 'document.pdf',
                    'content' => 'https://example.com/document.pdf'
                ]
            ]
        ],
        'fields' => ['first_name', 'last_name', 'email', 'phone_number']
    ];
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_POST, true);
    curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($data));
    curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    $response = curl_exec($ch);
    $httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
    curl_close($ch);
    if ($httpCode === 200) {
        $result = json_decode($response, true);
        echo 'Extracted data: ' . json_encode($result) . PHP_EOL;
        return $result;
    } else {
        echo 'Error: HTTP ' . $httpCode . ' - ' . $response . PHP_EOL;
        return false;
    }
}
// Usage
extractDocumentData();
?>
Batch Processing
For processing multiple documents at once:
curl -X POST "https://extract.curacel.co/api/annotate" \
  -H "X-API-Key: your_sandbox_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "files": [
      {
        "type": "url",
        "file": {
          "name": "document1.pdf",
          "content": "https://example.com/document1.pdf"
        }
      },
      {
        "type": "url",
        "file": {
          "name": "document2.pdf",
          "content": "https://example.com/document2.pdf"
        }
      }
    ],
    "fields": ["first_name", "last_name", "email", "phone_number"]
  }'
Error Handling
Common Error Responses
400 Bad Request
{
  "status": false,
  "message": "Invalid file format or missing required fields"
}
401 Unauthorized
{
  "status": false,
  "message": "Invalid API key"
}
422 Unprocessable Entity
{
  "status": false,
  "message": "Document processing failed"
}
Next Steps
Now that you've made your first API call:
- Explore the API Reference: Check out all available endpoints
- Test Different Document Types: Try various file formats
- Implement Error Handling: Add robust error handling to your code
- Set Up Production: Configure your production environment
- Monitor Usage: Track your API usage and limits
Support
If you need help:
- Documentation: Check our comprehensive guides
- API Reference: Explore all available endpoints
- Support: Contact us at support@curacel.ai
- Community: Join our developer community