Getting Started

Welcome to Curacel Doc Extractor! This guide will help you make your first API call and get started with document processing.

Prerequisites

Before you begin, ensure you have:

An API key for the production environment
Basic knowledge of HTTP requests and JSON

Quick Start

Step 1: Set Up Your Environment

First, set up your environment with the necessary credentials:

# Set your API key
export DOC_EXTRACTOR_API_KEY="your_production_api_key_here"

# Set the base URL for production
export DOC_EXTRACTOR_BASE_URL="https://extract.curacel.co/api"

Step 2: Make Your First API Call

Let's extract data from a sample document:

curl -X POST "https://api.doc-extractor.curacel.co/api/extract" \
  -H "X-API-Key: your_production_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "files": [
      {
        "type": "url",
        "file": {
          "name": "sample.pdf",
          "content": "https://example.com/sample.pdf"
        }
      }
    ],
    "fields": ["first_name", "last_name", "email", "phone_number"],
    "location": "Kenya"
    "data_type": "finance"
  }'

Step 3: Understand the Response

A successful response will look like this:

{
  "success": true,
  "data": {
    [
      "sample.pdf": {
        "first_name": "John",
        "last_name": "Doe",
        "email": "john.doe@example.com",
        "phone_number": "+1234567890"
      }
    ]
  },
  "message": "Data extracted successfully"
}

Understanding the Request Structure

File Input Types

The API supports three types of file inputs:

1. URL Input

{
  "type": "url",
  "file": {
    "name": "document.pdf",
    "content": "https://example.com/document.pdf"
  }
}

2. Base64 Input

{
  "type": "base64",
  "file": {
    "name": "document.pdf",
    "content": "data:application/pdf;base64,JVBERi0xLjQKJcfsj6IKNSAwIG9iago8PAovVHlwZSAvUGFnZQovUGFyZW50IDMgMCBSCi9NZWRpYUJveCBbMCAwIDU5NSA4NDJdCi9SZXNvdXJjZXMgPDwKL0ZvbnQgPDwKL0YxIDIgMCBSCj4+Cj4+Ci9Db250ZW50cyA0IDAgUgo+PgoKZW5kb2JqCg=="
  }
}

Field Configuration

Specify which fields you want to extract:

{
  "fields": [
    "first_name",
    "last_name",
    "email",
    "phone_number",
    "address",
    "date_of_birth"
  ]
}

Code Examples

JavaScript/Node.js

const axios = require("axios");

async function extractDocumentData() {
  try {
    const response = await axios.post(
      "https://extract.curacel.co/api/annotate",
      {
        files: [
          {
            type: "url",
            file: {
              name: "document.pdf",
              content: "https://example.com/document.pdf",
            },
          },
        ],
        fields: ["first_name", "last_name", "email", "phone_number"],
      },
      {
        headers: {
          "X-API-Key": process.env.DOC_EXTRACTOR_API_KEY,
          "Content-Type": "application/json",
        },
      },
    );

    console.log("Extracted data:", response.data);
    return response.data;
  } catch (error) {
    console.error("Error:", error.response?.data || error.message);
    throw error;
  }
}

// Usage
extractDocumentData();

Python

import requests
import os

def extract_document_data():
    url = 'https://extract.curacel.co/api/annotate'

    headers = {
        'X-API-Key': os.getenv('DOC_EXTRACTOR_API_KEY'),
        'Content-Type': 'application/json'
    }

    data = {
        'files': [
            {
                'type': 'url',
                'file': {
                    'name': 'document.pdf',
                    'content': 'https://example.com/document.pdf'
                }
            }
        ],
        'fields': ['first_name', 'last_name', 'email', 'phone_number']
    }

    try:
        response = requests.post(url, json=data, headers=headers)
        response.raise_for_status()

        result = response.json()
        print('Extracted data:', result)
        return result

    except requests.exceptions.RequestException as e:
        print('Error:', e)
        raise

# Usage
extract_document_data()

PHP

<?php
function extractDocumentData() {
    $url = 'https://extract.curacel.co/api/annotate';

    $headers = [
        'X-API-Key: ' . $_ENV['DOC_EXTRACTOR_API_KEY'],
        'Content-Type: application/json'
    ];

    $data = [
        'files' => [
            [
                'type' => 'url',
                'file' => [
                    'name' => 'document.pdf',
                    'content' => 'https://example.com/document.pdf'
                ]
            ]
        ],
        'fields' => ['first_name', 'last_name', 'email', 'phone_number']
    ];

    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_POST, true);
    curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($data));
    curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

    $response = curl_exec($ch);
    $httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
    curl_close($ch);

    if ($httpCode === 200) {
        $result = json_decode($response, true);
        echo 'Extracted data: ' . json_encode($result) . PHP_EOL;
        return $result;
    } else {
        echo 'Error: HTTP ' . $httpCode . ' - ' . $response . PHP_EOL;
        return false;
    }
}

// Usage
extractDocumentData();
?>

Batch Processing

For processing multiple documents at once:

curl -X POST "https://extract.curacel.co/api/annotate" \
  -H "X-API-Key: your_sandbox_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "files": [
      {
        "type": "url",
        "file": {
          "name": "document1.pdf",
          "content": "https://example.com/document1.pdf"
        }
      },
      {
        "type": "url",
        "file": {
          "name": "document2.pdf",
          "content": "https://example.com/document2.pdf"
        }
      }
    ],
    "fields": ["first_name", "last_name", "email", "phone_number"]
  }'

Error Handling

Common Error Responses

400 Bad Request

{
  "status": false,
  "message": "Invalid file format or missing required fields"
}

401 Unauthorized

{
  "status": false,
  "message": "Invalid API key"
}

422 Unprocessable Entity

{
  "status": false,
  "message": "Document processing failed"
}

Next Steps

Now that you've made your first API call:

Explore the API Reference: Check out all available endpoints
Test Different Document Types: Try various file formats
Implement Error Handling: Add robust error handling to your code
Set Up Production: Configure your production environment
Monitor Usage: Track your API usage and limits

Support

If you need help:

Documentation: Check our comprehensive guides
API Reference: Explore all available endpoints
Support: Contact us at support@curacel.ai
Community: Join our developer community

Getting Started

Prerequisites​

Quick Start​

Step 1: Set Up Your Environment​

Step 2: Make Your First API Call​

Step 3: Understand the Response​

Understanding the Request Structure​

File Input Types​

1. URL Input​

2. Base64 Input​

Field Configuration​

Code Examples​

JavaScript/Node.js​

Python​

PHP​

Batch Processing​

Error Handling​

Common Error Responses​

400 Bad Request​

401 Unauthorized​

422 Unprocessable Entity​

Next Steps​

Support​