📄 Document Processing API v1.2.3

A powerful REST API service that analyzes, summarizes, and extracts information from documents using AI.

🚀 Service Status: Running on port 6082
📦 S3 Bucket for Uploads: documentsummarizer (Region: us-east-1)
🤖 AI Model: Private

🔗 Endpoints

GET /health

Checks the operational status and version of the API service.

200 OK Service is healthy.

{
    "status": "healthy",
    "version": "1.2.3",
    "timestamp": "YYYY-MM-DDTHH:MM:SSZ",
    "service": "document-processing-api"
}

POST /document-info

Identifies the type of the document, lists key information categories, and suggests custom prompt templates for detailed extraction with the /summarize endpoint. File content must be base64 encoded.

Request Body (application/json)

ParameterTypeRequiredDescription
fileNamestringYesName of the file (e.g., "invoice.png"). Used for MIME type detection.
fileContentstringYesBase64 encoded string of the file content.

Example Request

{
    "fileName": "statement_may_2023.pdf",
    "fileContent": "BASE64_ENCODED_FILE_CONTENT_STRING"
}

Successful Response 200 OK

{
    "document_type": "Medical Bill",
    "key_information_categories": [
        "Patient Information",
        "Provider Details",
        "Service Dates & Descriptions",
        "Charges and Payments"
    ],
    "suggested_custom_prompts": [
        {
            "prompt_name": "Basic Bill Summary",
            "description": "Extract key summary details from the medical bill.",
            "prompt_template_for_user": "Please extract the following basic details from this medical bill. Respond with a single, valid JSON object only. The JSON structure should be: {\\\"patient_name\\\": \\\"\\\", \\\"provider_name\\\": \\\"\\\", \\\"total_charges\\\": \\\"\\\", \\\"amount_due\\\": \\\"\\\"}"
        },
        {
            "prompt_name": "Detailed Charges Extraction",
            "description": "Extract a comprehensive list of all itemized charges from the medical bill.",
            "prompt_template_for_user": "Extract all itemized charges from this medical bill. Your response must be a single, valid JSON object. The JSON structure should be: {\\\"bill_id\\\": \\\"\\\", \\\"patient_id\\\": \\\"\\\", \\\"itemized_charges\\\": [{\\\"service_date\\\": \\\"\\\", \\\"service_description\\\": \\\"\\\", \\\"cpt_code\\\": \\\"\\\", \\\"charge_amount\\\": \\\"\\\"}], \\\"total_billed_amount\\\": \\\"\\\"}"
        },
        {
            "prompt_name": "Insurance Payment Information",
            "description": "Find details about any insurance payments or adjustments mentioned on the bill.",
            "prompt_template_for_user": "From this medical bill, if insurance payment details are present, extract them. Otherwise, indicate no insurance payment found. Respond in JSON. The JSON structure should be: {\\\"insurance_payment_found\\\": , \\\"insurance_provider\\\": \\\"\\\", \\\"payment_amount\\\": \\\"\\\", \\\"adjustment_amount\\\": \\\"\\\"}"
        }
    ]
}

Note: The suggested_custom_prompts array provides different templates. Choose and adapt one of the prompt_template_for_user strings for the customJsonPrompt in a POST /summarize request.

POST /summarize

Analyzes and summarizes a document. If customJsonPrompt is provided, it attempts to extract data according to that prompt. Otherwise, it provides a general summary. The file content must be base64 encoded.

Request Body (application/json)

ParameterTypeRequiredDescription
fileNamestringYesName of the file (e.g., "report.pdf"). Used for MIME type detection and S3 naming.
fileContentstringYesBase64 encoded string of the file content.
customJsonPromptstringNoYour specific instructions for the AI, expecting a JSON output. If provided, the API returns the raw JSON from the AI. You can use the /document-info endpoint to help generate this.
summaryLengthstringNoDesired summary length: "short", "medium", or "long". Defaults to "medium". Only used if customJsonPrompt is empty.

Example Request (Default Summary)

{
    "fileName": "annual_report.docx",
    "fileContent": "BASE64_ENCODED_FILE_CONTENT_STRING",
    "summaryLength": "medium"
}

Successful Response (Default Summary) 200 OK

{
    "title": "Document Title",
    "key_points": ["Key point 1", "Key point 2"],
    "overall_summary": "A narrative summary...",
    "estimated_reading_time_minutes": 5,
    "sentiment": "Positive"
}

Example Request (Custom JSON Extraction)

{
    "fileName": "invoice_123.pdf",
    "fileContent": "BASE64_ENCODED_FILE_CONTENT_STRING",
    "customJsonPrompt": "Please extract the following details from this invoice and return your response as a single, valid JSON object only. {\"invoice_id\": \"\", \"vendor_name\": \"\", \"total_due\": \"\"}"
}

Successful Response (Custom JSON Extraction) 200 OK

Returns the raw JSON object generated by the AI based on your customJsonPrompt. For example:

{
    "invoice_id": "INV-2023-001",
    "vendor_name": "Supplier Corp",
    "total_due": "1250.75"
}

⚠️ Error Responses

The API uses standard HTTP status codes for errors.

Status CodeMeaningExample Response Body
400 Bad RequestInvalid request payload, missing required fields, or invalid base64 content.{"error": "Invalid request body", "message": "details..."}
500 Internal Server ErrorAn error occurred on the server, e.g., AI service failure, S3 upload issue (async), or unexpected parsing error.{"error": "Failed to analyze document", "message": "details..."}