PDFCanon Documentation

PDFCanon is a PDF normalization API that converts any PDF into a canonical, tamper-evident, PDF/A-compliant document. Use the REST API, official SDKs, or the MCP server to integrate normalization into your workflow.

Get started

Quickstart — Normalize your first PDF in under 5 minutes
Authentication — API keys and auth headers
API Reference — Full endpoint reference

Core concepts

PDFCanon normalizes PDFs by running them through a deterministic 11-stage pipeline:

Stage	Name	Description
0	PDF/A detection	Identify the declared compliance level of the input document
1	Tamper detection	Detect incremental-update injection, shadow content, and post-EOF data
2	Structural repair	Fix malformed cross-reference tables and trailer dictionaries
3	Digital signature detection	Identify and handle existing digital signatures per policy
4	Active content removal	Strip JavaScript, embedded executables, and launch actions
5	AcroForm handling	Flatten or preserve interactive form fields
6	Metadata canonicalization	Normalize XMP and DocInfo metadata to epoch timestamps
7	Font resource validation	Validate and detect non-embedded font subsets
8	Final rewrite	Linearize and emit a clean, canonical PDF with deterministic IDs
9	Content hash	SHA-256 hash of extracted text content for semantic deduplication
10	PDF/A compliance validation	Validate PDF/A compliance of the output (when input declared PDF/A)

The output is deterministic: the same input always produces the same output hash.

API version

The current stable API version is 2026-01-01. All responses include an apiVersion field.

Support

Status page — status.pdfcanon.com
Dashboard — app.pdfcanon.com
GitHub — github.com/PDFCanon

Get started​

Core concepts​

API version​

Support​

Get started

Core concepts

API version

Support