Skip to main content

POST /api/normalize

Normalize a PDF document through the PDFCanon pipeline. Returns a structured result object with full schema details about the normalization, tamper analysis, security changes, and validation.

Request

POST https://api.pdfcanon.com/api/normalize

Content-Type: multipart/form-data

Headers

HeaderRequiredDescription
X-Api-KeyYour API key (pdfn_...)

Form fields

FieldTypeRequiredDefaultDescription
filebinaryThe PDF file to normalize
linearizebooleanNotrueLinearize (web-optimize) the output PDF. Set false to skip.
remove_annotationsbooleanNofalseRemove all PDF annotations
signed_pdf_policystringNorejectHow to handle signed PDFs: reject, strip, or preserve
pdfa_policystringNopreserveHow to handle PDF/A documents: preserve or normalize_anyway
regionstringNoorg defaultTarget storage region (ca-central-1, us-east-2, eu-central-1)
webhook_urlstring (uri)NoOptional HTTPS URL to receive a completion webhook
idempotency_keystringNoClient-supplied idempotency key for safe retries (max 255 chars, 24h TTL)
batch_iduuidNoAssociate this submission with an existing batch

Responses

200 OK — Synchronously normalized (small PDFs)

Returns a NormalizeResponse JSON object.

202 Accepted — Accepted for async processing

Returns a NormalizeResponse JSON object with status: "PENDING" or "IN_PROGRESS". Poll GET /api/submissions/{id} until status is SUCCESS or FAILED.

Error responses

StatusDescription
400Validation error or invalid/disallowed region
401Invalid or missing API key
402Monthly quota exceeded

NormalizeResponse schema

NormalizeResponse
├── apiVersion string e.g. "2026-01-01"
├── requestId string unique request ID
├── submissionId uuid
├── processingTimeMs int64
├── status enum PENDING | IN_PROGRESS | SUCCESS | FAILED | REJECTED
├── original OriginalInfo
│ ├── sha256 string SHA-256 hex digest of the original file
│ └── sizeBytes int64
├── normalized NormalizedInfo (nullable — null until processing completes)
│ ├── sha256 string
│ ├── sizeBytes int64
│ ├── pdfVersion string e.g. "1.7"
│ ├── linearized boolean whether output was linearized (web-optimized)
│ ├── contentHash string (nullable) SHA-256 of extracted text (null for image-only PDFs)
│ └── downloadUrl string (uri) presigned URL to download the artifact
├── security SecurityInfo (what was removed)
│ ├── javascriptRemoved boolean
│ ├── openActionsRemoved boolean
│ ├── embeddedFilesRemoved boolean
│ ├── richMediaRemoved boolean
│ ├── launchActionsRemoved boolean
│ ├── incrementalUpdatesRemoved boolean
│ ├── acroformFlattened boolean
│ ├── annotationsRemoved boolean
│ ├── encryptedInput boolean (was the original encrypted?)
│ ├── digitalSignaturesDetected boolean
│ ├── digitalSignaturesRemoved boolean
│ ├── signatureCount int
│ ├── signatureVerificationResults[] SignatureVerification
│ │ ├── fieldName string
│ │ ├── signerName string (nullable) Distinguished Name from certificate
│ │ ├── signedAt datetime (nullable)
│ │ ├── valid boolean
│ │ ├── certificateExpired boolean
│ │ ├── certificateChainTrusted boolean (nullable)
│ │ ├── timestampPresent boolean
│ │ ├── timestampValid boolean (nullable)
│ │ └── reason string (nullable)
│ └── overallSignatureStatus enum ALL_VALID | SOME_INVALID | NONE
├── validation ValidationInfo (structural repair)
│ ├── xrefRebuilt boolean
│ ├── objectStreamsRegenerated boolean
│ ├── brokenReferencesDetected boolean
│ ├── nonEmbeddedFontsDetected boolean
│ ├── pdfaDeclared boolean (did the input claim PDF/A?)
│ ├── pdfaLevel string (nullable) e.g. "2b", "1a"
│ ├── pdfaPreserved boolean (was PDF/A compliance maintained?)
│ ├── pdfaCompliant boolean (is the output PDF/A compliant?)
│ ├── verapdfValidated boolean (nullable) (was veraPDF validation run?)
│ └── verapdfErrors[] string ISO 19005 clause violations, if any
├── tamperAnalysis TamperAnalysis (nullable — null until processing completes)
│ ├── riskLevel enum none | low | medium | high | critical
│ ├── anomaliesDetected int
│ └── anomalies[] TamperAnomaly
│ ├── type enum INCREMENTAL_UPDATE_INJECTION | POST_EOF_DATA |
│ │ HEADER_VERSION_MISMATCH | SHADOW_CONTENT_DETECTED |
│ │ ORPHANED_SIGNATURE_FIELD
│ ├── severity enum low | medium | high | critical
│ ├── description string
│ └── location string (nullable) e.g. "Byte offset 1847293"
├── warnings[] WarningInfo
│ ├── code string e.g. NON_EMBEDDED_FONT
│ └── message string
└── failure FailureInfo (nullable)
├── code string e.g. POLICY_REJECTION
├── message string
└── stage string (nullable) pipeline stage where failure occurred

Full response example (SUCCESS)

{
"apiVersion": "2026-01-01",
"requestId": "req_01jk4m2n3p5q6r7s",
"submissionId": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"processingTimeMs": 342,
"status": "SUCCESS",
"original": {
"sha256": "aabbccdd...",
"sizeBytes": 102400
},
"normalized": {
"sha256": "ddeeff00...",
"sizeBytes": 98304,
"pdfVersion": "1.7",
"linearized": true,
"contentHash": "ff00aabb...",
"downloadUrl": "https://api.pdfcanon.com/api/artifacts/ddeeff00..."
},
"security": {
"javascriptRemoved": true,
"openActionsRemoved": false,
"embeddedFilesRemoved": true,
"richMediaRemoved": false,
"launchActionsRemoved": false,
"incrementalUpdatesRemoved": true,
"acroformFlattened": false,
"annotationsRemoved": false,
"encryptedInput": false,
"digitalSignaturesDetected": false,
"digitalSignaturesRemoved": false,
"signatureCount": 0,
"signatureVerificationResults": [],
"overallSignatureStatus": "NONE"
},
"validation": {
"xrefRebuilt": false,
"objectStreamsRegenerated": true,
"brokenReferencesDetected": false,
"nonEmbeddedFontsDetected": true,
"pdfaDeclared": false,
"pdfaLevel": null,
"pdfaPreserved": false,
"pdfaCompliant": false,
"verapdfValidated": null,
"verapdfErrors": []
},
"tamperAnalysis": {
"riskLevel": "high",
"anomaliesDetected": 2,
"anomalies": [
{
"type": "INCREMENTAL_UPDATE_INJECTION",
"severity": "high",
"description": "Document contains 7 incremental updates with conflicting page content",
"location": "Byte offset 1847293"
},
{
"type": "POST_EOF_DATA",
"severity": "low",
"description": "Data found after PDF EOF marker",
"location": null
}
]
},
"warnings": [
{
"code": "NON_EMBEDDED_FONT",
"message": "Font 'Arial' is not embedded in the document"
}
],
"failure": null
}

Async response example (PENDING)

{
"apiVersion": "2026-01-01",
"requestId": "req_01jk4m2n3p5q6r7s",
"submissionId": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"processingTimeMs": 0,
"status": "PENDING",
"original": {
"sha256": "aabbccdd...",
"sizeBytes": 102400
},
"normalized": null,
"security": {},
"validation": {},
"tamperAnalysis": null,
"warnings": [],
"failure": null
}

Code examples

curl -X POST https://api.pdfcanon.com/api/normalize \
-H "X-Api-Key: pdfn_your_api_key_here" \
-F "file=@input.pdf" \
-F "linearize=true"

To skip linearization:

curl -X POST https://api.pdfcanon.com/api/normalize \
-H "X-Api-Key: pdfn_your_api_key_here" \
-F "file=@input.pdf" \
-F "linearize=false"