Caradoc: aPragmaticApproachtoPDFParsing andValidation · PortableDocumentFormat?...
Transcript of Caradoc: aPragmaticApproachtoPDFParsing andValidation · PortableDocumentFormat?...
![Page 1: Caradoc: aPragmaticApproachtoPDFParsing andValidation · PortableDocumentFormat? Acommonlyusedformat,butmanysecurityissues: 500+reportedvulnerabilitiesinAdobeReader1 (since1999).](https://reader033.fdocuments.in/reader033/viewer/2022050109/5f46f6034bf40521506c2459/html5/thumbnails/1.jpg)
Caradoc: a Pragmatic Approach to PDF Parsingand Validation
IEEE Security & Privacy LangSec Workshop 2016
Guillaume Endignoux Olivier Levillain Jean-Yves Migeon
École Polytechnique, FranceEPFL, Switzerland
ANSSI, France
Thursday 26th May, 2016
1 / 29
![Page 2: Caradoc: aPragmaticApproachtoPDFParsing andValidation · PortableDocumentFormat? Acommonlyusedformat,butmanysecurityissues: 500+reportedvulnerabilitiesinAdobeReader1 (since1999).](https://reader033.fdocuments.in/reader033/viewer/2022050109/5f46f6034bf40521506c2459/html5/thumbnails/2.jpg)
Portable Document Format ?
A commonly used format, but many security issues:500+ reported vulnerabilities in Adobe Reader1 (since 1999).Discrepancies between implementations.Syntax facilitates polymorphism2 (PDF+ZIP, PDF+JPEG,etc.).
In our work, we aim at verifying PDFs from syntactic level.
Two approaches to validate files:Blacklist: does not detect new malware...Whitelist: higher rejection rate, but accepted files are clean.
1http://www.cvedetails.com2See for example PoC||GTFO
2 / 29
![Page 3: Caradoc: aPragmaticApproachtoPDFParsing andValidation · PortableDocumentFormat? Acommonlyusedformat,butmanysecurityissues: 500+reportedvulnerabilitiesinAdobeReader1 (since1999).](https://reader033.fdocuments.in/reader033/viewer/2022050109/5f46f6034bf40521506c2459/html5/thumbnails/3.jpg)
Portable Document Format ?
A commonly used format, but many security issues:500+ reported vulnerabilities in Adobe Reader1 (since 1999).Discrepancies between implementations.Syntax facilitates polymorphism2 (PDF+ZIP, PDF+JPEG,etc.).
In our work, we aim at verifying PDFs from syntactic level.
Two approaches to validate files:Blacklist: does not detect new malware...Whitelist: higher rejection rate, but accepted files are clean.
1http://www.cvedetails.com2See for example PoC||GTFO
2 / 29
![Page 4: Caradoc: aPragmaticApproachtoPDFParsing andValidation · PortableDocumentFormat? Acommonlyusedformat,butmanysecurityissues: 500+reportedvulnerabilitiesinAdobeReader1 (since1999).](https://reader033.fdocuments.in/reader033/viewer/2022050109/5f46f6034bf40521506c2459/html5/thumbnails/4.jpg)
Table of contents
1 Syntactic and structural problems: a quick tour
2 Caradoc: a pragmatic solution
3 Application to real-world files
3 / 29
![Page 5: Caradoc: aPragmaticApproachtoPDFParsing andValidation · PortableDocumentFormat? Acommonlyusedformat,butmanysecurityissues: 500+reportedvulnerabilitiesinAdobeReader1 (since1999).](https://reader033.fdocuments.in/reader033/viewer/2022050109/5f46f6034bf40521506c2459/html5/thumbnails/5.jpg)
Table of contents
1 Syntactic and structural problems: a quick tour
2 Caradoc: a pragmatic solution
3 Application to real-world files
4 / 29
![Page 6: Caradoc: aPragmaticApproachtoPDFParsing andValidation · PortableDocumentFormat? Acommonlyusedformat,butmanysecurityissues: 500+reportedvulnerabilitiesinAdobeReader1 (since1999).](https://reader033.fdocuments.in/reader033/viewer/2022050109/5f46f6034bf40521506c2459/html5/thumbnails/6.jpg)
PDF syntax 101
A PDF document is made of objects:null
booleans: true, falsenumbers: 123, -4.56strings: (foo)names: /bararrays: [1 2 3], [(foo) /bar]
dictionaries: << /key (value) /foo 123 >>
references: 1 0 obj ... endobj and 1 0 R
streams: << ... >> stream ... endstream
5 / 29
![Page 7: Caradoc: aPragmaticApproachtoPDFParsing andValidation · PortableDocumentFormat? Acommonlyusedformat,butmanysecurityissues: 500+reportedvulnerabilitiesinAdobeReader1 (since1999).](https://reader033.fdocuments.in/reader033/viewer/2022050109/5f46f6034bf40521506c2459/html5/thumbnails/7.jpg)
Structure of a PDF file
HeaderObject
Object...
Reference tableTrailer
End-of-file
%PDF-1.7
1 0 obj<< /Type /Catalog /Pages 2 0 R >>endobj
2 0 obj<< /Type /Pages /Count 1 /Kids [3 0 R] >>endobj
xref0 60000000000 65536 f0000000009 00000 n0000000060 00000 n...
trailer<< /Size 6 /Root 1 0 R >>
startxref428%%EOF
Organization of a simple PDF file.
6 / 29
![Page 8: Caradoc: aPragmaticApproachtoPDFParsing andValidation · PortableDocumentFormat? Acommonlyusedformat,butmanysecurityissues: 500+reportedvulnerabilitiesinAdobeReader1 (since1999).](https://reader033.fdocuments.in/reader033/viewer/2022050109/5f46f6034bf40521506c2459/html5/thumbnails/8.jpg)
Structure of a PDF file
More complex structures:incremental updates,object streams,linearization.
HeaderObjects
...Table + trailer #1
End-of-file #1
Objects...
Table + trailer #2
End-of-file #2
%PDF-1.7
xref0 60000000000 65536 f0000000009 00000 n0000000060 00000 n...trailer<< /Size 6 /Root 1 0 R >>
startxref428%%EOF
xref0 30000000002 65536 f0000000567 00001 n0000000000 00001 f6 10000001234 00000 ntrailer<< /Size 7 /Root 1 1 R /Prev 428 >>
startxref1347%%EOF
Original file
Incrementalupdate
Incremental update.
7 / 29
![Page 9: Caradoc: aPragmaticApproachtoPDFParsing andValidation · PortableDocumentFormat? Acommonlyusedformat,butmanysecurityissues: 500+reportedvulnerabilitiesinAdobeReader1 (since1999).](https://reader033.fdocuments.in/reader033/viewer/2022050109/5f46f6034bf40521506c2459/html5/thumbnails/9.jpg)
Logical structure of a PDF file
Document of 17 pages (about 1000 objects).
8 / 29
![Page 10: Caradoc: aPragmaticApproachtoPDFParsing andValidation · PortableDocumentFormat? Acommonlyusedformat,butmanysecurityissues: 500+reportedvulnerabilitiesinAdobeReader1 (since1999).](https://reader033.fdocuments.in/reader033/viewer/2022050109/5f46f6034bf40521506c2459/html5/thumbnails/10.jpg)
Graph organization
The graph of objects is organized into sub-structures, especiallytrees.
Page tree.Catalog Root of the page tree
Page 3Node Page 4
Page 1 Page 2
9 / 29
![Page 11: Caradoc: aPragmaticApproachtoPDFParsing andValidation · PortableDocumentFormat? Acommonlyusedformat,butmanysecurityissues: 500+reportedvulnerabilitiesinAdobeReader1 (since1999).](https://reader033.fdocuments.in/reader033/viewer/2022050109/5f46f6034bf40521506c2459/html5/thumbnails/11.jpg)
Graph organization
The table of contents uses doubly-linked lists.
Table of contents.
CatalogOutline root
ChapterChapter Chapter
SectionSection Section
10 / 29
![Page 12: Caradoc: aPragmaticApproachtoPDFParsing andValidation · PortableDocumentFormat? Acommonlyusedformat,butmanysecurityissues: 500+reportedvulnerabilitiesinAdobeReader1 (since1999).](https://reader033.fdocuments.in/reader033/viewer/2022050109/5f46f6034bf40521506c2459/html5/thumbnails/12.jpg)
Problematic structure
An attacker may write an invalid structure.
Invalid table of contents.
CatalogOutline root
ChapterChapter Chapter
SectionSection Section
11 / 29
![Page 13: Caradoc: aPragmaticApproachtoPDFParsing andValidation · PortableDocumentFormat? Acommonlyusedformat,butmanysecurityissues: 500+reportedvulnerabilitiesinAdobeReader1 (since1999).](https://reader033.fdocuments.in/reader033/viewer/2022050109/5f46f6034bf40521506c2459/html5/thumbnails/13.jpg)
Demonstration
Demonstration: two examples
Loop in the outline structurehttps://github.com/ANSSI-FR/caradoc/blob/master/test_files/negative/outlines/cycle.pdf
Polymorphic filehttps://github.com/ANSSI-FR/caradoc/blob/master/test_files/negative/polymorph/polymorph.pdf
These files were reported to software editors.
12 / 29
![Page 14: Caradoc: aPragmaticApproachtoPDFParsing andValidation · PortableDocumentFormat? Acommonlyusedformat,butmanysecurityissues: 500+reportedvulnerabilitiesinAdobeReader1 (since1999).](https://reader033.fdocuments.in/reader033/viewer/2022050109/5f46f6034bf40521506c2459/html5/thumbnails/14.jpg)
Demonstration
These problems may lead to several attacks:Attacks on the structure (denial of service).Evasion techniques (attacks taking advantage ofimplementation discrepancies).
13 / 29
![Page 15: Caradoc: aPragmaticApproachtoPDFParsing andValidation · PortableDocumentFormat? Acommonlyusedformat,butmanysecurityissues: 500+reportedvulnerabilitiesinAdobeReader1 (since1999).](https://reader033.fdocuments.in/reader033/viewer/2022050109/5f46f6034bf40521506c2459/html5/thumbnails/15.jpg)
Table of contents
1 Syntactic and structural problems: a quick tour
2 Caradoc: a pragmatic solution
3 Application to real-world files
14 / 29
![Page 16: Caradoc: aPragmaticApproachtoPDFParsing andValidation · PortableDocumentFormat? Acommonlyusedformat,butmanysecurityissues: 500+reportedvulnerabilitiesinAdobeReader1 (since1999).](https://reader033.fdocuments.in/reader033/viewer/2022050109/5f46f6034bf40521506c2459/html5/thumbnails/16.jpg)
Solution proposals
Caradoc verifies a document at three levels:File syntax.Objects consistency (type checking).Higher-level verifications (graph, etc.).
15 / 29
![Page 17: Caradoc: aPragmaticApproachtoPDFParsing andValidation · PortableDocumentFormat? Acommonlyusedformat,butmanysecurityissues: 500+reportedvulnerabilitiesinAdobeReader1 (since1999).](https://reader033.fdocuments.in/reader033/viewer/2022050109/5f46f6034bf40521506c2459/html5/thumbnails/17.jpg)
Syntax restriction
At syntax level, guarantee extraction of objects without ambiguity:Grammar formalization3 (BNF).Structure restrictions (no updates, no linearization, etc.).Systematic rejection of “corrupted” files.
When a conforming reader reads a PDF file with adamaged or missing cross-reference table, it mayattempt to rebuild the table by scanning all the objectsin the file.
— ISO 32000-1:2008, annex C.2
3https://github.com/ANSSI-FR/caradoc/tree/master/doc/grammar16 / 29
![Page 18: Caradoc: aPragmaticApproachtoPDFParsing andValidation · PortableDocumentFormat? Acommonlyusedformat,butmanysecurityissues: 500+reportedvulnerabilitiesinAdobeReader1 (since1999).](https://reader033.fdocuments.in/reader033/viewer/2022050109/5f46f6034bf40521506c2459/html5/thumbnails/18.jpg)
Syntax restriction
At syntax level, guarantee extraction of objects without ambiguity:Grammar formalization3 (BNF).Structure restrictions (no updates, no linearization, etc.).Systematic rejection of “corrupted” files.
When a conforming reader reads a PDF file with adamaged or missing cross-reference table, it mayattempt to rebuild the table by scanning all the objectsin the file.
— ISO 32000-1:2008, annex C.2
3https://github.com/ANSSI-FR/caradoc/tree/master/doc/grammar16 / 29
![Page 19: Caradoc: aPragmaticApproachtoPDFParsing andValidation · PortableDocumentFormat? Acommonlyusedformat,butmanysecurityissues: 500+reportedvulnerabilitiesinAdobeReader1 (since1999).](https://reader033.fdocuments.in/reader033/viewer/2022050109/5f46f6034bf40521506c2459/html5/thumbnails/19.jpg)
Type checking
At object level: guarantee semantic consistency.
For this purpose: type checking algorithm.
17 / 29
![Page 20: Caradoc: aPragmaticApproachtoPDFParsing andValidation · PortableDocumentFormat? Acommonlyusedformat,butmanysecurityissues: 500+reportedvulnerabilitiesinAdobeReader1 (since1999).](https://reader033.fdocuments.in/reader033/viewer/2022050109/5f46f6034bf40521506c2459/html5/thumbnails/20.jpg)
Type checking
trailer<< /Size 7
/Root 1 0 R/Info 6 0 R >>
1 0 obj<< /Type /Catalog /Pages 2 0 R >>endobj
2 0 obj<< /Type /Pages /Count 1 /Kids [3 0 R] >>endobj
3 0 obj <</Type /Page/MediaBox [0 0 700 200]/Parent 2 0 R/Contents 4 0 R/Resources << /Font << /F1 5 0 R >> >>
>> endobj
4 0 obj << /Length 35 >>streamBT /F1 100 Tf (Hello world !) Tj ETendstreamendobj
5 0 obj <</Name /F1/BaseFont /Helvetica/Type /Font/Subtype /Type1
>> endobj
6 0 obj <</Author (G. E.)
>> endobj
Example on a Hello World file.
18 / 29
![Page 21: Caradoc: aPragmaticApproachtoPDFParsing andValidation · PortableDocumentFormat? Acommonlyusedformat,butmanysecurityissues: 500+reportedvulnerabilitiesinAdobeReader1 (since1999).](https://reader033.fdocuments.in/reader033/viewer/2022050109/5f46f6034bf40521506c2459/html5/thumbnails/21.jpg)
Type checking
trailer<< /Size 7
/Root 1 0 R/Info 6 0 R >>
1 0 obj<< /Type /Catalog /Pages 2 0 R >>endobj
2 0 obj<< /Type /Pages /Count 1 /Kids [3 0 R] >>endobj
3 0 obj <</Type /Page/MediaBox [0 0 700 200]/Parent 2 0 R/Contents 4 0 R/Resources << /Font << /F1 5 0 R >> >>
>> endobj
4 0 obj << /Length 35 >>streamBT /F1 100 Tf (Hello world !) Tj ETendstreamendobj
5 0 obj <</Name /F1/BaseFont /Helvetica/Type /Font/Subtype /Type1
>> endobj
6 0 obj <</Author (G. E.)
>> endobj
Constraint propagation.
19 / 29
![Page 22: Caradoc: aPragmaticApproachtoPDFParsing andValidation · PortableDocumentFormat? Acommonlyusedformat,butmanysecurityissues: 500+reportedvulnerabilitiesinAdobeReader1 (since1999).](https://reader033.fdocuments.in/reader033/viewer/2022050109/5f46f6034bf40521506c2459/html5/thumbnails/22.jpg)
Type checking
trailer<< /Size 7
/Root 1 0 R/Info 6 0 R >>
1 0 obj<< /Type /Catalog /Pages 2 0 R >>endobj
2 0 obj<< /Type /Pages /Count 1 /Kids [3 0 R] >>endobj
3 0 obj <</Type /Page/MediaBox [0 0 700 200]/Parent 2 0 R/Contents 4 0 R/Resources << /Font << /F1 5 0 R >> >>
>> endobj
4 0 obj << /Length 35 >>streamBT /F1 100 Tf (Hello world !) Tj ETendstreamendobj
5 0 obj <</Name /F1/BaseFont /Helvetica/Type /Font/Subtype /Type1
>> endobj
6 0 obj <</Author (G. E.)
>> endobj
Constraint propagation.
19 / 29
![Page 23: Caradoc: aPragmaticApproachtoPDFParsing andValidation · PortableDocumentFormat? Acommonlyusedformat,butmanysecurityissues: 500+reportedvulnerabilitiesinAdobeReader1 (since1999).](https://reader033.fdocuments.in/reader033/viewer/2022050109/5f46f6034bf40521506c2459/html5/thumbnails/23.jpg)
Type checking
trailer<< /Size 7
/Root 1 0 R/Info 6 0 R >>
1 0 obj<< /Type /Catalog /Pages 2 0 R >>endobj
2 0 obj<< /Type /Pages /Count 1 /Kids [3 0 R] >>endobj
3 0 obj <</Type /Page/MediaBox [0 0 700 200]/Parent 2 0 R/Contents 4 0 R/Resources << /Font << /F1 5 0 R >> >>
>> endobj
4 0 obj << /Length 35 >>streamBT /F1 100 Tf (Hello world !) Tj ETendstreamendobj
5 0 obj <</Name /F1/BaseFont /Helvetica/Type /Font/Subtype /Type1
>> endobj
6 0 obj <</Author (G. E.)
>> endobj
Constraint propagation.
19 / 29
![Page 24: Caradoc: aPragmaticApproachtoPDFParsing andValidation · PortableDocumentFormat? Acommonlyusedformat,butmanysecurityissues: 500+reportedvulnerabilitiesinAdobeReader1 (since1999).](https://reader033.fdocuments.in/reader033/viewer/2022050109/5f46f6034bf40521506c2459/html5/thumbnails/24.jpg)
Type checking
trailer<< /Size 7
/Root 1 0 R/Info 6 0 R >>
1 0 obj<< /Type /Catalog /Pages 2 0 R >>endobj
2 0 obj<< /Type /Pages /Count 1 /Kids [3 0 R] >>endobj
3 0 obj <</Type /Page/MediaBox [0 0 700 200]/Parent 2 0 R/Contents 4 0 R/Resources << /Font << /F1 5 0 R >> >>
>> endobj
4 0 obj << /Length 35 >>streamBT /F1 100 Tf (Hello world !) Tj ETendstreamendobj
5 0 obj <</Name /F1/BaseFont /Helvetica/Type /Font/Subtype /Type1
>> endobj
6 0 obj <</Author (G. E.)
>> endobj
Constraint propagation.
19 / 29
![Page 25: Caradoc: aPragmaticApproachtoPDFParsing andValidation · PortableDocumentFormat? Acommonlyusedformat,butmanysecurityissues: 500+reportedvulnerabilitiesinAdobeReader1 (since1999).](https://reader033.fdocuments.in/reader033/viewer/2022050109/5f46f6034bf40521506c2459/html5/thumbnails/25.jpg)
Type checking
trailer<< /Size 7
/Root 1 0 R/Info 6 0 R >>
1 0 obj<< /Type /Catalog /Pages 2 0 R >>endobj
2 0 obj<< /Type /Pages /Count 1 /Kids [3 0 R] >>endobj
3 0 obj <</Type /Page/MediaBox [0 0 700 200]/Parent 2 0 R/Contents 4 0 R/Resources << /Font << /F1 5 0 R >> >>
>> endobj
4 0 obj << /Length 35 >>streamBT /F1 100 Tf (Hello world !) Tj ETendstreamendobj
5 0 obj <</Name /F1/BaseFont /Helvetica/Type /Font/Subtype /Type1
>> endobj
6 0 obj <</Author (G. E.)
>> endobj
Constraint propagation.
19 / 29
![Page 26: Caradoc: aPragmaticApproachtoPDFParsing andValidation · PortableDocumentFormat? Acommonlyusedformat,butmanysecurityissues: 500+reportedvulnerabilitiesinAdobeReader1 (since1999).](https://reader033.fdocuments.in/reader033/viewer/2022050109/5f46f6034bf40521506c2459/html5/thumbnails/26.jpg)
Type checking
Types of a 17-page document.
actionpagedestinationannotationresourceoutlinecontent streamfontname treeother
20 / 29
![Page 27: Caradoc: aPragmaticApproachtoPDFParsing andValidation · PortableDocumentFormat? Acommonlyusedformat,butmanysecurityissues: 500+reportedvulnerabilitiesinAdobeReader1 (since1999).](https://reader033.fdocuments.in/reader033/viewer/2022050109/5f46f6034bf40521506c2459/html5/thumbnails/27.jpg)
More complex verifications
At a higher level:
Verification of tree structures (page tree, outlines, etc.).Other verifications easily integrable in the future (fonts,images, existing analyses, etc.).
21 / 29
![Page 28: Caradoc: aPragmaticApproachtoPDFParsing andValidation · PortableDocumentFormat? Acommonlyusedformat,butmanysecurityissues: 500+reportedvulnerabilitiesinAdobeReader1 (since1999).](https://reader033.fdocuments.in/reader033/viewer/2022050109/5f46f6034bf40521506c2459/html5/thumbnails/28.jpg)
Table of contents
1 Syntactic and structural problems: a quick tour
2 Caradoc: a pragmatic solution
3 Application to real-world files
22 / 29
![Page 29: Caradoc: aPragmaticApproachtoPDFParsing andValidation · PortableDocumentFormat? Acommonlyusedformat,butmanysecurityissues: 500+reportedvulnerabilitiesinAdobeReader1 (since1999).](https://reader033.fdocuments.in/reader033/viewer/2022050109/5f46f6034bf40521506c2459/html5/thumbnails/29.jpg)
Implementation
Implementation in OCaml from the PDF specification4.
Validation workflow.
strict parser
relaxed parser
objects
graph ofreferences
extraction ofspecific objects
typechecking
list oftypes
graphchecking
other checksto develop
no errordetectednormalization
4https://www.adobe.com/devnet/pdf/pdf_reference.html23 / 29
![Page 30: Caradoc: aPragmaticApproachtoPDFParsing andValidation · PortableDocumentFormat? Acommonlyusedformat,butmanysecurityissues: 500+reportedvulnerabilitiesinAdobeReader1 (since1999).](https://reader033.fdocuments.in/reader033/viewer/2022050109/5f46f6034bf40521506c2459/html5/thumbnails/30.jpg)
Real-world files
10K files collected from random queries on a web search engine.
Some files are directly accepted.
Direct validation.
10000 files
strictparser parsed
1465 files
typechecking
typechecked
536 files
graphchecking
no errorfound
536 files
24 / 29
![Page 31: Caradoc: aPragmaticApproachtoPDFParsing andValidation · PortableDocumentFormat? Acommonlyusedformat,butmanysecurityissues: 500+reportedvulnerabilitiesinAdobeReader1 (since1999).](https://reader033.fdocuments.in/reader033/viewer/2022050109/5f46f6034bf40521506c2459/html5/thumbnails/31.jpg)
Normalization
Many files do not pass the first stage... But they can be normalizedbeforehand.
The relaxed parser supports common structures: incrementalupdates, object streams, etc.
Normalization.
10000 files
relaxed parser parsed
8993 files
cleaning objects normalized
8993 files
Some files were not normalized: encryption, unrecoverable syntaxerrors, etc.
25 / 29
![Page 32: Caradoc: aPragmaticApproachtoPDFParsing andValidation · PortableDocumentFormat? Acommonlyusedformat,butmanysecurityissues: 500+reportedvulnerabilitiesinAdobeReader1 (since1999).](https://reader033.fdocuments.in/reader033/viewer/2022050109/5f46f6034bf40521506c2459/html5/thumbnails/32.jpg)
Normalization
Validation after normalization.
normalized
8993 files
type checking type checked
1429 filestype error
1391 files
graph checkingno errorfound
1427 files
Our type-checker detected typos:/Blackls1 instead of /BlackIs1,/XObjcect instead of /XObject.
We identified incorrect tree structures in the wild.
26 / 29
![Page 33: Caradoc: aPragmaticApproachtoPDFParsing andValidation · PortableDocumentFormat? Acommonlyusedformat,butmanysecurityissues: 500+reportedvulnerabilitiesinAdobeReader1 (since1999).](https://reader033.fdocuments.in/reader033/viewer/2022050109/5f46f6034bf40521506c2459/html5/thumbnails/33.jpg)
Future work
What remains to be done:Complete the set of types.Check compression filters.Check graphic content.Check fonts, images, etc.
27 / 29
![Page 34: Caradoc: aPragmaticApproachtoPDFParsing andValidation · PortableDocumentFormat? Acommonlyusedformat,butmanysecurityissues: 500+reportedvulnerabilitiesinAdobeReader1 (since1999).](https://reader033.fdocuments.in/reader033/viewer/2022050109/5f46f6034bf40521506c2459/html5/thumbnails/34.jpg)
Conclusion
Summary of our contributions:We identified novel issues in PDF parsers.We proposed and formalized a simplified syntax for PDF.We implemented Caradoc to parse and validate PDF files.
Project page: https://github.com/ANSSI-FR/caradoc
28 / 29