1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
|
Metadata-Version: 2.1
Name: pikepdf
Version: 6.2.3
Summary: Read and write PDFs with Python, powered by qpdf
Home-page: https://github.com/pikepdf/pikepdf
Author: James R. Barlow
Author-email: james@purplerock.ca
License: MPL-2.0
Project-URL: Documentation, https://pikepdf.readthedocs.io/
Project-URL: Source, https://github.com/pikepdf/pikepdf
Project-URL: Tracker, https://github.com/pikepdf/pikepdf/issues
Keywords: PDF
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Information Technology
Classifier: License :: OSI Approved :: Mozilla Public License 2.0 (MPL 2.0)
Classifier: Programming Language :: C++
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Programming Language :: Python :: Implementation :: PyPy
Classifier: Topic :: Multimedia :: Graphics
Classifier: Topic :: Software Development :: Libraries
Requires-Python: >=3.7
Description-Content-Type: text/markdown
License-File: LICENSE.txt
License-File: licenses-for-wheels.txt
Requires-Dist: Pillow (>=9.0)
Requires-Dist: deprecation
Requires-Dist: lxml (>=4.8)
Requires-Dist: packaging
Requires-Dist: importlib-metadata (>=4) ; python_version < "3.8"
Requires-Dist: typing-extensions (>=4) ; python_version < "3.8"
Provides-Extra: docs
Requires-Dist: GitPython ; extra == 'docs'
Requires-Dist: PyGithub ; extra == 'docs'
Requires-Dist: Sphinx (>=3) ; extra == 'docs'
Requires-Dist: ipython ; extra == 'docs'
Requires-Dist: matplotlib ; extra == 'docs'
Requires-Dist: pybind11 ; extra == 'docs'
Requires-Dist: requests ; extra == 'docs'
Requires-Dist: setuptools-scm ; extra == 'docs'
Requires-Dist: sphinx-design ; extra == 'docs'
Requires-Dist: sphinx-issues ; extra == 'docs'
Requires-Dist: sphinx-rtd-theme ; extra == 'docs'
Requires-Dist: tomli ; (python_version < "3.11") and extra == 'docs'
Provides-Extra: mypy
Requires-Dist: lxml-stubs ; extra == 'mypy'
Requires-Dist: types-Pillow ; extra == 'mypy'
Requires-Dist: types-requests ; extra == 'mypy'
Requires-Dist: types-setuptools ; extra == 'mypy'
Provides-Extra: test
Requires-Dist: attrs (>=20.2.0) ; extra == 'test'
Requires-Dist: coverage[toml] ; extra == 'test'
Requires-Dist: hypothesis (>=6.36) ; extra == 'test'
Requires-Dist: psutil (>=5.9) ; extra == 'test'
Requires-Dist: pybind11 ; extra == 'test'
Requires-Dist: pytest (>=6.2.5) ; extra == 'test'
Requires-Dist: pytest-cov (>=3.0.0) ; extra == 'test'
Requires-Dist: pytest-timeout (>=2.1.0) ; extra == 'test'
Requires-Dist: pytest-xdist (>=2.5.0) ; extra == 'test'
Requires-Dist: python-dateutil (>=2.8.1) ; extra == 'test'
Requires-Dist: tomli ; (python_version < "3.11") and extra == 'test'
Requires-Dist: python-xmp-toolkit (>=2.0.1) ; (sys_platform != "nt" and platform_machine == "x86_64") and extra == 'test'
<!-- SPDX-FileCopyrightText: 2022 James R. Barlow -->
<!-- SPDX-License-Identifier: MPL-2.0 -->
pikepdf
=======
**pikepdf** is a Python library for reading and writing PDF files.
[](https://github.com/pikepdf/pikepdf/actions/workflows/build.yml) [](https://pypi.org/project/pikepdf/)     [](https://codecov.io/gh/pikepdf/pikepdf)
pikepdf is based on [QPDF](https://github.com/qpdf/qpdf), a powerful PDF manipulation and repair library.
Python + QPDF = "py" + "qpdf" = "pyqpdf", which looks like a dyslexia test. Say it out loud, and it sounds like "pikepdf".
```python
# Elegant, Pythonic API
with pikepdf.open('input.pdf') as pdf:
num_pages = len(pdf.pages)
del pdf.pages[-1]
pdf.save('output.pdf')
```
**To install:**
```bash
pip install pikepdf
```
For users who want to build from source, see [installation](https://pikepdf.readthedocs.io/en/latest/index.html).
pikepdf is [documented](https://pikepdf.readthedocs.io/en/latest/index.html) and actively maintained. Binary wheels are available for all common platforms, both x86-64 and ARM64/Apple Silicon.
Commercial support is available.
Features
--------
This library is similar to PyPDF2 and pdfrw - it provides low level access to PDF features and allows editing and content transformation of existing PDFs. Some knowledge of the PDF specification may be helpful. It does not have the capability to render a PDF to image.
| **Feature** | **pikepdf** | **PyPDF2** | **pdfrw** |
| ------------------------------------------------------------------- | ------------------------------------------- | ----------------------------------------- | --------------------------------------- |
| Editing, manipulation and transformation of existing PDFs | ✔ | ✔ | ✔ |
| Based on an existing, mature PDF library | QPDF | ✘ | ✘ |
| Implementation | C++ and Python | Python | Python |
| PDF versions supported | 1.1 to 1.7 | 1.3? | 1.7 |
| Python versions supported | 3.7-3.10 [^1] | 2.7-3.10 | 2.6-3.6 |
| Save and load password protected (encrypted) PDFs | ✔ (except public key) | ✘ (Only obsolete RC4) | ✘ (not at all) |
| Save and load PDF compressed object streams (PDF 1.5) | ✔ | ✘ | ✘ |
| Creates linearized ("fast web view") PDFs | ✔ | ✘ | ✘ |
| Actively maintained | ![pikepdf commit activity][pikepdf-commits] | ![PyPDF2 commit activity][pypdf2-commits] | ![pdfrw commit activity][pdfrw-commits] |
| Test suite coverage | ![codecov][codecov] | ![codecovpypdf2][codecovpypdf] | unknown |
| Creates PDFs that pass PDF validation tests | ✔ | ✘ | ? |
| Modifies PDF/A without breaking PDF/A compliance | ✔ | ✘ | ? |
| Automatically repairs PDFs with internal errors | ✔ | ✘ | ✘ |
| PDF XMP metadata editing | ✔ | read-only | ✘ |
| Documentation | ✔ | ✔ | ✔ |
| Integrates with Jupyter and IPython notebooks for rapid development | ✔ | ✘ | ✘ |
[^1]: pikepdf 3.x and older support Python 3.6.
[pikepdf-commits]: https://img.shields.io/github/commit-activity/y/pikepdf/pikepdf.svg
[pypdf2-commits]: https://img.shields.io/github/commit-activity/y/mstamy2/PyPDF2.svg
[pdfrw-commits]: https://img.shields.io/github/commit-activity/y/pmaupin/pdfrw.svg
[codecov]: https://codecov.io/gh/pikepdf/pikepdf/branch/master/graph/badge.svg?token=8FJ755317J
[codecovpypdf]: https://codecov.io/gh/py-pdf/PyPDF2/branch/main/graph/badge.svg?token=id42cGNZ5Z
Testimonials
------------
> I decided to try writing a quick Python program with pikepdf to automate [something] and it "just worked". –Jay Berkenbilt, creator of QPDF
> "Thanks for creating a great pdf library, I tested out several and this is the one that was best able to work with whatever I threw at it." –@cfcurtis
In Production
-------------
* [OCRmyPDF](https://github.com/jbarlow83/OCRmyPDF) uses pikepdf to graft OCR text layers onto existing PDFs, to examine the contents of input PDFs, and to optimize PDFs.
* [PDF Arranger](https://github.com/jeromerobert/pdfarranger) is a small Python application that provides a graphical user interface to rotate, crop and rearrange PDFs.
* [PDFStitcher](https://github.com/cfcurtis/sewingutils) is a utility for stitching PDF pages into a single document (i.e. N-up or page imposition).
License
-------
pikepdf is provided under the [Mozilla Public License 2.0](https://www.mozilla.org/en-US/MPL/2.0/) license (MPL) that can be found in the LICENSE file. By using, distributing, or contributing to this project, you agree to the terms and conditions of this license. Some components of the project may be under other license agreements, as indicated in their SPDX license header or the [`.dep5/reuse`](REUSE) file.
[Informally](https://www.mozilla.org/en-US/MPL/2.0/FAQ/), MPL 2.0 is a not a "viral" license. It may be combined with other work, including commercial software. However, you must disclose your modifications *to pikepdf* in source code form. In other works, fork this repository on GitHub or elsewhere and commit your contributions there, and you've satisfied your obligations. MPL 2.0 is compatible with the GPL and LGPL - see the [guidelines](https://www.mozilla.org/en-US/MPL/2.0/combining-mpl-and-gpl/) for notes on use in GPL.
|