Overview

The pypdfplot package provides a backend to Matplotlib that generates a PDF file of the plot with the generating Python script embedded.

Normally, once a Matplotlib plot is saved as PNG or PDF file, the link between the plot and its generating Python script is lost. The philosophy behind pypdfplot is that there should be no distinction between the Python script that generates a plot and its output PDF file, much like there is no such distinction in an Origin or Excel file. As far as pypdfplot is concerned, the generating script is the plot.

When the pypdfplot backend is loaded and a figure is saved with plt.savefig(), the generating Python script is embedded into the output PDF file in such a way that when the PDF file is renamed from .pdf to .py, the file can be read by a Python interpreter directly without alteration. The script can be modified to implement changes in the plot, after which the script is ran again to produce the updated PDF file of the plot -- including the updated embedded generating script.

The resulting file is both a valid Python file and a valid PDF file, and is conveniently call a PyPDF file. The compatibility with both Python and PDF is achieved by arranging the data blocks in the PyPDF file in a very specific order, such that the PDF-part is read as comment block in Python, and the Python-part is seen as an embedded file by a PDF reader.

To learn more about how to use pypdfplot, continue with reading Quickstart_, or check out the commented examples in the examples folder <https://github.com/dcmvdbekerom/pypdfplot/tree/develop/examples>__.

Installation

PyPI repository

Install the package from the PyPI repository opening a command prompt and enter:

.. code:: bash

pip install pypdfplot

Github or download

Alternatively, the source files can be downloaded directly from the GitHub repository <https://github.com/dcmvdbekerom/pypdfplot>__. After downloading the source files, navigate to the project directory and install the package by opening a command prompt and enter:

.. code:: bash

python setup.py install

Anaconda/Spyder

In order for pypdfplot to work in an Anaconda/Spyder environment, the package has to be installed from source with the "editable" option.

Download the source code following the instructions above. Open an Anaconda prompt and navigate to the directory with the source code. Now install the package by typing in the Anaconda prompt:

.. code:: bash

pip install -e .

Installing the package with the "editable" option guarantees that the libraries are reloaded each time the code is ran.

This will produce a warning in the IPyhton console, which can be turned off by unchecking the "Show reloaded module list" box in the Tools > Preferences > Python interpreter menu in Spyder.

Next, navigate to the Graphics tab in the Tools > Preferences > IPython console menu and set the backend to "Automatic".

It is further recommended to save the figure with the keyword cleanup = 'False', see :ref:savefig().

.. _Quickstart:

Quickstart

In this example, a plot is produced with Matplotlib and saved as PyPDF-file using the pypdfplot backend.

First, create a new python file and call it e.g. example.py.

To produce a PyPDF-file, all you have to do is import the pypdfplot backend by adding the line import pypdfplot.backend before importing Matplotlib:

.. code:: python

import pypdfplot.backend
import matplotlib.pyplot as plt
import numpy as np

x = np.arange(-10,20,0.1)
y = x**2

plt.plot(x,y)
plt.savefig('example.pdf')

After running this script, the file example.py will have been removed and replaced by a new file example.pdf:

.. image:: https://pypdfplot.readthedocs.io/en/latest/_images/example_plot.png

As can be seen in the "Attachments" column on the left, the orginal example.py generating script is embedded in the PDF file.

The script can be accessed by renaming example.pdf back to example.py and opening it in a text editor:

.. code:: python

#%PDF-1.4 24 0 obj << /Type /EmbeddedFile /Length        690 >> stream
import pypdfplot.backend
import matplotlib.pyplot as plt
import numpy as np

x = np.arange(-10,20,0.1)
y = x**2

plt.plot(x,y)
plt.savefig('example.pdf')

"""
--- Do not edit below ---
endstream
endobj
1 0 obj

<< ... >>

startxref
9567
%%EOF
0000010174 LF
PyPDF-1.0
"""

It can be seen that after saving the plot with the pypdfplot backend, a commented line was added at the first line and a large comment block was appended at the end of the file. These comments contain all the necessary data for displaying the PDF and should not be altered directly by the user.

To update the plot, the user should instead modify the generating Python script and the PDF will be updated after running the script again!

For example, let's add another plot, e.g. a sine function:

.. code:: python

#%PDF-1.4 24 0 obj << /Type /EmbeddedFile /Length        690 >> stream
import pypdfplot.backend
import matplotlib.pyplot as plt
import numpy as np

x = np.arange(-10,20,0.1)
y1 = x**2
y2 = 100*np.sin(x)

plt.plot(x,y1)
plt.plot(x,y2)
plt.savefig('example.pdf')

"""
--- Do not edit below ---
endstream
endobj
1 0 obj

<< ... >>

startxref
9567
%%EOF
0000010174 LF
PyPDF-1.0
"""

After running example.py, the file is again replaced by our updated example.pdf:

.. image:: https://pypdfplot.readthedocs.io/en/latest/_images/example_plot2.png

Functions

.. _savefig():

savefig()

Saves the current plot as PyPDF file.

.. code:: python

savefig(fname, 
        pack_list = [],
        cleanup = True,
        multiple = 'pickle',
        force_pickle = False,
        verbose = True
        prompt_overwrite = False,
        **kwargs)

:fname: str

Filename of the output file.

:pack_list: list, default = []

List with filenames that will be embedded in the PyPDF-file. The generating script is added separately and should not be included here. See Packing and unpacking_ for more details.

:multiple: str, default = 'pickle'

How to handle multiple plots in a single generating script. Can be any of 'pickle', 'add_page', or 'finalize'. See Multiple plots_ for more details

:cleanup: bool, default = True

Whether or not to cleanup files that have been embedded in the PyPDF file. Set to False and run script to extract embedded files.

:force_pickle: bool, default = False

Pickles the figure and embeds a Python script that unpickles and reads the figure again. This can be useful when dealing with very large source files, see Pickling_ for more details.

:verbose: bool, default = True

Wether or not to show verbose comments during saving.

:prompt_overwrite: bool, default = False

Wether or not to prompt when the output file already exists and is about to be overwritten. If False and the output file does already exist, file will be overwritten if possible.

:**kwargs: Any keyword arguments accepted by matplotlib.pyplot.savefig()

unpack()

Extracts the files embedded in the PyPDF-file. Must be called before embedded files are read by the generating script. This can be guaranteed by importing the backend using pypdfplot.backend.unpack, which automatically calls unpack() with its default parameters. See Packing and unpacking_ for more details.

.. code:: python

unpack(fname = None,
       verbose = True)

:fname: str, default = None

Filename of the PyPDF file to unpack. If None, the filename of the currently executing script is taken.

:verbose: bool, default = True

Wether or not to show verbose comments during extraction.

fix_pypdf()

Fixes PyPDF files that have been severed, e.g. because they were saved as 'regular' PDF-files outside of pypdfplot. See PyPDF compliance types_ for more details.

.. code:: python

fix_pypdf(input_fname,
          output_fname = None,
          verbose = True)

:input_fname: str

Filename of the severed PyPDF file

:output_fname: str, default = None

Filename of the fixed output PyPDF file. If None, the input PDF file is overwritten.

:verbose: bool, default = True

Wether or not to show verbose comments during fixing.

.. _Packing and unpacking:

Packing and unpacking

In many cases you may want to plot data that is stored in a separate external file. In order for this to work, the external data file must be included, which can be achieved by packing and unpacking the data into the PyPDF file.

.. _Packing files:

Packing files

In this section we show how to write a script that opens data from an external Excel file and reads the title and axis label from an extrnal text file, where both files are embedded in the PyPDF file.

Create an excel file data.xlsx and fill the Excel file with data, e.g. the first 10 numbers of the Fibonacci sequence:

.. image:: https://pypdfplot.readthedocs.io/en/latest/_images/excel_data.png

Next, create a text file title.txt and add names for the plot title and axes:

.. image:: https://pypdfp

Pypdfplot

Install / Use

README

PyPI repository

Github or download

Anaconda/Spyder

savefig()

unpack()

fix_pypdf()

Packing files