Hugo with Jupyter and Jupytext

Jorge Martínez Garrido

April 10, 2022

python hugo jupyter notebooks jupytext matplotlib

This post was written using Markdown. All its Python code snnipets are interpreted and their output is captured. You don’t believe me? See the code below:

from datetime import datetime

import matplotlib.pyplot as plt

plt.plot(list(range(10)), "-bo")
plt.title(f"Last update: {datetime.now().strftime('%Y-%m-%-d')}")
plt.show()

Do you see the date in the title of the figure? It is today’s date. However, it does not match the publish date of this post. This is becasue my website gets rendered every day at 00:00. Thus, the date in the figure’s title gets updated every day too.

Want to know how I implemented this in a very simple way? Keep reading!

Introduction to the problem

Hugo is a blazing-fast static website generator. It’s built using The Go Programming Language. This is because Go is a compiled language.

As it happens with other static website generators, you can use Markdown to write any kind of content. This way, you focus on writing content instead of dealing with raw HTML.

It turns out that it is possible to write Jupyter Notebooks using Markdown too thanks to Jupytext. This means you can open and execute these Markdown notebooks within Jupyter Notebook without storing their content as JSON files.

In fact, Jupytext solves the problem of reviewing Jupyter Notebooks in your Git VCS since the notebooks are Markdown files!

If you wish to combine the power of Hugo, Jupytext and Jupyter Notebook, then you should write the following YAML at the top of your Markdown post:

---
title: Title of the post
author: Author of the post
date: YYYY-MM-DD
categories: ["Category A", "Category B"]
tags: ["Tag A", "Tag B", "Tag C"]

jupytext:                                              
    encoding: '# -*- coding: utf-8 -*-'                  
        text_representation:                                 
        extension: .md                                     
        format_name: md                                    
    kernelspec:                                            
        display_name: Python 3                               
        language: python                                     
        name: python3
---

All the content of your post goes below this line using plain Markdown...

However, as soon as you include a Python code snnippet like:

print("This code needs to be interpreted")

This code needs to be interpreted

It will only be executed if you use Jupytext to convert the current Markdown post into a Jupyter Notebook with the *.ipynb extension.

What can we do if we want to render previous piece of code? What if we want to store output figures comming from Matplotlib?

The solution to the problem

The solution is very simple: a roundtrip conversion!

Jupytext executes and converts a Markdown notebook to a Jupyter one. Then, nbconvert is used to convert previous Jupyter notebook into a Markdown one while preserving the executed output.

I know roundtrip conversions are inneficient, but this is the only solution I came up with! Furthermore, it has plenty of advantages:

If some problem is found when executing the notebook, the post will not be rendered. This ensures that all the code exposed in your website is up-to-date and working as expected.
You can forget about storing output data, no longer need to save Matplotlib figures or similars.
It is possible to use the CI to apply the roundtrip conversion, so you simply store the Markdown posts. Your workflow development does not get affected!

Commands to be used

I organize all my posts as specified in the unbloated theme. Therefore, all my posts are collected under a common directory named content/posts/. Knowing this, it is possible to apply the following commands:

xvfb-run jupytext --to notebook --execute content/posts/**/*.md &&\
jupyter nbconvert --to markdown content/posts/**/*.ipynb &&\
rm -rf content/posts/**/*.ipynb

Notice this will override your current posts by including captured output. Therefore, you want to apply this commands in the CI or wherever your website rendering process applies.

A note on xvfb

If you are using GitHub Actions or similar, it is not possible to display graphics because the server does not have a display. In other words, CI machines do not have a monitor were you can see a desktop, GUIs execution or any other type of graphic content. This is the reason why xvfb (which stands for X Virtual FrameBuffer) comes into play.

By taking advantage of the xvfb-run command, it is possible to execute commands which involve graphics display in headless mode.

Conclusion

I am pretty surprised how well this worked. It was as simple as adding two commands to the rendering workflow (no need to remove *.ipynb files in the CI machine).

Now, I no longer need to worry checking the output of each command and manually copying it into the post. I still need to explore how this behaves when using Matplotlib animations or using other plotting backends such us Plotly or even PyVista. Stay tuned!