3 Ways To Convert HTML To PDF In Python

Welcome to a tutorial on how to convert HTML to PDF in Python. So you need to create a PDF document using HTML? Or maybe capture an HTML page into a PDF file? Python does not have a native library that can work with PDF, but there are ways to do it – Read on for the examples!

 

 

TABLE OF CONTENTS

 

DOWNLOAD & NOTES

Here is the download link to the example code, so you don’t have to copy-paste everything.

 

EXAMPLE CODE DOWNLOAD

Source code on GitHub Gist

Just click on “download zip” or do a git clone. I have released it under the MIT license, so feel free to build on top of it or use it in your own project.

 

SORRY FOR THE ADS...

But someone has to pay the bills, and sponsors are paying for it. I insist on not turning Code Boxx into a "paid scripts" business, and I don't "block people with Adblock". Every little bit of support helps.

Buy Me A Coffee Code Boxx eBooks

 

 

PYTHON CONVERT HTML TO PDF

All right, let us now get into the examples of converting HTML to PDF in Python.

 

1) WKHTMLTOPDF & PDFKIT

1-pdfkit.py
# (A) LOAD PDFKIT
import pdfkit
 
# (B) HTML TO PDF
pdfkit.from_string(
  "<h1>It Works!</h1><p>This is an HTML string</p>",
  "demo1.pdf"
)
 
pdfkit.from_file(
  "x-dummy.html",
  "demo2.pdf",
  options = {"enable-local-file-access": ""}
)
 
pdfkit.from_url(
  "https://en.wikipedia.org/wiki/Koto_(instrument)",
  "demo3.pdf"
)

This seems to be the most common “answer” that is floating everywhere on the Internet, and it involves two packages.

  • First, you will need to install wkhtmltopdf – Which reads as “Webkit HTML to PDF”. Make sure to set WHERE_YOU_INSTALLED/wkhtmltopdf/bin/ to your system path, or it will not work.
  • Next, install PDFKIT – pip install pdfkit.
  • That’s all, very easy to use. Just import pdfkit in your script and use it.

I can see why this is a popular option, but take note, wkhtmltopdf is archived at the time of writing. I cannot recommend this option with security loopholes and a lack of future updates. Use with caution.

 

 

2) PYQT WEB ENGINE

2-qt.py
# (A) LOAD MODULES
import sys
from PyQt5.QtWebEngineWidgets import QWebEngineView
from PyQt5.QtWidgets import QApplication
from PyQt5.QtCore import QUrl
 
# (B) CREATE QAPP
app = QApplication(sys.argv)
 
# (C) WEB ENGINE VIEW - PRINT URL TO PDF
webview = QWebEngineView()
webview.setZoomFactor(1)
webview.page().pdfPrintingFinished.connect(webview.close)
webview.load(QUrl("https://en.wikipedia.org/wiki/Koto_(instrument)"))
def to_pdf(finished):
  webview.page().printToPdf("demo4.pdf")
webview.loadFinished.connect(to_pdf)
 
# (D) GO!
sys.exit(app.exec_())

TLDR – Create a Python app that loads a webpage, print it to a PDF file, the end.

If you poke around “Webkit HTML to PDF”, they did mention that it is based on QT Webkit. Yep, this snippet quite essentially replicates the entire “use QT web engine view” to create a PDF file – But it’s a lot more updated.

 

3) CHROME HEADLESS MODE

3-chrome.py
# (A) CHANGE TO YOUR OWN!
chrome = "C:\\Program Files\\Google\\Chrome\\Application\\chrome.exe"
output = "D:\\http\\demo5.pdf"
url = "https://en.wikipedia.org/wiki/Koto_(instrument)"
cmd = f'"{chrome}" --headless --disable-gpu --print-to-pdf="{output}" {url}'

# (B) RUN COMMAND
import subprocess
subprocess.run(cmd)

If you can install Chrome (or any Chromium-based browser) on the server, this snippet will run a terminal command:

  • Launch Chrome in headless mode. That is to open Chrome without a window.
  • Load the specified URL.
  • Save the page to a PDF file.

For those who are lost, you can point the URL to file://YOUR-HTML-FILE or even localhost/DYNAMIC-HTML-PAGE.

 

 

EXTRAS

That’s all for the tutorial, and here is a small section on some extras and links that may be useful to you.

 

LINKS & REFERENCES

 

THE END

Thank you for reading, and we have come to the end. I hope that it has helped you to better understand, and if you want to share anything with this guide, please feel free to comment below. Good luck and happy coding!