# Convert HTML to PDF with Python: WeasyPrint, pdfkit and Playwright
Generating PDFs from HTML is a common task: invoices, reports, certificates, receipts. Python offers several libraries depending on your use case.
## Quick Comparison
| Tool | Engine | JavaScript | Modern CSS | Installation |
|---|---|---|---|---|
| **WeasyPrint** | Pure Python | β | CSS3 + Flexbox | `pip install weasyprint` |
| **pdfkit** | wkhtmltopdf | Basic | Good | Requires external binary |
| **Playwright** | Chromium | β
full | β
full | `pip install playwright` |
| **xhtml2pdf** | Pure Python | β | Limited | `pip install xhtml2pdf` |
## WeasyPrint β Modern CSS Without External Dependencies
WeasyPrint is the most Pythonic option: no external binaries required.
### Installation
```bash
pip install weasyprint
# On Linux you may need system dependencies:
# Ubuntu/Debian:
sudo apt-get install libpango-1.0-0 libpangoft2-1.0-0
# macOS with Homebrew:
brew install pango
```
### Basic Usage
```python
from weasyprint import HTML, CSS
# From HTML string
HTML(string='
```
### Print CSS (`@page`, page breaks)
WeasyPrint supports print CSS:
```css
/* Page setup */
@page {
size: A4;
margin: 2cm 1.5cm;
/* Footer with page number */
@bottom-center {
content: "Page " counter(page) " of " counter(pages);
font-size: 10px;
color: #666;
}
}
/* Prevent page breaks inside important elements */
.invoice-item {
page-break-inside: avoid;
}
/* Force new page before section */
.new-section {
page-break-before: always;
}
/* Hide elements that shouldn't appear in PDF */
.screen-only {
display: none;
}
```
## pdfkit β wkhtmltopdf from Python
`pdfkit` is a wrapper around `wkhtmltopdf`, which uses a real WebKit engine:
```bash
# Install pdfkit
pip install pdfkit
# Install wkhtmltopdf (external binary)
# Windows: download from https://wkhtmltopdf.org/downloads.html
# Ubuntu: sudo apt-get install wkhtmltopdf
# macOS: brew install --cask wkhtmltopdf
```
```python
import pdfkit
options = {
'page-size': 'A4',
'margin-top': '1.5cm',
'margin-bottom': '1.5cm',
'margin-left': '1.5cm',
'margin-right': '1.5cm',
'encoding': 'UTF-8',
'no-outline': None,
'footer-right': '[page] of [topage]',
'footer-font-size': '9',
}
# From URL
pdfkit.from_url('https://example.com', 'web.pdf', options=options)
# From HTML file
pdfkit.from_file('report.html', 'report.pdf', options=options)
# From string
pdfkit.from_string('
Hello
My first PDF.
').write_pdf('output.pdf') # From HTML file HTML(filename='report.html').write_pdf('report.pdf') # From URL HTML(url='https://example.com/page').write_pdf('page.pdf') # With separate CSS HTML(string='Title
').write_pdf( 'output.pdf', stylesheets=[CSS(string='h1 { color: navy; }')] ) ``` ### Templates with Jinja2 The most powerful workflow: HTML template + data β dynamically generated PDF. ```python from weasyprint import HTML from jinja2 import Environment, FileSystemLoader # Configure Jinja2 env = Environment(loader=FileSystemLoader('templates/')) template = env.get_template('invoice.html') # Invoice data data = { "number": "2024-042", "client": "XYZ Corp Ltd", "items": [ {"description": "Web design", "price": 800}, {"description": "Monthly SEO", "price": 300}, ], "total": 1100 } # Render HTML rendered_html = template.render(**data) # Generate PDF HTML(string=rendered_html).write_pdf('invoice_2024-042.pdf') print("PDF generated: invoice_2024-042.pdf") ``` **Sample `templates/invoice.html`:** ```htmlInvoice #{{ number }}
Client: {{ client }}
| Description | Price |
|---|---|
| {{ item.description }} | {{ item.price }} USD |
| TOTAL | {{ total }} USD |
Hello
', 'simple.pdf', options=options) # Return bytes (without saving file) pdf_bytes = pdfkit.from_string('Hello
', False, options=options) ``` ### Limitation: wkhtmltopdf is in maintenance mode The wkhtmltopdf project no longer receives active updates. For modern pages with CSS Grid, advanced Flexbox or heavy JavaScript, use Playwright instead. ## Playwright β Real Chromium for JavaScript-Heavy Pages ```bash pip install playwright python -m playwright install chromium ``` ```python import asyncio from playwright.async_api import async_playwright async def html_to_pdf(url: str, output_file: str): async with async_playwright() as p: browser = await p.chromium.launch() page = await browser.new_page() # Load page (wait for JS to finish executing) await page.goto(url, wait_until='networkidle') # Generate PDF await page.pdf( path=output_file, format='A4', print_background=True, # include background colors margin={ 'top': '1.5cm', 'bottom': '1.5cm', 'left': '1.5cm', 'right': '1.5cm' } ) await browser.close() print(f"PDF saved: {output_file}") # Run asyncio.run(html_to_pdf('https://my-app.com/invoice/42', 'invoice42.pdf')) ``` **Synchronous version (simpler):** ```python from playwright.sync_api import sync_playwright def generate_pdf(html_content: str, output: str): with sync_playwright() as p: browser = p.chromium.launch() page = browser.new_page() page.set_content(html_content, wait_until='domcontentloaded') page.pdf(path=output, format='A4', print_background=True) browser.close() generate_pdf('Report
Processed data.
', 'report.pdf') ``` ### Playwright Advantages - Supports **React, Vue, Angular** and any SPA - CSS Grid, Flexbox, Web Fonts (including Google Fonts) - Waits for network requests to complete (`wait_until='networkidle'`) - Emulates devices (mobile, tablet) for responsive PDFs ## REST API for PDF Generation If your project generates PDFs from a web server, here's a Flask example: ```python from flask import Flask, request, send_file from weasyprint import HTML import io app = Flask(__name__) @app.route('/generate-pdf', methods=['POST']) def generate_pdf(): data = request.json html = f"""Order #{data['order_id']}
Client: {data['client']}
Total: {data['total']} USD
""" pdf_bytes = HTML(string=html).write_pdf() return send_file( io.BytesIO(pdf_bytes), mimetype='application/pdf', as_attachment=True, download_name=f"order_{data['order_id']}.pdf" ) if __name__ == '__main__': app.run(debug=True) ``` ## Which Tool Should You Choose? | Use case | Recommended tool | |---|---| | Invoices, reports, certificates (CSS-designed) | **WeasyPrint** | | Existing web page without complex JS | **pdfkit** | | Dashboard with React/Vue or dynamic content | **Playwright** | | No system dependency installation | **WeasyPrint** | | Production on Linux server | **WeasyPrint** or **Playwright** | WeasyPrint is the ideal starting point for 80% of cases. If you need to render JavaScript, switch to Playwright.