# EPUB Format: Internal Structure, Creation and Conversion
EPUB (Electronic Publication) is the open standard format for ebooks, maintained by the W3C. It's supported by most readers: Kindle (via conversion), Kobo, Apple Books, Adobe Digital Editions, and any app with EPUB support.
## Internal Structure of an EPUB
An EPUB file is essentially a renamed ZIP archive. Extracting it reveals:
```
book.epub (= ZIP)
βββ mimetype β must be the first file, uncompressed
βββ META-INF/
β βββ container.xml β points to the main OPF file
βββ OEBPS/
βββ content.opf β manifest: metadata + file list
βββ nav.xhtml β table of contents (EPUB 3)
βββ toc.ncx β table of contents (EPUB 2, backwards compatibility)
βββ chapter01.xhtml
βββ chapter02.xhtml
βββ styles/style.css
βββ images/cover.jpg
```
### The `mimetype` file
Must contain exactly (no trailing newline):
```
application/epub+zip
```
### `META-INF/container.xml`
```xml
```
### `OEBPS/content.opf` β The Manifest
```xml
978-0-000-00000-0
My Example Ebook
Author Name
en
2024-01-15T00:00:00Z
```
### EPUB 3 Table of Contents (`nav.xhtml`)
```xhtml
Contents
```
## Creating EPUB from Markdown with Pandoc
Pandoc is the most comprehensive command-line tool for generating EPUB:
```bash
# Basic EPUB from a Markdown file
pandoc input.md -o book.epub
# With metadata and cover
pandoc input.md \
--metadata title="My Book" \
--metadata author="Your Name" \
--metadata lang="en" \
--epub-cover-image=cover.jpg \
--css=styles/epub.css \
-o book.epub
# From multiple chapters
pandoc chapter01.md chapter02.md chapter03.md \
--metadata title="Complete Novel" \
--toc --toc-depth=2 \
-o novel.epub
# With separate metadata file (metadata.yaml)
pandoc input.md --metadata-file=metadata.yaml -o book.epub
```
**Sample `metadata.yaml`:**
```yaml
---
title: 'The Book Title'
author:
- 'Main Author'
- 'Co-Author'
date: '2024-01-15'
lang: en
description: 'Book description for catalogs.'
rights: 'Copyright Β© 2024'
publisher: 'Example Publishing'
---
```
## Converting with Calibre CLI (`ebook-convert`)
```bash
# EPUB β PDF
ebook-convert book.epub book.pdf
# EPUB β MOBI (Kindle)
ebook-convert book.epub book.mobi
# EPUB β AZW3 (modern Kindle)
ebook-convert book.epub book.azw3
# DOCX β EPUB with options
ebook-convert book.docx book.epub \
--title "My Book" \
--authors "Author" \
--language en \
--cover cover.jpg \
--output-profile tablet
# HTML β EPUB
ebook-convert page.html book.epub --no-default-epub-cover
```
**Calibre GUI** offers the same conversions visually β ideal if you prefer a graphical interface.
## Creating EPUB with Python (`ebooklib`)
```python
from ebooklib import epub
# 1. Create book
book = epub.EpubBook()
book.set_identifier('id-my-book-001')
book.set_title('My Ebook with Python')
book.set_language('en')
book.add_author('Your Name')
# 2. Create chapter
chapter = epub.EpubHtml(
title='Chapter 1: Start',
file_name='ch01.xhtml',
lang='en'
)
chapter.content = '''
Chapter 1: Start
This is the content of the first chapter.
You can include full HTML here.
''' # 3. Add CSS style = epub.EpubItem( uid="style", file_name="style/main.css", media_type="text/css", content=b'body { font-family: Georgia, serif; line-height: 1.6; }' ) chapter.add_item(style) # 4. Add to book book.add_item(chapter) book.add_item(style) # 5. Define table of contents and spine book.toc = (epub.Link('ch01.xhtml', 'Chapter 1', 'ch01'),) book.add_item(epub.EpubNcx()) book.add_item(epub.EpubNav()) book.spine = ['nav', chapter] # 6. Save epub.write_epub('my_ebook.epub', book) print("EPUB generated successfully") ``` ## Validating EPUB (`epubcheck`) `epubcheck` is the official W3C validator: ```bash # Install (requires Java) # Download from: https://github.com/w3c/epubcheck/releases # Validate java -jar epubcheck.jar book.epub # Expected output (no errors): # Errors: 0 Warnings: 0 Infos: 0 ``` Common errors: - **OPF001**: `mimetype` file is not first or is compressed - **RSC-005**: reference to a file not declared in the manifest - **NCX-003**: missing `toc.ncx` file for EPUB 2 compatibility ## EPUB 2 vs EPUB 3 | Feature | EPUB 2 | EPUB 3 | |---|---|---| | Standard | 2010 | 2011 (latest revision: 2023) | | Table of contents | NCX (XML) | nav.xhtml + optional NCX | | HTML | XHTML 1.1 | HTML5 / XHTML5 | | MathML / SVG | Limited | Full support | | Audio/Video | No | Yes (HTML5 media) | | JavaScript | No | Yes (limited by device) | | CSS | Basic CSS 2.1 | Full CSS3 | | Reader compatibility | Universal | High (Kindle still prefers AZW3) | ## Recommended Tools | Tool | Use | Free | |---|---|---| | **Pandoc** | Convert Markdown/DOCX/HTML β EPUB | Yes | | **Calibre** | Convert between all ebook formats | Yes | | **Sigil** | Visual EPUB editor | Yes | | **ebooklib** | Create EPUB with Python | Yes | | **epubcheck** | Validate EPUB (official W3C) | Yes | | **Vellum** (macOS) | Professional publishing | Paid | | **Atticus** | Layout for self-publishing | Paid | ## Convert EPUB Online For quick conversions without installing software, KaijuConverter supports EPUB as both source and destination. Convert EPUB to PDF, DOCX, or between ebook formats directly in your browser.