How to Merge PDFs with PyMuPDF: A Complete Guide
Jamie Lemon·July 9, 2025

PDF merging is a common task in document processing workflows, whether you're combining reports, consolidating research papers, or creating comprehensive documentation. While there are many tools available for this task, PyMuPDF stands out as a powerful Python library that offers precise control over PDF manipulation.
In this guide, we'll explore how to merge PDFs using PyMuPDF, from basic concatenation to advanced techniques with custom page ranges and bookmarks.
Basic PDF Merging
Let's start with the simplest case: merging two or more PDFs into a single document.
import pymupdf
def merge_pdfs(pdf_list, output_path):
"""
Merge multiple PDFs into a single document
Args:
pdf_list: List of PDF file paths to merge
output_path: Path for the output merged PDF
"""
# Create a new PDF document
merged_pdf = pymupdf.open()
# Iterate through each PDF file
for pdf_path in pdf_list:
# Open the PDF
pdf_document = pymupdf.open(pdf_path)
# Insert all pages from the current PDF
merged_pdf.insert_pdf(pdf_document)
# Close the current PDF
pdf_document.close()
# Save the merged PDF
merged_pdf.save(output_path)
merged_pdf.close()
# Example usage
pdf_files = ['document1.pdf', 'document2.pdf', 'document3.pdf']
merge_pdfs(pdf_files, 'merged_document.pdf')
Merging Specific Page Ranges
Sometimes you don't want to merge entire documents—just specific pages. PyMuPDF makes this easy:
import pymupdf
def merge_pdf_pages(pdf_info, output_path):
"""
Merge specific pages from multiple PDFs
Args:
pdf_info: List of tuples (pdf_path, start_page, end_page)
output_path: Path for the output merged PDF
"""
merged_pdf = pymupdf.open()
for pdf_path, start_page, end_page in pdf_info:
pdf_document = pymupdf.open(pdf_path)
# Insert specific page range (0-indexed)
merged_pdf.insert_pdf(
pdf_document,
from_page=start_page,
to_page=end_page
)
pdf_document.close()
merged_pdf.save(output_path)
merged_pdf.close()
# Example:
page_ranges = [
('document1.pdf', 0, 1), # First 2 pages
('document2.pdf', 3, 6) # Pages 4-7
]
merge_pdf_pages(page_ranges, 'custom_merged.pdf')
Advanced Merging with Error Handling
In production environments, you'll want robust error handling:
import pymupdf
import os
from pathlib import Path
def merge_pdfs_robust(pdf_list, output_path, include_bookmarks=True):
"""
Robustly merge PDFs with error handling and optional bookmark preservation
Args:
pdf_list: List of PDF file paths to merge
output_path: Path for the output merged PDF
include_bookmarks: Whether to preserve bookmarks from source PDFs
"""
merged_pdf = pymupdf.open()
try:
for i, pdf_path in enumerate(pdf_list):
# Check if file exists
if not os.path.exists(pdf_path):
print(f"Warning: File {pdf_path} not found, skipping...")
continue
try:
pdf_document = pymupdf.open(pdf_path)
# Check if PDF is valid and has pages
if pdf_document.page_count == 0:
print(f"Warning: {pdf_path} has no pages, skipping...")
pdf_document.close()
continue
# Get current page count for bookmark offset
current_page_count = merged_pdf.page_count
# Insert all pages
merged_pdf.insert_pdf(pdf_document)
# Handle bookmarks if requested
if include_bookmarks:
try:
toc = pdf_document.get_toc()
if toc:
# Adjust bookmark page numbers for merged document
adjusted_toc = []
for level, title, page in toc:
adjusted_toc.append([level, title, page + current_page_count])
# Get existing TOC and extend it
existing_toc = merged_pdf.get_toc()
existing_toc.extend(adjusted_toc)
merged_pdf.set_toc(existing_toc)
except:
print(f"Warning: Could not process bookmarks for {pdf_path}")
pdf_document.close()
print(f"Successfully merged: {pdf_path}")
except Exception as e:
print(f"Error processing {pdf_path}: {str(e)}")
continue
# Save the merged PDF
if merged_pdf.page_count > 0:
merged_pdf.save(output_path)
print(f"Merged PDF saved to: {output_path}")
print(f"Total pages: {merged_pdf.page_count}")
else:
print("No pages to merge!")
except Exception as e:
print(f"Error during merge process: {str(e)}")
finally:
merged_pdf.close()
# Example usage
pdf_files = ['report1.pdf', 'report2.pdf', 'appendix.pdf']
merge_pdfs_robust(pdf_files, 'final_report.pdf')
Merging with Custom Page Insertion
For more control over where pages are inserted:
import pymupdf
def merge_with_custom_insertion(base_pdf, insertions, output_path):
"""
Merge PDFs with custom insertion points
Args:
base_pdf: Path to the base PDF document
insertions: List of tuples (pdf_path, insert_after_page)
output_path: Path for the output merged PDF
"""
# Open the base document
merged_pdf = pymupdf.open(base_pdf)
# Sort insertions by page number (descending) to avoid page number shifts
insertions.sort(key=lambda x: x[1], reverse=True)
for pdf_path, insert_at_page in insertions:
insert_pdf = pymupdf.open(pdf_path)
# Insert at specified page
merged_pdf.insert_pdf(
insert_pdf,
start_at=insert_at_page
)
insert_pdf.close()
merged_pdf.save(output_path)
merged_pdf.close()
# Example: Insert cover.pdf at page 1, insert appendix.pdf at page 9
insertions = [
('cover.pdf', 0),
('appendix.pdf', 8)
]
merge_with_custom_insertion('main_document.pdf', insertions, 'complete_document.pdf')
Performance Optimization for Large Files
When working with large PDFs, consider these optimization techniques:
import pymupdf
def merge_large_pdfs(pdf_list, output_path, chunk_size=10):
"""
Merge large PDFs with memory optimization
Args:
pdf_list: List of PDF file paths to merge
output_path: Path for the output merged PDF
chunk_size: Number of pages to process at once
"""
merged_pdf = pymupdf.open()
for pdf_path in pdf_list:
pdf_document = pymupdf.open(pdf_path)
total_pages = pdf_document.page_count
# Process in chunks to manage memory
for start_page in range(0, total_pages, chunk_size):
end_page = min(start_page + chunk_size - 1, total_pages - 1)
# Create temporary document for this chunk
temp_doc = pymupdf.open()
temp_doc.insert_pdf(pdf_document, from_page=start_page, to_page=end_page)
# Insert chunk into merged document
merged_pdf.insert_pdf(temp_doc)
# Clean up temporary document
temp_doc.close()
pdf_document.close()
merged_pdf.save(output_path)
merged_pdf.close()
merge_large_pdfs(["large-doc-1","large-doc-2.pdf"], "output.pdf")
Adding Metadata to Merged PDFs
Don't forget to add proper metadata to your merged documents:
import pymupdf
def merge_with_metadata(pdf_list, output_path, metadata=None):
"""
Merge PDFs and add custom metadata
Args:
pdf_list: List of PDF file paths to merge
output_path: Path for the output merged PDF
metadata: Dictionary of metadata to add
"""
merged_pdf = pymupdf.open()
# Merge the PDFs
for pdf_path in pdf_list:
pdf_document = pymupdf.open(pdf_path)
merged_pdf.insert_pdf(pdf_document)
pdf_document.close()
# Add metadata
if metadata:
merged_pdf.set_metadata(metadata)
else:
# Default metadata
default_metadata = {
'title': 'Merged PDF Document',
'author': 'PyMuPDF Merger',
'subject': 'Combined PDF files',
'creator': 'Python PyMuPDF',
'producer': 'PyMuPDF Library'
}
merged_pdf.set_metadata(default_metadata)
merged_pdf.save(output_path)
merged_pdf.close()
# Example with custom metadata
custom_metadata = {
'title': 'Annual Report 2024',
'author': 'Your Company',
'subject': 'Financial and operational results',
'keywords': 'annual report, financial, operations'
}
merge_with_metadata(['q1.pdf', 'q2.pdf', 'q3.pdf', 'q4.pdf'],
'annual_report_2024.pdf',
custom_metadata)
Best Practices and Tips
1. Always close documents: PyMuPDF documents should be explicitly closed to free memory.
2. Handle encrypted PDFs: Check if PDFs are encrypted and handle authentication:
if pdf_document.needs_pass:
pdf_document.authenticate("password")
3. Validate input files: Always check if files exist and are valid PDFs before processing.
4. Memory management: For large documents, consider processing in chunks or using temporary files.
5. Preserve document structure: Use bookmark preservation when merging related documents.
Conclusion
PyMuPDF provides a powerful and flexible way to merge PDF documents in Python. Whether you need basic concatenation or advanced merging with custom page ranges and metadata, PyMuPDF has you covered. The library's performance and reliability make it an excellent choice for both simple scripts and production applications.
The examples in this guide should give you a solid foundation for implementing PDF merging in your own projects. Remember to always handle errors gracefully and consider memory usage when working with large documents.
For more advanced PDF manipulation features, explore PyMuPDF's extensive documentation and consider combining merging with other operations like page rotation, annotation handling, and text extraction.