How to Merge PDFs with PyMuPDF: A Complete Guide

Jamie Lemon·July 9, 2025

PyMuPDFPDF ManipulationMerging
How to Merge PDFs

PDF merging is a common task in document processing workflows, whether you're combining reports, consolidating research papers, or creating comprehensive documentation. While there are many tools available for this task, PyMuPDF stands out as a powerful Python library that offers precise control over PDF manipulation.

In this guide, we'll explore how to merge PDFs using PyMuPDF, from basic concatenation to advanced techniques with custom page ranges and bookmarks.

Basic PDF Merging

Let's start with the simplest case: merging two or more PDFs into a single document.

import pymupdf

def merge_pdfs(pdf_list, output_path):
    """
    Merge multiple PDFs into a single document
    
    Args:
        pdf_list: List of PDF file paths to merge
        output_path: Path for the output merged PDF
    """
    # Create a new PDF document
    merged_pdf = pymupdf.open()
    
    # Iterate through each PDF file
    for pdf_path in pdf_list:
        # Open the PDF
        pdf_document = pymupdf.open(pdf_path)
        
        # Insert all pages from the current PDF
        merged_pdf.insert_pdf(pdf_document)
        
        # Close the current PDF
        pdf_document.close()
    
    # Save the merged PDF
    merged_pdf.save(output_path)
    merged_pdf.close()

# Example usage
pdf_files = ['document1.pdf', 'document2.pdf', 'document3.pdf']
merge_pdfs(pdf_files, 'merged_document.pdf')

Merging Specific Page Ranges

Sometimes you don't want to merge entire documents—just specific pages. PyMuPDF makes this easy:

import pymupdf

def merge_pdf_pages(pdf_info, output_path):
    """
    Merge specific pages from multiple PDFs

    Args:
        pdf_info: List of tuples (pdf_path, start_page, end_page)
        output_path: Path for the output merged PDF
    """
    merged_pdf = pymupdf.open()

    for pdf_path, start_page, end_page in pdf_info:
        pdf_document = pymupdf.open(pdf_path)

        # Insert specific page range (0-indexed)
        merged_pdf.insert_pdf(
            pdf_document,
            from_page=start_page,
            to_page=end_page
        )

        pdf_document.close()

    merged_pdf.save(output_path)
    merged_pdf.close()

# Example:
page_ranges = [
    ('document1.pdf', 0, 1),  # First 2 pages
    ('document2.pdf', 3, 6)  # Pages 4-7
]
merge_pdf_pages(page_ranges, 'custom_merged.pdf')

Advanced Merging with Error Handling

In production environments, you'll want robust error handling:

import pymupdf
import os

from pathlib import Path

def merge_pdfs_robust(pdf_list, output_path, include_bookmarks=True):
    """
    Robustly merge PDFs with error handling and optional bookmark preservation

    Args:
        pdf_list: List of PDF file paths to merge
        output_path: Path for the output merged PDF
        include_bookmarks: Whether to preserve bookmarks from source PDFs
    """
    merged_pdf = pymupdf.open()

    try:
        for i, pdf_path in enumerate(pdf_list):
            # Check if file exists
            if not os.path.exists(pdf_path):
                print(f"Warning: File {pdf_path} not found, skipping...")
                continue

            try:
                pdf_document = pymupdf.open(pdf_path)

                # Check if PDF is valid and has pages
                if pdf_document.page_count == 0:
                    print(f"Warning: {pdf_path} has no pages, skipping...")
                    pdf_document.close()
                    continue

                # Get current page count for bookmark offset
                current_page_count = merged_pdf.page_count

                # Insert all pages
                merged_pdf.insert_pdf(pdf_document)

                # Handle bookmarks if requested
                if include_bookmarks:
                    try:
                        toc = pdf_document.get_toc()
                        if toc:
                            # Adjust bookmark page numbers for merged document
                            adjusted_toc = []
                            for level, title, page in toc:
                                adjusted_toc.append([level, title, page + current_page_count])

                            # Get existing TOC and extend it
                            existing_toc = merged_pdf.get_toc()
                            existing_toc.extend(adjusted_toc)
                            merged_pdf.set_toc(existing_toc)
                    except:
                        print(f"Warning: Could not process bookmarks for {pdf_path}")

                pdf_document.close()
                print(f"Successfully merged: {pdf_path}")

            except Exception as e:
                print(f"Error processing {pdf_path}: {str(e)}")
                continue

        # Save the merged PDF
        if merged_pdf.page_count > 0:
            merged_pdf.save(output_path)
            print(f"Merged PDF saved to: {output_path}")
            print(f"Total pages: {merged_pdf.page_count}")
        else:
            print("No pages to merge!")

    except Exception as e:
        print(f"Error during merge process: {str(e)}")

    finally:
        merged_pdf.close()

# Example usage
pdf_files = ['report1.pdf', 'report2.pdf', 'appendix.pdf']
merge_pdfs_robust(pdf_files, 'final_report.pdf')

Merging with Custom Page Insertion

For more control over where pages are inserted:

import pymupdf

def merge_with_custom_insertion(base_pdf, insertions, output_path):
    """
    Merge PDFs with custom insertion points

    Args:
        base_pdf: Path to the base PDF document
        insertions: List of tuples (pdf_path, insert_after_page)
        output_path: Path for the output merged PDF
    """
    # Open the base document
    merged_pdf = pymupdf.open(base_pdf)

    # Sort insertions by page number (descending) to avoid page number shifts
    insertions.sort(key=lambda x: x[1], reverse=True)

    for pdf_path, insert_at_page in insertions:
        insert_pdf = pymupdf.open(pdf_path)

        # Insert at specified page
        merged_pdf.insert_pdf(
            insert_pdf,
            start_at=insert_at_page
        )

        insert_pdf.close()

    merged_pdf.save(output_path)
    merged_pdf.close()

# Example: Insert cover.pdf at page 1, insert appendix.pdf at page 9
insertions = [
    ('cover.pdf', 0),
    ('appendix.pdf', 8)
]
merge_with_custom_insertion('main_document.pdf', insertions, 'complete_document.pdf')

Performance Optimization for Large Files

When working with large PDFs, consider these optimization techniques:

import pymupdf

def merge_large_pdfs(pdf_list, output_path, chunk_size=10):
    """
    Merge large PDFs with memory optimization

    Args:
        pdf_list: List of PDF file paths to merge
        output_path: Path for the output merged PDF
        chunk_size: Number of pages to process at once
    """
    merged_pdf = pymupdf.open()

    for pdf_path in pdf_list:
        pdf_document = pymupdf.open(pdf_path)
        total_pages = pdf_document.page_count

        # Process in chunks to manage memory
        for start_page in range(0, total_pages, chunk_size):
            end_page = min(start_page + chunk_size - 1, total_pages - 1)

            # Create temporary document for this chunk
            temp_doc = pymupdf.open()
            temp_doc.insert_pdf(pdf_document, from_page=start_page, to_page=end_page)

            # Insert chunk into merged document
            merged_pdf.insert_pdf(temp_doc)

            # Clean up temporary document
            temp_doc.close()

        pdf_document.close()

    merged_pdf.save(output_path)
    merged_pdf.close()

merge_large_pdfs(["large-doc-1","large-doc-2.pdf"], "output.pdf")

Adding Metadata to Merged PDFs

Don't forget to add proper metadata to your merged documents:

import pymupdf

def merge_with_metadata(pdf_list, output_path, metadata=None):
    """
    Merge PDFs and add custom metadata

    Args:
        pdf_list: List of PDF file paths to merge
        output_path: Path for the output merged PDF
        metadata: Dictionary of metadata to add
    """
    merged_pdf = pymupdf.open()

    # Merge the PDFs
    for pdf_path in pdf_list:
        pdf_document = pymupdf.open(pdf_path)
        merged_pdf.insert_pdf(pdf_document)
        pdf_document.close()

    # Add metadata
    if metadata:
        merged_pdf.set_metadata(metadata)
    else:
        # Default metadata
        default_metadata = {
            'title': 'Merged PDF Document',
            'author': 'PyMuPDF Merger',
            'subject': 'Combined PDF files',
            'creator': 'Python PyMuPDF',
            'producer': 'PyMuPDF Library'
        }
        merged_pdf.set_metadata(default_metadata)

    merged_pdf.save(output_path)
    merged_pdf.close()

# Example with custom metadata
custom_metadata = {
    'title': 'Annual Report 2024',
    'author': 'Your Company',
    'subject': 'Financial and operational results',
    'keywords': 'annual report, financial, operations'
}

merge_with_metadata(['q1.pdf', 'q2.pdf', 'q3.pdf', 'q4.pdf'],
                   'annual_report_2024.pdf',
                   custom_metadata)

Best Practices and Tips

1. Always close documents: PyMuPDF documents should be explicitly closed to free memory.

2. Handle encrypted PDFs: Check if PDFs are encrypted and handle authentication:

if pdf_document.needs_pass:
    pdf_document.authenticate("password")

3. Validate input files: Always check if files exist and are valid PDFs before processing.

4. Memory management: For large documents, consider processing in chunks or using temporary files.

5. Preserve document structure: Use bookmark preservation when merging related documents.

Conclusion

PyMuPDF provides a powerful and flexible way to merge PDF documents in Python. Whether you need basic concatenation or advanced merging with custom page ranges and metadata, PyMuPDF has you covered. The library's performance and reliability make it an excellent choice for both simple scripts and production applications.

The examples in this guide should give you a solid foundation for implementing PDF merging in your own projects. Remember to always handle errors gracefully and consider memory usage when working with large documents.

For more advanced PDF manipulation features, explore PyMuPDF's extensive documentation and consider combining merging with other operations like page rotation, annotation handling, and text extraction.