'Split each PDF page in two?

I have a large number of PDF files which have two slides to a page (for printing).

The format is A4 pages each with two slides setup like so:

-----------
| slide 1 |
-----------
| slide 2 |
-----------

How can I generate a new PDF file with one slide per page?

Happy to use GUI, CLI, scripts or even interface with a language's PDF library; but I do need the text on the slides to still be selectable.



Solution 1:[1]

PDF Scissors allowed me to bulk split (crop) all pages in a PDF.

Solution 2:[2]

mutool works brillantly for this. The example below will chop each page of input.pdf into 3 horizontal and 8 vertical parts (thus creating 24 pages of output for each 1 of input):

mutool poster -x 3 -y 8 input.pdf output.pdf

To install mutool, just install mupdf, which is probably packaged with most GNU/Linux distributions.

(Credits to marttt.)

On debian based linux systems like ubuntu, you can install it using

sudo apt install mupdf
sudo apt install mupdf-tools

Solution 3:[3]

You can use a Python library called PyPDF. This function will split double pages no matter what the page orientation is:

import copy
import math
import pyPdf

def split_pages(src, dst):
    src_f = file(src, 'r+b')
    dst_f = file(dst, 'w+b')

    input = pyPdf.PdfFileReader(src_f)
    output = pyPdf.PdfFileWriter()

    for i in range(input.getNumPages()):
        p = input.getPage(i)
        q = copy.copy(p)
        q.mediaBox = copy.copy(p.mediaBox)

        x1, x2 = p.mediaBox.lowerLeft
        x3, x4 = p.mediaBox.upperRight

        x1, x2 = math.floor(x1), math.floor(x2)
        x3, x4 = math.floor(x3), math.floor(x4)
        x5, x6 = math.floor(x3/2), math.floor(x4/2)

        if x3 > x4:
            # horizontal
            p.mediaBox.upperRight = (x5, x4)
            p.mediaBox.lowerLeft = (x1, x2)

            q.mediaBox.upperRight = (x3, x4)
            q.mediaBox.lowerLeft = (x5, x2)
        else:
            # vertical
            p.mediaBox.upperRight = (x3, x4)
            p.mediaBox.lowerLeft = (x1, x6)

            q.mediaBox.upperRight = (x3, x6)
            q.mediaBox.lowerLeft = (x1, x2)

        output.addPage(p)
        output.addPage(q)

    output.write(dst_f)
    src_f.close()
    dst_f.close()

Solution 4:[4]

Briss is "a simple cross-platform (Linux, Windows, Mac OSX) application for cropping PDF files. A simple user interface lets you define exactly the crop-region by fitting a rectangle on the visually overlaid pages." It's open source (GPL).

Works well for me. The GUI is minimal, but functional. It can also be used from the command line.

Solution 5:[5]

Thanks to Matt Gumbley for his Python Script. I have modified that Python script such that it now also works with PDFs that contain portrait and landscape pages and cropped pages:

# -*- coding: utf-8 -*-
"""
Created on Thu Feb 26 08:49:39 2015

@author: Matt Gumbley  (stackoverflow)
changed by Hanspeter Schmid to deal with already cropped pages
"""

import copy
import math
from PyPDF2 import PdfFileReader, PdfFileWriter

def split_pages2(src, dst):
    src_f = file(src, 'r+b')
    dst_f = file(dst, 'w+b')

    input = PdfFileReader(src_f)
    output = PdfFileWriter()

    for i in range(input.getNumPages()):
        # make two copies of the input page
        pp = input.getPage(i)
        p = copy.copy(pp)
        q = copy.copy(pp)

        # the new media boxes are the previous crop boxes
        p.mediaBox = copy.copy(p.cropBox)
        q.mediaBox = copy.copy(p.cropBox)

        x1, x2 = p.mediaBox.lowerLeft
        x3, x4 = p.mediaBox.upperRight

        x1, x2 = math.floor(x1), math.floor(x2)
        x3, x4 = math.floor(x3), math.floor(x4)
        x5, x6 = x1+math.floor((x3-x1)/2), x2+math.floor((x4-x2)/2)

        if (x3-x1) > (x4-x2):
            # horizontal
            q.mediaBox.upperRight = (x5, x4)
            q.mediaBox.lowerLeft = (x1, x2)

            p.mediaBox.upperRight = (x3, x4)
            p.mediaBox.lowerLeft = (x5, x2)
        else:
            # vertical
            p.mediaBox.upperRight = (x3, x4)
            p.mediaBox.lowerLeft = (x1, x6)

            q.mediaBox.upperRight = (x3, x6)
            q.mediaBox.lowerLeft = (x1, x2)


        p.artBox = p.mediaBox
        p.bleedBox = p.mediaBox
        p.cropBox = p.mediaBox

        q.artBox = q.mediaBox
        q.bleedBox = q.mediaBox
        q.cropBox = q.mediaBox

        output.addPage(q)
        output.addPage(p)


    output.write(dst_f)
    src_f.close()
    dst_f.close()

Solution 6:[6]

Here is how I did it with pdfrw:

import sys, os, pdfrw
writer = pdfrw.PdfWriter()
for page in pdfrw.PdfReader('input.pdf').pages:
    for y in [0, 0.5]:
        newpage = pdfrw.PageMerge()    
        newpage.add(page, viewrect=(0, y, 1, 0.5))
        writer.addpages([newpage.render()])
writer.write('output.pdf')

Short and working!

If you want it rotated (example: input A4 portrait, output 2 A5 portrait and not landscape):

import sys, os, pdfrw
writer = pdfrw.PdfWriter()
for page in pdfrw.PdfReader('input.pdf').pages:
    for y in [0, 0.5]:
        newpage = pdfrw.PageMerge()    
        newpage.add(page, viewrect=(0, y, 1, 0.5))
        p = newpage.render()
        p.Rotate = 270
        writer.addpages([p])
writer.write('output.pdf')

Solution 7:[7]

If using a Java or .Net library is ok for you, you can use iText / iTextSharp.

An example for tiling an existing document can be found in the book iText in Action, 2nd edition, in the freely available chapter 6: TilingHero.java / TilingHero.cs.

Solution 8:[8]

Thanks to moraes for that answer. In my case, the resulting PDF looked fine in Adobe Reader and Mac preview, but did not appear to have been split into separate pages at all when viewing on iOS. I used Python 2.7.8 and PyPDF 2, and modified the script as follows, which worked fine. (and reordered the pages left/right, rather than right/left).

import copy
import math
from PyPDF2 import PdfFileReader, PdfFileWriter

def split_pages(src, dst):
    src_f = file(src, 'r+b')
    dst_f = file(dst, 'w+b')

    input = PdfFileReader(src_f)
    output = PdfFileWriter()

    for i in range(input.getNumPages()):
        p = input.getPage(i)
        q = copy.copy(p)
        q.mediaBox = copy.copy(p.mediaBox)

        x1, x2 = p.mediaBox.lowerLeft
        x3, x4 = p.mediaBox.upperRight

        x1, x2 = math.floor(x1), math.floor(x2)
        x3, x4 = math.floor(x3), math.floor(x4)
        x5, x6 = math.floor(x3/2), math.floor(x4/2)

        if x3 > x4:
            # horizontal
            p.mediaBox.upperRight = (x5, x4)
            p.mediaBox.lowerLeft = (x1, x2)

            q.mediaBox.upperRight = (x3, x4)
            q.mediaBox.lowerLeft = (x5, x2)
        else:
            # vertical
            p.mediaBox.upperRight = (x3, x4)
            p.mediaBox.lowerLeft = (x1, x6)

            q.mediaBox.upperRight = (x3, x6)
            q.mediaBox.lowerLeft = (x1, x2)


        p.artBox = p.mediaBox
        p.bleedBox = p.mediaBox
        p.cropBox = p.mediaBox

        q.artBox = q.mediaBox
        q.bleedBox = q.mediaBox
        q.cropBox = q.mediaBox

        output.addPage(q)
        output.addPage(p)

    output.write(dst_f)
    src_f.close()
    dst_f.close()

Solution 9:[9]

Try BRISS.

alt text

It lets you split each page into as many subpages as you want by defining regions with a GUI. It groups all similar pages into groups for you, so you can define regions for that group once.

It's cross-platform, free, and open-source.

(copy-pasted from https://superuser.com/a/235327/35237)

Solution 10:[10]

With mupdf-1.8-windows-x64, in win10 CMD, you need to have 'poster ' (followed by space and without quotes) before the horizontal parameter (-x ). For example for a double-paged scan to PDF:

mutool poster -x 2 -y 1 C:\Users\alfie\Documents\SNM\The_Ultimate_Medicine.pdf C:\Users\alfie\Documents\ebooks\The_Ultimate_Medicine.pdf

What a wonderful tool! Merci infiniment !.. (and the output file ~9MB is only 52KB bigger than the original!)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Nick
Solution 2 jaggi
Solution 3 moraes
Solution 4 Nicolas Payette
Solution 5 Hanspeter Schmid
Solution 6
Solution 7 mkl
Solution 8 Matt Gumbley
Solution 9
Solution 10 ketan