All About PDF - Your PDF Toolkit

View Original

How To Split PDF By Document Contents

Splitting a PDF is a very useful feature for many reasons. For payroll processors, one common scenario is where the payroll software exports all the paystubs as a single PDF document and this PDF then needs to be split into individual pay stubs that can be sent to the respective employees.

One of the challenges when splitting such a PDF is that you don’t always know where to split the PDF as the different paystubs can be of varying lengths for different employees. For example, one employee’s paystub can be 3 pages long where as another employee’s may be just 1 page. The PDF splitting process should know where to split the PDF for all the different employees.

In All-About-PDF, we have added new method of splitting PDFs where you can specify the location for a specific text pattern to be searched and if this is found, the program will know to split the PDF on this page. This could be, in our example, the employee number or identifier.

This video shows how to split a single PDF document into multiple files by locating text that matches a pattern. The split PDFs can then be named after the t...

All-About-PDF uses standard wildcard notation for matching the text to a pattern and below are some examples for common scenarios:

  • Any word or number: *

  • Any 5 letter word: ?????

  • Any 5 number word: #####

  • 2 letters followed by 3 numbers: ??###

  • Any word starting with A: A*

  • Any word containing the letters “AIP”: *AIP*

  • Any word ending with X: *X

  • Any word starting with a number and ending with X: #*X

  • Any word starting with A and ending with a number and the letter G: A*#G

There are many way to create the search pattern and a more detailed explanation can be found here:

https://docs.microsoft.com/en-us/dotnet/visual-basic/language-reference/operators/like-operator

To use this new PDF splitting feature in All-About-PDF, please follow the steps below:

  1. Open All-About-PDF and click on the Split PDF card

  2. Select the PDF document that you would like to split

  3. Select the folder where you would the resulting split files to be saved

  4. Specify the file name of the split file - the text from the PDF will be appended to this name

  5. Select the “Split by matching text pattern…” option and provide the pattern to search for

  6. Click the “SELECT”/”RESET” button to show a preview of the PDF and then specify the location of the text on the page by selecting it with your mouse

  7. You can specify to use the text from the PDF as the resulting file name as well as the start and end pages of the split PDF. These are denoted by the special variables {{pdfselection}}, {{pagefrom}} and {{pageto}}

  8. Click the SPLIT button to begin the splitting process

To get started with this new feature and more in All-About-PDF, download the free trial today!