The
PoDoFo tools family got a new member during the last week:
podofocrop. This little tool crops the white margins of pages in PDF files.
I use tools like these a lot, for example when writing scientific papers or the like, because I try to include all figures as PDF files to have better quality in the final PDF. To do so, it is often necessary to remove the margin around the figure before including it. This is exactly what
podofocrop is doing.
The figure below illustrates what
podofocrop is doing. Figure (a) shows the input PDF file, we can see a page with a little bit of text and a big white margin. In the next step, Figure (b),
podofocrop calculates the bounding box for the final PDF, i.e. the area on the page that has content and the size to which we want to crop the PDF.
Ghostscript is used for this step and has to be in your PATH. Finally, the PDF is cropped and the new page is much smaller without any disturbing margins.
data:image/s3,"s3://crabby-images/36b4f/36b4f8e86bbd533e83af31877a0c95eb7fd2aaac" alt=""
If you are interested, get
podofocrop from the
PoDoFo SVN. In case of questions, please contact the
mailing list.
Please note: Windows support is still experimental and untested so far. It will compile but there are no guarantees that it will work. Help with the Windows port would be greatly appreciated.
At the end, let we give you one little detail on the implementation, because it is so amazingly simple using
PoDoFo. If you remove the code for parsing command line arguments, communication with Ghostscript and error handling, you see that the tool has only a few lines:
std::vector cropBoxes = get_crop_boxes_from_gs( pszInput );
PdfVariant var;
PdfMemDocument doc;
doc.Load( pszInput );
for( int i=0;i {
PdfPage* pPage = doc.GetPage( i );
cropBoxes[i].ToVariant( var );
pPage->GetObject()->GetDictionary().AddKey( PdfName("MediaBox"), var );
}
doc.Write( pszOutput );