How to extract images from a PDF¶
Before you start, make sure you have installed pdfminer.six. The second thing you need is a PDF with images. If you don’t have one, you can download this research paper with images of cats and dogs and save it as example.pdf:
$ curl https://www.robots.ox.ac.uk/~vgg/publications/2012/parkhi12a/parkhi12a.pdf --output example.pdf
Then run the pdf2txt command:
$ pdf2txt.py example.pdf --output-dir cats-and-dogs
This command extracts all the images from the PDF and saves them into the cats-and-dogs directory.