Showing posts with label pdf2img. Show all posts
Showing posts with label pdf2img. Show all posts

12/01/2021

pdf2img, pdf to image, python library

 way #1

--

# pip install pdf2image
from pdf2image import convert_from_path
pdffile = '2081033884.pdf'
pages = convert_from_path(pdffile, 500)
#Saving pages in jpeg format
for i, page in enumerate(pages):
page.save(f'pdf2image_{i}.jpg', 'JPEG')

--



way #2

--

#pip install pymupdf
import fitz
pdffile = '2081033884.pdf'
doc = fitz.open(pdffile)
#split pages
for i, page in enumerate(doc.pages()):
pix = page.get_pixmap()
img_filename = f"fitz_{i}.jpg"
pix.pil_save(img_filename, format="jpeg", dpi=(300,300)) #, ... more PIL parameters)

--


Thank you.

www.marearts.com

πŸ™‡πŸ»‍♂️

4/02/2020

PDF to OpenCV as page by page using PyMuPDF library (python example code)

Just see the below example code 😊

pip install PyMuPDF
document : https://pymupdf.readthedocs.io/en/latest/

I think this is better library than pypdf2 πŸ€”
..

import fitz
import numpy as np
import cv2
fname = 'information-10-00248-v2'
doc = fitz.open(fname+'.pdf')

#split pages
for i, page in enumerate(doc.pages()):
print(i)
zoom = 1
mat = fitz.Matrix(zoom, zoom)
pix = page.getPixmap(matrix = mat)
imgData = pix.getImageData("png")
 
#save image from byte
f = open('./save_by_byte_{}_{}.png'.format(fname, i), 'wb')
f.write(imgData)
f.close()
 
#save image from opencv
nparr = np.frombuffer(imgData, np.uint8)
img = cv2.imdecode(nparr, cv2.IMREAD_COLOR)
print(img.shape)
cv2.imwrite('./save_by_opencv_{}_{}.png'.format(fname, i),img)

..