to convert a bunch of docx to to a single pdf
-
install unoconv and libreoffice via brew
brew install unoconv
andbrew install libreoffice
-
batch convert all the docx files to individual pdfs
find . -name "*.docx" -exec unoconv {} \;
found in this stackoverflow answer also works with *.odt or *.doc or any format that libreoffice can open for *.jpg use (imagemagic):find . -name "*.jpg" -exec convert {} -background white -density 72 -page a4 {}.pdf \;
-
merge all the pdfs into one and create bookmarks for each with the filename using following python script (install PyPDF2 via
pip install PyPDF2
):
import os
import glob
from PyPDF2 import PdfFileMerger, PdfFileReader
merger = PdfFileMerger()
for pdf in sorted(glob.iglob('./**/*.pdf', recursive=True)):
merger.append(PdfFileReader(pdf, "rb"), bookmark=pdf[:-4])
merger.write("merged.pdf")
merger.close()