to convert a bunch of docx to to a single pdf
install unoconv and libreoffice via brew
brew install unoconv
andbrew install libreoffice
batch convert all the docx files to individual pdfs
find . -name "*.docx" -exec unoconv {} \;
found in this stackoverflow answer also works with *.odt or *.doc or any format that libreoffice can open for *.jpg use (imagemagic):find . -name "*.jpg" -exec convert {} -background white -density 72 -page a4 {}.pdf \;
merge all the pdfs into one and create bookmarks for each with the filename using following python script (install PyPDF2 via
pip install PyPDF2
import os
import glob
from PyPDF2 import PdfFileMerger, PdfFileReader
merger = PdfFileMerger()
for pdf in sorted(glob.iglob('./**/*.pdf', recursive=True)):
merger.append(PdfFileReader(pdf, "rb"), bookmark=pdf[:-4])