https://github.com/euske/pdfminer
Cermine uses Java itext in characterextractor
Grobid uses xpdf / Using pdf2xml/
written in Java though
https://www.crossref.org/labs/pdfextract/
written in ruby recommends Cermine
https://github.com/elifesciences/sciencebeam
uses Grobid and apache beam
contentmine
https://github.com/ContentMine/norma