I don't think it's a big problem, because Invalid UTF-8 byte appear in commented lines . You can try to replace with iconv -t 'utf-8' Anyway, also ppchtex.tex: Non-ISO extended-ASCII English text sort-lan.tex: Non-ISO extended-ASCII English text regi-ibm.tex: Non-ISO extended-ASCII English text, with LF, NEL line terminators -- luigi