discuss@mandoc.bsd.lv
 help / color / Atom feed
* Bad national charachers renndering in PDF
@ 2018-11-19 13:31 Marcin „sirmacik” Karpezo
  2018-11-19 13:52 ` Anthony J. Bentley
  0 siblings, 1 reply; 7+ messages in thread
From: Marcin „sirmacik” Karpezo @ 2018-11-19 13:31 UTC (permalink / raw)
  To: discuss

Hi

I'm having troubles with rendering of Polish charachers in my pdf
exports. I've tried every options I've found in man and so on, but
without any results. I've also tried emacs-like option in mdoc file.

Command I'm using:

mandoc -Tpdf -Kutf-8 -O paper=a4 test_case.1 >  test_case.pdf

Below you'll find urls to download sample files with:

How the export should look like:
https://chmurka.sirmacik.net/s/Cddu2Iju9Vb9WMB

What mandoc produces:
https://chmurka.sirmacik.net/s/YHi8nDTwVThRvBL

Test case, so you can try it for yourself:
https://chmurka.sirmacik.net/s/MzKOoB7jCGcUPGm

Please advaise on what am I doing wrong, or how this could be fixed.

Thank you for your time
Marcin

--
 To unsubscribe send an email to discuss+unsubscribe@mandoc.bsd.lv

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Bad national charachers renndering in PDF
  2018-11-19 13:31 Bad national charachers renndering in PDF Marcin „sirmacik” Karpezo
@ 2018-11-19 13:52 ` Anthony J. Bentley
  2018-11-19 14:06   ` Ingo Schwarze
  2018-11-19 14:07   ` Marcin „sirmacik” Karpezo
  0 siblings, 2 replies; 7+ messages in thread
From: Anthony J. Bentley @ 2018-11-19 13:52 UTC (permalink / raw)
  To: discuss

Hi Marcin,

Marcin „sirmacik” Karpezo writes:
> I'm having troubles with rendering of Polish charachers in my pdf
> exports. I've tried every options I've found in man and so on, but
> without any results.

Unfortunately mandoc's PDF output doesn't handle Unicode at all, and
renders all special characters as ASCII. I am not sure what adding
such support would entail (there might be font issues?). mandoc is
great at rendering Unicode in the terminal and in HTML output, but
getting Unicode in PDF manuals is currently a matter of using some
other troff implementation.

-- 
Anthony J. Bentley
--
 To unsubscribe send an email to discuss+unsubscribe@mandoc.bsd.lv

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Bad national charachers renndering in PDF
  2018-11-19 13:52 ` Anthony J. Bentley
@ 2018-11-19 14:06   ` Ingo Schwarze
  2018-11-19 14:19     ` Marcin „sirmacik” Karpezo
  2018-11-19 14:07   ` Marcin „sirmacik” Karpezo
  1 sibling, 1 reply; 7+ messages in thread
From: Ingo Schwarze @ 2018-11-19 14:06 UTC (permalink / raw)
  To: discuss

Hi Marcin and Anthony,

Anthony J. Bentley wrote on Mon, Nov 19, 2018 at 06:52:09AM -0700:
> Marcin Karpezo writes:

>> I'm having troubles with rendering of Polish charachers in my pdf
>> exports. I've tried every options I've found in man and so on, but
>> without any results.

> Unfortunately mandoc's PDF output doesn't handle Unicode at all, and
> renders all special characters as ASCII. I am not sure what adding
> such support would entail (there might be font issues?).

At least that, yes.  For PostScript and PDF output, font metrics
for the Times New Roman ASCII font are currently hardcoded in
term_ps.c, the number of supported glyphs is limited to 96 glyphs
per font style, and no other font can be used.  Generalizing font
support would be a major undertaking.

> mandoc is
> great at rendering Unicode in the terminal and in HTML output, but
> getting Unicode in PDF manuals is currently a matter of using some
> other troff implementation.

That statement is accurate.  Groff is to be recommended for that
purpose because it has better mdoc(7) support than other troff
implementations (except mandoc, of course).

Yours,
  Ingo
--
 To unsubscribe send an email to discuss+unsubscribe@mandoc.bsd.lv

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Bad national charachers renndering in PDF
  2018-11-19 13:52 ` Anthony J. Bentley
  2018-11-19 14:06   ` Ingo Schwarze
@ 2018-11-19 14:07   ` Marcin „sirmacik” Karpezo
  1 sibling, 0 replies; 7+ messages in thread
From: Marcin „sirmacik” Karpezo @ 2018-11-19 14:07 UTC (permalink / raw)
  To: discuss

Anthony J. Bentley dixit (2018-11-19, 06:52):

> Hi Marcin,
> 
> Marcin „sirmacik” Karpezo writes:
> > I'm having troubles with rendering of Polish charachers in my pdf
> > exports. I've tried every options I've found in man and so on, but
> > without any results.
> 
> Unfortunately mandoc's PDF output doesn't handle Unicode at all, and
> renders all special characters as ASCII. I am not sure what adding
> such support would entail (there might be font issues?). mandoc is
> great at rendering Unicode in the terminal and in HTML output, but
> getting Unicode in PDF manuals is currently a matter of using some
> other troff implementation.

That's bad news. At my institute we've found ourselves writing a lot of
mandoc documentation but we also need to start producing PDFs from it. 
Thanks for the info, I'll start looking for some other implementation. 

Marcin
--
 To unsubscribe send an email to discuss+unsubscribe@mandoc.bsd.lv

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Bad national charachers renndering in PDF
  2018-11-19 14:06   ` Ingo Schwarze
@ 2018-11-19 14:19     ` Marcin „sirmacik” Karpezo
  2018-11-19 16:07       ` Ingo Schwarze
  0 siblings, 1 reply; 7+ messages in thread
From: Marcin „sirmacik” Karpezo @ 2018-11-19 14:19 UTC (permalink / raw)
  To: discuss

Ingo Schwarze dixit (2018-11-19, 15:06):

> Hi Marcin and Anthony,
> 
> Anthony J. Bentley wrote on Mon, Nov 19, 2018 at 06:52:09AM -0700:
> > Marcin Karpezo writes:
> 
> >> I'm having troubles with rendering of Polish charachers in my pdf
> >> exports. I've tried every options I've found in man and so on, but
> >> without any results.
> 
> > Unfortunately mandoc's PDF output doesn't handle Unicode at all, and
> > renders all special characters as ASCII. I am not sure what adding
> > such support would entail (there might be font issues?).
> 
> At least that, yes.  For PostScript and PDF output, font metrics
> for the Times New Roman ASCII font are currently hardcoded in
> term_ps.c, the number of supported glyphs is limited to 96 glyphs
> per font style, and no other font can be used.  Generalizing font
> support would be a major undertaking.
> 
> > mandoc is
> > great at rendering Unicode in the terminal and in HTML output, but
> > getting Unicode in PDF manuals is currently a matter of using some
> > other troff implementation.
> 
> That statement is accurate.  Groff is to be recommended for that
> purpose because it has better mdoc(7) support than other troff
> implementations (except mandoc, of course).
> 
> Yours,
>   Ingo

Do you by any chance have some command example to get that from Groff? 
I've tried several options I've found in manuals and onlie sources but
results look worse then mandoc output. Charachers are either printent
with fre alphanums or do not printed at all. 

Marcin
--
 To unsubscribe send an email to discuss+unsubscribe@mandoc.bsd.lv

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Bad national charachers renndering in PDF
  2018-11-19 14:19     ` Marcin „sirmacik” Karpezo
@ 2018-11-19 16:07       ` Ingo Schwarze
  2018-11-19 17:17         ` Marcin Karpezo
  0 siblings, 1 reply; 7+ messages in thread
From: Ingo Schwarze @ 2018-11-19 16:07 UTC (permalink / raw)
  To: Marcin Karpezo; +Cc: discuss

Hi Marcin,

Marcin Karpezo wrote on Mon, Nov 19, 2018 at 03:19:53PM +0100:

> Do you by any chance have some command example to get that from Groff? 
> I've tried several options I've found in manuals and onlie sources but
> results look worse then mandoc output. Charachers are either printent
> with fre alphanums or do not printed at all. 

A default installation of the groff PostScript and PDF output devices
does not include a font that supports the Polish language.  Did you
install such a font?  If not, the gropdf(1) manual page explains
how to do that, below "FONT INSTALLATION".

Once you have a font installed, both of the following commands are
supposed to work:

 $ groff -k -mandoc -T pdf /usr/local/man/pl/man6/wesnothd.6 > /tmp/tmp.pdf
 $ pdfroff -k -mandoc /usr/local/man/pl/man6/wesnothd.6 > /tmp/tmp.pdf

(Sorry for the silly example, that's the only Polish manual page
i happen to have installed right now...)

In addition to the options shown above, you might also need the -f
option documented in the groff(1) manual page to actually select the
font family you just installed instead of the default, Times New Roman.

If you have no suitable font family installed, you will see error
messages similar to the following, and wrong glyphs in the output
file:

troff: wesnothd.6:29: warning: can't find special character 'u0073_0301'
troff: wesnothd.6:29: warning: can't find special character 'u007A_0307'
troff: wesnothd.6:46: warning: can't find special character 'u0065_0328'
troff: wesnothd.6:46: warning: can't find special character 'u0061_0328'
troff: wesnothd.6:53: warning: can't find special character 'u006E_0301'
troff: wesnothd.6:158: warning: can't find special character 'u0053_0301'
troff: wesnothd.6:297: warning: can't find special character 'u005A_0307'
troff: wesnothd.6:297: warning: can't find special character 'u0045_0328'
troff: wesnothd.6:297: warning: can't find special character 'u0041_0328'

While mandoc tries to make *reading* manual pages simple and hence
does not require any fancy options or configuration for the -T utf8
and -T html output modes, real typesetting to PostScript or PDF is
unfortunately still somewhat tricky even in the 21st century and
still requires configuring quite a bit of stuff...  :-(

Yours,
  Ingo
--
 To unsubscribe send an email to discuss+unsubscribe@mandoc.bsd.lv

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Bad national charachers renndering in PDF
  2018-11-19 16:07       ` Ingo Schwarze
@ 2018-11-19 17:17         ` Marcin Karpezo
  0 siblings, 0 replies; 7+ messages in thread
From: Marcin Karpezo @ 2018-11-19 17:17 UTC (permalink / raw)
  To: discuss

Ingo Schwarze dixit (2018-11-19, 17:07):

> Hi Marcin,
> 
> Marcin Karpezo wrote on Mon, Nov 19, 2018 at 03:19:53PM +0100:
> 
> > Do you by any chance have some command example to get that from Groff? 
> > I've tried several options I've found in manuals and onlie sources but
> > results look worse then mandoc output. Charachers are either printent
> > with fre alphanums or do not printed at all. 
> 
> A default installation of the groff PostScript and PDF output devices
> does not include a font that supports the Polish language.  Did you
> install such a font?  If not, the gropdf(1) manual page explains
> how to do that, below "FONT INSTALLATION".
> 
> Once you have a font installed, both of the following commands are
> supposed to work:
> 
>  $ groff -k -mandoc -T pdf /usr/local/man/pl/man6/wesnothd.6 > /tmp/tmp.pdf
>  $ pdfroff -k -mandoc /usr/local/man/pl/man6/wesnothd.6 > /tmp/tmp.pdf
> 
> (Sorry for the silly example, that's the only Polish manual page
> i happen to have installed right now...)
> 
> In addition to the options shown above, you might also need the -f
> option documented in the groff(1) manual page to actually select the
> font family you just installed instead of the default, Times New Roman.
> 
> If you have no suitable font family installed, you will see error
> messages similar to the following, and wrong glyphs in the output
> file:
> 
> troff: wesnothd.6:29: warning: can't find special character 'u0073_0301'
> troff: wesnothd.6:29: warning: can't find special character 'u007A_0307'
> troff: wesnothd.6:46: warning: can't find special character 'u0065_0328'
> troff: wesnothd.6:46: warning: can't find special character 'u0061_0328'
> troff: wesnothd.6:53: warning: can't find special character 'u006E_0301'
> troff: wesnothd.6:158: warning: can't find special character 'u0053_0301'
> troff: wesnothd.6:297: warning: can't find special character 'u005A_0307'
> troff: wesnothd.6:297: warning: can't find special character 'u0045_0328'
> troff: wesnothd.6:297: warning: can't find special character 'u0041_0328'
> 
> While mandoc tries to make *reading* manual pages simple and hence
> does not require any fancy options or configuration for the -T utf8
> and -T html output modes, real typesetting to PostScript or PDF is
> unfortunately still somewhat tricky even in the 21st century and
> still requires configuring quite a bit of stuff...  :-(

This is just amazing, thank you for your time and such an extensive
instructions! 

Marcin
--
 To unsubscribe send an email to discuss+unsubscribe@mandoc.bsd.lv

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, back to index

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-19 13:31 Bad national charachers renndering in PDF Marcin „sirmacik” Karpezo
2018-11-19 13:52 ` Anthony J. Bentley
2018-11-19 14:06   ` Ingo Schwarze
2018-11-19 14:19     ` Marcin „sirmacik” Karpezo
2018-11-19 16:07       ` Ingo Schwarze
2018-11-19 17:17         ` Marcin Karpezo
2018-11-19 14:07   ` Marcin „sirmacik” Karpezo

discuss@mandoc.bsd.lv

Archives are clonable: git clone --mirror http://inbox.vuxu.org/mandoc-discuss

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://inbox.vuxu.org/vuxu.archive.mandoc.discuss


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git