ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
* Re: texutil "malformed UTF-8 character" error
       [not found] <20040823100001.BA8FF1277A@ronja.ntg.nl>
@ 2004-08-23 11:20 ` Duncan Hothersall
  2004-08-23 11:59   ` Hans Hagen
  0 siblings, 1 reply; 7+ messages in thread
From: Duncan Hothersall @ 2004-08-23 11:20 UTC (permalink / raw)


Taco said:

> On Mon, 23 Aug 2004 09:03:36 +0100, Duncan wrote:
> 
> 
>>>(i run perl 5.8.0)
>>
>>I'm running 5.8.0 too (on Redhat).
> 
> 
> I'd definately try to get away from 5.8.0 as soon as possible. There were
> quite a lot of (sometimes rather serious) bugs in 5.8.0, esp. in the multibyte 
> handling, but also in other areas !

Right, interesting - have upgraded to Perl 5.8.5, and now, having 
removed the \def\testbytesequence{} from cont-new.tex, the error message 
"malformed UTF-8 character" has gone away! Quite possibly the RedHat 9 
install of Perl 5.8.0 had some default of UTF-8 mode set (though I 
couldn't see any evidence of it).

Thanks for this Taco. Still not producing any register output, but at 
least that's half the problem solved.

Duncan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Re: texutil "malformed UTF-8 character" error
  2004-08-23 11:20 ` texutil "malformed UTF-8 character" error Duncan Hothersall
@ 2004-08-23 11:59   ` Hans Hagen
  0 siblings, 0 replies; 7+ messages in thread
From: Hans Hagen @ 2004-08-23 11:59 UTC (permalink / raw)


Duncan Hothersall wrote:

> Right, interesting - have upgraded to Perl 5.8.5, and now, having 
> removed the \def\testbytesequence{} from cont-new.tex, the error 
> message "malformed UTF-8 character" has gone away! Quite possibly the 
> RedHat 9 install of Perl 5.8.0 had some default of UTF-8 mode set 
> (though I couldn't see any evidence of it).

but still strange that ^^something triggers utf-8 

Hans 
 

-----------------------------------------------------------------
                                          Hans Hagen | PRAGMA ADE
              Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
     tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                             | www.pragma-pod.nl
-----------------------------------------------------------------

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: texutil "malformed UTF-8 character" error
       [not found] <20040823125611.DABA31278B@ronja.ntg.nl>
@ 2004-08-23 14:31 ` Duncan Hothersall
  0 siblings, 0 replies; 7+ messages in thread
From: Duncan Hothersall @ 2004-08-23 14:31 UTC (permalink / raw)


Ach.

I have fixed my second problem with missing register entries - I was 
using a header defined as a derivative of \chapter as the heading of my 
table of contents, rather than a derivative of \title which it should 
have been. Making that change fixed things.

Oddly, the output from texutil is still saying

       register entries : 0 -> 0 entries 0 references

despite the fact that a full table of contents is now showing up. One of 
the reasons I didn't find how to fix this earlier was that I was looking 
in the wrong place, and that is partly because that message made me 
assume that register entries were missing, when apparently they weren't.

Did I misunderstand what this message means, or is it a feedback bug?

Thanks for all the help,

Duncan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: texutil "malformed UTF-8 character" error
       [not found] <20040823024808.07B181278C@ronja.ntg.nl>
@ 2004-08-23  8:03 ` Duncan Hothersall
  0 siblings, 0 replies; 7+ messages in thread
From: Duncan Hothersall @ 2004-08-23  8:03 UTC (permalink / raw)


Hans wrote:

> in cont-new (or cont-sys) you can say:
> 
> \def\testbytesequence{}
> 
> This 'test' was added in order to determine of tex runs in 8 bit 
> mode. I wonder where the Malformed message comes from. Since when is 
> perl utf-8 by default?
> 
> (i run perl 5.8.0)

I'm running 5.8.0 too (on Redhat).

I have added \def\testbytesequence{} to the end of cont-new and the 
errors go away - but unfortunately I'm still not getting any register 
output. I still get

       register entries : 0 -> 0 entries 0 references

although the .tui file is full of entries as far as I can tell. Maybe 
something esle is wrong - still working on minimal file.

Hans wrote:

> can you check your cp8bit.tcx file? it probably isn't 8 bit -) should
> be:
> 
> 0x00   0x00  %
 > 0x01   0x01  %
 > 0x02   0x02  %
 > 0x03   0x03  %
 > 0x04   0x04  %
> 
> etc

Well on my system that file (which is dated February 29 2000) starts 
like this

more /usr/TeX/texmf/web2c/cp8bit.tcx:

%% cp8bit.tcx: transparent encoding translation table for TeX
%% input:     any 8-bit text encoding
%% internal TeX: the same encoding (nothing changes, but teTeX will display
%%              8-bit messages on console and in logfile)
%% comment:     This is required in teTeX to see 8-bit messages at 
console and
%%              in logfile (they are displayed in ^^xx form by default).
%%              Usage: add
%%                %& --translate-file=cp8bit.tcx
%%              as a first line of your document.
%%
%%              Prepared by Alexander Bokovoy <bokovoy@minsk.lug.net>
%%              (1999) Public domain
0x80   0x80  %
0x81   0x81  %
0x82   0x82  %
0x83   0x83  %
0x84   0x84  %
0x85   0x85  %
0x86   0x86  %
0x87   0x87  %

etc.


And the first lines of my log file for the job are:

This is pdfeTeXk, Version 3.141592-1.11a-2.1 (Web2C 7.5.2) 
(format=cont-en 2004.
8.22)  23 AUG 2004 08:32
entering extended mode
  %&-line parsing enabled.
  (/usr/TeX/texmf/web2c/cp8bit.tcx)


Thanks for any further insight!

Duncan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: texutil "malformed UTF-8 character" error
  2004-08-22 12:54 Duncan Hothersall
  2004-08-22 12:57 ` Duncan Hothersall
@ 2004-08-22 21:43 ` Hans Hagen Outside
  1 sibling, 0 replies; 7+ messages in thread
From: Hans Hagen Outside @ 2004-08-22 21:43 UTC (permalink / raw)


Duncan Hothersall wrote:

> I'm running the latest beta on top of a TeXlive 2003 install on linux. 
> The job I'm currently running needs various tables of contents (and a 
> set of bookmarks) but texutil (v. 8.2) seems to be choking on the .tui 
> file at the very end of the run with this message:
>
>  TeXUtil 8.2 - ConTeXt / PRAGMA ADE 1992-2004
>
>                 action : processing commands, lists and registers
>                 option : sorting IJ under Y
>                 option : converting high ASCII values
>             input file : nubs-rg-bk.tui
>            output file : nubs-rg-bk.tuo
> Malformed UTF-8 character (unexpected end of string) at 
> /usr/TeX/texmf/scripts/context/perl/texutil.pl line 1520, <TUI> line 3.
> Malformed UTF-8 character (unexpected end of string) at 
> /usr/TeX/texmf/scripts/context/perl/texutil.pl line 979, <TUI> line 3.
> Malformed UTF-8 character (unexpected end of string) at 
> /usr/TeX/texmf/scripts/context/perl/texutil.pl line 990, <TUI> line 3.
>        passed commands : 1136
>          remapped keys : 0
>       register entries : 0 -> 0 entries 0 references
>        synonym entries : 0 -> 0 entries
>         embedded files : 1
>
>
> As you can see, as a result of the UTF-8 errors the .tui file isn't 
> being successfully read, so I'm getting no register entries and hence 
> no tables of contents ( or bookmarks).
>
> I'm in the process of composing a minimal example file, but it's tough 
> going and I wondered if anyone could point me in the right direction 
> from the information in the error message.
>
> I'd be really grateful! (Impossible deadlines approach...)

in cont-new (or cont-sys) you can say:

\def\testbytesequence{}

This 'test' was added in order to determine of tex runs in 8 bit mode. I wonder where the Malformed message comes from. Since when is perl utf-8 by default? 

(i run perl 5.8.0) 


Hans 


-----------------------------------------------------------------
                                          Hans Hagen | PRAGMA ADE
              Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
     tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                             | www.pragma-pod.nl
-----------------------------------------------------------------

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: texutil "malformed UTF-8 character" error
  2004-08-22 12:54 Duncan Hothersall
@ 2004-08-22 12:57 ` Duncan Hothersall
  2004-08-22 21:43 ` Hans Hagen Outside
  1 sibling, 0 replies; 7+ messages in thread
From: Duncan Hothersall @ 2004-08-22 12:57 UTC (permalink / raw)


I wrote:

> Malformed UTF-8 character (unexpected end of string) at 
> /usr/TeX/texmf/scripts/context/perl/texutil.pl line 1520, <TUI> line 3.
etc.

Forgot to say, the /tui file in question has this at line 3:

c \thisisbytesequence{^^G^^[#}

which certainly does look a bit funny.

Duncan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* texutil "malformed UTF-8 character" error
@ 2004-08-22 12:54 Duncan Hothersall
  2004-08-22 12:57 ` Duncan Hothersall
  2004-08-22 21:43 ` Hans Hagen Outside
  0 siblings, 2 replies; 7+ messages in thread
From: Duncan Hothersall @ 2004-08-22 12:54 UTC (permalink / raw)


I'm running the latest beta on top of a TeXlive 2003 install on linux. 
The job I'm currently running needs various tables of contents (and a 
set of bookmarks) but texutil (v. 8.2) seems to be choking on the .tui 
file at the very end of the run with this message:

  TeXUtil 8.2 - ConTeXt / PRAGMA ADE 1992-2004

                 action : processing commands, lists and registers
                 option : sorting IJ under Y
                 option : converting high ASCII values
             input file : nubs-rg-bk.tui
            output file : nubs-rg-bk.tuo
Malformed UTF-8 character (unexpected end of string) at 
/usr/TeX/texmf/scripts/context/perl/texutil.pl line 1520, <TUI> line 3.
Malformed UTF-8 character (unexpected end of string) at 
/usr/TeX/texmf/scripts/context/perl/texutil.pl line 979, <TUI> line 3.
Malformed UTF-8 character (unexpected end of string) at 
/usr/TeX/texmf/scripts/context/perl/texutil.pl line 990, <TUI> line 3.
        passed commands : 1136
          remapped keys : 0
       register entries : 0 -> 0 entries 0 references
        synonym entries : 0 -> 0 entries
         embedded files : 1


As you can see, as a result of the UTF-8 errors the .tui file isn't 
being successfully read, so I'm getting no register entries and hence no 
tables of contents ( or bookmarks).

I'm in the process of composing a minimal example file, but it's tough 
going and I wondered if anyone could point me in the right direction 
from the information in the error message.

I'd be really grateful! (Impossible deadlines approach...)

Thanks all.

Duncan

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2004-08-23 14:31 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20040823100001.BA8FF1277A@ronja.ntg.nl>
2004-08-23 11:20 ` texutil "malformed UTF-8 character" error Duncan Hothersall
2004-08-23 11:59   ` Hans Hagen
     [not found] <20040823125611.DABA31278B@ronja.ntg.nl>
2004-08-23 14:31 ` Duncan Hothersall
     [not found] <20040823024808.07B181278C@ronja.ntg.nl>
2004-08-23  8:03 ` Duncan Hothersall
2004-08-22 12:54 Duncan Hothersall
2004-08-22 12:57 ` Duncan Hothersall
2004-08-22 21:43 ` Hans Hagen Outside

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).