* texutil "malformed UTF-8 character" error
@ 2004-08-22 12:54 Duncan Hothersall
2004-08-22 12:57 ` Duncan Hothersall
2004-08-22 21:43 ` Hans Hagen Outside
0 siblings, 2 replies; 8+ messages in thread
From: Duncan Hothersall @ 2004-08-22 12:54 UTC (permalink / raw)
I'm running the latest beta on top of a TeXlive 2003 install on linux.
The job I'm currently running needs various tables of contents (and a
set of bookmarks) but texutil (v. 8.2) seems to be choking on the .tui
file at the very end of the run with this message:
TeXUtil 8.2 - ConTeXt / PRAGMA ADE 1992-2004
action : processing commands, lists and registers
option : sorting IJ under Y
option : converting high ASCII values
input file : nubs-rg-bk.tui
output file : nubs-rg-bk.tuo
Malformed UTF-8 character (unexpected end of string) at
/usr/TeX/texmf/scripts/context/perl/texutil.pl line 1520, <TUI> line 3.
Malformed UTF-8 character (unexpected end of string) at
/usr/TeX/texmf/scripts/context/perl/texutil.pl line 979, <TUI> line 3.
Malformed UTF-8 character (unexpected end of string) at
/usr/TeX/texmf/scripts/context/perl/texutil.pl line 990, <TUI> line 3.
passed commands : 1136
remapped keys : 0
register entries : 0 -> 0 entries 0 references
synonym entries : 0 -> 0 entries
embedded files : 1
As you can see, as a result of the UTF-8 errors the .tui file isn't
being successfully read, so I'm getting no register entries and hence no
tables of contents ( or bookmarks).
I'm in the process of composing a minimal example file, but it's tough
going and I wondered if anyone could point me in the right direction
from the information in the error message.
I'd be really grateful! (Impossible deadlines approach...)
Thanks all.
Duncan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: texutil "malformed UTF-8 character" error
2004-08-22 12:54 texutil "malformed UTF-8 character" error Duncan Hothersall
@ 2004-08-22 12:57 ` Duncan Hothersall
2004-08-22 21:45 ` Hans Hagen Outside
2004-08-22 21:43 ` Hans Hagen Outside
1 sibling, 1 reply; 8+ messages in thread
From: Duncan Hothersall @ 2004-08-22 12:57 UTC (permalink / raw)
I wrote:
> Malformed UTF-8 character (unexpected end of string) at
> /usr/TeX/texmf/scripts/context/perl/texutil.pl line 1520, <TUI> line 3.
etc.
Forgot to say, the /tui file in question has this at line 3:
c \thisisbytesequence{^^G^^[#}
which certainly does look a bit funny.
Duncan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: texutil "malformed UTF-8 character" error
2004-08-22 12:54 texutil "malformed UTF-8 character" error Duncan Hothersall
2004-08-22 12:57 ` Duncan Hothersall
@ 2004-08-22 21:43 ` Hans Hagen Outside
1 sibling, 0 replies; 8+ messages in thread
From: Hans Hagen Outside @ 2004-08-22 21:43 UTC (permalink / raw)
Duncan Hothersall wrote:
> I'm running the latest beta on top of a TeXlive 2003 install on linux.
> The job I'm currently running needs various tables of contents (and a
> set of bookmarks) but texutil (v. 8.2) seems to be choking on the .tui
> file at the very end of the run with this message:
>
> TeXUtil 8.2 - ConTeXt / PRAGMA ADE 1992-2004
>
> action : processing commands, lists and registers
> option : sorting IJ under Y
> option : converting high ASCII values
> input file : nubs-rg-bk.tui
> output file : nubs-rg-bk.tuo
> Malformed UTF-8 character (unexpected end of string) at
> /usr/TeX/texmf/scripts/context/perl/texutil.pl line 1520, <TUI> line 3.
> Malformed UTF-8 character (unexpected end of string) at
> /usr/TeX/texmf/scripts/context/perl/texutil.pl line 979, <TUI> line 3.
> Malformed UTF-8 character (unexpected end of string) at
> /usr/TeX/texmf/scripts/context/perl/texutil.pl line 990, <TUI> line 3.
> passed commands : 1136
> remapped keys : 0
> register entries : 0 -> 0 entries 0 references
> synonym entries : 0 -> 0 entries
> embedded files : 1
>
>
> As you can see, as a result of the UTF-8 errors the .tui file isn't
> being successfully read, so I'm getting no register entries and hence
> no tables of contents ( or bookmarks).
>
> I'm in the process of composing a minimal example file, but it's tough
> going and I wondered if anyone could point me in the right direction
> from the information in the error message.
>
> I'd be really grateful! (Impossible deadlines approach...)
in cont-new (or cont-sys) you can say:
\def\testbytesequence{}
This 'test' was added in order to determine of tex runs in 8 bit mode. I wonder where the Malformed message comes from. Since when is perl utf-8 by default?
(i run perl 5.8.0)
Hans
-----------------------------------------------------------------
Hans Hagen | PRAGMA ADE
Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
| www.pragma-pod.nl
-----------------------------------------------------------------
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Re: texutil "malformed UTF-8 character" error
2004-08-22 12:57 ` Duncan Hothersall
@ 2004-08-22 21:45 ` Hans Hagen Outside
0 siblings, 0 replies; 8+ messages in thread
From: Hans Hagen Outside @ 2004-08-22 21:45 UTC (permalink / raw)
Duncan Hothersall wrote:
> I wrote:
>
>> Malformed UTF-8 character (unexpected end of string) at
>> /usr/TeX/texmf/scripts/context/perl/texutil.pl line 1520, <TUI> line 3.
>
> etc.
>
> Forgot to say, the /tui file in question has this at line 3:
>
> c \thisisbytesequence{^^G^^[#}
>
> which certainly does look a bit funny.
can you check your cp8bit.tcx file? it probably isn't 8 bit -) should be:
0x00 0x00 %
0x01 0x01 %
0x02 0x02 %
0x03 0x03 %
0x04 0x04 %
etc
Hans
-----------------------------------------------------------------
Hans Hagen | PRAGMA ADE
Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
| www.pragma-pod.nl
-----------------------------------------------------------------
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Re: texutil "malformed UTF-8 character" error
2004-08-23 11:20 ` Duncan Hothersall
@ 2004-08-23 11:59 ` Hans Hagen
0 siblings, 0 replies; 8+ messages in thread
From: Hans Hagen @ 2004-08-23 11:59 UTC (permalink / raw)
Duncan Hothersall wrote:
> Right, interesting - have upgraded to Perl 5.8.5, and now, having
> removed the \def\testbytesequence{} from cont-new.tex, the error
> message "malformed UTF-8 character" has gone away! Quite possibly the
> RedHat 9 install of Perl 5.8.0 had some default of UTF-8 mode set
> (though I couldn't see any evidence of it).
but still strange that ^^something triggers utf-8
Hans
-----------------------------------------------------------------
Hans Hagen | PRAGMA ADE
Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
| www.pragma-pod.nl
-----------------------------------------------------------------
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Re: texutil "malformed UTF-8 character" error
[not found] <4129C7B2.2010503@capdm.com>
@ 2004-08-23 10:36 ` Duncan Hothersall
0 siblings, 0 replies; 8+ messages in thread
From: Duncan Hothersall @ 2004-08-23 10:36 UTC (permalink / raw)
> > although the .tui file is full of entries as far as I can tell. Maybe
> > something esle is wrong - still working on minimal file.
>
> \starttext
>
> \placelist[chapter][criterium=text]
> \placeindex[criterium=text]
>
> \chapter{test}
>
> \index{test} test
>
> \stoptext
>
>
> can you send me the tui file?
>
> Hans
This works fine - registers gain entries and are output correctly, and
the .tui file has in it:
c \thisissectionseparator{:}
c \thisisutilityversion{2003.07.19}
c \thisisbytesequence{}
f b {tester}
c \mainreference{}{index:t}{2::0:0:0:0:0:0:0::1}{1}{}
c \initializevariable\usedcolorchannels{}
c \listentry{chapter}{1}{1}{test}{2::0:1:0:0:0:0:0::2}{2}
r e {index} {2} {} {test} {2::0:1:0:0:0:0:0::2} {2}
f e {tester}
c \initializevariable\lastpage{2}
c \initializevariable\lastpagenumber{2}
c \initializevariable\totalnofMPgraphics{0}
c \initializevariable\totalnofpositions{0}
c \initializevariable\totalnofparbackgrounds{0}
c \initializevariable\currentstrategypass{1}
Must be something I'm doing in the main file. I will continue to try to
produce a minimal non-working example. Thanks for all the help thus far.
Duncan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Re: texutil "malformed UTF-8 character" error
2004-08-23 8:03 ` Duncan Hothersall
2004-08-23 8:26 ` Taco Hoekwater
@ 2004-08-23 10:11 ` Hans Hagen
1 sibling, 0 replies; 8+ messages in thread
From: Hans Hagen @ 2004-08-23 10:11 UTC (permalink / raw)
Duncan Hothersall wrote:
> although the .tui file is full of entries as far as I can tell. Maybe
> something esle is wrong - still working on minimal file.
\starttext
\placelist[chapter][criterium=text]
\placeindex[criterium=text]
\chapter{test}
\index{test} test
\stoptext
can you send me the tui file?
Hans
-----------------------------------------------------------------
Hans Hagen | PRAGMA ADE
Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
| www.pragma-pod.nl
-----------------------------------------------------------------
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Re: texutil "malformed UTF-8 character" error
2004-08-23 8:03 ` Duncan Hothersall
@ 2004-08-23 8:26 ` Taco Hoekwater
2004-08-23 10:11 ` Hans Hagen
1 sibling, 0 replies; 8+ messages in thread
From: Taco Hoekwater @ 2004-08-23 8:26 UTC (permalink / raw)
Hi,
On Mon, 23 Aug 2004 09:03:36 +0100, Duncan wrote:
> > (i run perl 5.8.0)
>
> I'm running 5.8.0 too (on Redhat).
I'd definately try to get away from 5.8.0 as soon as possible. There were
quite a lot of (sometimes rather serious) bugs in 5.8.0, esp. in the multibyte
handling, but also in other areas !
And you should also check if perl runs in UTF-8 mode by default. From the man
page:
You can enable automatic UTF-8-ification of your standard file han-
dles, default "open()" layer, and @ARGV by using either the "-C"
command line switch or the "PERL_UNICODE" environment variable, see
perlrun for the documentation of the "-C" switch.
Good luck, Taco
--
groeten,
Taco
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2004-08-23 11:59 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-08-22 12:54 texutil "malformed UTF-8 character" error Duncan Hothersall
2004-08-22 12:57 ` Duncan Hothersall
2004-08-22 21:45 ` Hans Hagen Outside
2004-08-22 21:43 ` Hans Hagen Outside
[not found] <20040823024808.07B181278C@ronja.ntg.nl>
2004-08-23 8:03 ` Duncan Hothersall
2004-08-23 8:26 ` Taco Hoekwater
2004-08-23 10:11 ` Hans Hagen
[not found] <4129C7B2.2010503@capdm.com>
2004-08-23 10:36 ` Duncan Hothersall
[not found] <20040823100001.BA8FF1277A@ronja.ntg.nl>
2004-08-23 11:20 ` Duncan Hothersall
2004-08-23 11:59 ` Hans Hagen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).