PATCH: insert-unicode-char

zsh-workers
 help / color / mirror / code / Atom feed

* PATCH: insert-unicode-char
@ 2005-08-18 15:43 Peter Stephenson
  2005-08-18 18:48 ` Oliver Kiddle
  2005-08-18 22:08 ` Nikolai Weibull
  0 siblings, 2 replies; 7+ messages in thread
From: Peter Stephenson @ 2005-08-18 15:43 UTC (permalink / raw)
  To: Zsh hackers list

First go at a function to compose Unicode characters based on
two-character sequences.

Index: Doc/Zsh/contrib.yo
===================================================================
RCS file: /cvsroot/zsh/zsh/Doc/Zsh/contrib.yo,v
retrieving revision 1.43
diff -u -r1.43 contrib.yo
--- Doc/Zsh/contrib.yo	27 Jun 2005 10:22:51 -0000	1.43
+++ Doc/Zsh/contrib.yo	18 Aug 2005 15:39:59 -0000
@@ -668,6 +668,112 @@
 
 example(bindkey '^Xf' insert-files)
 )
+tindex(insert-unicode-char)
+item(tt(insert-unicode-char))(
+This function allows you to compose Unicode characters to be inserted
+into the command line.  The command is followed by two keys (there is
+no prompt), of which the first indicates the type of accent or special
+character, and the second indicates the base character.  Both input
+characters are always from the ASCII character set.  For best results
+zsh should have been built with support for multibyte characters
+(configured with tt(--enable-multibyte)).
+
+The character is converted from Unicode into the local representation and
+inserted into the command line at the cursor position.
+(The conversion is done within the shell, using whatever facilities
+the C library provides.)  With a numeric argument, the character and its
+code are previewed in the status line
+
+The function may be run outside zle in which case it prints the character
+(together with a newline) to standard output.  Input is still read from
+keystrokes.
+
+The set of accented characters is reasonably complete up to U+0180, the
+set of special characters less so.  However, it mostly gives up at that
+point.  Adding new Unicode characters is easy, however.  Please send any
+additions to tt(zsh-workers@sunsite.dk).
+
+The codes for the first character are as follows:
+startsitem()
+sitem(tt(`))(
+Grave accent.
+)
+sitem(tt('))(
+Acute accent.
+)
+sitem(tt(d))(
+Double acute accent (only supported on a few letters).
+)
+sitem(tt(^))(
+Circumflex.
+)
+sitem(tt(~))(
+Tilde.
+)
+sitem(tt("))(
+Diaeresis (Umlaut).
+)
+sitem(tt(o))(
+Circle over the base character.
+)
+sitem(tt(e))(
+Ligatures ending in e or E: tt(e A) gives AE, tt(e o) gives oe, etc.
+)
+sitem(tt(j))(
+Ligatures ending in j or J: ij or IJ.
+)
+sitem(tt(c))(
+Cedilla.
+)
+sitem(tt(/))(
+Stroke through the base character.
+)
+sitem(tt(-))(
+Macron.  (A horizonal bar over the base character.)
+)
+sitem(tt(u))(
+Breve.  (A shallow dish shape over the base character.)
+)
+sitem(tt(.))(
+Dot above the base character
+)
+sitem(tt(:))(
+A dot in the middle plane of the base character
+)
+sitem(tt(g))(
+Ogonek.  (A little forward facing hook at the bottom right
+of the character.  The "g" stands for "Ogonek" but another
+mnemonic is that g has a squiggle below the line.)
+)
+sitem(tt(v))(
+Caron.  (A little v over the letter.)
+)
+sitem(tt(s))(
+Used only as tt(s s), a german Eszett or "scharfes S" ligature.
+)
+sitem(tt(h))(
+Icelandic (or Runic) edh (tt(h d)) or thorn (tt(h t)).
+)
+sitem(tt(m))(
+Various mathematica characters: not (tt(m \)), multiply (tt(m *)), divide
+(tt(m /)), degree (tt(m o)), +/- (tt(m +)), superscripts 1, 2, 3 (tt(m 1),
+etc.), micro (tt(m u)), quarter (tt(m q)), half (tt(m h)), three quarters
+(tt(m t)).
+)
+sitem(tt(p))(
+Various punctuation and currency characters (any non-mathematical symbol
+that is not part of a word):  soft space (tt(p _)), inverted ! (tt(p !)),
+cent (tt(p C)), pound sign (tt(p l)) (think lira, librum), currency (tt(p
+$)), yen (tt(p y)), broken bar (tt(p |)), section sign (tt(p s)), lonely
+diaeresis (tt(p ")), copyright sign (tt(p C)), Spanish feminine marker
+(tt(p f)), left guillemet (tt(p <)), soft hyphen (tt(p h)), registered
+trade mark (tt(p R)), lonely macron (tt(p -)), lonely acute (tt(p ')),
+Pilcrow (paragraph) sign (tt(p p)), middle dot (tt(p :)),
+lonely cedilla (tt(p c)), Spanish masculine marker (tt(p m)), right
+guillemet (tt(p >)), inverted ? (tt(p ?)), Euro sign (tt(p e)).
+)
+endsitem()
+)
 tindex(narrow-to-region)
 tindex(narrow-to-region-invisible)
 xitem(tt(narrow-to-region [ -p) var(pre) tt(] [ -P) var(post) tt(]))
Index: Functions/Zle/insert-unicode-char
===================================================================
RCS file: Functions/Zle/insert-unicode-char
diff -N Functions/Zle/insert-unicode-char
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ Functions/Zle/insert-unicode-char	18 Aug 2005 15:40:00 -0000
@@ -0,0 +1,214 @@
+# Accented characters.  Inputs two keys: first the code for the accent, then
+# the base character being accented.  Note that all input characters are
+# ASCII.  For best results zsh should have been built with support for
+# multibyte characters (--enable-multibyte).
+#
+# Outputs the character converted from Unicode into the local representation.
+# (The conversion is done within the shell, using whatever facilities
+# the C library provides.)
+#
+# When used as a zle widget, the character is inserted at the cursor
+# position.  With a numeric argument, preview in status line; outside zle,
+# print character (and newline) to standard output.
+#
+# The set of accented characters is reasonably complete up to U+0180, the
+# set of special characters less so.  However, it mostly gives up at that
+# point.  Adding new Unicode characters is easy, however.  Please send any
+# additions to zsh-workers@sunsite.dk .
+#
+# Some of the accent codes are a little more obscure than others.
+# Only the base character changes for upper case: A with circle is "o A".
+#  `   Grave
+#  '   Acute
+#  d   Double acute
+#  ^   Circumflex
+#  ~   Tilde
+#  "   Diaeresis (Umlaut)
+#  o   Circle
+#  e   Ligatures ending in e or E: e A gives AE, e o gives oe, etc.
+#  j   Ligatures ending in j or J: ij or IJ
+#  c   Cedilla
+#  /   Stroke through character
+#  -   Macron.  (A horizonal bar over the letter.)
+#  u   Breve.  (A shallow dish shape over the letter.)
+#  .   Dot above
+#  :   Middle dot
+#  g   Ogonek.  (A little forward facing hook at the bottom right
+#      of the character.  The "g" stands for "Ogonek" but another
+#      mnemonic is that g has a squiggle below the line.)
+#  v   Caron.  (A little v over the letter.)
+#  s   s s = Eszett (lower case only)
+#  h   Icelandic (or Runic) edh (h d) or thorn (h t)
+#  m   Mathematical: not (m \), multiply (m *), divide (m /), degree (m o),
+#      +/- (m +), superscripts 1, 2, 3 (m 1 etc.), micro (m u), quarter (m q),
+#      half (m h), three quarters (m t)
+#  p   Punctuation (and currency etc.): soft space (p _), inverted ! (p !),
+#      cent (p C), pound sign (p l) (think lira, librum), currency (p $),
+#      yen (p y), broken bar (p |), section (p s), lonely diaeresis (p "),
+#      copyright (p C), Spanish feminine marker (p f), left guillemet (p
+#      <), soft hyphen (p h), registered trade mark (p R), lonely macron (p
+#      -), lonely acute (p '), Pilcrow (paragraph) (p p), middle dot (p :),
+#      lonely cedilla (p c), Spanish masculine marker (p m), right
+#      guillemet (p >), inverted ? (p ?), Euro sign (p e).
+#
+
+emulate -LR zsh
+setopt cbases extendedglob printeightbit
+
+local accent basechar ochar error
+
+if [[ -n $WIDGET ]]; then
+  error=(zle -M)
+else
+  error=print
+fi
+
+if (( ${+zsh_accented_chars} == 0 )); then
+  # The associative array zsh_accent_chars is indexed by the
+  # accent.  The values are sets of character / Unicode pairs for
+  # the character with the given accent.  The Unicode value is
+  # a hex index with no base discriminator; essentially a UCS-4 index
+  # with the leading zeroes suppressed.
+  typeset -gA zsh_accented_chars
+
+  # grave
+  accent=\`
+  zsh_accented_chars[$accent]="\
+A C0 E C8 I CC O D2 U D9 a E0 e E8 i EC o F2 u F9 N 1F8 n 1F9 \
+"
+  # acute
+  accent=\'
+  zsh_accented_chars[$accent]="\
+A C1 E C9 I CD O D3 U DA Y DD a E1 e E9 i EC o F3 u FA y FD C 106 c 107 \
+L 139 l 13A N 143 n 144 R 154 r 155 S 15A s 15B Z 179 z 17A \
+"
+  # double acute
+  accent=d
+  zsh_accented_chars[$accent]="\
+O 150 o 151 U 170 u 171\
+"
+  # circumflex
+  accent=\^
+  zsh_accented_chars[$accent]="\
+A C2 E CA I CE O D4 U DB a E2 e EA i EE o F4 u FB C 108 c 109 G 11C g 11d \
+H 124 h 125 J 134 j 135 S 15C s 15D W 174 w 175 Y 176 y 177 \
+"
+  # tilde
+  accent=\~
+  zsh_accented_chars[$accent]="\
+A C3 E CB N D1 O D5 a E3 n F1 o F5 I 128 i 129 U 168 u 169 \
+"
+  # diaeresis / Umlaut
+  accent=\"
+  zsh_accented_chars[$accent]="\
+A C4 I CF O D6 U DC a E4 e EB i EF o F6 u FC y FF Y 178 \
+"
+  # ring above
+  accent=o
+  zsh_accented_chars[$accent]="\
+A C5 a E5 U 16E u 16F \
+"
+  # ligature with e or E
+  accent=e
+  zsh_accented_chars[$accent]="\
+A C6 a E6 O 152 o 153 \
+"
+  # ligature with j or J
+  accent=j
+  zsh_accented_chars[$accent]="\
+I 132 i 133\
+"
+  # cedilla
+  accent=c
+  zsh_accented_chars[$accent]="\
+C C7 c E7 G 122 g 123 K 136 k 137 L 13B l 13C N 145 n 146 R 156 r 157 \
+S 15E s 15F T 162 t 163 \
+"
+  # stroke through
+  accent=/
+  zsh_accented_chars[$accent]="\
+O D8 o F8 D 110 d 111 H 126 h 127 L 141 l 142 T 166 t 167 b 180 \
+"
+  # macron
+  accent=-
+  zsh_accented_chars[$accent]="\
+A 100 a 101 E 112 e 113 I 12a i 12b O 14C o 14D U 16A u 16B \
+"
+  # breve
+  accent=u
+  zsh_accented_chars[$accent]="\
+A 102 a 103 E 114 e 115 G 11E g 11F I 12C i 12D O 14E o 14F U 16C u 16D \
+"
+  # dot above
+  accent=.
+  zsh_accented_chars[$accent]="\
+C 10A c 10b E 116 e 117 G 120 g 121 I 130 i 131 Z 17B z 17C \
+"
+  # middle dot
+  accent=:
+  zsh_accented_chars[$accent]="\
+L 13F l 140 \
+"
+  # ogonek
+  accent=g
+  zsh_accented_chars[$accent]="\
+A 104 a 105 E 118 e 119 I 12E i 12F U 172 u 173 \
+"
+  # caron
+  accent=v
+  zsh_accented_chars[$accent]="\
+C 10C c 10D D 10E d 10F E 11A e 11B L 13D l 13E N 147 n 148 R 158 r 159 \
+S 160 s 161 T 164 t 165 Z 17D z 17E \
+"
+  # eszett
+  accent=s
+  zsh_accented_chars[$accent]="\
+s DF \
+"
+  # edh or thorn
+  accent=h
+  zsh_accented_chars[$accent]="\
+D D0 d F0 t FE \
+"
+  # mathematical
+  accent=m
+  zsh_accented_chars[$accent]="\
+\\ AC o B0 * D7 / F7 + B1 2 B2 3 B3 u B5 1 B9 q BC h BD t BE\
+"
+  # punctuation and currency
+  accent=p
+  zsh_accented_chars[$accent]="\
+_ A0 ! A1 C A2 l A3 $ A4 y A5 | A6 s A7 \" A8 C A9 f AA < AB \
+h AD R AE - AF ' B4 p B6 : B7 c B8 m BA > BB ? BF e 20AC \
+"
+fi
+
+read -k accent || return 1
+
+if [[ -z $zsh_accented_chars[$accent] ]]; then
+  $error "No accented characters with accent: $accent"
+  return 1
+fi
+
+local -A charmap
+charmap=(${=zsh_accented_chars[$accent]})
+
+read -k basechar
+
+if [[ -z $charmap[$basechar] ]]; then
+  $error "Accent $accent not available with character $basechar"
+  return 1
+fi
+
+if [[ -z $WIDGET ]]; then
+  [[ -t 1 ]] && print
+  print "\U${(l.8..0.)charmap[$basechar]}"
+else
+  ochar="$(print -n "\U${(l.8..0.)charmap[$basechar]}")"
+
+  if (( ${+NUMERIC} )); then
+    $error "Character ${(l.8..0.)charmap[$basechar]}: $ochar"
+  else
+    LBUFFER+=$ochar
+  fi
+fi

-- 
Peter Stephenson <pws@csr.com>                  Software Engineer
CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
Cambridge, CB4 0WZ, UK                          Tel: +44 (0)1223 692070


**********************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.

**********************************************************************


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: PATCH: insert-unicode-char
  2005-08-18 15:43 PATCH: insert-unicode-char Peter Stephenson
@ 2005-08-18 18:48 ` Oliver Kiddle
  2005-08-18 20:08   ` Peter Stephenson
  2005-08-18 22:08 ` Nikolai Weibull
  1 sibling, 1 reply; 7+ messages in thread
From: Oliver Kiddle @ 2005-08-18 18:48 UTC (permalink / raw)
  To: Zsh hackers list

Peter wrote:
> First go at a function to compose Unicode characters based on
> two-character sequences.

Nice.

> Index: Functions/Zle/insert-unicode-char

Presumably this works perfectly well in non-unicode locales if the
resultant character is available. Given that, is insert-unicode-char (as
opposed to something like compose-char) the best name?

I'm sure you know this but for the benefit of anyone that doesn't, it's
possible to configure a compose key with xmodmap. It think the
applicable line (for the rightmost Windows key) is:
  keycode 117 = Multi_key

Oliver

This e-mail and any attachment is for authorised use by the intended recipient(s) only. It may contain proprietary material, confidential information and/or be subject to legal privilege. It should not be copied, disclosed to, retained or used by, any other party. If you are not an intended recipient then please promptly delete this e-mail and any attachment and all copies and inform the sender. Thank you.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: PATCH: insert-unicode-char
  2005-08-18 18:48 ` Oliver Kiddle
@ 2005-08-18 20:08   ` Peter Stephenson
  0 siblings, 0 replies; 7+ messages in thread
From: Peter Stephenson @ 2005-08-18 20:08 UTC (permalink / raw)
  To: Zsh hackers list

Oliver Kiddle wrote:
> > Index: Functions/Zle/insert-unicode-char
> 
> Presumably this works perfectly well in non-unicode locales if the
> resultant character is available.

Yes, it should do.

> Given that, is insert-unicode-char (as
> opposed to something like compose-char) the best name?

It's built around characters available in Unicode, and requires
conversion from UCS-4, so it's not irrelevant... but it's true that
users of ISO-8859-1 probably wouldn't realise they could use it for
accented characters.  (The manual entry hints but doesn't quite say.)

> I'm sure you know this but for the benefit of anyone that doesn't, it's
> possible to configure a compose key with xmodmap. It think the
> applicable line (for the rightmost Windows key) is:
>   keycode 117 = Multi_key

That's the right menu key on my keyboard; the Windows key is 116.
I tend to think the less I have to do with xmodmap the better, but this
time it does seem to be fairly painless.

-- 
Peter Stephenson <pws@pwstephenson.fsnet.co.uk>
Work: pws@csr.com
Web: http://www.pwstephenson.fsnet.co.uk

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: PATCH: insert-unicode-char
  2005-08-18 15:43 PATCH: insert-unicode-char Peter Stephenson
  2005-08-18 18:48 ` Oliver Kiddle
@ 2005-08-18 22:08 ` Nikolai Weibull
  2005-08-19  9:22   ` Peter Stephenson
  1 sibling, 1 reply; 7+ messages in thread
From: Nikolai Weibull @ 2005-08-18 22:08 UTC (permalink / raw)
  To: Zsh hackers list

Peter Stephenson wrote:

> +The codes for the first character are as follows:
⋮

Have you taken a look at RFC 1345?  It has a list of digraphs for quite
a few characters.  Vim uses it I believe,
        nikolai

-- 
Nikolai Weibull: now available free of charge at http://bitwi.se/!
Born in Chicago, IL USA; currently residing in Gothenburg, Sweden.
main(){printf(&linux["\021%six\012\0"],(linux)["have"]+"fun"-97);}


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: PATCH: insert-unicode-char
  2005-08-18 22:08 ` Nikolai Weibull
@ 2005-08-19  9:22   ` Peter Stephenson
  2005-08-19 15:31     ` Peter Stephenson
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Stephenson @ 2005-08-19  9:22 UTC (permalink / raw)
  To: Zsh hackers list

Nikolai Weibull wrote:
> Have you taken a look at RFC 1345?  It has a list of digraphs for quite
> a few characters.  Vim uses it I believe,

Yes, it might be better to use that.  I'll play around further.

pws

**********************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.

**********************************************************************

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: PATCH: insert-unicode-char
  2005-08-19  9:22   ` Peter Stephenson
@ 2005-08-19 15:31     ` Peter Stephenson
  2005-08-19 17:17       ` Wayne Davison
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Stephenson @ 2005-08-19 15:31 UTC (permalink / raw)
  To: zsh-workers

This renames insert-unicode-char to insert-composed-char (which is a bit
clumsy but I stuck with "insert" since it's common to most widgets that
insert new text) and uses RFC 1345 for the characters.  Some of the
mnemonics for accents are a bit less obvious, such as ! for grave and ? for
tilde, since RFC 1345 tried to be conservative about what was available on
the keyboard.

The RQUOTE() was because nroff is doing something funny with right quotes,
but it seems this is beyond the reach of Yodl so it doesn't have any
practical effect.

There's no attempt to compose letters from non-Latin alphabets at the
moment, although RFC 1345 makes that relatively straightforward.

Index: Doc/zmacros.yo
===================================================================
RCS file: /cvsroot/zsh/zsh/Doc/zmacros.yo,v
retrieving revision 1.2
diff -u -r1.2 zmacros.yo
--- Doc/zmacros.yo	22 Sep 2004 04:37:02 -0000	1.2
+++ Doc/zmacros.yo	19 Aug 2005 15:15:31 -0000
@@ -17,6 +17,7 @@
   DEFINEMACRO(ARG1)(ARG2)(ARG3)\
 )
 
+DEFINEMACRO(RQUOTE)(0)(CHAR(39))
 DEFINEMACRO(LPAR)(0)(CHAR(40))
 DEFINEMACRO(RPAR)(0)(CHAR(41))
 DEFINEMACRO(PLUS)(0)(CHAR(43))
Index: Doc/Zsh/contrib.yo
===================================================================
RCS file: /cvsroot/zsh/zsh/Doc/Zsh/contrib.yo,v
retrieving revision 1.45
diff -u -r1.45 contrib.yo
--- Doc/Zsh/contrib.yo	18 Aug 2005 16:50:18 -0000	1.45
+++ Doc/Zsh/contrib.yo	19 Aug 2005 15:15:31 -0000
@@ -660,25 +660,23 @@
 
 example(bindkey '^Xi' incremental-complete-word)
 )
-tindex(insert-files)
-item(tt(insert-files))(
-This function allows you type a file pattern, and see the results of the
-expansion at each step.  When you hit return, all expansions are inserted
-into the command line.
-
-example(bindkey '^Xf' insert-files)
-)
-tindex(insert-unicode-char)
-item(tt(insert-unicode-char))(
-This function allows you to compose Unicode characters to be inserted
-into the command line.  The command is followed by two keys (there is
-no prompt), of which the first indicates the type of accent or special
-character, and the second indicates the base character.  Both input
-characters are always from the ASCII character set.  For best results
-zsh should have been built with support for multibyte characters
-(configured with tt(--enable-multibyte)).
+tindex(insert-composed-char)
+item(tt(insert-composed-char))(
+This function allows you to compose characters that don't appear on the
+keyboard to be inserted into the command line.  The command is followed by
+two keys corresponding to ASCII characters (there is no prompt).  For
+accented characters, the two keys are a base character followed by a code
+for the accent, while for other special characters the two characters
+together form a mnemonic for the character to be inserted.  The
+two-character codes are a subset of those given by RFC 1345 (see for
+example tt(http://www.faqs.org/rfcs/rfc1345.html)).
+
+For best results zsh should have been built with support for multibyte
+characters (configured with tt(--enable-multibyte)); however, the function
+works for the limited range of characters available in single-byte
+character sets such as ISO-8859-1.
 
-The character is converted from Unicode into the local representation and
+The character is converted into the local representation and
 inserted into the command line at the cursor position.
 (The conversion is done within the shell, using whatever facilities
 the C library provides.)  With a numeric argument, the character and its
@@ -688,91 +686,139 @@
 (together with a newline) to standard output.  Input is still read from
 keystrokes.
 
-The set of accented characters is reasonably complete up to U+0180, the
-set of special characters less so.  However, it mostly gives up at that
-point.  Adding new Unicode characters is easy, however.  Please send any
-additions to tt(zsh-workers@sunsite.dk).
+The set of accented characters is reasonably complete up to Unicode
+character U+0180, the set of special characters less so.  However, it
+mostly gives up at that point.  Adding new characters is easy, however.
+Please send any additions to tt(zsh-workers@sunsite.dk).
 
-The codes for the first character are as follows:
+The codes for the second character when used to accent the first are as
+follows.  Note that not every character can take every accent.
 startsitem()
-sitem(tt(`))(
-Grave accent.
-)
-sitem(tt('))(
-Acute accent.
-)
-sitem(tt(d))(
-Double acute accent (only supported on a few letters).
-)
-sitem(tt(^))(
-Circumflex.
-)
-sitem(tt(~))(
-Tilde.
-)
-sitem(tt("))(
-Diaeresis (Umlaut).
-)
-sitem(tt(o))(
-Circle over the base character.
-)
-sitem(tt(e))(
-Ligatures ending in e or E: tt(e A) gives AE, tt(e o) gives oe, etc.
-)
-sitem(tt(j))(
-Ligatures ending in j or J: ij or IJ.
-)
-sitem(tt(c))(
-Cedilla.
-)
-sitem(tt(/))(
-Stroke through the base character.
-)
-sitem(tt(-))(
-Macron.  (A horizonal bar over the base character.)
-)
-sitem(tt(u))(
-Breve.  (A shallow dish shape over the base character.)
+sitem(tt(!))(Grave.)
+sitem(tt(RQUOTE()))(Acute.)
+sitem(tt(>))(Circumflex.)
+sitem(tt(?))(Tilde.  (This is not tt(~) as RFC 1345 does not assume that
+character is present on the keyboard.))
+sitem(tt(-))(Macron.  (A horizonal bar over the base character.))
+sitem(tt(LPAR()))(Breve.  (A shallow dish shape over the base character.))
+sitem(tt(.))(Dot above the base character, or in the case of tt(i) no dot,
+or in the case of tt(L) and tt(l) a centered dot.)
+sitem(tt(:))(Diaeresis (Umlaut).)
+sitem(tt(c))(Cedilla.)
+sitem(tt(_))(Underline, however there are currently no underlined characters.)
+sitem(tt(/))(Stroke through the base character.)
+sitem(tt("))(Double acute (only supported on a few letters).)
+sitem(tt(;))(Ogonek.  (A little forward facing hook at the bottom right
+of the character.  The "g" stands for "Ogonek" but another
+mnemonic is that g has a squiggle below the line.))
+sitem(tt(<))(Caron.  (A little v over the letter.))
+sitem(tt(0))(Circle over the base character.)
+sitem(tt(2))(Hook over the base character.)
+sitem(tt(9))(Horn over the base character.)
+endsitem()
+
+The following other two-character sequences are understood.
+
+startitem()
+item(ASCII characters)(
+These are already present on most keyboards:
+startsitem()
+sitem(tt(<LPAR()))(Left square bracket)
+sitem(tt(//))(Backslash (solidus))
+sitem(tt(RPAR()>))(Right square bracket)
+sitem(tt(LPAR()!))(Left brace (curly bracket))
+sitem(tt(!!))(Vertical bar (pipe symbol))
+sitem(tt(!RPAR()))(Right brace (curly bracket))
+sitem(tt(RQUOTE()?))(Tilde)
+endsitem()
 )
-sitem(tt(.))(
-Dot above the base character
+item(Special letters)(
+Characters found in various variants of the Latin alphabet:
+startsitem()
+sitem(tt(ss))(Eszett (scafes S))
+sitem(tt(D-), tt(d-))(Eth)
+sitem(tt(TH), tt(th))(Thorn)
+sitem(tt(kk))(Kra)
+sitem(tt(RQUOTE()n))(RQUOTE()n)
+sitem(tt(NG), tt(ng))(Ng)
+sitem(tt(OI), tt(oi))(Oi)
+sitem(tt(yr))(yr)
+sitem(tt(ED))(ezh)
+endsitem()
 )
-sitem(tt(:))(
-A dot in the middle plane of the base character
+item(Currency symbols)(
+startsitem()
+sitem(tt(Ct))(Cent)
+sitem(tt(Pd))(Pound sterling (also lira and others))
+sitem(tt(Cu))(Currency)
+sitem(tt(Ye))(Yen)
+sitem(tt(Eu))(Euro (N.B. not in RFC 1345))
+endsitem()
 )
-sitem(tt(g))(
-Ogonek.  (A little forward facing hook at the bottom right
-of the character.  The "g" stands for "Ogonek" but another
-mnemonic is that g has a squiggle below the line.)
+item(Punctuation characters)(
+References to "right" quotes indicate the shape (like a 9 rather than 6)
+rather than their grammatical use.  (For example, a "right" low double
+quote is used to open quotations in German.)
+startsitem()
+sitem(tt(!I))(Inverted exclamation mark)
+sitem(tt(BB))(Broken vertical bar)
+sitem(tt(SE))(Section)
+sitem(tt(Co))(Copyright)
+sitem(tt(-a))(Spanish feminine ordinal indicator)
+sitem(tt(<<))(Left guillemet)
+sitem(tt(--))(Soft hyphen)
+sitem(tt(Rg))(Registered trade mark)
+sitem(tt(PI))(Pilcrow (paragraph))
+sitem(tt(-o))(Spanish masculine ordinal indicator)
+sitem(tt(>>))(Right guillemet)
+sitem(tt(?I))(Inverted question mark)
+sitem(tt(RQUOTE()6))(Left single quote)
+sitem(tt(RQUOTE()9))(Right single quote)
+sitem(tt(.9))("Right" low quote)
+sitem(tt(9+RQUOTE()))(Reversed "right" quote)
+sitem(tt("6))(Left double quote)
+sitem(tt("9))(Right double quote)
+sitem(tt(:9))("Right" low double quote)
+sitem(tt(9"))(Reversed "right" double quote)
+sitem(tt(/-))(Dagger)
+sitem(tt(/=))(Double dagger)
+endsitem()
 )
-sitem(tt(v))(
-Caron.  (A little v over the letter.)
+item(Mathematical symbols)(
+startsitem()
+sitem(tt(DG))(Degree)
+sitem(tt(+-))(+/- sign)
+sitem(tt(2S))(Superscript 2)
+sitem(tt(3S))(Superscript 3)
+sitem(tt(1S))(Superscript 1)
+sitem(tt(My))(Micro)
+sitem(tt(.M))(Middle dot)
+sitem(tt(14))(Quarter)
+sitem(tt(12))(Half)
+sitem(tt(34))(Three quarters)
+sitem(tt(*X))(Multiplication)
+sitem(tt(-:))(Division)
+endsitem()
 )
-sitem(tt(s))(
-Used only as tt(s s), a german Eszett or "scharfes S" ligature.
+item(Accents on their own)(
+startsitem()
+sitem(tt(RQUOTE()>))(Circumflex (same as caret, tt(^)))
+sitem(tt(RQUOTE()!))(Grave (same as backtick, tt(`)))
+sitem(tt(RQUOTE(),))(Cedilla)
+sitem(tt(RQUOTE():))(Diaeresis (Umlaut))
+sitem(tt(RQUOTE()m))(Macron)
+sitem(tt(RQUOTE()RQUOTE()))(Acute)
+endsitem()
 )
-sitem(tt(h))(
-Icelandic (or Runic) edh (tt(h d)) or thorn (tt(h t)).
-)
-sitem(tt(m))(
-Various mathematical characters: not (tt(m \)), multiply (tt(m *)), divide
-(tt(m /)), degree (tt(m o)), +/- (tt(m +)), superscripts 1, 2, 3 (tt(m 1),
-etc.), micro (tt(m u)), quarter (tt(m q)), half (tt(m h)), three quarters
-(tt(m t)).
-)
-sitem(tt(p))(
-Various punctuation and currency characters (any non-mathematical symbol
-that is not part of a word):  soft space (tt(p _)), inverted ! (tt(p !)),
-cent (tt(p %)), pound sign (tt(p l)) (think lira, librum), currency (tt(p
-$)), yen (tt(p y)), broken bar (tt(p |)), section sign (tt(p s)), lonely
-diaeresis (tt(p ")), copyright sign (tt(p C)), Spanish feminine marker
-(tt(p f)), left guillemet (tt(p <)), soft hyphen (tt(p h)), registered
-trade mark (tt(p R)), lonely macron (tt(p -)), lonely acute (tt(p ')),
-Pilcrow (paragraph) sign (tt(p p)), middle dot (tt(p :)),
-lonely cedilla (tt(p c)), Spanish masculine marker (tt(p m)), right
-guillemet (tt(p >)), inverted ? (tt(p ?)), Euro sign (tt(p e)).
+enditem()
 )
-endsitem()
+tindex(insert-files)
+item(tt(insert-files))(
+This function allows you type a file pattern, and see the results of the
+expansion at each step.  When you hit return, all expansions are inserted
+into the command line.
+
+example(bindkey '^Xf' insert-files)
 )
 tindex(narrow-to-region)
 tindex(narrow-to-region-invisible)
Index: Functions/Zle/.distfiles
===================================================================
RCS file: /cvsroot/zsh/zsh/Functions/Zle/.distfiles,v
retrieving revision 1.16
diff -u -r1.16 .distfiles
--- Functions/Zle/.distfiles	21 Jul 2005 13:45:27 -0000	1.16
+++ Functions/Zle/.distfiles	19 Aug 2005 15:15:31 -0000
@@ -1,19 +1,20 @@
 DISTFILES_SRC='
 .distfiles
-cycle-completion-positions  incarg                      predict-on
-edit-command-line           incremental-complete-word   smart-insert-last-word
-history-search-end          insert-files
-copy-earlier-word
-down-line-or-beginning-search
-up-line-or-beginning-search
-narrow-to-region narrow-to-region-invisible
-read-from-minibuffer replace-string
-backward-kill-word-match backward-word-match capitalize-word-match
-down-case-word-match forward-word-match kill-word-match
-match-words-by-style select-word-style transpose-words-match
-up-case-word-match
-delete-whole-word-match quote-and-complete-word url-quote-magic
-zed-set-file-name history-pattern-search
-keeper
-which-command
+backward-kill-word-match       backward-word-match
+capitalize-word-match          copy-earlier-word
+cycle-completion-positions     delete-whole-word-match
+down-case-word-match           down-line-or-beginning-search
+edit-command-line              forward-word-match
+history-pattern-search         history-search-end
+incarg                         incremental-complete-word
+insert-composed-char           insert-files
+keeper                         kill-word-match
+match-words-by-style           narrow-to-region
+narrow-to-region-invisible     predict-on
+quote-and-complete-word        read-from-minibuffer
+replace-string                 select-word-style
+smart-insert-last-word         transpose-words-match
+up-case-word-match             up-line-or-beginning-search
+url-quote-magic                which-command
+zed-set-file-name
 '
Index: Functions/Zle/insert-composed-char
===================================================================
RCS file: Functions/Zle/insert-composed-char
diff -N Functions/Zle/insert-composed-char
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ Functions/Zle/insert-composed-char	19 Aug 2005 15:15:31 -0000
@@ -0,0 +1,407 @@
+# Accented characters.  Inputs two keys.  There are two types: those
+# with a base character followed by an accent (see below for codes for
+# accents), and those with a two-character mnemonic for the composed
+# character.  These are (with the exception of the Euro) the codes
+# given by RFC 1345.  Note that some codes in RFC 1345 require three
+# characters to be input; none of these are handled.
+#
+# For best results zsh should have been built with support for
+# multibyte characters (--enable-multibyte), but single character sets
+# also work.
+#
+# Outputs the character converted from Unicode into the local representation.
+# (The conversion is done within the shell, using whatever facilities
+# the C library provides.)
+#
+# When used as a zle widget, the character is inserted at the cursor
+# position.  With a numeric argument, preview in status line; outside zle,
+# print character (and newline) to standard output.
+#
+# The set of accented characters is reasonably complete up to U+0180, the
+# set of special characters less so.  However, it mostly gives up at that
+# point.  Adding new Unicode characters is easy, however.  Please send any
+# additions to zsh-workers@sunsite.dk .
+#
+# Some of the accent codes are a little more obscure than others.
+#  !   Grave
+#  '   Acute
+#  >   Circumflex
+#  ?   Tilde
+#  -   Macron.  (A horizonal bar over the letter.)
+#  (   Breve.  (A shallow dish shape over the letter.)
+#  .   Dot above, or no dot with lower case i, or dot in the middle of L or l.
+#  :   Diaeresis (Umlaut)
+#  ,   Cedilla
+#  _   Underline (none of these currently)
+#  /   Stroke through character
+#  "   Double acute
+#  ;   Ogonek.  (A little forward facing hook at the bottom right
+#      of the character.)
+#  <   Caron.  (A little v over the letter.)
+#  0   Circle
+#  2   Hook
+#  9   Horn
+# Hence A! is upper case A with a grave, c, is lower case c with cedilla.
+#
+# Some other composed charaters:
+# Various ligatures:
+#  AE ae OE oe IJ ij
+#
+# ASCII characters not on all keyboards:
+#  <(           [
+#  //           \
+#  )>           ]
+#  (!           {
+#  !!           |
+#  !)           }
+#  '?           ~
+#
+# Special letters:
+#  ss		Eszett (schafes S)
+#  D- d- TH th  Eth and thorn
+#  kk           kra
+#  'n           'n
+#  NG ng        ng
+#  OI oi        OI
+#  yr           yr
+#  ED           ezh
+#
+# Currency symbols:
+#  Ct           Cent
+#  Pd           Pound sterling
+#  Cu           Currency
+#  Ye           Yen
+#  Eu           Euro (not in RFC 1345 but logical)
+#
+# Punctuation
+#  !I           Inverted !
+#  BB           Broken vertical bar
+#  SE           Section
+#  Co           Copyright
+#  -a           Spanish feminine ordinal indicator
+#  <<           Left guillemet
+#  --           Soft hyphen
+#  Rg           Registered trade mark
+#  PI           Pilcrow (paragraph)
+#  -o           Spanish masculine ordinal indicator
+#  >>           Right guillemet
+#  ?I           Inverted question mark
+#  '6           Left single quote
+#  '9           Right single quote
+#  .9           "Right" low quote
+#  9'           Reversed "right" quote
+#  "6           Left double quote
+#  "9           Right double quote
+#  :9           "Right" low double quote
+#  9"           Reversed "right" double quote
+#  /-           Dagger
+#  /=           Double dagger
+#
+# Mathematical
+#  DG           Degree
+#  +-           +/-
+#  2S           Superscript 2
+#  3S           Superscript 3
+#  My           Micro
+#  .M           Middle dot
+#  1S           Superscript 1
+#  14           Quarter
+#  12           Half
+#  34           Three quarters
+#  *X           Multiplication
+#  -:           Division
+#
+# Accents with no base character
+# '>            Circumflex (caret)
+# '!            Grave (backtick)
+# ',            Cedilla
+# ':            Diaeresis (Umlaut)
+# 'm            Macron
+# ''            Acute
+
+emulate -LR zsh
+setopt cbases extendedglob printeightbit
+
+local accent basechar ochar error
+
+if [[ -n $WIDGET ]]; then
+  error=(zle -M)
+else
+  error=print
+fi
+
+if (( ${+zsh_accented_chars} == 0 )); then
+  # The associative array zsh_accent_chars is indexed by the
+  # accent.  The values are sets of character / Unicode pairs for
+  # the character with the given accent.  The Unicode value is
+  # a hex index with no base discriminator; essentially a UCS-4 index
+  # with the leading zeroes suppressed.
+  typeset -gA zsh_accented_chars
+
+  # grave
+  accent=\!
+  zsh_accented_chars[$accent]="\
+A C0 E C8 I CC O D2 U D9 a E0 e E8 i EC o F2 u F9 N 1F8 n 1F9 \
+"
+  # acute
+  accent=\'
+  zsh_accented_chars[$accent]="\
+A C1 E C9 I CD O D3 U DA Y DD a E1 e E9 i EC o F3 u FA y FD C 106 c 107 \
+L 139 l 13A N 143 n 144 R 154 r 155 S 15A s 15B Z 179 z 17A \
+"
+  # circumflex
+  accent=\>
+  zsh_accented_chars[$accent]="\
+A C2 E CA I CE O D4 U DB a E2 e EA i EE o F4 u FB C 108 c 109 G 11C g 11d \
+H 124 h 125 J 134 j 135 S 15C s 15D W 174 w 175 Y 176 y 177 \
+"
+  # tilde
+  accent=\?
+  zsh_accented_chars[$accent]="\
+A C3 E CB N D1 O D5 a E3 n F1 o F5 I 128 i 129 U 168 u 169 \
+"
+  # macron (d-, D- give eth)
+  accent=-
+  zsh_accented_chars[$accent]="\
+A 100 a 101 d F0 D D0 E 112 e 113 I 12a i 12b O 14C o 14D U 16A u 16B \
+"
+  # breve
+  accent=\(
+  zsh_accented_chars[$accent]="\
+A 102 a 103 E 114 e 115 G 11E g 11F I 12C i 12D O 14E o 14F U 16C u 16D \
+"
+  # dot above, small i with no dot, or l with middle dot
+  accent=.
+  zsh_accented_chars[$accent]="\
+C 10A c 10b E 116 e 117 G 120 g 121 I 130 i 131 L 13F l 140 Z 17B z 17C \
+"
+  # diaeresis / Umlaut
+  accent=:
+  zsh_accented_chars[$accent]="\
+A C4 I CF O D6 U DC a E4 e EB i EF o F6 u FC y FF Y 178 \
+"
+  # cedilla
+  accent=,
+  zsh_accented_chars[$accent]="\
+C C7 c E7 G 122 g 123 K 136 k 137 L 13B l 13C N 145 n 146 R 156 r 157 \
+S 15E s 15F T 162 t 163 \
+"
+  # underline (_) would go here
+  # stroke through
+  accent=/
+  zsh_accented_chars[$accent]="\
+O D8 o F8 D 110 d 111 H 126 h 127 L 141 l 142 T 166 t 167 b 180 \
+"
+  # double acute
+  accent=\"
+  zsh_accented_chars[$accent]="\
+O 150 o 151 U 170 u 171\
+"
+  # ogonek
+  accent=\;
+  zsh_accented_chars[$accent]="\
+A 104 a 105 E 118 e 119 I 12E i 12F U 172 u 173 \
+"
+  # caron
+  accent=\<
+  zsh_accented_chars[$accent]="\
+C 10C c 10D D 10E d 10F E 11A e 11B L 13D l 13E N 147 n 148 R 158 r 159 \
+S 160 s 161 T 164 t 165 Z 17D z 17E \
+"
+  # ring above
+  accent=0
+  zsh_accented_chars[$accent]="\
+A C5 a E5 U 16E u 16F \
+"
+  # hook above
+  accent=2
+  zsh_accented_chars[$accent]="\
+A 1EA2 a 1EA3 E 1EBA e 1EBA \
+"
+  # horn, also right quotation marks
+  accent=9
+  zsh_accented_chars[$accent]="\
+O 1A0 o 1A1 U 1Af u 1b0 ' 2019 . 201A \" 201D : 201E \
+"
+  # left quotation marks
+  accent=6
+  zsh_accented_chars[$accent]="\
+' 2018 \" 201C \
+"
+  # reversed quotation marks for convenience
+  accent=\'
+  zsh_accented_chars[$accent]+=" \
+9 201B \
+"
+  accent=\"
+  zsh_accented_chars[$accent]+=" \
+9 201F \
+"
+
+  # ligature with E
+  accent=e
+  zsh_accented_chars[$accent]="\
+A C6 O 152 \
+"
+  # ligature with e
+  accent=e
+  zsh_accented_chars[$accent]="\
+a E6 o 153 \
+"
+  # ligature with J
+  accent=J
+  zsh_accented_chars[$accent]="\
+I 132 \
+"
+  # ligature with j
+  accent=j
+  zsh_accented_chars[$accent]="\
+i 133 \
+"
+  # eszett
+  accent=s
+  zsh_accented_chars[$accent]="\
+s DF \
+"
+  # upper case thorn
+  accent=H
+  zsh_accented_chars[$accent]="\
+T DE \
+"
+  # lower case thorn
+  accent=h
+  zsh_accented_chars[$accent]="\
+t FE \
+"
+
+  # Remaining characters are handled as separate pairs.
+  # We need to remember that the assoc array is keyed by the second character.
+  # Left square bracket
+  accent=\(
+  zsh_accented_chars[$accent]+=" < 5B"
+  # Reverse solidus (backslash to you and me).
+  accent=/
+  zsh_accented_chars[$accent]+=" / 5C"
+  # Right square bracket, circumflex
+  accent=\>
+  zsh_accented_chars[$accent]+=" ) 5D ' 5E"
+  # Grave accent
+  accent=\!
+  zsh_accented_chars[$accent]+=" ' 60"
+  # diglyphys for (usually) standard characters {, |, }, ~
+  accent=\!
+  zsh_accented_chars[$accent]+=" ( 7B"
+  zsh_accented_chars[$accent]+=" ! 7C"
+  accent=\)
+  zsh_accented_chars[$accent]+=" ! 7D"
+  accent=\?
+  zsh_accented_chars[$accent]+=" ' 7E"
+  # non-breaking space
+  zsh_accented_chars[S]+=" N A0"
+  # inverted exclamation mark
+  zsh_accented_chars[I]+=" ! A1"
+  # cent
+  zsh_accented_chars[t]+=" C A2"
+  # pound sterling
+  zsh_accented_chars[d]+=" P A3"
+  # currency
+  zsh_accented_chars[u]+=" C A4"
+  # yen
+  zsh_accented_chars[e]+=" Y A5"
+  # broken bar
+  zsh_accented_chars[B]+=" B A6"
+  # section
+  zsh_accented_chars[E]+=" S A7"
+  # lonely diaeresis
+  zsh_accented_chars[:]+=" ' A8"
+  # copyright
+  zsh_accented_chars[o]+=" C A9"
+  # spanish feminine ordinal
+  zsh_accented_chars[a]+=" - AA"
+  # left guillemet
+  accent=\<
+  zsh_accented_chars[$accent]+=" < AB"
+  zsh_accented_chars[O]+=" N AC"
+  # soft hyphen
+  zsh_accented_chars[-]+=" - AD"
+  # registered
+  zsh_accented_chars[g]+=" R AE"
+  # lonely macron
+  zsh_accented_chars[m]+=" ' AF"
+  # degree
+  zsh_accented_chars[G]+=" D B0"
+  # +/-
+  zsh_accented_chars[-]+=" + B1"
+  # superscripts
+  zsh_accented_chars[S]+=" 2 B2 3 B3"
+  # lonely acute
+  accent=\'
+  zsh_accented_chars[$accent]+=" ' B4"
+  # micro
+  zsh_accented_chars[y]+=" M B5"
+  # pilcrow (paragraph)
+  zsh_accented_chars[I]+=" P B6"
+  # Middle dot
+  zsh_accented_chars[M]+=" . B7"
+  # Lonely cedilla
+  zsh_accented_chars[,]+=" ' B8"
+  # Superscript one
+  zsh_accented_chars[S]+=" 1 B9"
+  # spanish masculine ordinal
+  zsh_accented_chars[o]+=" - BA"
+  # right guillemet
+  accent=\>
+  zsh_accented_chars[$accent]+=" > BB"
+  # fractions
+  zsh_accented_chars[4]+=" 1 BC 3 BE"
+  zsh_accented_chars[2]+=" 1 BD"
+  # inverted question mark
+  zsh_accented_chars[I]+=" ? BF"
+  # multiplication
+  zsh_accented_chars[X]+=" * D7"
+  # division
+  zsh_accented_chars[:]+=" - F7"
+  # kra
+  zsh_accented_chars[k]+=" k 138"
+  # apostrophe n
+  zsh_accented_chars[n]+=" ' 149"
+  # Lappish ng
+  zsh_accented_chars[G]+=" N 14A"
+  zsh_accented_chars[g]+=" n 14B"
+  # OI
+  zsh_accented_chars[I]+=" O 1A2"
+  zsh_accented_chars[i]+=" o 1A3"
+  # yr
+  zsh_accented_chars[r]+=" y 1A6"
+  # ezh
+  zsh_accented_chars[D]+=" E 1B7"
+  # euro (I invented this but it's logical)
+  zsh_accented_chars[u]+=" E 20AC"
+  # dagger and double dagger
+  zsh_accented_chars[-]+=" / 2020"
+  zsh_accented_chars[=]+=" / 2021"
+fi
+
+read -k basechar || return 1
+read -k accent || return 1
+
+local -A charmap
+charmap=(${=zsh_accented_chars[$accent]})
+
+if [[ ${#charmap} -eq 0 || -z $charmap[$basechar] ]]; then
+  $error "Combination ${basechar}${accent} is not available."
+  return 1
+fi
+
+if [[ -z $WIDGET ]]; then
+  [[ -t 1 ]] && print
+  print "\U${(l.8..0.)charmap[$basechar]}"
+else
+  ochar="$(print -n "\U${(l.8..0.)charmap[$basechar]}")"
+
+  if (( ${+NUMERIC} )); then
+    $error "Character ${(l.8..0.)charmap[$basechar]}: $ochar"
+  else
+    LBUFFER+=$ochar
+  fi
+fi
Index: Functions/Zle/insert-unicode-char
===================================================================
RCS file: Functions/Zle/insert-unicode-char
diff -N Functions/Zle/insert-unicode-char
--- Functions/Zle/insert-unicode-char	18 Aug 2005 16:45:30 -0000	1.1
+++ /dev/null	1 Jan 1970 00:00:00 -0000
@@ -1,214 +0,0 @@
-# Accented characters.  Inputs two keys: first the code for the accent, then
-# the base character being accented.  Note that all input characters are
-# ASCII.  For best results zsh should have been built with support for
-# multibyte characters (--enable-multibyte).
-#
-# Outputs the character converted from Unicode into the local representation.
-# (The conversion is done within the shell, using whatever facilities
-# the C library provides.)
-#
-# When used as a zle widget, the character is inserted at the cursor
-# position.  With a numeric argument, preview in status line; outside zle,
-# print character (and newline) to standard output.
-#
-# The set of accented characters is reasonably complete up to U+0180, the
-# set of special characters less so.  However, it mostly gives up at that
-# point.  Adding new Unicode characters is easy, however.  Please send any
-# additions to zsh-workers@sunsite.dk .
-#
-# Some of the accent codes are a little more obscure than others.
-# Only the base character changes for upper case: A with circle is "o A".
-#  `   Grave
-#  '   Acute
-#  d   Double acute
-#  ^   Circumflex
-#  ~   Tilde
-#  "   Diaeresis (Umlaut)
-#  o   Circle
-#  e   Ligatures ending in e or E: e A gives AE, e o gives oe, etc.
-#  j   Ligatures ending in j or J: ij or IJ
-#  c   Cedilla
-#  /   Stroke through character
-#  -   Macron.  (A horizonal bar over the letter.)
-#  u   Breve.  (A shallow dish shape over the letter.)
-#  .   Dot above
-#  :   Middle dot
-#  g   Ogonek.  (A little forward facing hook at the bottom right
-#      of the character.  The "g" stands for "Ogonek" but another
-#      mnemonic is that g has a squiggle below the line.)
-#  v   Caron.  (A little v over the letter.)
-#  s   s s = Eszett (lower case only)
-#  h   Icelandic (or Runic) edh (h d) or thorn (h t)
-#  m   Mathematical: not (m \), multiply (m *), divide (m /), degree (m o),
-#      +/- (m +), superscripts 1, 2, 3 (m 1 etc.), micro (m u), quarter (m q),
-#      half (m h), three quarters (m t)
-#  p   Punctuation (and currency etc.): soft space (p _), inverted ! (p !),
-#      cent (p C), pound sign (p l) (think lira, librum), currency (p $),
-#      yen (p y), broken bar (p |), section (p s), lonely diaeresis (p "),
-#      copyright (p C), Spanish feminine marker (p f), left guillemet (p
-#      <), soft hyphen (p h), registered trade mark (p R), lonely macron (p
-#      -), lonely acute (p '), Pilcrow (paragraph) (p p), middle dot (p :),
-#      lonely cedilla (p c), Spanish masculine marker (p m), right
-#      guillemet (p >), inverted ? (p ?), Euro sign (p e).
-#
-
-emulate -LR zsh
-setopt cbases extendedglob printeightbit
-
-local accent basechar ochar error
-
-if [[ -n $WIDGET ]]; then
-  error=(zle -M)
-else
-  error=print
-fi
-
-if (( ${+zsh_accented_chars} == 0 )); then
-  # The associative array zsh_accent_chars is indexed by the
-  # accent.  The values are sets of character / Unicode pairs for
-  # the character with the given accent.  The Unicode value is
-  # a hex index with no base discriminator; essentially a UCS-4 index
-  # with the leading zeroes suppressed.
-  typeset -gA zsh_accented_chars
-
-  # grave
-  accent=\`
-  zsh_accented_chars[$accent]="\
-A C0 E C8 I CC O D2 U D9 a E0 e E8 i EC o F2 u F9 N 1F8 n 1F9 \
-"
-  # acute
-  accent=\'
-  zsh_accented_chars[$accent]="\
-A C1 E C9 I CD O D3 U DA Y DD a E1 e E9 i EC o F3 u FA y FD C 106 c 107 \
-L 139 l 13A N 143 n 144 R 154 r 155 S 15A s 15B Z 179 z 17A \
-"
-  # double acute
-  accent=d
-  zsh_accented_chars[$accent]="\
-O 150 o 151 U 170 u 171\
-"
-  # circumflex
-  accent=\^
-  zsh_accented_chars[$accent]="\
-A C2 E CA I CE O D4 U DB a E2 e EA i EE o F4 u FB C 108 c 109 G 11C g 11d \
-H 124 h 125 J 134 j 135 S 15C s 15D W 174 w 175 Y 176 y 177 \
-"
-  # tilde
-  accent=\~
-  zsh_accented_chars[$accent]="\
-A C3 E CB N D1 O D5 a E3 n F1 o F5 I 128 i 129 U 168 u 169 \
-"
-  # diaeresis / Umlaut
-  accent=\"
-  zsh_accented_chars[$accent]="\
-A C4 I CF O D6 U DC a E4 e EB i EF o F6 u FC y FF Y 178 \
-"
-  # ring above
-  accent=o
-  zsh_accented_chars[$accent]="\
-A C5 a E5 U 16E u 16F \
-"
-  # ligature with e or E
-  accent=e
-  zsh_accented_chars[$accent]="\
-A C6 a E6 O 152 o 153 \
-"
-  # ligature with j or J
-  accent=j
-  zsh_accented_chars[$accent]="\
-I 132 i 133\
-"
-  # cedilla
-  accent=c
-  zsh_accented_chars[$accent]="\
-C C7 c E7 G 122 g 123 K 136 k 137 L 13B l 13C N 145 n 146 R 156 r 157 \
-S 15E s 15F T 162 t 163 \
-"
-  # stroke through
-  accent=/
-  zsh_accented_chars[$accent]="\
-O D8 o F8 D 110 d 111 H 126 h 127 L 141 l 142 T 166 t 167 b 180 \
-"
-  # macron
-  accent=-
-  zsh_accented_chars[$accent]="\
-A 100 a 101 E 112 e 113 I 12a i 12b O 14C o 14D U 16A u 16B \
-"
-  # breve
-  accent=u
-  zsh_accented_chars[$accent]="\
-A 102 a 103 E 114 e 115 G 11E g 11F I 12C i 12D O 14E o 14F U 16C u 16D \
-"
-  # dot above
-  accent=.
-  zsh_accented_chars[$accent]="\
-C 10A c 10b E 116 e 117 G 120 g 121 I 130 i 131 Z 17B z 17C \
-"
-  # middle dot
-  accent=:
-  zsh_accented_chars[$accent]="\
-L 13F l 140 \
-"
-  # ogonek
-  accent=g
-  zsh_accented_chars[$accent]="\
-A 104 a 105 E 118 e 119 I 12E i 12F U 172 u 173 \
-"
-  # caron
-  accent=v
-  zsh_accented_chars[$accent]="\
-C 10C c 10D D 10E d 10F E 11A e 11B L 13D l 13E N 147 n 148 R 158 r 159 \
-S 160 s 161 T 164 t 165 Z 17D z 17E \
-"
-  # eszett
-  accent=s
-  zsh_accented_chars[$accent]="\
-s DF \
-"
-  # edh or thorn
-  accent=h
-  zsh_accented_chars[$accent]="\
-D D0 d F0 t FE \
-"
-  # mathematical
-  accent=m
-  zsh_accented_chars[$accent]="\
-\\ AC o B0 * D7 / F7 + B1 2 B2 3 B3 u B5 1 B9 q BC h BD t BE\
-"
-  # punctuation and currency
-  accent=p
-  zsh_accented_chars[$accent]="\
-_ A0 ! A1 C A2 l A3 $ A4 y A5 | A6 s A7 \" A8 C A9 f AA < AB \
-h AD R AE - AF ' B4 p B6 : B7 c B8 m BA > BB ? BF e 20AC \
-"
-fi
-
-read -k accent || return 1
-
-if [[ -z $zsh_accented_chars[$accent] ]]; then
-  $error "No accented characters with accent: $accent"
-  return 1
-fi
-
-local -A charmap
-charmap=(${=zsh_accented_chars[$accent]})
-
-read -k basechar
-
-if [[ -z $charmap[$basechar] ]]; then
-  $error "Accent $accent not available with character $basechar"
-  return 1
-fi
-
-if [[ -z $WIDGET ]]; then
-  [[ -t 1 ]] && print
-  print "\U${(l.8..0.)charmap[$basechar]}"
-else
-  ochar="$(print -n "\U${(l.8..0.)charmap[$basechar]}")"
-
-  if (( ${+NUMERIC} )); then
-    $error "Character ${(l.8..0.)charmap[$basechar]}: $ochar"
-  else
-    LBUFFER+=$ochar
-  fi
-fi


-- 
Peter Stephenson <pws@csr.com>                  Software Engineer
CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
Cambridge, CB4 0WZ, UK                          Tel: +44 (0)1223 692070


**********************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.

**********************************************************************


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: PATCH: insert-unicode-char
  2005-08-19 15:31     ` Peter Stephenson
@ 2005-08-19 17:17       ` Wayne Davison
  0 siblings, 0 replies; 7+ messages in thread
From: Wayne Davison @ 2005-08-19 17:17 UTC (permalink / raw)
  To: Peter Stephenson; +Cc: zsh-workers

[-- Attachment #1: Type: text/plain, Size: 702 bytes --]

On Fri, Aug 19, 2005 at 04:31:31PM +0100, Peter Stephenson wrote:
> The RQUOTE() was because nroff is doing something funny with right
> quotes, but it seems this is beyond the reach of Yodl so it doesn't
> have any practical effect.

My version of yodl is patched to transform ' into \&' instead of \',
but it appears that yodl is not touching the ' in this context:

    sitem(tt('))(

I therefore patched the sed command in the Makefile.in to tweak the
resulting nroff source to add in the missing \& prefix.  Attached is a
patch.

I'll also attach a patch for anyone using a stock version of yodl (this
patch is derived from Debian, and some other distributions already
include it too).

..wayne..

[-- Attachment #2: apostrophe.patch --]
[-- Type: text/plain, Size: 588 bytes --]

--- Doc/Makefile.in	31 Jul 2005 18:04:35 -0000	1.27
+++ Doc/Makefile.in	19 Aug 2005 17:09:37 -0000
@@ -117,7 +117,8 @@ zsh.info: $(sdir)/zsh.texi
 	esac; \
 	case '$(YODL)' in :*) touch $$target ;; *) \
 	    echo $(YODL) -o $$target -I$(sdir) -w zman.yo version.yo $< ; \
-	    $(YODL) -I$(sdir) -w zman.yo version.yo $< | sed -e '1s/\\-/-/g' -e '/^\.'\''/d' > $$target \
+	    $(YODL) -I$(sdir) -w zman.yo version.yo $< | \
+	      sed -e '1s/\\-/-/g' -e '/^\.'\''/d' -e 's/\\fB'\''\\fP/\\fB\\\&'\''\\fP/' > $$target \
 	;; esac; \
 
 # These targets are no longer made by default, nor

[-- Attachment #3: yodl.patch --]
[-- Type: text/plain, Size: 299 bytes --]

--- yodl-1.31.18.orig/macros/chartables/man.tables.yo
+++ yodl-1.31.18/macros/chartables/man.tables.yo
@@ -3,7 +3,7 @@
 DEFINECHARTABLE(standard)(
     '\\'    = "\\e"
     '.'	    = "\\&."
-    '\''    = "\\'"
+    '\''    = "\\&'"
 )
 
 COMMENT(*** Accent characters, leading to empty chars. ***)

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2005-08-19 17:18 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-08-18 15:43 PATCH: insert-unicode-char Peter Stephenson
2005-08-18 18:48 ` Oliver Kiddle
2005-08-18 20:08   ` Peter Stephenson
2005-08-18 22:08 ` Nikolai Weibull
2005-08-19  9:22   ` Peter Stephenson
2005-08-19 15:31     ` Peter Stephenson
2005-08-19 17:17       ` Wayne Davison

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).