Gnus development mailing list
 help / color / mirror / Atom feed
From: Hrvoje Niksic <hniksic@srce.hr>
Subject: Re: Cool bug in URL parsing
Date: 07 May 1998 17:25:08 +0200	[thread overview]
Message-ID: <kigemy685nv.fsf@jagor.srce.hr> (raw)
In-Reply-To: Karl Kleinpaste's message of "07 May 1998 11:15:47 -0400"

Karl Kleinpaste <karl@jprc.com> writes:

> If you haven't stopped your setup from doing highlighting of URLs
> embedded in text, here's an entertaining glitch to see.
> 
> >From the end of this line containing the sequence to start a "<URL:"
> everything will be highlighted as a supposed URL until, for example,
> some quoted text shows up to provide the terminator.
> 
> > Such as on this line here.
> 
> Methinks there's a regexp that gets a /little/ too aggressive...

What makes you think this is a bug?  According to rfc1738:

APPENDIX: Recommendations for URLs in Context

   URIs, including URLs, are intended to be transmitted through
   protocols which provide a context for their interpretation.

   In some cases, it will be necessary to distinguish URLs from other
   possible data structures in a syntactic structure. In this case, is
   recommended that URLs be preceeded with a prefix consisting of the
   characters "URL:". For example, this prefix may be used to
   distinguish URLs from other kinds of URIs.

   In addition, there are many occasions when URLs are included in other
   kinds of text; examples include electronic mail, USENET news
   messages, or printed on paper. In such cases, it is convenient to
   have a separate syntactic wrapper that delimits the URL and separates
   it from the rest of the text, and in particular from punctuation
   marks that might be mistaken for part of the URL. For this purpose,
   is recommended that angle brackets ("<" and ">"), along with the
   prefix "URL:", be used to delimit the boundaries of the URL.  This
   wrapper does not form part of the URL and should not be used in
   contexts in which delimiters are already specified.

   In the case where a fragment/anchor identifier is associated with a
   URL (following a "#"), the identifier would be placed within the
   brackets as well.

   In some cases, extra whitespace (spaces, linebreaks, tabs, etc.) may
   need to be added to break long URLs across lines.  The whitespace
   should be ignored when extracting the URL.

   No whitespace should be introduced after a hyphen ("-") character.
   Because some typesetters and printers may (erroneously) introduce a
   hyphen at the end of line when breaking a line, the interpreter of a
   URL containing a line break immediately after a hyphen should ignore
   all unencoded whitespace around the line break, and should be aware
   that the hyphen may or may not actually be part of the URL.

   Examples:

      Yes, Jim, I found it under <URL:ftp://info.cern.ch/pub/www/doc;
      type=d> but you can probably pick it up from <URL:ftp://ds.in
      ternic.net/rfc>.  Note the warning in <URL:http://ds.internic.
      net/instructions/overview.html#WARNING>.

-- 
Hrvoje Niksic <hniksic@srce.hr> | Student at FER Zagreb, Croatia
--------------------------------+--------------------------------
I'm a Lisp variable -- bind me!


  reply	other threads:[~1998-05-07 15:25 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
1998-05-07 15:15 Karl Kleinpaste
1998-05-07 15:25 ` Hrvoje Niksic [this message]
1998-05-07 16:03 ` Per Abrahamsen
1998-05-07 16:09   ` Hrvoje Niksic
1998-06-01  3:15     ` Lars Magne Ingebrigtsen
1998-06-02  6:27       ` Hrvoje Niksic
1998-06-03  3:08         ` Lars Magne Ingebrigtsen
1998-06-03 11:35           ` Hrvoje Niksic
1998-06-04  0:09             ` Lars Magne Ingebrigtsen
1998-06-04  0:40               ` Hrvoje Niksic
1998-06-05 18:42                 ` Dave Love
1998-06-05 19:36                   ` Hrvoje Niksic
1998-06-08  9:28                     ` Jan Vroonhof
1998-06-08 12:01                       ` Hrvoje Niksic
1998-06-08 19:03                         ` Jan Vroonhof
1998-06-08 21:32                           ` Hrvoje Niksic
1998-06-08 12:52                       ` Jari Aalto+list.ding
1998-06-11 17:34                     ` Dave Love
1998-06-11 21:53                       ` Hrvoje Niksic
1998-05-08  5:03   ` Russ Allbery

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=kigemy685nv.fsf@jagor.srce.hr \
    --to=hniksic@srce.hr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).