From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=0.2 required=5.0 tests=DKIM_INVALID,DKIM_SIGNED, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.4 Received: (qmail 20813 invoked from network); 25 Mar 2022 16:22:01 -0000 Received: from bsd.lv (HELO mandoc.bsd.lv) (66.111.2.12) by inbox.vuxu.org with ESMTPUTF8; 25 Mar 2022 16:22:01 -0000 Received: from fantadrom.bsd.lv (localhost [127.0.0.1]) by mandoc.bsd.lv (OpenSMTPD) with ESMTP id 37fb692b for ; Fri, 25 Mar 2022 11:21:53 -0500 (EST) Received: from mail-oo1-f49.google.com (mail-oo1-f49.google.com [209.85.161.49]) by mandoc.bsd.lv (OpenSMTPD) with ESMTP id 78c1f568 for ; Fri, 25 Mar 2022 11:21:50 -0500 (EST) Received: by mail-oo1-f49.google.com with SMTP id u30-20020a4a6c5e000000b00320d8dc2438so1384186oof.12 for ; Fri, 25 Mar 2022 09:21:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=anjbe-name.20210112.gappssmtp.com; s=20210112; h=from:to:subject:in-reply-to:references:comments:mime-version :content-id:date:message-id; bh=/E215vd4qaL/wUOcOvn2OOxRxsHjcT6HT6CUyBmKcLA=; b=5zVvw7Gav6D6G3kcfKgkNN/KghUEhkbIDAN3YDsuxgwFuk5KChmaQwW2hn8cMjsv0g omZIRG7YRSGCh6n4IhUe+hOXr5/dW5UvFQp3YzdlX1by+c6wRqMH0Dal5vFpVBnjq8J0 MVitiF5KdBHPFlj9KFkSidDWrIRqhdih/KyHdnME8/A19K/NHJmCsMdPLp8HBxGvwNuN kPQpyRDMz8CMm+losqTG8jUSnu4fuAeAk+gZd9UBLrm3Cdc9AoPXmtycZ/Gx5OoxsoOe iFf1xDVjIz0KpZsYIkJjMDGnZIcIo2PJ4GMYMWgbGbUdRTaOtGuKM6xDMMMc8MpSNbQI DB+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:in-reply-to:references:comments :mime-version:content-id:date:message-id; bh=/E215vd4qaL/wUOcOvn2OOxRxsHjcT6HT6CUyBmKcLA=; b=6D635ZpMGGEHPKbQmSF4mO+wP18lCKoaFrOD4Ez9GzJpjP2uEpv514vUN39jNvzxev B+oo56YhY9gcMHF9sBQXe1nmBZ/2C42u1xw8Ebg3LP1ZwsMdRfBKRYNQUVkZSjs9g5jM Fd69n1u2ZOVy0keejXNzevMMtcRiXx3n8HPsFDvb89rk4xC66fq8t0IVt3QhA1YBl85Z ogIM6BZqia8BhAH3cENBL+NS/bIOBATF/j42DEikpEibdlCdpu3lFL/RCfMnePLqNw7U fO/x0/B7UL0fHxLj9/huNrfwGTrus1teaKe3Qgyy0v0SxvHSuJFRYFdyHotxshiZDEym CgnA== X-Gm-Message-State: AOAM531Ma9FFr7iD7Ryb2KXL/a6rXJVnNWmAZuj5W92gz31rFmDY/JlY ZoTx3d/LsTVgNYwmQBSLfWdP36ygJlkz6/hw X-Google-Smtp-Source: ABdhPJzXLAFRTaYOFj8YQ1gtk8JbjU3/1zdLsZaz671R/teUy57SXbZICDzI0hch6jqte30LFKu0ZA== X-Received: by 2002:a4a:2556:0:b0:324:bd36:f020 with SMTP id v22-20020a4a2556000000b00324bd36f020mr3586832ooe.13.1648225309264; Fri, 25 Mar 2022 09:21:49 -0700 (PDT) Received: from desktop.ajb.soy (c-174-56-122-129.hsd1.nm.comcast.net. [174.56.122.129]) by smtp.gmail.com with ESMTPSA id s26-20020a4ac81a000000b00322a2b5d943sm2656047ooq.37.2022.03.25.09.21.48 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 25 Mar 2022 09:21:49 -0700 (PDT) Received: from desktop.ajb.soy (localhost [127.0.0.1]) by desktop.ajb.soy (OpenSMTPD) with ESMTP id 2b8d1d6e for ; Fri, 25 Mar 2022 10:21:48 -0600 (MDT) From: "Anthony J. Bentley" To: discuss@mandoc.bsd.lv Subject: Re: HTML output: section headers with diacritics not in table of contents In-reply-to: References: Comments: In-reply-to Ingo Schwarze message dated "Fri, 25 Mar 2022 13:27:00 +0100." X-Mailinglist: mandoc-discuss Reply-To: discuss@mandoc.bsd.lv MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <65790.1648225307.1@desktop.ajb.soy> Date: Fri, 25 Mar 2022 10:21:48 -0600 Message-ID: <10474-1648225308.014815@KUMT.SLa5.YYhl> Hi Ingo, Ingo Schwarze writes: > Maybe mandoc should treat any \\[uXXXX] sequence as a letter for > the purposes of tagging? The code needed for that will look rather > awkward though, and even when implemented perfectly, the tags will > be UTF-8 rather than ASCII-encoded. Would links like > > https://man.archlinux.org/man/diff.1.de#%C3%9CBERSICHT > > really be all that useful? What do people think? There would be no need for Mandoc to percent-encode UTF-8 here. In HTML5, a URL fragment (that is, the portion after the '#') may contain unescaped "URL code points," which are: "ASCII alphanumeric, U+0021 (!), U+0024 ($), U+0026 (&), U+0027 ('), U+0028 LEFT PARENTHESIS, U+0029 RIGHT PARENTHESIS, U+002A (*), U+002B (+), U+002C (,), U+002D (-), U+002E (.), U+002F (/), U+003A (:), U+003B (;), U+003D (=), U+003F (?), U+0040 (@), U+005F (_), U+007E (~), and code points in the range U+00A0 to U+10FFFD, inclusive, excluding surrogates and noncharacters." -- Anthony J. Bentley -- To unsubscribe send an email to discuss+unsubscribe@mandoc.bsd.lv