From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.tex.context/111807 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Thangalin Newsgroups: gmane.comp.tex.context Subject: Straight Quotes / Curly Quotes Date: Thu, 17 Jun 2021 13:28:25 -0700 Message-ID: Reply-To: mailing list for ConTeXt users Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1942108585179515067==" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="36565"; mail-complaints-to="usenet@ciao.gmane.io" To: mailing list for ConTeXt users Original-X-From: ntg-context-bounces@ntg.nl Thu Jun 17 22:29:04 2021 Return-path: Envelope-to: gctc-ntg-context-518@m.gmane-mx.org Original-Received: from zapf.boekplan.nl ([5.39.185.232] helo=zapf.ntg.nl) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1ltydM-0009MO-Lv for gctc-ntg-context-518@m.gmane-mx.org; Thu, 17 Jun 2021 22:29:04 +0200 Original-Received: from localhost (localhost [127.0.0.1]) by zapf.ntg.nl (Postfix) with ESMTP id 1E6C72847E7; Thu, 17 Jun 2021 22:28:46 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at zapf.boekplan.nl Original-Received: from zapf.ntg.nl ([127.0.0.1]) by localhost (zapf.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8k1lp8cSaueV; Thu, 17 Jun 2021 22:28:44 +0200 (CEST) Original-Received: from zapf.ntg.nl (localhost [127.0.0.1]) by zapf.ntg.nl (Postfix) with ESMTP id 8C27C2847E5; Thu, 17 Jun 2021 22:28:44 +0200 (CEST) Original-Received: from localhost (localhost [127.0.0.1]) by zapf.ntg.nl (Postfix) with ESMTP id 673B02847E5 for ; Thu, 17 Jun 2021 22:28:42 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at zapf.boekplan.nl Original-Received: from zapf.ntg.nl ([127.0.0.1]) by localhost (zapf.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4Ij8inoAdD2x for ; Thu, 17 Jun 2021 22:28:40 +0200 (CEST) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=209.85.210.54; helo=mail-ot1-f54.google.com; envelope-from=thangalin@gmail.com; receiver= Original-Received: from mail-ot1-f54.google.com (mail-ot1-f54.google.com [209.85.210.54]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits)) (No client certificate requested) by zapf.ntg.nl (Postfix) with ESMTPS id 1D73D2847B0 for ; Thu, 17 Jun 2021 22:28:40 +0200 (CEST) Original-Received: by mail-ot1-f54.google.com with SMTP id i12-20020a05683033ecb02903346fa0f74dso7432149otu.10 for ; Thu, 17 Jun 2021 13:28:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=ixey+NwnXom+V8N8AiJag6jwbTUdPUkdqGRZhL/Ld2U=; b=jtqYqKNgj3sE/tH4Z7j8SYrNtb/wmYhX2YXLUWv8U4/eLAID0wl38109CUQkitufrj PV3Dozp3VfcXrSMmSopyN0Sv2pWZ7kJ/SWvQC9H/GGt3UfXI2+TKmjUOl9BTkZQC/uOq /nkzJ+230fMFp5HrNYT5fC9q+6WsLTt9ZMzor/Y/7qwnqtiRUhiKQTO2X3+kUyGCMwvr qgBYT+laQDJfh1UFAnXC+sJZ5zLJ6HBBnhqHbmeQdDtv+EF94pyhj2QhfDH/prHv/x0Y dTmau8Q2tpMbjQMJUXCkB9Pwyyud6tCox8ufrnAh7Cd1wC3Bku+/zZcnHczFPA8LmM2R JA7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=ixey+NwnXom+V8N8AiJag6jwbTUdPUkdqGRZhL/Ld2U=; b=pj830BIF2E6LTPRHQBQSK5lW6/4V3P8YsRkk+QPAXoapHHqNeIt4lY7bEThJHVcQF8 MCPQ5Ok8jElP6a/Av7YVZLro8As7tgKN0sR8y5p8lleakmF3FPQZl8BxAezXF+lc/N5t Zul3rOiYMgm7S9TwrQVxPmy2/7le3QvA6bO4yvCmo/MwxaLRb7i9P73JL5JdFHFvrDt1 3F8lgicBlik/WkEv8MlDsH76j3o8ULBTs3j6JwiLIjnFhKdl+THCjxC6A03urkZZg2TY tDAYOUEzAAqbgWyBcRDV+1JheEA6E1oYnVYmZ+pc/vuilXxh7yKvoqnmwYcZ1TbzDBk5 cbog== X-Gm-Message-State: AOAM533CjuB+aK4436o6hMrhhxWbZ6WNDnWQo80YyAU1TTIImeguD2Pq p0RUNhXSMdngCeqo4DGfzlXzBOAjF/ZezS2HFD9gb2vgAEI= X-Google-Smtp-Source: ABdhPJzxzsDJqARiEG7KwpzWUPMPgFZ3LeSCWn26zAV+42CumbMKX4mPrK0G8D4EgK8x+FZC1l4PQL2ygh8KddHhNIM= X-Received: by 2002:a9d:7558:: with SMTP id b24mr6290394otl.60.1623961716662; Thu, 17 Jun 2021 13:28:36 -0700 (PDT) X-BeenThere: ntg-context@ntg.nl X-Mailman-Version: 2.1.26 Precedence: list List-Id: mailing list for ConTeXt users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ntg-context-bounces@ntg.nl Original-Sender: "ntg-context" Xref: news.gmane.io gmane.comp.tex.context:111807 Archived-At: --===============1942108585179515067== Content-Type: multipart/alternative; boundary="00000000000040265b05c4fc0d0a" --00000000000040265b05c4fc0d0a Content-Type: text/plain; charset="UTF-8" I've written a Java-based lexer/parser that can convert straight quotes to curly quotes for English prose. It's a one-pass algorithm (O(n)) that uses neither look-behind nor regex. Here's a list of test cases it handles: https://raw.githubusercontent.com/DaveJarvis/keenquotes/main/lib/src/test/resources/com/keenwrite/quotes/smartypants.txt A test harness converted several Project Gutenberg texts quite well. The folks at PG may be interested in using it themselves to help convert quotes in older texts en masse. The source code is MIT-licensed: https://github.com/DaveJarvis/keenquotes/ The code should port to Lua fairly easily, should anyone be interested in adding a straight/curly quotation mark conversion module to ConTeXt. (Similar to the LaTeX package, but without using regex.) Cheers everyone! --00000000000040265b05c4fc0d0a Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
I've written a Java-based lexer/parser that can conver= t straight quotes to curly quotes for English prose. It's a one-pass al= gorithm (O(n)) that uses neither look-behind nor regex. Here's a list o= f test cases it handles:

https://raw.githubusercontent.com/DaveJarvis/keenquotes/main= /lib/src/test/resources/com/keenwrite/quotes/smartypants.txt

A test harness converted several Project Gutenberg texts quite wel= l. The folks at PG may be interested in using it themselves to help convert= quotes in older texts en masse. The source code is MIT-licensed:
=


The c= ode should port to Lua fairly easily, should anyone be interested in adding= a straight/curly quotation mark conversion module to ConTeXt. (Similar to = the LaTeX package, but without using regex.)

C= heers everyone!
--00000000000040265b05c4fc0d0a-- --===============1942108585179515067== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX18KSWYgeW91ciBxdWVzdGlvbiBpcyBvZiBpbnRlcmVz dCB0byBvdGhlcnMgYXMgd2VsbCwgcGxlYXNlIGFkZCBhbiBlbnRyeSB0byB0aGUgV2lraSEKCm1h aWxsaXN0IDogbnRnLWNvbnRleHRAbnRnLm5sIC8gaHR0cDovL3d3dy5udGcubmwvbWFpbG1hbi9s aXN0aW5mby9udGctY29udGV4dAp3ZWJwYWdlICA6IGh0dHA6Ly93d3cucHJhZ21hLWFkZS5ubCAv IGh0dHA6Ly9jb250ZXh0LmFhbmhldC5uZXQKYXJjaGl2ZSAgOiBodHRwczovL2JpdGJ1Y2tldC5v cmcvcGhnL2NvbnRleHQtbWlycm9yL2NvbW1pdHMvCndpa2kgICAgIDogaHR0cDovL2NvbnRleHRn YXJkZW4ubmV0Cl9fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fCg== --===============1942108585179515067==--