From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/32378 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Albert Krewinkel Newsgroups: gmane.text.pandoc Subject: Re: Status and quality of Org-reader Date: Wed, 22 Mar 2023 17:04:39 +0100 Message-ID: <875yas68pc.fsf@zeitkraut.de> References: Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="5888"; mail-complaints-to="usenet@ciao.gmane.io" To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-X-From: pandoc-discuss+bncBCZJF7XJTILRB5G65SQAMGQENWRVFMY-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Wed Mar 22 17:40:24 2023 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-oa1-f56.google.com ([209.85.160.56]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1pf1Vf-0001LA-LY for gtp-pandoc-discuss@m.gmane-mx.org; Wed, 22 Mar 2023 17:40:23 +0100 Original-Received: by mail-oa1-f56.google.com with SMTP id 586e51a60fabf-17e11dd9a3dsf5756348fac.10 for ; Wed, 22 Mar 2023 09:40:23 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1679503222; cv=pass; d=google.com; s=arc-20160816; b=UpCMyu5bS5QKZ2e/weVkyezbcvlZSnfLheB0E2USHr+KACWjig21VoEzn3nSNEqlP/ Z2iEgq7tQgydLYpwnSLO4aEY8uZ20jAGFxJCp5RQXk1MR5h8zfB6wDtujTmiK6XR9k9k r9KlhAudq0Hf2GacpHYs9MWS4DVOnXVL0dqvLr2D/dPVy/IW+RBoWtwUHrR6yqO1E2lr duooN1rGrYaAKefdLNbYEaJ9GW6rJY3PIp+0dGrXhYOdAeJK1Ok34SA0g36fPGrBFq9y VKmLQHIulLr3ZFqjzZBJAxFaX+2cCQgemQQ1RSJDo6dp5+xyJUteIySTvyXwxdf6YRY9 oUEQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:mime-version:message-id :in-reply-to:date:subject:to:from:references:sender:dkim-signature; bh=6XVRvx1e+NqECfCN8s94gDlxZGwpFybrLjnCWxpRGgM=; b=W7GJGtTHxrYypibJmg9cO3Qn7JJ6N7myFoopI/9KJSQ5CO/8t/mzXR4l2yVeA/1+BV 1+ohoaA9lNlFWWqT/BoUbkipXYuL/I8WqK6hrm3wV6eIkWnEVc3iExzZU3gFHFF/4CBo 5Wyn9QEAIDDNWV918cXH6OOr//0m6sIqvwbSKj9wXN8JYS9BB6Fh5234je38HHUmpK3a PE/NpHqLhqzDMKX12pztsPkG/2g0kz/a1GCtB/JrR1NlXOEaX6seAPQfmhbqmqM7BCzp TlxVXqdr7mOui9OtOyxR2HnVQK5zdz9ncZ40wufGUd41ktw0VBal31y11oWMlg27E1H0 p/Fw== ARC-Authentication-Results: i=2; gmr-mx.google.com; spf=pass (google.com: domain of albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org designates 2001:67c:2050:0:465::202 as permitted sender) smtp.mailfrom=albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20210112; t=1679503222; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to :x-original-authentication-results:x-original-sender:mime-version :message-id:in-reply-to:date:subject:to:from:references:sender:from :to:cc:subject:date:message-id:reply-to; bh=6XVRvx1e+NqECfCN8s94gDlxZGwpFybrLjnCWxpRGgM=; b=X9cfKDV9pXavTFW5385Bc5Jpwd8RZ932Xl4Vt1gl/PQ7zz4s3YzPskUAXXZnXua5Ra qGUu3/vkqW4YD9a5wIYZYaiXHU+Q0HzN5lmVN+NGEyklyQBbH6cVPLBSmNy3Mzjj0+ha 2bFgh2gi3IBItOdjdzv8EL7/3ZoBidwyPBxB/hzrllhsawrwlrMADsz/w+PVnlwYTX3C eiQ8zAtvj1r3StRZSrp6Y9ekkIl7SfcuImHYS6d4lcHnXAQxcojW+1f6CdYnMjZA6TSK CE2TqNyChFdjmVMtj3/ClqqUdnO3N0g999euwZzf0x6V2uaQ2KUReVcDVSHVQoB5KybM LzUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679503222; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :x-spam-checked-in-group:list-id:mailing-list:precedence:reply-to :x-original-authentication-results:x-original-sender:mime-version :message-id:in-reply-to:date:subject:to:from:references :x-gm-message-state:sender:from:to:cc:subject:date:message-id :reply-to; bh=6XVRvx1e+NqECfCN8s94gDlxZGwpFybrLjnCWxpRGgM=; b=HTRYv0xQ6yvQcsIq6vNP0K6geI/8ZcQQX0KzftYyCQxe22LFp5Z+q8vmvPsGYfVLLq Qzk8H00r53z1tSiIQmG5HRIr2DKPXeayUmESzG3UhYjLf98bnCzyXkq9oOCUHs7XNGk0 /Ir6CEH236itdu84+ZSh/nX1zvJilWsMwnbtcxQc3q3+35Fq/aUfMJB44FF8nFH7NzqF iZQF4BIxr0agvlnooNuvAUmzSHpvC2rE9+gSkAx8bUZE5c6JLZ7E1IJph0OX1ejkFXSa qJRls1enm3onwAbJ86I4SMQ1kpn7Eceh Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AO0yUKU0T61Cn4Rg/Hq7ZTnXg2IvwZdYg4yfWKXhptXRqm0VzoKMgkfw DUUZWMVueBxkLb3OPHOm+6k= X-Google-Smtp-Source: AK7set+6+ViR0OBMh0MGu4eBSprqydzJT/2l5aLo0XBfeYlbv1H1EfIKRy7029oYLGuMkY9MA6dJog== X-Received: by 2002:a05:6808:2a8a:b0:387:35cf:ab5b with SMTP id fc10-20020a0568082a8a00b0038735cfab5bmr1009496oib.0.1679503222584; Wed, 22 Mar 2023 09:40:22 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a05:6820:1795:b0:51f:f7a6:4288 with SMTP id bs21-20020a056820179500b0051ff7a64288ls907875oob.2.-pod-prod-gmail; Wed, 22 Mar 2023 09:40:19 -0700 (PDT) X-Received: by 2002:a4a:94a1:0:b0:536:a447:b270 with SMTP id k30-20020a4a94a1000000b00536a447b270mr1578220ooi.6.1679503219924; Wed, 22 Mar 2023 09:40:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679503219; cv=none; d=google.com; s=arc-20160816; b=Ko+g4PEWURJ7VnxNJtBSBQe+25Au2s0SnlG+4ctT6xtZNAKnFqnPuSs1cGlPlaf8FI PofohJa5h4HWfiopLffzZ0KJFgLEk074J1mQwGhWX3Q+dVN9Opi6Hcwy8/mNLJH1r5YB ItaSDYwSRLNKtLVmqb/GuP+LdiACV3q6gjhuoOY6DXSK4xrzE7yi25jDFRumQfarL62v jsE24fldZR7d6Jd9fDO4oX8tRz7hoU/HPBxaNQeplYHA/TUayaSOoff6nShXLo6oP5ki Ulxi48MHENRqWQ4x4iyKlTNpo0e+QPA2oFrd29irwMhLqTnsK0D557FtC3GqU1RVy1od u9Tg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:message-id:in-reply-to:date:subject:to:from:references; bh=VWh9qSj53ATY91yiBHfcVbs4RiHRL8rWkFInTc2BT/c=; b=rIQ7Hhtuex1m7o7uvLbJ8LPYNo9POi7+0m17xBNqhZICcut0MI9Ntg5V45PMEuOtHg dCwZfMHr+FfIOVZGJJ/HyCv6kkPZnnMC7su6jWPcywLYNWjReto4RXg6OGcobQXDQTpj 9agt8lMenKxdRhTdoWYHh4DyGqcgOA+grl1vvsRXycxW0eVOoarHlYwOcT2xNmct8xp0 xfrV1AKr94QcayGUf/ZyTd5zr7Tsve7q9Jli9vMccXgFjOYhn5jV4vdIhPnHC4vNctDP HxDeF26jT2qrubXuEQetISeEN6ZJjelNiGGymv98Jgvn1QjyQo9fwtmW2Q1Ryq949gSQ r7EQ== ARC-Authentication-Results: i=1; gmr-mx.google.com; spf=pass (google.com: domain of albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org designates 2001:67c:2050:0:465::202 as permitted sender) smtp.mailfrom=albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org Original-Received: from mout-p-202.mailbox.org (mout-p-202.mailbox.org. [2001:67c:2050:0:465::202]) by gmr-mx.google.com with ESMTPS id y1-20020a0568302a0100b0069f974342fcsi297742otu.0.2023.03.22.09.40.19 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 22 Mar 2023 09:40:19 -0700 (PDT) Received-SPF: pass (google.com: domain of albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org designates 2001:67c:2050:0:465::202 as permitted sender) client-ip=2001:67c:2050:0:465::202; Original-Received: from smtp102.mailbox.org (smtp102.mailbox.org [IPv6:2001:67c:2050:b231:465::102]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-384) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-202.mailbox.org (Postfix) with ESMTPS id 4PhZ2m31FCz9scv for ; Wed, 22 Mar 2023 17:40:16 +0100 (CET) In-reply-to: X-Rspamd-Queue-Id: 4PhZ2m31FCz9scv X-Original-Sender: albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org X-Original-Authentication-Results: gmr-mx.google.com; spf=pass (google.com: domain of albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org designates 2001:67c:2050:0:465::202 as permitted sender) smtp.mailfrom=albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:32378 Archived-At: Hi Christian, c.buhtz-OA1p21XQzgd4Eiagz67IpQ@public.gmane.org writes: > I'm maintainer of an org-to-html-converter application (I assume > advertising it isn't suited here.) and do most of the parsing myself. > But I'm at a point to think about refactoring the parsing or finding a > better solution. To the contrary, please do. I like to learn from other projects, and feel that it's very important to acknowledge when there are tools that do a better job at some conversions. > Before I throw half of my code into the trash I would like to learn > and hear more about the current status of the org-reader and how you > rate it? I'm obviously biased in my views, as I wrote most of the code for the org reader. So take this with a grain of salt. I believe we're doing a solid job, but it's nowhere near perfect. Org is powerful and complex, and it's not always easy to match org concepts to pandoc's way of handling things. Prime example: input handling that depends on the output format, as seen in issue #5454. https://github.com/jgm/pandoc/issues/5454 OTOH, the writer holds up quite well when it comes to #+OPTIONS handling and metadata processing. Most markup-parsing is good, too. The lack of a formal syntax definition and a constantly changing reference implementation make org a moving target. See also https://pandoc.org/org.html for an overview of what pandoc can and cannot do. > I was also looking into the bug tracker and only found some minor > problems with org reading. All problems I found I can handle and > workaround with my own code. Some of these small things are quite hard due to the way pandoc works. Other tickets would require just a little bit of time and could possibly be fixed quickly; I'm not sure. > Do you have any further suggestions about the org-reader part of > pandoc? How many people in the pandoc project working on the > org-reader part? It's mostly me, with jgm fixing bugs and adding features there, too. The org reader was the first real-life Haskell code that I wrote a decade ago, but I tried to keep it maintainable. It's ok code IMHO. Org is not a priority for me right now. I'd like to extract it into a separate package some day, which could also help to fix a few issues, but don't have any concrete plans yet. Best, Albert -- Albert Krewinkel GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124