From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/32657 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Julia Diaz Newsgroups: gmane.text.pandoc Subject: Re: What is the point of isBlockElement in the JATS reader? Date: Fri, 19 May 2023 15:22:28 -0700 (PDT) Message-ID: References: <87cz2wo61j.fsf@zeitkraut.de> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_2128_1380546341.1684534948931" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="20501"; mail-complaints-to="usenet@ciao.gmane.io" To: pandoc-discuss Original-X-From: pandoc-discuss+bncBDGPDMUERAEBBJXNT6RQMGQE4CCM3FI-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Sat May 20 00:22:33 2023 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-ot1-f63.google.com ([209.85.210.63]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1q08Ub-00057R-BC for gtp-pandoc-discuss@m.gmane-mx.org; Sat, 20 May 2023 00:22:33 +0200 Original-Received: by mail-ot1-f63.google.com with SMTP id 46e09a7af769-6ab0a992002sf1179226a34.2 for ; Fri, 19 May 2023 15:22:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20221208; t=1684534952; x=1687126952; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:x-original-sender :mime-version:subject:references:in-reply-to:message-id:to:from:date :sender:from:to:cc:subject:date:message-id:reply-to; bh=ECDTcdw66ryDOG1d4qQnuecDYCosEMMOKvmOh5eVheg=; b=CcHLUjgXjzYVeUGQizLwIQg8hvSnRl0ViCPWuvX3vrtWqkHDVC1sxNOa1fGSiNHr10 jboDi3/FDHybuiSbpzjvit5/7QlPl+e/v9BDkPgKBsTG6piLiEu4mo0zULFXY/diVkQ7 0c2AILNKC4mO2OQ25Ej2vlvuLEVGWe/Xu7Ho6d/v8M6j2KzSgEUrcZW6JlsRFL2rHo/N lXdm4vkjJMrTBl/n5Ezuk0g3v0Zhu0OI+A0mkNobWzHSjJU7dy/rTerYAWTP+G2h0kx7 YLP47hQhS7awZUN90B1uYMYoDiUdkIeBjgGN8yn4+aDsXLkjtSyWDu9tbd0dvdzNLZwX kPHA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684534952; x=1687126952; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:x-original-sender :mime-version:subject:references:in-reply-to:message-id:to:from:date :from:to:cc:subject:date:message-id:reply-to; bh=ECDTcdw66ryDOG1d4qQnuecDYCosEMMOKvmOh5eVheg=; b=gV9Uae/LM+2l3I8yGKFeHV/YjCvTvnZPgepD/jz1OBO+E3RPm4HUhXRnbyqzzcMF3p SbA+20k7ZjqkKjSNFQbQ4CTr9CI+NjRI9iNEOPvcZ3AcSsG19g5Ioi2yqq09Q0JZfFcq FCXLB/cktMUU0UoPikjShGHcAi6LCTW4KxnLJfbabhRuzCP9v9Vkm+GuEto+qG4WdzVV sTfsSH+ar7T+WP12D5mhNdpP/IjHBOF659ti/ggoPQQDR6jIGAE5sJ1uVOfKFEKX8/po YQjI3n6LaxrcR9L8ydJpuKrsZX9xvbNEVYW3+lAR/C4HaytWcxd2PI/afqqPe+SJSWYw X8Qg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684534952; x=1687126952; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :x-spam-checked-in-group:list-id:mailing-list:precedence:reply-to :x-original-sender:mime-version:subject:references:in-reply-to :message-id:to:from:date:x-beenthere:x-gm-message-state:sender:from :to:cc:subject:date:message-id:reply-to; bh=ECDTcdw66ryDOG1d4qQnuecDYCosEMMOKvmOh5eVheg=; b=ixSAJzoZ3u4hR/7V00+5fMPfn884fj5ezPumZTXV6heiCFSAlyboyVTwP+mnNlwxeb vewxs9jKP4jCuRlrQb1stq6C7zMSBYCOG71LgFjZq06jYWQ7ReOhkJ0Br3WC/Vny4XIO N5ffJfuWxag/DmoIo1K+wpnmWvkX/b/9VBiaGrdHZuO0JH4DYpY8anZqyUGrc1ecGWKj Sr0f/iqmdDHSDFJAHapF+6N3CH6/gHp4qRqNjpWVedugyjDVSW8RmjaUwQR5CImLycPa eLDxA93ML/7Hda6CZum+46H2BjnRFbocdbYjlQTjgrva6tEwxX Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AC+VfDwt51GcFtBTbiySpdCGaUJEvzHnt96qSQsvUY7jQuydFacifpqa YmeQzRehGYFjqGYdmJ8dwgk= X-Google-Smtp-Source: ACHHUZ63YUrTNh6i/Q5oZNp7/p59i8bzuG4t1s1o/zn+Q/Ph+Y+VJqmrm+QiaBTbEyrJf3CmJ7bETw== X-Received: by 2002:a9d:6e1a:0:b0:6ab:894:7275 with SMTP id e26-20020a9d6e1a000000b006ab08947275mr855272otr.3.1684534952253; Fri, 19 May 2023 15:22:32 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a05:6870:822b:b0:187:a128:fe88 with SMTP id n43-20020a056870822b00b00187a128fe88ls285194oae.1.-pod-prod-07-us; Fri, 19 May 2023 15:22:29 -0700 (PDT) X-Received: by 2002:a05:6870:7a10:b0:192:ad93:b17f with SMTP id hf16-20020a0568707a1000b00192ad93b17fmr1325611oab.4.1684534949623; Fri, 19 May 2023 15:22:29 -0700 (PDT) In-Reply-To: <87cz2wo61j.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org> X-Original-Sender: julia.diaz-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:32657 Archived-At: ------=_Part_2128_1380546341.1684534948931 Content-Type: multipart/alternative; boundary="----=_Part_2129_931210618.1684534948931" ------=_Part_2129_931210618.1684534948931 Content-Type: text/plain; charset="UTF-8" On Friday, 19 May 2023 at 07:32:51 UTC-5 Albert Krewinkel wrote: The JATS reader is based on the DocBook reader, AFAIK, and reuses a good bit of the DocBook code. The list of block tags in the DocBook reader is much longer, so this is most likely a leftover than could be simplified. Looks like legacy from DocBook indeed. I just realised something else: as the JATS reader is written now, the isBlockElement never returns TRUE. This is because the only function that calls isBlockElement is parseMixed, which is only used for the case of "p", which by definition of the JATS models cannot contain itself an inner "p" element . Thus the only case that could possibly trigger a TRUE result for isBlockElement is impossible. In other words, as it is written now, not only the isBlockElement is pointless, also parseMixed is. Since isBlock is always FALSE, the rest is always empty , and lines 208-211 are never reached. So we could always in all confidence parse the full contents of "p" just with parseInLine as done here . I would things something got mixed up in the process when the isBlockElement was adapted for the JATS reader. I could not help but notice that the order of the inLineTags in the isBlockElement function is almost identical, and in the exact same order, to the list of allowed contents of "p" in the JATS specification (only missing are a few more recent elements, mostly Q&A elements, that presumably did not exist when the JATS reader was first written). The paragraphLevel list is also an exact copy of the "Paragraph-level Display Elements" sublist in the same JATS specification page. It makes no sense to me to define these separately only to filter them out immediately and inevitably, specially when no record of which list the element in question belonged to, and only a context-less Boolean value is ever provided... -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/a71b20c8-7a6c-41ac-9af0-141b908111f7n%40googlegroups.com. ------=_Part_2129_931210618.1684534948931 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Friday, 19 May 2023 at 07:32:51 UTC-5 Alber= t Krewinkel wrote:
The JATS = reader is based on the DocBook reader, AFAIK, and reuses a good
bit of the DocBook code. The list of block tags in the DocBook reader= is
much longer, so this is most likely a leftover than could be simplifi= ed.

Looks like legacy from DocBook = indeed.=C2=A0

I just realised something else: as= the JATS reader is written now, the isBlockElement never returns TRUE.=C2= =A0
This is because the only function that calls isBlockElement i= s parseMixed, which is only used for the case of "p", which by definitio= n of the JATS models cannot contain itself an inner "p" element. Thus t= he only case that could possibly trigger a TRUE result for isBlockElement i= s impossible.

In other words, as it is written n= ow, not only the isBlockElement is pointless, also parseMixed is. Since isB= lock is always FALSE, the= rest is always empty, and lines 208-211 are never reached. So we could always in all c= onfidence parse the full contents of "p" just with parseInLine as done=C2= =A0here.=C2=A0
<= div>
I would things something got mixed up in the process w= hen the isBlockElement was adapted for the JATS reader. I could not help bu= t notice that the order of the inLineTags in the isBlockElement function is= almost identical, and in the exact same order, to the list of allowed cont= ents of "p" in the JATS specification=C2=A0(only missing are a few m= ore recent elements, mostly Q&A elements, that presumably did not exist= when the JATS reader was first written). The paragraphLevel list is also a= n exact copy of the "Paragraph= -level Display Elements" sublist in the same JATS specification page= . It makes no sense to me to define these separately only to filter them ou= t immediately and inevitably, specially when no record of which list the el= ement in question belonged to, and only a context-less Boolean value is eve= r provided...

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.google.com/d= /msgid/pandoc-discuss/a71b20c8-7a6c-41ac-9af0-141b908111f7n%40googlegroups.= com.
------=_Part_2129_931210618.1684534948931-- ------=_Part_2128_1380546341.1684534948931--