ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
* CSV scanners built in ConTeXt - feature or bug?
@ 2015-02-26  0:40 Jaroslav Hajtmar
  2015-02-26  9:47 ` Hans Hagen
  0 siblings, 1 reply; 4+ messages in thread
From: Jaroslav Hajtmar @ 2015-02-26  0:40 UTC (permalink / raw)
  To: mailing list for ConTeXt users

Hi ConTeXist
A few days ago Hans pointed me builtin CSV splitter. I tried to test it 
sure will come in handy for my needs. I found that if the CSV file 
contain a blank line, then it stops processing the file (see my minimal 
example). It is clear to me that the incorrectness of file format (eg. 
different number of columns in rows etc.) may cause interruption of 
processing, however, I want to ask whether there is an opportunity to 
process the CSV file with blank lines until the end of CSV file. I 
noticed that when I exporting data from Excel sometimes happens that in 
the export file will appear blank line. Is it interrupt processing a 
feature of a buildin splitter or is it a bug? Can it possibly somehow 
fix or add new functionality?

Thanx
Jaroslav Hajtmar



Here is minimal example:

\starttext

\startluacode
local mycsvsplitter = utilities.parsers.rfc4180splitter{
     separator = ",",
     quote = '"',
}

local crap = io.loaddata("data.txt")

-- with header variant
local tablerows, columnname = mycsvsplitter(crap,true)
inspect(tablerows)
inspect(columnname)

-- without header variant
-- local tablerows = mycsvsplitter(crap)
-- inspect(tablerows)

for i=1,#tablerows do
     local l = tablerows[i]
      for j=1,#l do context(l[j]..", ")
     end
     context('\\crlf')
end

\stopluacode


\stoptext




% <-------------- here start data.txt file ---------------------->
first,second,third,fourth
1,"2","3","4"
"a","b","c","d"
"foo","bar""baz","boogie","xyzzy"
"    ","    ","    ","     "
"And now","followed by","several","blank lines"




"After several","empty rows","data continues","here"
11,"22","33","44"
"aa","bb","cc","dd"
% <-------------- and here stop data.txt file ---------------------->





___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: CSV scanners built in ConTeXt - feature or bug?
  2015-02-26  0:40 CSV scanners built in ConTeXt - feature or bug? Jaroslav Hajtmar
@ 2015-02-26  9:47 ` Hans Hagen
  2015-02-26 11:20   ` Alan BRASLAU
  0 siblings, 1 reply; 4+ messages in thread
From: Hans Hagen @ 2015-02-26  9:47 UTC (permalink / raw)
  To: hajtmar, mailing list for ConTeXt users

On 2/26/2015 1:40 AM, Jaroslav Hajtmar wrote:
> \starttext
>
> \startluacode
> local mycsvsplitter = utilities.parsers.rfc4180splitter{
>      separator = ",",
>      quote = '"',
> }
>
> local crap = io.loaddata("data.txt")
>
> -- with header variant
> local tablerows, columnname = mycsvsplitter(crap,true)
> inspect(tablerows)
> inspect(columnname)
>
> -- without header variant
> -- local tablerows = mycsvsplitter(crap)
> -- inspect(tablerows)
>
> for i=1,#tablerows do
>      local l = tablerows[i]
>       for j=1,#l do context(l[j]..", ")
>      end
>      context('\\crlf')
> end
>
> \stopluacode
>
>
> \stoptext

line 527 in util-prs.lua:

     local wholeblob   = Ct((newline^(specification.strict and -1 or 1) 
* record)^0)

should do the trick

i'm not sure about the default as the standard might demand quit at 
empty line so that needs to be figured out (not by me therefore by you)

\starttext

\startluacode
local crap = [[
1,"2","3","4"
"a","b","c","d"
"foo","bar""baz","boogie","xyzzy"
"    ","    ","    ","     "
"And now","followed by","several","blank lines"


1,"2","3","4"
"a","b","c","d"
"foo","bar""baz","boogie","xyzzy"
"    ","    ","    ","     "
]]

local mycsvsplitter = {
     utilities.parsers.rfc4180splitter{
         separator = ",",
         quote = '"',
         strict = true,
     },
     utilities.parsers.rfc4180splitter{
         separator = ",",
         quote = '"',
     }
}

for i=1,#mycsvsplitter do
     local tablerows, columnname = mycsvsplitter[i](crap,true)
     context.formatted.title("Case %s",i)
     for i=1,#tablerows do
         local l = tablerows[i]
          for j=1,#l do context(l[j]..", ")
         end
         context('\\crlf')
     end
end
\stopluacode

\stoptext


-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
     tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: CSV scanners built in ConTeXt - feature or bug?
  2015-02-26  9:47 ` Hans Hagen
@ 2015-02-26 11:20   ` Alan BRASLAU
  2015-02-26 13:26     ` Jaroslav Hajtmar
  0 siblings, 1 reply; 4+ messages in thread
From: Alan BRASLAU @ 2015-02-26 11:20 UTC (permalink / raw)
  To: Hans Hagen; +Cc: mailing list for ConTeXt users

On Thu, 26 Feb 2015 10:47:29 +0100
Hans Hagen <pragma@wxs.nl> wrote:

> i'm not sure about the default as the standard might demand quit at 
> empty line so that needs to be figured out (not by me therefore by
> you)

Stop on empty line is a very MetaPost-like feature.

I don't have an opinion as to what should be expected for CSV, but
MP thus allows one to pick-off sets of data by successive reads to one
file.

Alan
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: CSV scanners built in ConTeXt - feature or bug?
  2015-02-26 11:20   ` Alan BRASLAU
@ 2015-02-26 13:26     ` Jaroslav Hajtmar
  0 siblings, 0 replies; 4+ messages in thread
From: Jaroslav Hajtmar @ 2015-02-26 13:26 UTC (permalink / raw)
  To: ntg-context

Hans and Alan,
Thanks for the reply. Now it works properly. I would like to ask if 
you're planning to fix file util-prs.lua in a future release of 
standalone ConTeXt.
As for me, I'd rather vote for as the default option other option, ie 
which does not stop on a blank line ie. not how it's setup now (ie 
personally I'd rather vote  strict=true would mean process all lines of 
CSV file and strict=false mean stop processing on blank line), but I 
will take into account whatever alternative and consequently I would 
take into account this options for my own library.
Alan writes about this behavior as like metapost feature. Personally, I 
think that the CSV file is basically a plain text file and a blank line 
in it has its place. The end of the text file is usually marked by <eof> 
character, so I guess there's no reason to terminate processing before 
the file really ends.

Jaroslav Hajtmar



Dne 26.2.2015 v 12:20 Alan BRASLAU napsal(a):
> On Thu, 26 Feb 2015 10:47:29 +0100
> Hans Hagen <pragma@wxs.nl> wrote:
>
>> i'm not sure about the default as the standard might demand quit at
>> empty line so that needs to be figured out (not by me therefore by
>> you)
> Stop on empty line is a very MetaPost-like feature.
>
> I don't have an opinion as to what should be expected for CSV, but
> MP thus allows one to pick-off sets of data by successive reads to one
> file.
>
> Alan
> ___________________________________________________________________________________
> If your question is of interest to others as well, please add an entry to the Wiki!
>
> maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
> webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
> archive  : http://foundry.supelec.fr/projects/contextrev/
> wiki     : http://contextgarden.net
> ___________________________________________________________________________________

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-02-26 13:26 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-26  0:40 CSV scanners built in ConTeXt - feature or bug? Jaroslav Hajtmar
2015-02-26  9:47 ` Hans Hagen
2015-02-26 11:20   ` Alan BRASLAU
2015-02-26 13:26     ` Jaroslav Hajtmar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).