9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* [9fans] libbio and CR-LF
@ 2003-02-24 10:00 Saroj Mahapatra
  2003-02-24 10:54 ` [9fans] " Douglas A. Gwyn
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Saroj Mahapatra @ 2003-02-24 10:00 UTC (permalink / raw)
  To: 9fans

I have been thinking about adapting libbio for Windows platform. There
are basically two approaches. One can remove CR-LF when reading from
the file, or one can keep the CR-LF and Bget can return a '\n' when it
sees CR-LF. Neither approach is perfect. In the first case, file
offset + icount is messed up. In the second case, Bunget will have to
change '\n' to CR-LF. It (Bunget) still can not handle mixed line
terminators 'CR-LF' and CR (See 'getcsv' in "Practice Of Programming"
for an example). Leaving the handling of CR-LF to the application
programs is one option (though not very pleasing). Still a different
approach is to use programs like dos2unix and unix2dos. But if you are
dealing with a network protocol like http or New York Stock Exchange
CMS format messages, dos2unix and unix2dos do not help. What do guys
here think about this?

Thanks,
Saroj Mahapatra


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [9fans] Re: libbio and CR-LF
  2003-02-24 10:00 [9fans] libbio and CR-LF Saroj Mahapatra
@ 2003-02-24 10:54 ` Douglas A. Gwyn
  2003-02-24 14:49 ` [9fans] " Russ Cox
  2003-02-24 21:54 ` [9fans] CRLFication (was: libbio and CR-LF) Geoff Collyer
  2 siblings, 0 replies; 8+ messages in thread
From: Douglas A. Gwyn @ 2003-02-24 10:54 UTC (permalink / raw)
  To: 9fans

Saroj Mahapatra wrote:
> I have been thinking about adapting libbio for Windows platform. There
> are basically two approaches. One can remove CR-LF when reading from
> the file, ...

Not all Windows text files include CR.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [9fans] libbio and CR-LF
  2003-02-24 10:00 [9fans] libbio and CR-LF Saroj Mahapatra
  2003-02-24 10:54 ` [9fans] " Douglas A. Gwyn
@ 2003-02-24 14:49 ` Russ Cox
  2003-02-24 15:09   ` Lucio De Re
  2003-02-24 21:54 ` [9fans] CRLFication (was: libbio and CR-LF) Geoff Collyer
  2 siblings, 1 reply; 8+ messages in thread
From: Russ Cox @ 2003-02-24 14:49 UTC (permalink / raw)
  To: 9fans

If the CR is there it should be reported.
Libbio should not do any funny conversions
like Windows stdio libraries traditionally have.
They confuse more than they help.

Russ



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [9fans] libbio and CR-LF
  2003-02-24 14:49 ` [9fans] " Russ Cox
@ 2003-02-24 15:09   ` Lucio De Re
  2003-02-24 15:14     ` Russ Cox
  0 siblings, 1 reply; 8+ messages in thread
From: Lucio De Re @ 2003-02-24 15:09 UTC (permalink / raw)
  To: 9fans

On Mon, Feb 24, 2003 at 09:49:22AM -0500, Russ Cox wrote:
>
> If the CR is there it should be reported.
> Libbio should not do any funny conversions
> like Windows stdio libraries traditionally have.
> They confuse more than they help.
>
Isn't the CP/M approach that when reading a "text" file, one treats
the <CR><LF> combination as end of line, where the <CR> is optional (I
seem to recall some software that used them in the reverse sequence,
mechanical typewriter-style)?

In other words, one can treat the <CR> as optional if it immediately
precedes <LF> for "text" files.  In fact, any number of <CR>s can be
discarded in such a situation.  <CR>s standing alone must either be
translated or treated as themselves.  Whatever is decided, it must be
cast in stone, of course.

That is, if a "text" mode can be specified.

++L


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [9fans] libbio and CR-LF
  2003-02-24 15:09   ` Lucio De Re
@ 2003-02-24 15:14     ` Russ Cox
  2003-02-24 15:27       ` Boyd Roberts
  2003-02-24 15:29       ` Lucio De Re
  0 siblings, 2 replies; 8+ messages in thread
From: Russ Cox @ 2003-02-24 15:14 UTC (permalink / raw)
  To: 9fans

> Isn't the CP/M approach that when reading a "text" file, one treats
> the <CR><LF> combination as end of line, where the <CR> is optional (I
> seem to recall some software that used them in the reverse sequence,
> mechanical typewriter-style)?
>
> In other words, one can treat the <CR> as optional if it immediately
> precedes <LF> for "text" files.  In fact, any number of <CR>s can be
> discarded in such a situation.  <CR>s standing alone must either be
> translated or treated as themselves.  Whatever is decided, it must be
> cast in stone, of course.
>
> That is, if a "text" mode can be specified.

If the CR is there it should be reported.
Libbio should not do any funny conversions
like Windows stdio libraries traditionally have.
They confuse more than they help.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [9fans] libbio and CR-LF
  2003-02-24 15:14     ` Russ Cox
@ 2003-02-24 15:27       ` Boyd Roberts
  2003-02-24 15:29       ` Lucio De Re
  1 sibling, 0 replies; 8+ messages in thread
From: Boyd Roberts @ 2003-02-24 15:27 UTC (permalink / raw)
  To: 9fans

Russ Cox wrote:
 > If the CR is there it should be reported.
 > Libbio should not do any funny conversions

In the general case the problem is intractable so
no special treatment should be done.  Should you
decide to do it then all the tools (that use it) will
have to be told what sort of 'file' it is.  Obviously,
this is a bad idea.

I built such functionality into a 'line' based i/o library
when it was faced with talking POP.  In this case the
application _knew_ it was going to be talking POP
and so it had sufficient context to make the right
decision.





^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [9fans] libbio and CR-LF
  2003-02-24 15:14     ` Russ Cox
  2003-02-24 15:27       ` Boyd Roberts
@ 2003-02-24 15:29       ` Lucio De Re
  1 sibling, 0 replies; 8+ messages in thread
From: Lucio De Re @ 2003-02-24 15:29 UTC (permalink / raw)
  To: 9fans

On Mon, Feb 24, 2003 at 10:14:34AM -0500, Russ Cox wrote:
>
> If the CR is there it should be reported.
> Libbio should not do any funny conversions
> like Windows stdio libraries traditionally have.
> They confuse more than they help.

You're suggesting that Libbio for Windows should be consistent with
itself and not the underlying environment.  It's a subjective call,
but it makes sense in that only new code would use Libbio.

However, the programmer would have to jump through flaming hoops
to manage all files, as it can't be predicted which variety is
going to be encountered.  If the library can simplify this, I
believe it ought to.  Binary mode restores sanity, if so desired.

It's that, or a complicated shim, re-invented by each user of
Libbio.  At least in my opinion.

++L


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [9fans] CRLFication (was: libbio and CR-LF)
  2003-02-24 10:00 [9fans] libbio and CR-LF Saroj Mahapatra
  2003-02-24 10:54 ` [9fans] " Douglas A. Gwyn
  2003-02-24 14:49 ` [9fans] " Russ Cox
@ 2003-02-24 21:54 ` Geoff Collyer
  2 siblings, 0 replies; 8+ messages in thread
From: Geoff Collyer @ 2003-02-24 21:54 UTC (permalink / raw)
  To: 9fans

Coincidentally, I just wrote a bidirectional copy with optional
CRLFication.  I think it's mostly useful for letting shell scripts
talk ye olde Internet protocols, but pushing CRLFication out to the
edge of the system (to the interface with the network) seems better
than requiring programs to understand it natively.  Not that this
helps with Windows text files particularly.

I've also enclosed dial, a script that uses bicp to connect and copy
bidirectionally.  As its EXAMPLE shows, it can be used to print when
lp is broken.


# To unbundle, run this file
echo bicp
sed 's/^X//' >bicp <<'!'
X.TH BICP 1
X.SH NAME
bicp \- bidirectional copy with optional crlfication
X.SH SYNOPSIS
X.B bicp
X[
X.B -c
X]
command
X[ argument ... ]
X.SH DESCRIPTION
X.I Bicp
copies its standard input to a pipe to
X.IR command ,
and copies from that pipe to its standard output.
X.PP
Under
X.BR -c ,
X.I bicp
also deletes CRs from its standard input and
inserts CRs before newlines in its standard output if there are none.
This permits ordinary programs (including shell scripts)
to serve the usual lousy Internet protocols without
having to deal with their silly end-of-line conventions.
In effect, this pushes crlfication out to the interface with the network.
X.SH EXAMPLES
Implement the
X.I daytime
protocol.
X.EE
X.ti +0.5i
X{echo '#!/bin/rc'; echo exec bicp -c date} >/bin/service/tcp13
X.EX
X.\" .SH FILES
X.SH SEE ALSO
X.IR dial (1),
X.IR listen (8)
X.SH HISTORY
Written by Geoff Collyer.
X.SH BUGS
Arguably,
X.IR listen (8)
and
X.IR dial (2)
should provide crlfication as an optional service.
!
echo dial
sed 's/^X//' >dial <<'!'
X.TH DIAL 1
X.SH NAME
dial \- place an outgoing call and run a command on the resulting connection
X.SH SYNOPSIS
X.B dial
X[
X.B -x
X] [
X.B -p
X]
dialstring [ command ]
X.SH DESCRIPTION
X.I Dial
connects to
X.I dialstring
and runs
X.RI ` command
X.BR /net/ proto/conn /data ',
where
X.BR /net/ proto/conn
is the network directory of the resultant connection.
The default
X.I command
is
X.IR bicp (1).
X.B -x
uses
X.B /net.alt
instead of
X.BR /net .
X.B -p
retries (after a pause) until a connection is made.
X.PP
X.I Dial
and
X.I bicp
provide most of the machinery needed for a simple printer spooler.
X.SH EXAMPLES
X.EX
X# epsps [file...] - emergency printing system; print postscript on $LPDEST
aux/download -f -H/sys/lib/postscript/font -mfontmap -plw+ $* |
X	dial -p tcp!$LPDEST!printer
X.EE
X.SH FILES
X.B /net/tcp
subtree of TCP connections
X.SH SEE ALSO
X.IR bicp (1),
X.IR lp (1)
X.SH HISTORY
Written by Geoff Collyer.
!
# To unbundle, run this file
echo bicp.c
sed 's/^X//' >bicp.c <<'!'
X/*
X * bicp [-c] cmd [arg ...] - bidirectional copy
X *
X * copy from stdin to pipe-to-cmd, and simultaneously copy from
X * pipe-from-cmd to stdout.  Under -c, delete CRs so cmd doesn't see
X * them in its input and add CRs before newlines to its output.
X *
X * stdin & stdout are typically a network connection.  this is
X * intended to be like a streams line discipline.
X */

X#include <u.h>
X#include <libc.h>
X#include <bio.h>	/* too much context switching with byte-at-a-time i/o */

X#define	STREQ(a, b)	(*(a) == *(b) && strcmp((a), (b)) == 0)

enum {
X	Pcopy, Pcmd, Npipe,		// pipe indices
X};
enum {
X	Stdin, Stdout, Stderr,		// pipe indices
X};
enum {
X	Child,
X	Hackdelay = 500,		// in ms
X};

static int wasintr, crlfy;

static void
killmypg(void)
X{
X	int npf;
X	char fullnm[64];

X	snprint(fullnm, sizeof fullnm, "/proc/%d/notepg", getpid());
X	npf = open(fullnm, OWRITE);
X	if (npf >= 0) {
X		write(npf, "kill", 4);
X		close(npf);
X	}
X}

static void
pipeclose(int pipe[Npipe])
X{
X	close(pipe[Pcopy]);
X	close(pipe[Pcmd]);
X}

X/* chop line (ln) at CR of CRLF, if present, else at newline */
static void
chopateol(Biobuf *bp, char *ln)
X{
X	int len = Blinelen(bp);
X	char *eol = &ln[len - 1];		// points at newline

X	if (len >= 2 && eol[-1] == '\r')
X		--eol;
X	*eol = '\0';
X}

X// copy net → cmd, optionally deleting CRs
static void
copyfromnet(int net, int cmd[Npipe])
X{
X	char *ln;
X	Biobuf netbuf, *netb = &netbuf;

X	Binit(netb, net, OREAD);
X	while ((ln = Brdline(netb, '\n')) != nil && !wasintr) {
X		if (crlfy)
X			chopateol(netb, ln);
X		else
X			ln[Blinelen(netb) - 1] = '\0';
X		fprint(cmd[Pcopy], "%s\n", ln);
X	}
X	Bterm(netb);
X}

X// copy cmd → net, optionally inserting CRs before newlines
static void
copytonet(int net, int cmd[Npipe])
X{
X	char *ln;
X	Biobuf cmdbuf, *cmdb = &cmdbuf;

X	Binit(cmdb, cmd[Pcopy], OREAD);
X	while ((ln = Brdline(cmdb, '\n')) != nil) {
X		chopateol(cmdb, ln);
X		if (crlfy)
X			fprint(net, "%s\r\n", ln);
X		else
X			fprint(net, "%s\n", ln);
X	}
X	Bterm(cmdb);
X}

static void
docopy(void (*copy)(int, int [Npipe]), int net, int clfd, int cmd[Npipe])
X{
X	close(clfd);
X	close(Stderr);
X	close(cmd[Pcmd]);	// we aren't the command side
X	(*copy)(net, cmd);
X	close(cmd[Pcopy]);	// pass on the EOF to command
X	close(net);		//  and network
X	sleep(Hackdelay);	// wait for output to drain
X	killmypg();		// kill other copying process & cmd
X	exits(0);
X}

static void
forkcopy(void (*copy)(int, int [Npipe]), int net, int clfd, int cmd[Npipe])
X{
X	if (rfork(RFPROC|RFFDG|RFNOWAIT) == Child)
X		docopy(copy, net, clfd, cmd);
X}

void
notifyf(void *, char *s)
X{
X	if (STREQ(s, "interrupt")) {
X		wasintr++;
X		noted(NCONT);
X	}
X	killmypg();
X	noted(NDFLT);
X}

static void
execcmd(char **argv, int cmd[Npipe])
X{
X	char *prog;
X	char fullnm[256];

X	dup(cmd[Pcmd], Stdin);	// cross-connected pipe halves
X	dup(cmd[Pcmd], Stdout);
X	pipeclose(cmd);

X	prog = argv[0];
X	exec(prog, argv);
X	if (prog[0] != '/') {
X		snprint(fullnm, sizeof fullnm, "/bin/%s", prog);
X		exec(fullnm, argv);
X	}
X	killmypg();
X	sysfatal("can't exec %s nor %s: %r", prog, fullnm);
X}

void
main(int argc, char *argv[])
X{
X	int errflg = 0;
X	int cmd[Npipe];

X	ARGBEGIN {
X	case 'c':
X		++crlfy;
X		break;
X	default:
X		errflg++;
X		break;
X	} ARGEND
X	if (argc <= 0 || errflg)
X		sysfatal("usage: %s [-c] cmd [arg]...", argv0);

X	if (pipe(cmd) < 0)
X		sysfatal("pipe: %r");
X	// cmd[0] is the copying side; cmd[1] is the command side
X	rfork(RFNOTEG);

X	notify(notifyf);
X	forkcopy(copyfromnet, Stdin, Stdout, cmd);
X	if (fork() == Child)
X		execcmd(argv, cmd);
X	docopy(copytonet, Stdout, Stdin, cmd);
X}
!
# To unbundle, run this file
echo dial
sed 's/^X//' >dial <<'!'
X#!/bin/rc
X# dial [-x][-p] string [cmd] - place an outgoing call to string and
X#	run `cmd /net/.../data'.  -x uses /net.alt; -p persists.
rfork e
opt=''
persist=no
silent=no
while (test $#* -gt 0 && ~ $1 -*) {
X	switch ($1) {
X	case -x
X		opt=/net.alt/cs
X	case -p
X		persist=yes
X	case *
X		echo usage: $0 '[-x][-p] dialstr [cmd]' >[1=2]
X		exit usage
X	}
X	shift
X}
cmd=bicp		# dialcp
if (~ $#* 2)
X	cmd=$2

X# csquery needs an option to shut it up, at the very least.
X# prints /net/tcp/clone ip!port
conn=`{echo $1 | eval ndb/csquery $opt | sed 's/^> //'}
clone=$conn(1)		# clone file, opening it yields a ctl file
addr=$conn(2)
net=`{echo $clone | sed 's;/clone$;;'}

X# each open of $clone yields a distinct connection and rc lacks bourne's <>,
X# so get connection number from $clone, then open & write its ctl file.
X# apparently, closing the clone file drops the connection.
while () {
X	{
X		# the connection is reserved now
X		netdir=$net/`{read <[0=3]}
X		if (echo 'connect '$addr >$netdir/ctl >[2]/dev/null)
X			if (! ~ `{cat $netdir/remote} 0.0.0.0!0 ::!0)
X				# no ip v4 nor v6 failure?
X				# the connection will be established when
X				# $netdir/data is opened and can fail then.
X				eval exec $cmd $netdir/data
X		if (~ $persist yes) {
X			echo -n .
X			sleep 5
X		}
X		if not
X			exit 'conn refused'
X	} <[3]$clone
X}
!



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2003-02-24 21:54 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-02-24 10:00 [9fans] libbio and CR-LF Saroj Mahapatra
2003-02-24 10:54 ` [9fans] " Douglas A. Gwyn
2003-02-24 14:49 ` [9fans] " Russ Cox
2003-02-24 15:09   ` Lucio De Re
2003-02-24 15:14     ` Russ Cox
2003-02-24 15:27       ` Boyd Roberts
2003-02-24 15:29       ` Lucio De Re
2003-02-24 21:54 ` [9fans] CRLFication (was: libbio and CR-LF) Geoff Collyer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).