Computer Old Farts Forum
 help / color / mirror / Atom feed
From: jpl.jpl at gmail.com (John P. Linderman)
Subject: [COFF] Was [TUHS] tabs vs spaces - entab, detab
Date: Thu, 11 Mar 2021 15:02:11 -0500	[thread overview]
Message-ID: <CAC0cEp9GVsYbjYhsYk2Hjjj90FxYFAia2Luy_vg854NTrV3Hww@mail.gmail.com> (raw)

The tab/detab horse was still twitching, so I decided to beat it a little
more.

Doug's claim that tabs saving space was an urban legend didn't ring true,
but betting again Doug is a good way to get poor quick. So I tossed
together a perl script (version run through col -x is at the end of this
note) to measure savings. A simpler script just counted tabs,
distinguishing leading tabs, which I expected to be very common, from
embedded tabs, which I expected to be rare. In retrospect, embedded tabs
are common in (my) C code, separating structure types from the element
names and trailing comments. As Norman pointed out, genuine tabs often
preserve line to line alignment in the presence of small changes. So the
fancier script distinguishes between leading tabs and embedded tabs for
various possible tab stops. Small tab stops keep heavily indented code
lines short, large tab stops can save more space when tabbing past leading
blanks. My coding style uses "set-width" of 4, which vi turns into spaces
or tabs, with "standard" tabs every 8 columns. My code therefore benefits
most with tabstops every 4 columns. A lot of code is indented 4 spaces,
which saves 3 bytes when replaced by a tab, but there is no saving with
tabstops at 8. Here's the output when run on itself (before it was
detabbed) and on a largish C program:

  /home/jpl/bin/tabsave.pl /home/jpl/bin/tabsave.pl rsort.c
/home/jpl/bin/tabsave.pl, size 1876
  2: Leading 202, Embedded 3, Total 205
  4: Leading 303, Embedded 4, Total 307
  8: Leading 238, Embedded 5, Total 243

rsort.c, size 209597
  2: Leading 13186, Embedded 4219, Total 17405
  4: Leading 19776, Embedded 5990, Total 25766
  8: Leading 16506, Embedded 6800, Total 23306

The bytes saved by using tabs compared to the (detabbed) original size are
not chump change, with 2, 4 or 8 column tabstops. On ordinary text, savings
are totally unimpressive, usually 0. Your savings may vary. I think the
horse is now officially deceased. -- jpl

===

#!/usr/bin/perl -w

use strict;
my @Tab_stops = ( 2, 4, 8 );

sub check_stop {
    my ($line, $stop_at) = @_;
    my $pos = length($line);
    my ($leading, $embedded) = (0,0);

    while ($pos >= $stop_at) {
        $pos -= ($pos % $stop_at);      # Get to previous tab stop
        my $blanks = 0;
        while ((--$pos >= 0) && (substr($line, $pos, 1) eq ' ')) {
++$blanks; }
        if ($blanks > 1) {
            my $full = int($blanks/$stop_at);
            my $partial = $blanks - $full * $stop_at;
            my $savings = (--$partial > 0) ? $partial : 0;
            $savings += $full * ($stop_at - 1);
            if ($pos < 0) {
                $leading += $savings;
            } else {
                $embedded += $savings;
            }
        }
    }
    return ($leading, $embedded);
}

sub dofile {
    my $file = shift;
    my $command = "col -x < $file";
    my $notabsfh;
    unless (open($notabsfh, "-|", $command)) {
        printf STDERR ("Open failed on '$command': $!");
        return;
    }
    my $size = 0;
    my ($leading, $embedded) = (0,0);
    my @savings;
    for (my $i = 0; $i < @Tab_stops; ++$i) { $savings[$i] = [0,0]; }
    while (my $line = <$notabsfh>) {
        my $n = length($line);
        $size += $n;
        $line =~ s/(\s*)$//;
        for (my $i = 0; $i < @Tab_stops; ++$i) {
            my @l_e = check_stop($line, $Tab_stops[$i]);
            for (my $j = 0; $j < @l_e; ++$j) {
                $savings[$i][$j] += $l_e[$j];
            }
        }
    }
    print("$file, size $size\n");
    for (my $i = 0; $i < @Tab_stops; ++$i) {
        print("  $Tab_stops[$i]: ");
        my $l = $savings[$i][0];
        my $e = $savings[$i][1];
        my $t = $l + $e;
        print("Leading $l, Embedded $e, Total $t\n");
    }
    print("\n");
}

sub main {
    for my $file (@ARGV) {
        dofile($file);
    }
}

main();
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://minnie.tuhs.org/pipermail/coff/attachments/20210311/63c904e4/attachment.htm>


             reply	other threads:[~2021-03-11 20:02 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-11 20:02 jpl.jpl [this message]
2021-03-11 21:18 ` steffen
2021-03-11 23:31   ` jpl.jpl
2021-03-12  0:31     ` steffen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAC0cEp9GVsYbjYhsYk2Hjjj90FxYFAia2Luy_vg854NTrV3Hww@mail.gmail.com \
    --to=coff@minnie.tuhs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).