From mboxrd@z Thu Jan 1 00:00:00 1970 From: jpl.jpl at gmail.com (John P. Linderman) Date: Thu, 11 Mar 2021 15:02:11 -0500 Subject: [COFF] Was [TUHS] tabs vs spaces - entab, detab Message-ID: The tab/detab horse was still twitching, so I decided to beat it a little more. Doug's claim that tabs saving space was an urban legend didn't ring true, but betting again Doug is a good way to get poor quick. So I tossed together a perl script (version run through col -x is at the end of this note) to measure savings. A simpler script just counted tabs, distinguishing leading tabs, which I expected to be very common, from embedded tabs, which I expected to be rare. In retrospect, embedded tabs are common in (my) C code, separating structure types from the element names and trailing comments. As Norman pointed out, genuine tabs often preserve line to line alignment in the presence of small changes. So the fancier script distinguishes between leading tabs and embedded tabs for various possible tab stops. Small tab stops keep heavily indented code lines short, large tab stops can save more space when tabbing past leading blanks. My coding style uses "set-width" of 4, which vi turns into spaces or tabs, with "standard" tabs every 8 columns. My code therefore benefits most with tabstops every 4 columns. A lot of code is indented 4 spaces, which saves 3 bytes when replaced by a tab, but there is no saving with tabstops at 8. Here's the output when run on itself (before it was detabbed) and on a largish C program: /home/jpl/bin/tabsave.pl /home/jpl/bin/tabsave.pl rsort.c /home/jpl/bin/tabsave.pl, size 1876 2: Leading 202, Embedded 3, Total 205 4: Leading 303, Embedded 4, Total 307 8: Leading 238, Embedded 5, Total 243 rsort.c, size 209597 2: Leading 13186, Embedded 4219, Total 17405 4: Leading 19776, Embedded 5990, Total 25766 8: Leading 16506, Embedded 6800, Total 23306 The bytes saved by using tabs compared to the (detabbed) original size are not chump change, with 2, 4 or 8 column tabstops. On ordinary text, savings are totally unimpressive, usually 0. Your savings may vary. I think the horse is now officially deceased. -- jpl === #!/usr/bin/perl -w use strict; my @Tab_stops = ( 2, 4, 8 ); sub check_stop { my ($line, $stop_at) = @_; my $pos = length($line); my ($leading, $embedded) = (0,0); while ($pos >= $stop_at) { $pos -= ($pos % $stop_at); # Get to previous tab stop my $blanks = 0; while ((--$pos >= 0) && (substr($line, $pos, 1) eq ' ')) { ++$blanks; } if ($blanks > 1) { my $full = int($blanks/$stop_at); my $partial = $blanks - $full * $stop_at; my $savings = (--$partial > 0) ? $partial : 0; $savings += $full * ($stop_at - 1); if ($pos < 0) { $leading += $savings; } else { $embedded += $savings; } } } return ($leading, $embedded); } sub dofile { my $file = shift; my $command = "col -x < $file"; my $notabsfh; unless (open($notabsfh, "-|", $command)) { printf STDERR ("Open failed on '$command': $!"); return; } my $size = 0; my ($leading, $embedded) = (0,0); my @savings; for (my $i = 0; $i < @Tab_stops; ++$i) { $savings[$i] = [0,0]; } while (my $line = <$notabsfh>) { my $n = length($line); $size += $n; $line =~ s/(\s*)$//; for (my $i = 0; $i < @Tab_stops; ++$i) { my @l_e = check_stop($line, $Tab_stops[$i]); for (my $j = 0; $j < @l_e; ++$j) { $savings[$i][$j] += $l_e[$j]; } } } print("$file, size $size\n"); for (my $i = 0; $i < @Tab_stops; ++$i) { print(" $Tab_stops[$i]: "); my $l = $savings[$i][0]; my $e = $savings[$i][1]; my $t = $l + $e; print("Leading $l, Embedded $e, Total $t\n"); } print("\n"); } sub main { for my $file (@ARGV) { dofile($file); } } main(); -------------- next part -------------- An HTML attachment was scrubbed... URL: