mailing list of musl libc
 help / color / mirror / code / Atom feed
* FreeSec crypt()
@ 2012-06-12 23:51 Solar Designer
  2012-06-13  1:18 ` Rich Felker
  0 siblings, 1 reply; 17+ messages in thread
From: Solar Designer @ 2012-06-12 23:51 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 596 bytes --]

Rich -

As discussed on IRC, here is a revision of the FreeSec crypt() code with
greatly reduced memory requirements: 10 KB for the "shared" and "local"
structs combined.  The original code required about 70 KB of .bss.

This passes the included tests, but more testing is desired - perhaps
fuzz this on random passwords/salts against other implementations.

Also, we could want to add a runtime self-test, which would detect
possible miscompiles.

Oh, and I haven't yet replaced the cast to signed char in ascii_to_bin().

Anyway, this should be better than what's currently in musl.

Alexander

[-- Attachment #2: crypt_freesec.h --]
[-- Type: text/plain, Size: 2990 bytes --]

/*
 * The following notice applies to this header file (only):
 *
 * Copyright (c) 2002,2010,2012 Solar Designer <solar at openwall.com>
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted.
 */

#ifndef _CRYPT_FREESEC_H
#define _CRYPT_FREESEC_H

struct _crypt_extended_shared {
	u_int32_t psbox[8][64];
	u_int32_t ip_maskl[16][16], ip_maskr[16][16];
	u_int32_t fp_maskl[16][16], fp_maskr[16][16];
	u_int32_t key_perm_maskl[16][16], key_perm_maskr[16][16];
	u_int32_t comp_maskl0[8][8], comp_maskr0[8][8];
	u_int32_t comp_maskl1[8][16], comp_maskr1[8][16];
};

struct _crypt_extended_local {
	int initialized;
	u_int32_t saltbits;
	u_int32_t old_salt;
	u_int32_t en_keysl[16], en_keysr[16];
	u_int32_t old_rawkey0, old_rawkey1;
	char output[21];
};

/*
 * _crypt_extended_init() must be called explicitly before first use of
 * _crypt_extended_r().  Strictly speaking, _crypt_extended_init() is not
 * reentrant unless the "shared" struct happens to be private to each call.
 * All it does is initialize some variables inside the "shared" struct to
 * constant values, so it is unlikely that anything would go wrong if this is
 * done multiple times in parallel, but correct behavior in that case is not
 * guaranteed (e.g., things may go wrong if a given CPU architecture can't
 * operate on 32-bit quantities natively, requiring read-modify-write
 * instruction sequences operating on larger quantities and thus affecting
 * nearby array elements).
 *
 * After _crypt_extended_init() has returned, the resulting "shared" struct
 * may in fact be safely shared between different threads' calls to
 * _crypt_extended_r(), which is in fact reentrant (but each concurrent call
 * must use its own instance of the "local" struct).
 *
 * Before first use of the "local" struct, its "initialized" field must be
 * set to 0.  This is compatible with the requirement of some other crypt_r()
 * implementations requiring their entire data structure to be initialized
 * with all zero bytes, so that approach may be applied instead (e.g., this
 * may be required from the callers of a wrapper function).
 *
 * _crypt_extended_r() returns NULL on error.  Although modern standards say
 * that crypt(3) does in fact return NULL on error, many applications do not
 * expect that.  Thus, it is recommended that a crypt(3)-like wrapper function
 * translate these NULL returns into strings guaranteed to be different from
 * the "setting" string, too short for matching a valid password hash, and not
 * containing any characters that would be special for the passwd file format.
 * Specifically, such a wrapper function may return "*0" on error as long as
 * the "setting" string does not start with "*0", or "*1" otherwise.
 */

void _crypt_extended_init(struct _crypt_extended_shared *shared);

char *_crypt_extended_r(const char *key, const char *setting,
	const struct _crypt_extended_shared *shared,
	struct _crypt_extended_local *local);

#endif

[-- Attachment #3: crypt_freesec.c --]
[-- Type: text/plain, Size: 22515 bytes --]

/*
 * This version is derived from the original implementation of FreeSec
 * (release 1.1) by David Burren.  I've reviewed the changes made in
 * OpenBSD (as of 2.7) and modified the original code in a similar way
 * where applicable.  I've also made it reentrant, reduced its memory
 * usage (with only minimal performance impact), and made the handling
 * of invalid salts mostly UFC-crypt compatible.
 * - Solar Designer <solar at openwall.com>
 */

/*
 * FreeSec: libcrypt for NetBSD
 *
 * Copyright (c) 1994 David Burren
 * All rights reserved.
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions
 * are met:
 * 1. Redistributions of source code must retain the above copyright
 *    notice, this list of conditions and the following disclaimer.
 * 2. Redistributions in binary form must reproduce the above copyright
 *    notice, this list of conditions and the following disclaimer in the
 *    documentation and/or other materials provided with the distribution.
 * 3. Neither the name of the author nor the names of other contributors
 *    may be used to endorse or promote products derived from this software
 *    without specific prior written permission.
 *
 * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
 * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
 * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
 * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
 * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
 * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
 * SUCH DAMAGE.
 *
 *	$Owl: Owl/packages/glibc/crypt_freesec.c,v 1.6 2010/02/20 14:45:06 solar Exp $
 *	$Id: crypt.c,v 1.15 1994/09/13 04:58:49 davidb Exp $
 *
 * This is an original implementation of the DES and the crypt(3) interfaces
 * by David Burren <davidb at werj.com.au>.
 *
 * An excellent reference on the underlying algorithm (and related
 * algorithms) is:
 *
 *	B. Schneier, Applied Cryptography: protocols, algorithms,
 *	and source code in C, John Wiley & Sons, 1994.
 *
 * Note that in that book's description of DES the lookups for the initial,
 * pbox, and final permutations are inverted (this has been brought to the
 * attention of the author).  A list of errata for this book has been
 * posted to the sci.crypt newsgroup by the author and is available for FTP.
 *
 * ARCHITECTURE ASSUMPTIONS:
 *	This code used to have some nasty ones, but these have been removed
 *	by now.  The code requires a 32-bit integer type, though.
 */

#include <sys/types.h>
#include <string.h>

#ifdef TEST
#include <stdio.h>
#endif

#include "crypt_freesec.h"

#define _PASSWORD_EFMT1 '_'

static const u_char IP[64] = {
	58, 50, 42, 34, 26, 18, 10,  2, 60, 52, 44, 36, 28, 20, 12,  4,
	62, 54, 46, 38, 30, 22, 14,  6, 64, 56, 48, 40, 32, 24, 16,  8,
	57, 49, 41, 33, 25, 17,  9,  1, 59, 51, 43, 35, 27, 19, 11,  3,
	61, 53, 45, 37, 29, 21, 13,  5, 63, 55, 47, 39, 31, 23, 15,  7
};

static const u_char key_perm[56] = {
	57, 49, 41, 33, 25, 17,  9,  1, 58, 50, 42, 34, 26, 18,
	10,  2, 59, 51, 43, 35, 27, 19, 11,  3, 60, 52, 44, 36,
	63, 55, 47, 39, 31, 23, 15,  7, 62, 54, 46, 38, 30, 22,
	14,  6, 61, 53, 45, 37, 29, 21, 13,  5, 28, 20, 12,  4
};

static const u_char key_shifts[16] = {
	1, 1, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 1
};

static const u_char comp_perm[48] = {
	14, 17, 11, 24,  1,  5,  3, 28, 15,  6, 21, 10,
	23, 19, 12,  4, 26,  8, 16,  7, 27, 20, 13,  2,
	41, 52, 31, 37, 47, 55, 30, 40, 51, 45, 33, 48,
	44, 49, 39, 56, 34, 53, 46, 42, 50, 36, 29, 32
};

/*
 * No E box is used, as it's replaced by some ANDs, shifts, and ORs.
 */

static const u_char sbox[8][64] = {
	{
		14,  4, 13,  1,  2, 15, 11,  8,  3, 10,  6, 12,  5,  9,  0,  7,
		 0, 15,  7,  4, 14,  2, 13,  1, 10,  6, 12, 11,  9,  5,  3,  8,
		 4,  1, 14,  8, 13,  6,  2, 11, 15, 12,  9,  7,  3, 10,  5,  0,
		15, 12,  8,  2,  4,  9,  1,  7,  5, 11,  3, 14, 10,  0,  6, 13
	},
	{
		15,  1,  8, 14,  6, 11,  3,  4,  9,  7,  2, 13, 12,  0,  5, 10,
		 3, 13,  4,  7, 15,  2,  8, 14, 12,  0,  1, 10,  6,  9, 11,  5,
		 0, 14,  7, 11, 10,  4, 13,  1,  5,  8, 12,  6,  9,  3,  2, 15,
		13,  8, 10,  1,  3, 15,  4,  2, 11,  6,  7, 12,  0,  5, 14,  9
	},
	{
		10,  0,  9, 14,  6,  3, 15,  5,  1, 13, 12,  7, 11,  4,  2,  8,
		13,  7,  0,  9,  3,  4,  6, 10,  2,  8,  5, 14, 12, 11, 15,  1,
		13,  6,  4,  9,  8, 15,  3,  0, 11,  1,  2, 12,  5, 10, 14,  7,
		 1, 10, 13,  0,  6,  9,  8,  7,  4, 15, 14,  3, 11,  5,  2, 12
	},
	{
		 7, 13, 14,  3,  0,  6,  9, 10,  1,  2,  8,  5, 11, 12,  4, 15,
		13,  8, 11,  5,  6, 15,  0,  3,  4,  7,  2, 12,  1, 10, 14,  9,
		10,  6,  9,  0, 12, 11,  7, 13, 15,  1,  3, 14,  5,  2,  8,  4,
		 3, 15,  0,  6, 10,  1, 13,  8,  9,  4,  5, 11, 12,  7,  2, 14
	},
	{
		 2, 12,  4,  1,  7, 10, 11,  6,  8,  5,  3, 15, 13,  0, 14,  9,
		14, 11,  2, 12,  4,  7, 13,  1,  5,  0, 15, 10,  3,  9,  8,  6,
		 4,  2,  1, 11, 10, 13,  7,  8, 15,  9, 12,  5,  6,  3,  0, 14,
		11,  8, 12,  7,  1, 14,  2, 13,  6, 15,  0,  9, 10,  4,  5,  3
	},
	{
		12,  1, 10, 15,  9,  2,  6,  8,  0, 13,  3,  4, 14,  7,  5, 11,
		10, 15,  4,  2,  7, 12,  9,  5,  6,  1, 13, 14,  0, 11,  3,  8,
		 9, 14, 15,  5,  2,  8, 12,  3,  7,  0,  4, 10,  1, 13, 11,  6,
		 4,  3,  2, 12,  9,  5, 15, 10, 11, 14,  1,  7,  6,  0,  8, 13
	},
	{
		 4, 11,  2, 14, 15,  0,  8, 13,  3, 12,  9,  7,  5, 10,  6,  1,
		13,  0, 11,  7,  4,  9,  1, 10, 14,  3,  5, 12,  2, 15,  8,  6,
		 1,  4, 11, 13, 12,  3,  7, 14, 10, 15,  6,  8,  0,  5,  9,  2,
		 6, 11, 13,  8,  1,  4, 10,  7,  9,  5,  0, 15, 14,  2,  3, 12
	},
	{
		13,  2,  8,  4,  6, 15, 11,  1, 10,  9,  3, 14,  5,  0, 12,  7,
		 1, 15, 13,  8, 10,  3,  7,  4, 12,  5,  6, 11,  0, 14,  9,  2,
		 7, 11,  4,  1,  9, 12, 14,  2,  0,  6, 10, 13, 15,  3,  5,  8,
		 2,  1, 14,  7,  4, 10,  8, 13, 15, 12,  9,  0,  3,  5,  6, 11
	}
};

static const u_char pbox[32] = {
	16,  7, 20, 21, 29, 12, 28, 17,  1, 15, 23, 26,  5, 18, 31, 10,
	 2,  8, 24, 14, 32, 27,  3,  9, 19, 13, 30,  6, 22, 11,  4, 25
};

static const u_int32_t bits32[32] =
{
	0x80000000, 0x40000000, 0x20000000, 0x10000000,
	0x08000000, 0x04000000, 0x02000000, 0x01000000,
	0x00800000, 0x00400000, 0x00200000, 0x00100000,
	0x00080000, 0x00040000, 0x00020000, 0x00010000,
	0x00008000, 0x00004000, 0x00002000, 0x00001000,
	0x00000800, 0x00000400, 0x00000200, 0x00000100,
	0x00000080, 0x00000040, 0x00000020, 0x00000010,
	0x00000008, 0x00000004, 0x00000002, 0x00000001
};

static const u_char bits8[8] = { 0x80, 0x40, 0x20, 0x10, 0x08, 0x04, 0x02, 0x01 };

static const u_char ascii64[] =
	 "./0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
/*	  0000000000111111111122222222223333333333444444444455555555556666 */
/*	  0123456789012345678901234567890123456789012345678901234567890123 */

/*
 * We match the behavior of UFC-crypt on systems where "char" is signed by
 * default (the majority), regardless of char's signedness on our system.
 */
static inline int
ascii_to_bin(char ch)
{
	signed char sch = ch;
	int retval;

	retval = sch - '.';
	if (sch >= 'A') {
		retval = sch - ('A' - 12);
		if (sch >= 'a')
			retval = sch - ('a' - 38);
	}
	retval &= 0x3f;

	return retval;
}

/*
 * When we choose to "support" invalid salts, nevertheless disallow those
 * containing characters that would violate the passwd file format.
 */
static inline int
ascii_is_unsafe(char ch)
{
	return !ch || ch == '\n' || ch == ':';
}

static void init_ip_k(struct _crypt_extended_shared *shared)
{
	int i, j, k, inbit, obit;
	u_int32_t il, ir, fl, fr;
	const u_int32_t *bits28, *bits24;
	u_char inv_key_perm[64];
	u_char inv_comp_perm[56];
	u_char init_perm[64], final_perm[64];

	bits24 = (bits28 = bits32 + 4) + 4;

	/*
	 * Set up the initial & final permutations into a useful form, and
	 * initialise the inverted key permutation.
	 */
	for (i = 0; i < 64; i++) {
		init_perm[final_perm[i] = IP[i] - 1] = i;
		inv_key_perm[i] = 255;
	}

	/*
	 * Invert the key permutation and initialise the inverted key
	 * compression permutation.
	 */
	for (i = 0; i < 56; i++) {
		inv_key_perm[key_perm[i] - 1] = i;
		inv_comp_perm[i] = 255;
	}

	/*
	 * Invert the key compression permutation.
	 */
	for (i = 0; i < 48; i++) {
		inv_comp_perm[comp_perm[i] - 1] = i;
	}

	/*
	 * Set up the OR-mask arrays for the initial and final permutations,
	 * and for the key initial and compression permutations.
	 */
	for (k = 0; k < 16; k++) {
		for (i = 0; i < 16; i++) {
			il = ir = fl = fr = 0;
			for (j = 0; j < 4; j++) {
				inbit = 4 * k + j;
				if (i & bits8[j + 4]) {
					if ((obit = init_perm[inbit]) < 32)
						il |= bits32[obit];
					else
						ir |= bits32[obit - 32];
					if ((obit = final_perm[inbit]) < 32)
						fl |= bits32[obit];
					else
						fr |= bits32[obit - 32];
				}
			}
			shared->ip_maskl[k][i] = il;
			shared->ip_maskr[k][i] = ir;
			shared->fp_maskl[k][i] = fl;
			shared->fp_maskr[k][i] = fr;

			il = ir = 0;
			for (j = 0; j < 4 - (k & 1); j++) {
				inbit = 4 * k + j;
				if (i & bits8[j + 4]) {
					if ((obit = inv_key_perm[inbit]) == 255)
						continue;
					if (obit < 28)
						il |= bits28[obit];
					else
						ir |= bits28[obit - 28];
				}
			}
			shared->key_perm_maskl[k][i] = il;
			shared->key_perm_maskr[k][i] = ir;
		}
	}
	for (k = 0; k < 8; k++) {
		for (i = 0; i < 8; i++) {
			il = ir = 0;
			for (j = 0; j < 3; j++) {
				inbit = 7 * k + j;
				if (i & bits8[j + 5]) {
					if ((obit = inv_comp_perm[inbit]) == 255)
						continue;
					if (obit < 24)
						il |= bits24[obit];
					else
						ir |= bits24[obit - 24];
				}
			}
			shared->comp_maskl0[k][i] = il;
			shared->comp_maskr0[k][i] = ir;
		}
		for (i = 0; i < 16; i++) {
			il = ir = 0;
			for (j = 3; j < 7; j++) {
				inbit = 7 * k + j;
				if (i & bits8[j + 1]) {
					if ((obit = inv_comp_perm[inbit]) == 255)
						continue;
					if (obit < 24)
						il |= bits24[obit];
					else
						ir |= bits24[obit - 24];
				}
			}
			shared->comp_maskl1[k][i] = il;
			shared->comp_maskr1[k][i] = ir;
		}
	}
}

static void init_s(struct _crypt_extended_shared *shared)
{
	int i, j, b;
	u_char u_sbox[8][64];
	u_char un_pbox[32];

	/*
	 * Invert the S-boxes, reordering the input bits.
	 */
	for (i = 0; i < 8; i++)
		for (j = 0; j < 64; j++) {
			b = (j & 0x20) | ((j & 1) << 4) | ((j >> 1) & 0xf);
			u_sbox[i][j] = sbox[i][b];
		}

	/*
	 * Invert the P-box permutation, and convert into OR-masks for
	 * handling the output of the S-box arrays setup above.
	 */
	for (i = 0; i < 32; i++)
		un_pbox[pbox[i] - 1] = i;

	for (b = 0; b < 8; b++) {
		for (i = 0; i < 64; i++) {
			u_int32_t p = 0;
			for (j = 0; j < 4; j++) {
				if (u_sbox[b][i] & bits8[j + 4])
					p |= bits32[un_pbox[4 * b + j]];
			}
			shared->psbox[b][i] = p;
		}
	}
}

void
_crypt_extended_init(struct _crypt_extended_shared *shared)
{
	init_ip_k(shared);
	init_s(shared);
}

static void
des_init_local(struct _crypt_extended_local *local)
{
	local->old_rawkey0 = local->old_rawkey1 = 0;
	local->saltbits = 0;
	local->old_salt = 0;

	local->initialized = 1;
}

static void
setup_salt(u_int32_t salt, struct _crypt_extended_local *local)
{
	u_int32_t obit, saltbit, saltbits;
	int i;

	if (salt == local->old_salt)
		return;
	local->old_salt = salt;

	saltbits = 0;
	saltbit = 1;
	obit = 0x800000;
	for (i = 0; i < 24; i++) {
		if (salt & saltbit)
			saltbits |= obit;
		saltbit <<= 1;
		obit >>= 1;
	}
	local->saltbits = saltbits;
}

static int
des_setkey(const u_char *key,
	const struct _crypt_extended_shared *shared,
	struct _crypt_extended_local *local)
{
	u_int32_t k0, k1, rawkey0, rawkey1;
	int shifts, round;

	rawkey0 =
		(u_int32_t)(u_char)key[3] |
		((u_int32_t)(u_char)key[2] << 8) |
		((u_int32_t)(u_char)key[1] << 16) |
		((u_int32_t)(u_char)key[0] << 24);
	rawkey1 =
		(u_int32_t)(u_char)key[7] |
		((u_int32_t)(u_char)key[6] << 8) |
		((u_int32_t)(u_char)key[5] << 16) |
		((u_int32_t)(u_char)key[4] << 24);

	if ((rawkey0 | rawkey1)
	    && rawkey0 == local->old_rawkey0
	    && rawkey1 == local->old_rawkey1) {
		/*
		 * Already setup for this key.
		 * This optimisation fails on a zero key (which is weak and
		 * has bad parity anyway) in order to simplify the starting
		 * conditions.
		 */
		return 0;
	}
	local->old_rawkey0 = rawkey0;
	local->old_rawkey1 = rawkey1;

	/*
	 * Do key permutation and split into two 28-bit subkeys.
	 */
	{
		int i, inbit;
		k0 = k1 = 0;
		for (i = 0, inbit = 28; i < 8; i++, inbit -= 4) {
			k0 |= shared->key_perm_maskl[i][(rawkey0 >> inbit) & 0xf] |
			     shared->key_perm_maskl[i + 8][(rawkey1 >> inbit) & 0xf];
			k1 |= shared->key_perm_maskr[i][(rawkey0 >> inbit) & 0xf] |
			     shared->key_perm_maskr[i + 8][(rawkey1 >> inbit) & 0xf];
		}
	}

	/*
	 * Rotate subkeys and do compression permutation.
	 */
	shifts = 0;
	for (round = 0; round < 16; round++) {
		u_int32_t t0, t1;

		shifts += key_shifts[round];

		t0 = (k0 << shifts) | (k0 >> (28 - shifts));
		t1 = (k1 << shifts) | (k1 >> (28 - shifts));

		{
			int i, inbit;
			u_int32_t kl, kr;
			kl = kr = 0;
			inbit = 25;
			for (i = 0; i < 4; i++) {
				kl |= shared->comp_maskl0[i][(t0 >> inbit) & 7] |
				      shared->comp_maskl0[i + 4][(t1 >> inbit) & 7];
				kr |= shared->comp_maskr0[i][(t0 >> inbit) & 7] |
				      shared->comp_maskr0[i + 4][(t1 >> inbit) & 7];
				inbit -= 4;
				kl |= shared->comp_maskl1[i][(t0 >> inbit) & 0xf] |
				      shared->comp_maskl1[i + 4][(t1 >> inbit) & 0xf];
				kr |= shared->comp_maskr1[i][(t0 >> inbit) & 0xf] |
				      shared->comp_maskr1[i + 4][(t1 >> inbit) & 0xf];
				inbit -= 3;
			}
			local->en_keysl[round] = kl;
			local->en_keysr[round] = kr;
		}
	}

	return 0;
}

static int
do_des(u_int32_t l_in, u_int32_t r_in, u_int32_t *l_out, u_int32_t *r_out,
	int count,
	const struct _crypt_extended_shared *shared,
	struct _crypt_extended_local *local)
{
	/*
	 * l_in, r_in, l_out, and r_out are in pseudo-"big-endian" format.
	 */
	u_int32_t l, r, *kl, *kr, *kl1, *kr1;
	u_int32_t f, r48l, r48r, saltbits;
	int round;

	kl1 = local->en_keysl;
	kr1 = local->en_keysr;

	/*
	 * Do initial permutation (IP).
	 */
	l = r = 0;
	if (l_in | r_in) {
		int i, inbit;
		for (i = 0, inbit = 28; i < 8; i++, inbit -= 4) {
			l |= shared->ip_maskl[i][(l_in >> inbit) & 0xf] |
			     shared->ip_maskl[i + 8][(r_in >> inbit) & 0xf];
			r |= shared->ip_maskr[i][(l_in >> inbit) & 0xf] |
			     shared->ip_maskr[i + 8][(r_in >> inbit) & 0xf];
		}
	}

	saltbits = local->saltbits;
	while (count--) {
		/*
		 * Do each round.
		 */
		kl = kl1;
		kr = kr1;
		round = 16;
		while (round--) {
			/*
			 * Expand R to 48 bits (simulate the E-box).
			 */
			r48l	= ((r & 0x00000001) << 23)
				| ((r & 0xf8000000) >> 9)
				| ((r & 0x1f800000) >> 11)
				| ((r & 0x01f80000) >> 13)
				| ((r & 0x001f8000) >> 15);

			r48r	= ((r & 0x0001f800) << 7)
				| ((r & 0x00001f80) << 5)
				| ((r & 0x000001f8) << 3)
				| ((r & 0x0000001f) << 1)
				| ((r & 0x80000000) >> 31);
			/*
			 * Do salting for crypt() and friends, and
			 * XOR with the permuted key.
			 */
			f = (r48l ^ r48r) & saltbits;
			r48l ^= f ^ *kl++;
			r48r ^= f ^ *kr++;
			/*
			 * Do S-box lookups (which shrink it back to 32 bits)
			 * and do the P-box permutation at the same time.
			 */
			f = shared->psbox[0][r48l >> 18]
			  | shared->psbox[1][(r48l >> 12) & 0x3f]
			  | shared->psbox[2][(r48l >> 6) & 0x3f]
			  | shared->psbox[3][r48l & 0x3f]
			  | shared->psbox[4][r48r >> 18]
			  | shared->psbox[5][(r48r >> 12) & 0x3f]
			  | shared->psbox[6][(r48r >> 6) & 0x3f]
			  | shared->psbox[7][r48r & 0x3f];
			/*
			 * Now that we've permuted things, complete f().
			 */
			f ^= l;
			l = r;
			r = f;
		}
		r = l;
		l = f;
	}

	/*
	 * Do final permutation (inverse of IP).
	 */
	{
		int i, inbit;
		u_int32_t lo, ro;
		lo = ro = 0;
		for (i = 0, inbit = 28; i < 8; i++, inbit -= 4) {
			lo |= shared->fp_maskl[i][(l >> inbit) & 0xf] |
			     shared->fp_maskl[i + 8][(r >> inbit) & 0xf];
			ro |= shared->fp_maskr[i][(l >> inbit) & 0xf] |
			     shared->fp_maskr[i + 8][(r >> inbit) & 0xf];
		}
		*l_out = lo;
		*r_out = ro;
	}

	return 0;
}

static int
des_cipher(const u_char *in, u_char *out, u_int32_t salt, int count,
	const struct _crypt_extended_shared *shared,
	struct _crypt_extended_local *local)
{
	u_int32_t l_out, r_out, rawl, rawr;
	int retval;

	setup_salt(salt, local);

	rawl =
		(u_int32_t)(u_char)in[3] |
		((u_int32_t)(u_char)in[2] << 8) |
		((u_int32_t)(u_char)in[1] << 16) |
		((u_int32_t)(u_char)in[0] << 24);
	rawr =
		(u_int32_t)(u_char)in[7] |
		((u_int32_t)(u_char)in[6] << 8) |
		((u_int32_t)(u_char)in[5] << 16) |
		((u_int32_t)(u_char)in[4] << 24);

	retval = do_des(rawl, rawr, &l_out, &r_out, count, shared, local);

	out[0] = l_out >> 24;
	out[1] = l_out >> 16;
	out[2] = l_out >> 8;
	out[3] = l_out;
	out[4] = r_out >> 24;
	out[5] = r_out >> 16;
	out[6] = r_out >> 8;
	out[7] = r_out;

	return retval;
}

char *
_crypt_extended_r(const char *key, const char *setting,
	const struct _crypt_extended_shared *shared,
	struct _crypt_extended_local *local)
{
	int i;
	u_int32_t count, salt, l, r0, r1, keybuf[2];
	u_char *p, *q;

	if (!local->initialized)
		des_init_local(local);

	/*
	 * Copy the key, shifting each character up by one bit
	 * and padding with zeros.
	 */
	q = (u_char *) keybuf;
	while (q - (u_char *) keybuf < sizeof(keybuf)) {
		*q++ = *key << 1;
		if (*key)
			key++;
	}
	if (des_setkey((u_char *) keybuf, shared, local))
		return NULL;

	if (*setting == _PASSWORD_EFMT1) {
		/*
		 * "new"-style:
		 *	setting - underscore, 4 chars of count, 4 chars of salt
		 *	key - unlimited characters
		 */
		for (i = 1, count = 0; i < 5; i++) {
			int value = ascii_to_bin(setting[i]);
			if (ascii64[value] != setting[i])
				return NULL;
			count |= value << (i - 1) * 6;
		}
		if (!count)
			return NULL;

		for (i = 5, salt = 0; i < 9; i++) {
			int value = ascii_to_bin(setting[i]);
			if (ascii64[value] != setting[i])
				return NULL;
			salt |= value << (i - 5) * 6;
		}

		while (*key) {
			/*
			 * Encrypt the key with itself.
			 */
			if (des_cipher((u_char *) keybuf, (u_char *) keybuf,
			    0, 1, shared, local))
				return NULL;
			/*
			 * And XOR with the next 8 characters of the key.
			 */
			q = (u_char *) keybuf;
			while (q - (u_char *) keybuf < sizeof(keybuf) && *key)
				*q++ ^= *key++ << 1;

			if (des_setkey((u_char *) keybuf, shared, local))
				return NULL;
		}
		memcpy(local->output, setting, 9);
		local->output[9] = '\0';
		p = (u_char *) local->output + 9;
	} else {
		/*
		 * "old"-style:
		 *	setting - 2 chars of salt
		 *	key - up to 8 characters
		 */
		count = 25;

		if (ascii_is_unsafe(setting[0]) || ascii_is_unsafe(setting[1]))
			return NULL;

		salt = (ascii_to_bin(setting[1]) << 6)
		     |  ascii_to_bin(setting[0]);

		local->output[0] = setting[0];
		local->output[1] = setting[1];
		p = (u_char *) local->output + 2;
	}
	setup_salt(salt, local);
	/*
	 * Do it.
	 */
	if (do_des(0, 0, &r0, &r1, count, shared, local))
		return NULL;
	/*
	 * Now encode the result...
	 */
	l = (r0 >> 8);
	*p++ = ascii64[(l >> 18) & 0x3f];
	*p++ = ascii64[(l >> 12) & 0x3f];
	*p++ = ascii64[(l >> 6) & 0x3f];
	*p++ = ascii64[l & 0x3f];

	l = (r0 << 16) | ((r1 >> 16) & 0xffff);
	*p++ = ascii64[(l >> 18) & 0x3f];
	*p++ = ascii64[(l >> 12) & 0x3f];
	*p++ = ascii64[(l >> 6) & 0x3f];
	*p++ = ascii64[l & 0x3f];

	l = r1 << 2;
	*p++ = ascii64[(l >> 12) & 0x3f];
	*p++ = ascii64[(l >> 6) & 0x3f];
	*p++ = ascii64[l & 0x3f];
	*p = 0;

	return local->output;
}

#ifndef TEST_STATIC
#define TEST_STATIC /* not static */
#endif

#ifdef TEST
static char *
_crypt_extended(const char *key, const char *setting)
{
	TEST_STATIC int initialized = 0;
	TEST_STATIC struct _crypt_extended_shared shared;
/* "local" must be static since it holds our own return value */
	static struct _crypt_extended_local local;

	if (!initialized) {
		memset(&shared, 's', sizeof(shared));
		memset(&local, 'l', sizeof(local));
		_crypt_extended_init(&shared);
		initialized = 1;
		local.initialized = 0;
	}
	return _crypt_extended_r(key, setting, &shared, &local);
}

static const struct {
	char *hash;
	char *pw;
} tests[] = {
/* "new"-style */
	{"_J9..CCCCXBrJUJV154M", "U*U*U*U*"},
	{"_J9..CCCCXUhOBTXzaiE", "U*U***U"},
	{"_J9..CCCC4gQ.mB/PffM", "U*U***U*"},
	{"_J9..XXXXvlzQGqpPPdk", "*U*U*U*U"},
	{"_J9..XXXXsqM/YSSP..Y", "*U*U*U*U*"},
	{"_J9..XXXXVL7qJCnku0I", "*U*U*U*U*U*U*U*U"},
	{"_J9..XXXXAj8cFbP5scI", "*U*U*U*U*U*U*U*U*"},
	{"_J9..SDizh.vll5VED9g", "ab1234567"},
	{"_J9..SDizRjWQ/zePPHc", "cr1234567"},
	{"_J9..SDizxmRI1GjnQuE", "zxyDPWgydbQjgq"},
	{"_K9..SaltNrQgIYUAeoY", "726 even"},
	{"_J9..SDSD5YGyRCr4W4c", ""},
	{"_01234567IBjxKliXXRQ", "\xc3\x80" "1234abcd"},
	{"_012345678OSGpGQRVHA", "\xc3\x80" "9234abcd"},
/* "old"-style, valid salts */
	{"CCNf8Sbh3HDfQ", "U*U*U*U*"},
	{"CCX.K.MFy4Ois", "U*U***U"},
	{"CC4rMpbg9AMZ.", "U*U***U*"},
	{"XXxzOu6maQKqQ", "*U*U*U*U"},
	{"SDbsugeBiC58A", ""},
	{"./xZjzHv5vzVE", "password"},
	{"0A2hXM1rXbYgo", "password"},
	{"A9RXdR23Y.cY6", "password"},
	{"ZziFATVXHo2.6", "password"},
	{"zZDDIZ0NOlPzw", "password"},
	{"99PxawtsTfX56", "\xc3\x80" "1234abcd"},
	{"99jcVcGxUZOWk", "\xc3\x80" "9234abcd"},
/* "old"-style, "reasonable" invalid salts, UFC-crypt behavior expected */
	{"\001\002wyd0KZo65Jo", "password"},
	{"a_C10Dk/ExaG.", "password"},
	{"~\377.5OTsRVjwLo", "password"},
/* The below are erroneous inputs, so NULL return is expected/required */
	{"", ""}, /* no salt */
	{" ", ""}, /* setting string is too short */
	{"a:", ""}, /* unsafe character */
	{"\na", ""}, /* unsafe character */
	{"_/......", ""}, /* setting string is too short for its type */
	{"_........", ""}, /* zero iteration count */
	{"_/!......", ""}, /* invalid character in count */
	{"_/......!", ""}, /* invalid character in salt */
	{NULL, NULL}
};

int main(void)
{
	int i;

	for (i = 0; tests[i].hash; i++) {
		char *hash = _crypt_extended(tests[i].pw, tests[i].hash);
		if (!hash && strlen(tests[i].hash) < 13)
			continue; /* expected failure */
		if (!strcmp(hash, tests[i].hash))
			continue; /* expected success */
		puts("FAILED");
		return 1;
	}

	puts("PASSED");

	return 0;
}
#endif

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: FreeSec crypt()
  2012-06-12 23:51 FreeSec crypt() Solar Designer
@ 2012-06-13  1:18 ` Rich Felker
  2012-06-13  6:10   ` Szabolcs Nagy
  2012-06-13 12:07   ` Solar Designer
  0 siblings, 2 replies; 17+ messages in thread
From: Rich Felker @ 2012-06-13  1:18 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 1308 bytes --]

On Wed, Jun 13, 2012 at 03:51:13AM +0400, Solar Designer wrote:
> Rich -
> 
> As discussed on IRC, here is a revision of the FreeSec crypt() code with
> greatly reduced memory requirements: 10 KB for the "shared" and "local"
> structs combined.  The original code required about 70 KB of .bss.

Thanks. Here's a _really_ quick draft, untested, of the direction I
wanted to take it with making the tables static-initialized. Note that
doing so allowed removing all of the table-creation code and most of
the existing tables. For me, it's weighing in at ~12k. I also removed
some unused code for things like keeping salt/keys between multiple
runs since the data is not preserved across multiple runs anyway, and
the lookup tables for bitshifts.

> Also, we could want to add a runtime self-test, which would detect
> possible miscompiles.

I understand your motivation for doing this with security-critical
things, but really most/all of libc is security-critical, and we can't
have runtime miscompilation tests all over the place. Moreover, the
vast majority of cases of GCC "miscompiling" something turn out to be
code that's invoking undefined behavior; the only non-UB example I've
encountered while working on musl is gcc 4.7's bad breakage, which is
so bad you can't even get programs to start...

Rich

[-- Attachment #2: crypt.c --]
[-- Type: text/plain, Size: 41527 bytes --]

/*
 * This version is derived from the original implementation of FreeSec
 * (release 1.1) by David Burren.  I've reviewed the changes made in
 * OpenBSD (as of 2.7) and modified the original code in a similar way
 * where applicable.  I've also made it reentrant, reduced its memory
 * usage (with only minimal performance impact), and made the handling
 * of invalid salts mostly UFC-crypt compatible.
 * - Solar Designer <solar at openwall.com>
 */

/*
 * FreeSec: libcrypt for NetBSD
 *
 * Copyright (c) 1994 David Burren
 * All rights reserved.
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions
 * are met:
 * 1. Redistributions of source code must retain the above copyright
 *    notice, this list of conditions and the following disclaimer.
 * 2. Redistributions in binary form must reproduce the above copyright
 *    notice, this list of conditions and the following disclaimer in the
 *    documentation and/or other materials provided with the distribution.
 * 3. Neither the name of the author nor the names of other contributors
 *    may be used to endorse or promote products derived from this software
 *    without specific prior written permission.
 *
 * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
 * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
 * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
 * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
 * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
 * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
 * SUCH DAMAGE.
 *
 *	$Owl: Owl/packages/glibc/crypt_freesec.c,v 1.6 2010/02/20 14:45:06 solar Exp $
 *	$Id: crypt.c,v 1.15 1994/09/13 04:58:49 davidb Exp $
 *
 * This is an original implementation of the DES and the crypt(3) interfaces
 * by David Burren <davidb at werj.com.au>.
 *
 * An excellent reference on the underlying algorithm (and related
 * algorithms) is:
 *
 *	B. Schneier, Applied Cryptography: protocols, algorithms,
 *	and source code in C, John Wiley & Sons, 1994.
 *
 * Note that in that book's description of DES the lookups for the initial,
 * pbox, and final permutations are inverted (this has been brought to the
 * attention of the author).  A list of errata for this book has been
 * posted to the sci.crypt newsgroup by the author and is available for FTP.
 *
 * ARCHITECTURE ASSUMPTIONS:
 *	This code used to have some nasty ones, but these have been removed
 *	by now.  The code requires a 32-bit integer type, though.
 */

#include <sys/types.h>
#include <string.h>

struct _crypt_extended_local {
	u_int32_t saltbits;
	u_int32_t en_keysl[16], en_keysr[16];
};

#define _PASSWORD_EFMT1 '_'

static const u_char key_shifts[16] = {
	1, 1, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 1
};

static const uint32_t psbox[8][64] = {
	{
		0x00808200,0x00000000,0x00008000,0x00808202,
		0x00808002,0x00008202,0x00000002,0x00008000,
		0x00000200,0x00808200,0x00808202,0x00000200,
		0x00800202,0x00808002,0x00800000,0x00000002,
		0x00000202,0x00800200,0x00800200,0x00008200,
		0x00008200,0x00808000,0x00808000,0x00800202,
		0x00008002,0x00800002,0x00800002,0x00008002,
		0x00000000,0x00000202,0x00008202,0x00800000,
		0x00008000,0x00808202,0x00000002,0x00808000,
		0x00808200,0x00800000,0x00800000,0x00000200,
		0x00808002,0x00008000,0x00008200,0x00800002,
		0x00000200,0x00000002,0x00800202,0x00008202,
		0x00808202,0x00008002,0x00808000,0x00800202,
		0x00800002,0x00000202,0x00008202,0x00808200,
		0x00000202,0x00800200,0x00800200,0x00000000,
		0x00008002,0x00008200,0x00000000,0x00808002,
	},{
		0x40084010,0x40004000,0x00004000,0x00084010,
		0x00080000,0x00000010,0x40080010,0x40004010,
		0x40000010,0x40084010,0x40084000,0x40000000,
		0x40004000,0x00080000,0x00000010,0x40080010,
		0x00084000,0x00080010,0x40004010,0x00000000,
		0x40000000,0x00004000,0x00084010,0x40080000,
		0x00080010,0x40000010,0x00000000,0x00084000,
		0x00004010,0x40084000,0x40080000,0x00004010,
		0x00000000,0x00084010,0x40080010,0x00080000,
		0x40004010,0x40080000,0x40084000,0x00004000,
		0x40080000,0x40004000,0x00000010,0x40084010,
		0x00084010,0x00000010,0x00004000,0x40000000,
		0x00004010,0x40084000,0x00080000,0x40000010,
		0x00080010,0x40004010,0x40000010,0x00080010,
		0x00084000,0x00000000,0x40004000,0x00004010,
		0x40000000,0x40080010,0x40084010,0x00084000,
	},{
		0x00000104,0x04010100,0x00000000,0x04010004,
		0x04000100,0x00000000,0x00010104,0x04000100,
		0x00010004,0x04000004,0x04000004,0x00010000,
		0x04010104,0x00010004,0x04010000,0x00000104,
		0x04000000,0x00000004,0x04010100,0x00000100,
		0x00010100,0x04010000,0x04010004,0x00010104,
		0x04000104,0x00010100,0x00010000,0x04000104,
		0x00000004,0x04010104,0x00000100,0x04000000,
		0x04010100,0x04000000,0x00010004,0x00000104,
		0x00010000,0x04010100,0x04000100,0x00000000,
		0x00000100,0x00010004,0x04010104,0x04000100,
		0x04000004,0x00000100,0x00000000,0x04010004,
		0x04000104,0x00010000,0x04000000,0x04010104,
		0x00000004,0x00010104,0x00010100,0x04000004,
		0x04010000,0x04000104,0x00000104,0x04010000,
		0x00010104,0x00000004,0x04010004,0x00010100,
	},{
		0x80401000,0x80001040,0x80001040,0x00000040,
		0x00401040,0x80400040,0x80400000,0x80001000,
		0x00000000,0x00401000,0x00401000,0x80401040,
		0x80000040,0x00000000,0x00400040,0x80400000,
		0x80000000,0x00001000,0x00400000,0x80401000,
		0x00000040,0x00400000,0x80001000,0x00001040,
		0x80400040,0x80000000,0x00001040,0x00400040,
		0x00001000,0x00401040,0x80401040,0x80000040,
		0x00400040,0x80400000,0x00401000,0x80401040,
		0x80000040,0x00000000,0x00000000,0x00401000,
		0x00001040,0x00400040,0x80400040,0x80000000,
		0x80401000,0x80001040,0x80001040,0x00000040,
		0x80401040,0x80000040,0x80000000,0x00001000,
		0x80400000,0x80001000,0x00401040,0x80400040,
		0x80001000,0x00001040,0x00400000,0x80401000,
		0x00000040,0x00400000,0x00001000,0x00401040,
	},{
		0x00000080,0x01040080,0x01040000,0x21000080,
		0x00040000,0x00000080,0x20000000,0x01040000,
		0x20040080,0x00040000,0x01000080,0x20040080,
		0x21000080,0x21040000,0x00040080,0x20000000,
		0x01000000,0x20040000,0x20040000,0x00000000,
		0x20000080,0x21040080,0x21040080,0x01000080,
		0x21040000,0x20000080,0x00000000,0x21000000,
		0x01040080,0x01000000,0x21000000,0x00040080,
		0x00040000,0x21000080,0x00000080,0x01000000,
		0x20000000,0x01040000,0x21000080,0x20040080,
		0x01000080,0x20000000,0x21040000,0x01040080,
		0x20040080,0x00000080,0x01000000,0x21040000,
		0x21040080,0x00040080,0x21000000,0x21040080,
		0x01040000,0x00000000,0x20040000,0x21000000,
		0x00040080,0x01000080,0x20000080,0x00040000,
		0x00000000,0x20040000,0x01040080,0x20000080,
	},{
		0x10000008,0x10200000,0x00002000,0x10202008,
		0x10200000,0x00000008,0x10202008,0x00200000,
		0x10002000,0x00202008,0x00200000,0x10000008,
		0x00200008,0x10002000,0x10000000,0x00002008,
		0x00000000,0x00200008,0x10002008,0x00002000,
		0x00202000,0x10002008,0x00000008,0x10200008,
		0x10200008,0x00000000,0x00202008,0x10202000,
		0x00002008,0x00202000,0x10202000,0x10000000,
		0x10002000,0x00000008,0x10200008,0x00202000,
		0x10202008,0x00200000,0x00002008,0x10000008,
		0x00200000,0x10002000,0x10000000,0x00002008,
		0x10000008,0x10202008,0x00202000,0x10200000,
		0x00202008,0x10202000,0x00000000,0x10200008,
		0x00000008,0x00002000,0x10200000,0x00202008,
		0x00002000,0x00200008,0x10002008,0x00000000,
		0x10202000,0x10000000,0x00200008,0x10002008,
	},{
		0x00100000,0x02100001,0x02000401,0x00000000,
		0x00000400,0x02000401,0x00100401,0x02100400,
		0x02100401,0x00100000,0x00000000,0x02000001,
		0x00000001,0x02000000,0x02100001,0x00000401,
		0x02000400,0x00100401,0x00100001,0x02000400,
		0x02000001,0x02100000,0x02100400,0x00100001,
		0x02100000,0x00000400,0x00000401,0x02100401,
		0x00100400,0x00000001,0x02000000,0x00100400,
		0x02000000,0x00100400,0x00100000,0x02000401,
		0x02000401,0x02100001,0x02100001,0x00000001,
		0x00100001,0x02000000,0x02000400,0x00100000,
		0x02100400,0x00000401,0x00100401,0x02100400,
		0x00000401,0x02000001,0x02100401,0x02100000,
		0x00100400,0x00000000,0x00000001,0x02100401,
		0x00000000,0x00100401,0x02100000,0x00000400,
		0x02000001,0x02000400,0x00000400,0x00100001,
	},{
		0x08000820,0x00000800,0x00020000,0x08020820,
		0x08000000,0x08000820,0x00000020,0x08000000,
		0x00020020,0x08020000,0x08020820,0x00020800,
		0x08020800,0x00020820,0x00000800,0x00000020,
		0x08020000,0x08000020,0x08000800,0x00000820,
		0x00020800,0x00020020,0x08020020,0x08020800,
		0x00000820,0x00000000,0x00000000,0x08020020,
		0x08000020,0x08000800,0x00020820,0x00020000,
		0x00020820,0x00020000,0x08020800,0x00000800,
		0x00000020,0x08020020,0x00000800,0x00020820,
		0x08000800,0x00000020,0x08000020,0x08020000,
		0x08020020,0x08000000,0x00020000,0x08000820,
		0x00000000,0x08020820,0x00020020,0x08000020,
		0x08020000,0x08000800,0x08000820,0x00000000,
		0x08020820,0x00020800,0x00020800,0x00000820,
		0x00000820,0x00020020,0x08000000,0x08020800,
	},
};
static const uint32_t ip_maskl[16][16] = {
	{
		0x00000000,0x00010000,0x00000000,0x00010000,
		0x01000000,0x01010000,0x01000000,0x01010000,
		0x00000000,0x00010000,0x00000000,0x00010000,
		0x01000000,0x01010000,0x01000000,0x01010000,
	},{
		0x00000000,0x00000001,0x00000000,0x00000001,
		0x00000100,0x00000101,0x00000100,0x00000101,
		0x00000000,0x00000001,0x00000000,0x00000001,
		0x00000100,0x00000101,0x00000100,0x00000101,
	},{
		0x00000000,0x00020000,0x00000000,0x00020000,
		0x02000000,0x02020000,0x02000000,0x02020000,
		0x00000000,0x00020000,0x00000000,0x00020000,
		0x02000000,0x02020000,0x02000000,0x02020000,
	},{
		0x00000000,0x00000002,0x00000000,0x00000002,
		0x00000200,0x00000202,0x00000200,0x00000202,
		0x00000000,0x00000002,0x00000000,0x00000002,
		0x00000200,0x00000202,0x00000200,0x00000202,
	},{
		0x00000000,0x00040000,0x00000000,0x00040000,
		0x04000000,0x04040000,0x04000000,0x04040000,
		0x00000000,0x00040000,0x00000000,0x00040000,
		0x04000000,0x04040000,0x04000000,0x04040000,
	},{
		0x00000000,0x00000004,0x00000000,0x00000004,
		0x00000400,0x00000404,0x00000400,0x00000404,
		0x00000000,0x00000004,0x00000000,0x00000004,
		0x00000400,0x00000404,0x00000400,0x00000404,
	},{
		0x00000000,0x00080000,0x00000000,0x00080000,
		0x08000000,0x08080000,0x08000000,0x08080000,
		0x00000000,0x00080000,0x00000000,0x00080000,
		0x08000000,0x08080000,0x08000000,0x08080000,
	},{
		0x00000000,0x00000008,0x00000000,0x00000008,
		0x00000800,0x00000808,0x00000800,0x00000808,
		0x00000000,0x00000008,0x00000000,0x00000008,
		0x00000800,0x00000808,0x00000800,0x00000808,
	},{
		0x00000000,0x00100000,0x00000000,0x00100000,
		0x10000000,0x10100000,0x10000000,0x10100000,
		0x00000000,0x00100000,0x00000000,0x00100000,
		0x10000000,0x10100000,0x10000000,0x10100000,
	},{
		0x00000000,0x00000010,0x00000000,0x00000010,
		0x00001000,0x00001010,0x00001000,0x00001010,
		0x00000000,0x00000010,0x00000000,0x00000010,
		0x00001000,0x00001010,0x00001000,0x00001010,
	},{
		0x00000000,0x00200000,0x00000000,0x00200000,
		0x20000000,0x20200000,0x20000000,0x20200000,
		0x00000000,0x00200000,0x00000000,0x00200000,
		0x20000000,0x20200000,0x20000000,0x20200000,
	},{
		0x00000000,0x00000020,0x00000000,0x00000020,
		0x00002000,0x00002020,0x00002000,0x00002020,
		0x00000000,0x00000020,0x00000000,0x00000020,
		0x00002000,0x00002020,0x00002000,0x00002020,
	},{
		0x00000000,0x00400000,0x00000000,0x00400000,
		0x40000000,0x40400000,0x40000000,0x40400000,
		0x00000000,0x00400000,0x00000000,0x00400000,
		0x40000000,0x40400000,0x40000000,0x40400000,
	},{
		0x00000000,0x00000040,0x00000000,0x00000040,
		0x00004000,0x00004040,0x00004000,0x00004040,
		0x00000000,0x00000040,0x00000000,0x00000040,
		0x00004000,0x00004040,0x00004000,0x00004040,
	},{
		0x00000000,0x00800000,0x00000000,0x00800000,
		0x80000000,0x80800000,0x80000000,0x80800000,
		0x00000000,0x00800000,0x00000000,0x00800000,
		0x80000000,0x80800000,0x80000000,0x80800000,
	},{
		0x00000000,0x00000080,0x00000000,0x00000080,
		0x00008000,0x00008080,0x00008000,0x00008080,
		0x00000000,0x00000080,0x00000000,0x00000080,
		0x00008000,0x00008080,0x00008000,0x00008080,
	},
};
static const uint32_t ip_maskr[16][16] = {
	{
		0x00000000,0x00000000,0x00010000,0x00010000,
		0x00000000,0x00000000,0x00010000,0x00010000,
		0x01000000,0x01000000,0x01010000,0x01010000,
		0x01000000,0x01000000,0x01010000,0x01010000,
	},{
		0x00000000,0x00000000,0x00000001,0x00000001,
		0x00000000,0x00000000,0x00000001,0x00000001,
		0x00000100,0x00000100,0x00000101,0x00000101,
		0x00000100,0x00000100,0x00000101,0x00000101,
	},{
		0x00000000,0x00000000,0x00020000,0x00020000,
		0x00000000,0x00000000,0x00020000,0x00020000,
		0x02000000,0x02000000,0x02020000,0x02020000,
		0x02000000,0x02000000,0x02020000,0x02020000,
	},{
		0x00000000,0x00000000,0x00000002,0x00000002,
		0x00000000,0x00000000,0x00000002,0x00000002,
		0x00000200,0x00000200,0x00000202,0x00000202,
		0x00000200,0x00000200,0x00000202,0x00000202,
	},{
		0x00000000,0x00000000,0x00040000,0x00040000,
		0x00000000,0x00000000,0x00040000,0x00040000,
		0x04000000,0x04000000,0x04040000,0x04040000,
		0x04000000,0x04000000,0x04040000,0x04040000,
	},{
		0x00000000,0x00000000,0x00000004,0x00000004,
		0x00000000,0x00000000,0x00000004,0x00000004,
		0x00000400,0x00000400,0x00000404,0x00000404,
		0x00000400,0x00000400,0x00000404,0x00000404,
	},{
		0x00000000,0x00000000,0x00080000,0x00080000,
		0x00000000,0x00000000,0x00080000,0x00080000,
		0x08000000,0x08000000,0x08080000,0x08080000,
		0x08000000,0x08000000,0x08080000,0x08080000,
	},{
		0x00000000,0x00000000,0x00000008,0x00000008,
		0x00000000,0x00000000,0x00000008,0x00000008,
		0x00000800,0x00000800,0x00000808,0x00000808,
		0x00000800,0x00000800,0x00000808,0x00000808,
	},{
		0x00000000,0x00000000,0x00100000,0x00100000,
		0x00000000,0x00000000,0x00100000,0x00100000,
		0x10000000,0x10000000,0x10100000,0x10100000,
		0x10000000,0x10000000,0x10100000,0x10100000,
	},{
		0x00000000,0x00000000,0x00000010,0x00000010,
		0x00000000,0x00000000,0x00000010,0x00000010,
		0x00001000,0x00001000,0x00001010,0x00001010,
		0x00001000,0x00001000,0x00001010,0x00001010,
	},{
		0x00000000,0x00000000,0x00200000,0x00200000,
		0x00000000,0x00000000,0x00200000,0x00200000,
		0x20000000,0x20000000,0x20200000,0x20200000,
		0x20000000,0x20000000,0x20200000,0x20200000,
	},{
		0x00000000,0x00000000,0x00000020,0x00000020,
		0x00000000,0x00000000,0x00000020,0x00000020,
		0x00002000,0x00002000,0x00002020,0x00002020,
		0x00002000,0x00002000,0x00002020,0x00002020,
	},{
		0x00000000,0x00000000,0x00400000,0x00400000,
		0x00000000,0x00000000,0x00400000,0x00400000,
		0x40000000,0x40000000,0x40400000,0x40400000,
		0x40000000,0x40000000,0x40400000,0x40400000,
	},{
		0x00000000,0x00000000,0x00000040,0x00000040,
		0x00000000,0x00000000,0x00000040,0x00000040,
		0x00004000,0x00004000,0x00004040,0x00004040,
		0x00004000,0x00004000,0x00004040,0x00004040,
	},{
		0x00000000,0x00000000,0x00800000,0x00800000,
		0x00000000,0x00000000,0x00800000,0x00800000,
		0x80000000,0x80000000,0x80800000,0x80800000,
		0x80000000,0x80000000,0x80800000,0x80800000,
	},{
		0x00000000,0x00000000,0x00000080,0x00000080,
		0x00000000,0x00000000,0x00000080,0x00000080,
		0x00008000,0x00008000,0x00008080,0x00008080,
		0x00008000,0x00008000,0x00008080,0x00008080,
	},
};
static const uint32_t fp_maskl[16][16] = {
	{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},{
		0x00000000,0x40000000,0x00400000,0x40400000,
		0x00004000,0x40004000,0x00404000,0x40404000,
		0x00000040,0x40000040,0x00400040,0x40400040,
		0x00004040,0x40004040,0x00404040,0x40404040,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},{
		0x00000000,0x10000000,0x00100000,0x10100000,
		0x00001000,0x10001000,0x00101000,0x10101000,
		0x00000010,0x10000010,0x00100010,0x10100010,
		0x00001010,0x10001010,0x00101010,0x10101010,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},{
		0x00000000,0x04000000,0x00040000,0x04040000,
		0x00000400,0x04000400,0x00040400,0x04040400,
		0x00000004,0x04000004,0x00040004,0x04040004,
		0x00000404,0x04000404,0x00040404,0x04040404,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},{
		0x00000000,0x01000000,0x00010000,0x01010000,
		0x00000100,0x01000100,0x00010100,0x01010100,
		0x00000001,0x01000001,0x00010001,0x01010001,
		0x00000101,0x01000101,0x00010101,0x01010101,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},{
		0x00000000,0x80000000,0x00800000,0x80800000,
		0x00008000,0x80008000,0x00808000,0x80808000,
		0x00000080,0x80000080,0x00800080,0x80800080,
		0x00008080,0x80008080,0x00808080,0x80808080,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},{
		0x00000000,0x20000000,0x00200000,0x20200000,
		0x00002000,0x20002000,0x00202000,0x20202000,
		0x00000020,0x20000020,0x00200020,0x20200020,
		0x00002020,0x20002020,0x00202020,0x20202020,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},{
		0x00000000,0x08000000,0x00080000,0x08080000,
		0x00000800,0x08000800,0x00080800,0x08080800,
		0x00000008,0x08000008,0x00080008,0x08080008,
		0x00000808,0x08000808,0x00080808,0x08080808,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},{
		0x00000000,0x02000000,0x00020000,0x02020000,
		0x00000200,0x02000200,0x00020200,0x02020200,
		0x00000002,0x02000002,0x00020002,0x02020002,
		0x00000202,0x02000202,0x00020202,0x02020202,
	},
};
static const uint32_t fp_maskr[16][16] = {
	{
		0x00000000,0x40000000,0x00400000,0x40400000,
		0x00004000,0x40004000,0x00404000,0x40404000,
		0x00000040,0x40000040,0x00400040,0x40400040,
		0x00004040,0x40004040,0x00404040,0x40404040,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},{
		0x00000000,0x10000000,0x00100000,0x10100000,
		0x00001000,0x10001000,0x00101000,0x10101000,
		0x00000010,0x10000010,0x00100010,0x10100010,
		0x00001010,0x10001010,0x00101010,0x10101010,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},{
		0x00000000,0x04000000,0x00040000,0x04040000,
		0x00000400,0x04000400,0x00040400,0x04040400,
		0x00000004,0x04000004,0x00040004,0x04040004,
		0x00000404,0x04000404,0x00040404,0x04040404,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},{
		0x00000000,0x01000000,0x00010000,0x01010000,
		0x00000100,0x01000100,0x00010100,0x01010100,
		0x00000001,0x01000001,0x00010001,0x01010001,
		0x00000101,0x01000101,0x00010101,0x01010101,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},{
		0x00000000,0x80000000,0x00800000,0x80800000,
		0x00008000,0x80008000,0x00808000,0x80808000,
		0x00000080,0x80000080,0x00800080,0x80800080,
		0x00008080,0x80008080,0x00808080,0x80808080,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},{
		0x00000000,0x20000000,0x00200000,0x20200000,
		0x00002000,0x20002000,0x00202000,0x20202000,
		0x00000020,0x20000020,0x00200020,0x20200020,
		0x00002020,0x20002020,0x00202020,0x20202020,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},{
		0x00000000,0x08000000,0x00080000,0x08080000,
		0x00000800,0x08000800,0x00080800,0x08080800,
		0x00000008,0x08000008,0x00080008,0x08080008,
		0x00000808,0x08000808,0x00080808,0x08080808,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},{
		0x00000000,0x02000000,0x00020000,0x02020000,
		0x00000200,0x02000200,0x00020200,0x02020200,
		0x00000002,0x02000002,0x00020002,0x02020002,
		0x00000202,0x02000202,0x00020202,0x02020202,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},
};
static const uint32_t key_perm_maskl[16][16] = {
	{
		0x00000000,0x00000000,0x00000010,0x00000010,
		0x00001000,0x00001000,0x00001010,0x00001010,
		0x00100000,0x00100000,0x00100010,0x00100010,
		0x00101000,0x00101000,0x00101010,0x00101010,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},{
		0x00000000,0x00000000,0x00000020,0x00000020,
		0x00002000,0x00002000,0x00002020,0x00002020,
		0x00200000,0x00200000,0x00200020,0x00200020,
		0x00202000,0x00202000,0x00202020,0x00202020,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},{
		0x00000000,0x00000000,0x00000040,0x00000040,
		0x00004000,0x00004000,0x00004040,0x00004040,
		0x00400000,0x00400000,0x00400040,0x00400040,
		0x00404000,0x00404000,0x00404040,0x00404040,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},{
		0x00000000,0x00000000,0x00000080,0x00000080,
		0x00008000,0x00008000,0x00008080,0x00008080,
		0x00800000,0x00800000,0x00800080,0x00800080,
		0x00808000,0x00808000,0x00808080,0x00808080,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},{
		0x00000000,0x00000001,0x00000100,0x00000101,
		0x00010000,0x00010001,0x00010100,0x00010101,
		0x01000000,0x01000001,0x01000100,0x01000101,
		0x01010000,0x01010001,0x01010100,0x01010101,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},{
		0x00000000,0x00000002,0x00000200,0x00000202,
		0x00020000,0x00020002,0x00020200,0x00020202,
		0x02000000,0x02000002,0x02000200,0x02000202,
		0x02020000,0x02020002,0x02020200,0x02020202,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},{
		0x00000000,0x00000004,0x00000400,0x00000404,
		0x00040000,0x00040004,0x00040400,0x00040404,
		0x04000000,0x04000004,0x04000400,0x04000404,
		0x04040000,0x04040004,0x04040400,0x04040404,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},{
		0x00000000,0x00000008,0x00000800,0x00000808,
		0x00080000,0x00080008,0x00080800,0x00080808,
		0x08000000,0x08000008,0x08000800,0x08000808,
		0x08080000,0x08080008,0x08080800,0x08080808,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},
};
static const uint32_t key_perm_maskr[16][16] = {
	{
		0x00000000,0x00000001,0x00000000,0x00000001,
		0x00000000,0x00000001,0x00000000,0x00000001,
		0x00000000,0x00000001,0x00000000,0x00000001,
		0x00000000,0x00000001,0x00000000,0x00000001,
	},{
		0x00000000,0x00000000,0x00100000,0x00100000,
		0x00001000,0x00001000,0x00101000,0x00101000,
		0x00000010,0x00000010,0x00100010,0x00100010,
		0x00001010,0x00001010,0x00101010,0x00101010,
	},{
		0x00000000,0x00000002,0x00000000,0x00000002,
		0x00000000,0x00000002,0x00000000,0x00000002,
		0x00000000,0x00000002,0x00000000,0x00000002,
		0x00000000,0x00000002,0x00000000,0x00000002,
	},{
		0x00000000,0x00000000,0x00200000,0x00200000,
		0x00002000,0x00002000,0x00202000,0x00202000,
		0x00000020,0x00000020,0x00200020,0x00200020,
		0x00002020,0x00002020,0x00202020,0x00202020,
	},{
		0x00000000,0x00000004,0x00000000,0x00000004,
		0x00000000,0x00000004,0x00000000,0x00000004,
		0x00000000,0x00000004,0x00000000,0x00000004,
		0x00000000,0x00000004,0x00000000,0x00000004,
	},{
		0x00000000,0x00000000,0x00400000,0x00400000,
		0x00004000,0x00004000,0x00404000,0x00404000,
		0x00000040,0x00000040,0x00400040,0x00400040,
		0x00004040,0x00004040,0x00404040,0x00404040,
	},{
		0x00000000,0x00000008,0x00000000,0x00000008,
		0x00000000,0x00000008,0x00000000,0x00000008,
		0x00000000,0x00000008,0x00000000,0x00000008,
		0x00000000,0x00000008,0x00000000,0x00000008,
	},{
		0x00000000,0x00000000,0x00800000,0x00800000,
		0x00008000,0x00008000,0x00808000,0x00808000,
		0x00000080,0x00000080,0x00800080,0x00800080,
		0x00008080,0x00008080,0x00808080,0x00808080,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},{
		0x00000000,0x00000000,0x01000000,0x01000000,
		0x00010000,0x00010000,0x01010000,0x01010000,
		0x00000100,0x00000100,0x01000100,0x01000100,
		0x00010100,0x00010100,0x01010100,0x01010100,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},{
		0x00000000,0x00000000,0x02000000,0x02000000,
		0x00020000,0x00020000,0x02020000,0x02020000,
		0x00000200,0x00000200,0x02000200,0x02000200,
		0x00020200,0x00020200,0x02020200,0x02020200,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},{
		0x00000000,0x00000000,0x04000000,0x04000000,
		0x00040000,0x00040000,0x04040000,0x04040000,
		0x00000400,0x00000400,0x04000400,0x04000400,
		0x00040400,0x00040400,0x04040400,0x04040400,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},{
		0x00000000,0x00000000,0x08000000,0x08000000,
		0x00080000,0x00080000,0x08080000,0x08080000,
		0x00000800,0x00000800,0x08000800,0x08000800,
		0x00080800,0x00080800,0x08080800,0x08080800,
	},
};
static const uint32_t comp_maskl0[8][8] = {
	{
		0x00000000,0x00020000,0x00000001,0x00020001,
		0x00080000,0x000a0000,0x00080001,0x000a0001,
	},{
		0x00000000,0x00001000,0x00000000,0x00001000,
		0x00000040,0x00001040,0x00000040,0x00001040,
	},{
		0x00000000,0x00400000,0x00000020,0x00400020,
		0x00008000,0x00408000,0x00008020,0x00408020,
	},{
		0x00000000,0x00100000,0x00000800,0x00100800,
		0x00000000,0x00100000,0x00000800,0x00100800,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},
};
static const uint32_t comp_maskr0[8][8] = {
	{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},{
		0x00000000,0x00200000,0x00020000,0x00220000,
		0x00000002,0x00200002,0x00020002,0x00220002,
	},{
		0x00000000,0x00000000,0x00100000,0x00100000,
		0x00000004,0x00000004,0x00100004,0x00100004,
	},{
		0x00000000,0x00004000,0x00000800,0x00004800,
		0x00000000,0x00004000,0x00000800,0x00004800,
	},{
		0x00000000,0x00400000,0x00008000,0x00408000,
		0x00000008,0x00400008,0x00008008,0x00408008,
	},
};
static const uint32_t comp_maskl1[8][16] = {
	{
		0x00000000,0x00000010,0x00004000,0x00004010,
		0x00040000,0x00040010,0x00044000,0x00044010,
		0x00000100,0x00000110,0x00004100,0x00004110,
		0x00040100,0x00040110,0x00044100,0x00044110,
	},{
		0x00000000,0x00800000,0x00000002,0x00800002,
		0x00000200,0x00800200,0x00000202,0x00800202,
		0x00200000,0x00a00000,0x00200002,0x00a00002,
		0x00200200,0x00a00200,0x00200202,0x00a00202,
	},{
		0x00000000,0x00002000,0x00000004,0x00002004,
		0x00000400,0x00002400,0x00000404,0x00002404,
		0x00000000,0x00002000,0x00000004,0x00002004,
		0x00000400,0x00002400,0x00000404,0x00002404,
	},{
		0x00000000,0x00010000,0x00000008,0x00010008,
		0x00000080,0x00010080,0x00000088,0x00010088,
		0x00000000,0x00010000,0x00000008,0x00010008,
		0x00000080,0x00010080,0x00000088,0x00010088,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},
};
static const uint32_t comp_maskr1[8][16] = {
	{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},{
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
		0x00000000,0x00000000,0x00000000,0x00000000,
	},{
		0x00000000,0x00000000,0x00000080,0x00000080,
		0x00002000,0x00002000,0x00002080,0x00002080,
		0x00000001,0x00000001,0x00000081,0x00000081,
		0x00002001,0x00002001,0x00002081,0x00002081,
	},{
		0x00000000,0x00000010,0x00800000,0x00800010,
		0x00010000,0x00010010,0x00810000,0x00810010,
		0x00000200,0x00000210,0x00800200,0x00800210,
		0x00010200,0x00010210,0x00810200,0x00810210,
	},{
		0x00000000,0x00000400,0x00001000,0x00001400,
		0x00080000,0x00080400,0x00081000,0x00081400,
		0x00000020,0x00000420,0x00001020,0x00001420,
		0x00080020,0x00080420,0x00081020,0x00081420,
	},{
		0x00000000,0x00000100,0x00040000,0x00040100,
		0x00000000,0x00000100,0x00040000,0x00040100,
		0x00000040,0x00000140,0x00040040,0x00040140,
		0x00000040,0x00000140,0x00040040,0x00040140,
	},
};

static const u_char ascii64[] =
	 "./0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
/*	  0000000000111111111122222222223333333333444444444455555555556666 */
/*	  0123456789012345678901234567890123456789012345678901234567890123 */

/*
 * We match the behavior of UFC-crypt on systems where "char" is signed by
 * default (the majority), regardless of char's signedness on our system.
 */
static inline int
ascii_to_bin(int ch)
{
	int sch = ch>127 ? -(256-ch) : ch;
	int retval;

	retval = sch - '.';
	if (sch >= 'A') {
		retval = sch - ('A' - 12);
		if (sch >= 'a')
			retval = sch - ('a' - 38);
	}
	retval &= 0x3f;

	return retval;
}

/*
 * When we choose to "support" invalid salts, nevertheless disallow those
 * containing characters that would violate the passwd file format.
 */
static inline int
ascii_is_unsafe(int ch)
{
	return !ch || ch == '\n' || ch == ':';
}

static void
setup_salt(u_int32_t salt, struct _crypt_extended_local *local)
{
	u_int32_t obit, saltbit, saltbits;
	int i;

	saltbits = 0;
	saltbit = 1;
	obit = 0x800000;
	for (i = 0; i < 24; i++) {
		if (salt & saltbit)
			saltbits |= obit;
		saltbit <<= 1;
		obit >>= 1;
	}
	local->saltbits = saltbits;
}

static int
des_setkey(const u_char *key,
	struct _crypt_extended_local *local)
{
	u_int32_t k0, k1, rawkey0, rawkey1;
	int shifts, round;

	rawkey0 =
		(u_int32_t)(u_char)key[3] |
		((u_int32_t)(u_char)key[2] << 8) |
		((u_int32_t)(u_char)key[1] << 16) |
		((u_int32_t)(u_char)key[0] << 24);
	rawkey1 =
		(u_int32_t)(u_char)key[7] |
		((u_int32_t)(u_char)key[6] << 8) |
		((u_int32_t)(u_char)key[5] << 16) |
		((u_int32_t)(u_char)key[4] << 24);

	/*
	 * Do key permutation and split into two 28-bit subkeys.
	 */
	{
		int i, inbit;
		k0 = k1 = 0;
		for (i = 0, inbit = 28; i < 8; i++, inbit -= 4) {
			k0 |= key_perm_maskl[i][(rawkey0 >> inbit) & 0xf] |
			     key_perm_maskl[i + 8][(rawkey1 >> inbit) & 0xf];
			k1 |= key_perm_maskr[i][(rawkey0 >> inbit) & 0xf] |
			     key_perm_maskr[i + 8][(rawkey1 >> inbit) & 0xf];
		}
	}

	/*
	 * Rotate subkeys and do compression permutation.
	 */
	shifts = 0;
	for (round = 0; round < 16; round++) {
		u_int32_t t0, t1;

		shifts += key_shifts[round];

		t0 = (k0 << shifts) | (k0 >> (28 - shifts));
		t1 = (k1 << shifts) | (k1 >> (28 - shifts));

		{
			int i, inbit;
			u_int32_t kl, kr;
			kl = kr = 0;
			inbit = 25;
			for (i = 0; i < 4; i++) {
				kl |= comp_maskl0[i][(t0 >> inbit) & 7] |
				      comp_maskl0[i + 4][(t1 >> inbit) & 7];
				kr |= comp_maskr0[i][(t0 >> inbit) & 7] |
				      comp_maskr0[i + 4][(t1 >> inbit) & 7];
				inbit -= 4;
				kl |= comp_maskl1[i][(t0 >> inbit) & 0xf] |
				      comp_maskl1[i + 4][(t1 >> inbit) & 0xf];
				kr |= comp_maskr1[i][(t0 >> inbit) & 0xf] |
				      comp_maskr1[i + 4][(t1 >> inbit) & 0xf];
				inbit -= 3;
			}
			local->en_keysl[round] = kl;
			local->en_keysr[round] = kr;
		}
	}

	return 0;
}

static int
do_des(u_int32_t l_in, u_int32_t r_in, u_int32_t *l_out, u_int32_t *r_out,
	int count,
	struct _crypt_extended_local *local)
{
	/*
	 * l_in, r_in, l_out, and r_out are in pseudo-"big-endian" format.
	 */
	u_int32_t l, r, *kl, *kr, *kl1, *kr1;
	u_int32_t f, r48l, r48r, saltbits;
	int round;

	kl1 = local->en_keysl;
	kr1 = local->en_keysr;

	/*
	 * Do initial permutation (IP).
	 */
	l = r = 0;
	if (l_in | r_in) {
		int i, inbit;
		for (i = 0, inbit = 28; i < 8; i++, inbit -= 4) {
			l |= ip_maskl[i][(l_in >> inbit) & 0xf] |
			     ip_maskl[i + 8][(r_in >> inbit) & 0xf];
			r |= ip_maskr[i][(l_in >> inbit) & 0xf] |
			     ip_maskr[i + 8][(r_in >> inbit) & 0xf];
		}
	}

	saltbits = local->saltbits;
	while (count--) {
		/*
		 * Do each round.
		 */
		kl = kl1;
		kr = kr1;
		round = 16;
		while (round--) {
			/*
			 * Expand R to 48 bits (simulate the E-box).
			 */
			r48l	= ((r & 0x00000001) << 23)
				| ((r & 0xf8000000) >> 9)
				| ((r & 0x1f800000) >> 11)
				| ((r & 0x01f80000) >> 13)
				| ((r & 0x001f8000) >> 15);

			r48r	= ((r & 0x0001f800) << 7)
				| ((r & 0x00001f80) << 5)
				| ((r & 0x000001f8) << 3)
				| ((r & 0x0000001f) << 1)
				| ((r & 0x80000000) >> 31);
			/*
			 * Do salting for crypt() and friends, and
			 * XOR with the permuted key.
			 */
			f = (r48l ^ r48r) & saltbits;
			r48l ^= f ^ *kl++;
			r48r ^= f ^ *kr++;
			/*
			 * Do S-box lookups (which shrink it back to 32 bits)
			 * and do the P-box permutation at the same time.
			 */
			f = psbox[0][r48l >> 18]
			  | psbox[1][(r48l >> 12) & 0x3f]
			  | psbox[2][(r48l >> 6) & 0x3f]
			  | psbox[3][r48l & 0x3f]
			  | psbox[4][r48r >> 18]
			  | psbox[5][(r48r >> 12) & 0x3f]
			  | psbox[6][(r48r >> 6) & 0x3f]
			  | psbox[7][r48r & 0x3f];
			/*
			 * Now that we've permuted things, complete f().
			 */
			f ^= l;
			l = r;
			r = f;
		}
		r = l;
		l = f;
	}

	/*
	 * Do final permutation (inverse of IP).
	 */
	{
		int i, inbit;
		u_int32_t lo, ro;
		lo = ro = 0;
		for (i = 0, inbit = 28; i < 8; i++, inbit -= 4) {
			lo |= fp_maskl[i][(l >> inbit) & 0xf] |
			     fp_maskl[i + 8][(r >> inbit) & 0xf];
			ro |= fp_maskr[i][(l >> inbit) & 0xf] |
			     fp_maskr[i + 8][(r >> inbit) & 0xf];
		}
		*l_out = lo;
		*r_out = ro;
	}

	return 0;
}

static int
des_cipher(const u_char *in, u_char *out, u_int32_t salt, int count,
	struct _crypt_extended_local *local)
{
	u_int32_t l_out, r_out, rawl, rawr;
	int retval;

	setup_salt(salt, local);

	rawl =
		(u_int32_t)(u_char)in[3] |
		((u_int32_t)(u_char)in[2] << 8) |
		((u_int32_t)(u_char)in[1] << 16) |
		((u_int32_t)(u_char)in[0] << 24);
	rawr =
		(u_int32_t)(u_char)in[7] |
		((u_int32_t)(u_char)in[6] << 8) |
		((u_int32_t)(u_char)in[5] << 16) |
		((u_int32_t)(u_char)in[4] << 24);

	retval = do_des(rawl, rawr, &l_out, &r_out, count, local);

	out[0] = l_out >> 24;
	out[1] = l_out >> 16;
	out[2] = l_out >> 8;
	out[3] = l_out;
	out[4] = r_out >> 24;
	out[5] = r_out >> 16;
	out[6] = r_out >> 8;
	out[7] = r_out;

	return retval;
}

char *
_crypt_extended_r(const char *key, const char *setting, char *output)
{
	int i;
	u_int32_t count, salt, l, r0, r1, keybuf[2];
	u_char *p, *q;
	struct _crypt_extended_local local = { 0 };

	/*
	 * Copy the key, shifting each character up by one bit
	 * and padding with zeros.
	 */
	q = (u_char *) keybuf;
	while (q - (u_char *) keybuf < sizeof(keybuf)) {
		*q++ = *key << 1;
		if (*key)
			key++;
	}
	if (des_setkey((u_char *) keybuf, &local))
		return NULL;

	if (*setting == _PASSWORD_EFMT1) {
		/*
		 * "new"-style:
		 *	setting - underscore, 4 chars of count, 4 chars of salt
		 *	key - unlimited characters
		 */
		for (i = 1, count = 0; i < 5; i++) {
			int value = ascii_to_bin(setting[i]);
			if (ascii64[value] != setting[i])
				return NULL;
			count |= value << (i - 1) * 6;
		}
		if (!count)
			return NULL;

		for (i = 5, salt = 0; i < 9; i++) {
			int value = ascii_to_bin(setting[i]);
			if (ascii64[value] != setting[i])
				return NULL;
			salt |= value << (i - 5) * 6;
		}

		while (*key) {
			/*
			 * Encrypt the key with itself.
			 */
			if (des_cipher((u_char *) keybuf, (u_char *) keybuf,
			    0, 1, &local))
				return NULL;
			/*
			 * And XOR with the next 8 characters of the key.
			 */
			q = (u_char *) keybuf;
			while (q - (u_char *) keybuf < sizeof(keybuf) && *key)
				*q++ ^= *key++ << 1;

			if (des_setkey((u_char *) keybuf, &local))
				return NULL;
		}
		memcpy(output, setting, 9);
		output[9] = '\0';
		p = (u_char *) output + 9;
	} else {
		/*
		 * "old"-style:
		 *	setting - 2 chars of salt
		 *	key - up to 8 characters
		 */
		count = 25;

		if (ascii_is_unsafe(setting[0]) || ascii_is_unsafe(setting[1]))
			return NULL;

		salt = (ascii_to_bin(setting[1]) << 6)
		     |  ascii_to_bin(setting[0]);

		output[0] = setting[0];
		output[1] = setting[1];
		p = (u_char *) output + 2;
	}
	setup_salt(salt, &local);
	/*
	 * Do it.
	 */
	if (do_des(0, 0, &r0, &r1, count, &local))
		return NULL;
	/*
	 * Now encode the result...
	 */
	l = (r0 >> 8);
	*p++ = ascii64[(l >> 18) & 0x3f];
	*p++ = ascii64[(l >> 12) & 0x3f];
	*p++ = ascii64[(l >> 6) & 0x3f];
	*p++ = ascii64[l & 0x3f];

	l = (r0 << 16) | ((r1 >> 16) & 0xffff);
	*p++ = ascii64[(l >> 18) & 0x3f];
	*p++ = ascii64[(l >> 12) & 0x3f];
	*p++ = ascii64[(l >> 6) & 0x3f];
	*p++ = ascii64[l & 0x3f];

	l = r1 << 2;
	*p++ = ascii64[(l >> 12) & 0x3f];
	*p++ = ascii64[(l >> 6) & 0x3f];
	*p++ = ascii64[l & 0x3f];
	*p = 0;

	return output;
}

static char *
_crypt_extended(const char *key, const char *setting)
{
	static char output[21];
	return _crypt_extended_r(key, setting, output);
}

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: FreeSec crypt()
  2012-06-13  1:18 ` Rich Felker
@ 2012-06-13  6:10   ` Szabolcs Nagy
  2012-06-13 12:43     ` Solar Designer
  2012-06-13 12:58     ` Rich Felker
  2012-06-13 12:07   ` Solar Designer
  1 sibling, 2 replies; 17+ messages in thread
From: Szabolcs Nagy @ 2012-06-13  6:10 UTC (permalink / raw)
  To: musl

* Rich Felker <dalias@aerifal.cx> [2012-06-12 21:18:42 -0400]:
> On Wed, Jun 13, 2012 at 03:51:13AM +0400, Solar Designer wrote:
> > Rich -
> > 
> > As discussed on IRC, here is a revision of the FreeSec crypt() code with
> 
> Thanks. Here's a _really_ quick draft, untested, of the direction I
> wanted to take it with making the tables static-initialized. Note that

my comments:

> #include <sys/types.h>
> #include <string.h>
> 
> struct _crypt_extended_local {
> 	u_int32_t saltbits;
> 	u_int32_t en_keysl[16], en_keysr[16];
> };
> 

-#include <sys/types.h>
+#include <stdint.h>

s/u_int32_t/uint32_t/g
s/u_char/unsigned char/g

> static inline int
> ascii_to_bin(int ch)
> {
> 	int sch = ch>127 ? -(256-ch) : ch;
> 	int retval;
> 
> 	retval = sch - '.';
> 	if (sch >= 'A') {
> 		retval = sch - ('A' - 12);
> 		if (sch >= 'a')
> 			retval = sch - ('a' - 38);
> 	}
> 	retval &= 0x3f;
> 
> 	return retval;
> }
> 

s/inline//

on [-128,255] the following code gives the same result:

static int ascii_to_bin2(int c)
{
	if (c >= 128)
		c += -128 + 18;
	else if (c >= 97)
		c += -97 + 38;
	else if (c >= 65)
		c += -65 + 12;
	else
		c += 128 + 18;
	return (unsigned)c % 64;
}


> char *
> _crypt_extended_r(const char *key, const char *setting, char *output)
> {
...
> 	while (q - (u_char *) keybuf < sizeof(keybuf)) {
> 		*q++ = *key << 1;

implementation-defined signed shift

> 		for (i = 1, count = 0; i < 5; i++) {
> 			int value = ascii_to_bin(setting[i]);
> 			if (ascii64[value] != setting[i])
> 				return NULL;
> 			count |= value << (i - 1) * 6;
> 		}

signed shift (harmless)

> 			while (q - (u_char *) keybuf < sizeof(keybuf) && *key)
> 				*q++ ^= *key++ << 1;

signed shift



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: FreeSec crypt()
  2012-06-13  1:18 ` Rich Felker
  2012-06-13  6:10   ` Szabolcs Nagy
@ 2012-06-13 12:07   ` Solar Designer
  2012-06-13 14:53     ` Rich Felker
  1 sibling, 1 reply; 17+ messages in thread
From: Solar Designer @ 2012-06-13 12:07 UTC (permalink / raw)
  To: musl

On Tue, Jun 12, 2012 at 09:18:42PM -0400, Rich Felker wrote:
> Thanks. Here's a _really_ quick draft, untested, of the direction I
> wanted to take it with making the tables static-initialized. Note that
> doing so allowed removing all of the table-creation code and most of
> the existing tables. For me, it's weighing in at ~12k.

This sounds good, yet it's an increase from ~5k for the version I posted
(~6k with tests).  I understand that you're avoiding having memory pages
that would be written to by each DES-crypt-using process and thus not
shared - but that would only happen when this code is actually used,
which I think it usually won't be (this is backwards compatibility
stuff; we need to introduce modern crypt() flavors for actual use).

That said, the decision is yours to make, and you appear to have made it.

> I also removed
> some unused code for things like keeping salt/keys between multiple
> runs since the data is not preserved across multiple runs anyway,

It can be preserved if you make the (tiny) struct _crypt_extended_local
static.

> and the lookup tables for bitshifts.

As discussed on IRC, I had tried that, but reverted my changes because
they resulted in code size increase (greater than the savings from not
having those tables).  Apparently, the table lookups helped avoid mov's
(on 2-operand archs like x86) and add/sub (becomes part of effective
address calculation).  Did you check how this affected total size in
your case?  Or do you prefer cleaner source code anyway (makes sense)?

> > Also, we could want to add a runtime self-test, which would detect
> > possible miscompiles.
> 
> I understand your motivation for doing this with security-critical
> things, but really most/all of libc is security-critical, and we can't
> have runtime miscompilation tests all over the place. Moreover, the
> vast majority of cases of GCC "miscompiling" something turn out to be
> code that's invoking undefined behavior; the only non-UB example I've
> encountered while working on musl is gcc 4.7's bad breakage, which is
> so bad you can't even get programs to start...

I agree that the vast majority of cases may involve UB, but so what - we
can't be 100% certain we don't have any cases of UB in our source code
even if we review it very carefully.  We might miss things, and besides
source code changes over time may introduce UB (e.g., changes in
variable declarations may result in code that continues to work
correctly, but actually depends on undefined behavior).

We have a fairly large chunk of code here (a few KB), which we can have
self-tested by a few hundred bytes of code more.  I think this is worth
it.  Besides, the self-test may serve to zeroize sensitive data - and in
a more reliable way than a memset() would (which may be optimized out,
and it would not necessarily zeroize stack and registers).  In fact, I
am using this approach to zeroization in crypt_blowfish.  Yes, the
self-test needs to run _after_ hashing the real password, and then
depending on self-test result you either return the originally computed
value or you return an error indication (what to return from crypt() on
error is a separate non-trivial topic).

>  * This version is derived from the original implementation of FreeSec
...

Can you please add a comment documenting your changes, too?

I notice that you fully dropped the tests - are you going to have
compile-time tests elsewhere?

Thanks,

Alexander


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: FreeSec crypt()
  2012-06-13  6:10   ` Szabolcs Nagy
@ 2012-06-13 12:43     ` Solar Designer
  2012-06-13 12:58     ` Rich Felker
  1 sibling, 0 replies; 17+ messages in thread
From: Solar Designer @ 2012-06-13 12:43 UTC (permalink / raw)
  To: musl

On Wed, Jun 13, 2012 at 08:10:32AM +0200, Szabolcs Nagy wrote:
> static int ascii_to_bin2(int c)
> {
> 	if (c >= 128)
> 		c += -128 + 18;
> 	else if (c >= 97)
> 		c += -97 + 38;
> 	else if (c >= 65)
> 		c += -65 + 12;
> 	else
> 		c += 128 + 18;
> 	return (unsigned)c % 64;
> }

I think we should use a lookup table instead, which would also reduce
the timing leaks we have here.  Salts are not secret (not any more than
the hashes), but they're not supposed to be known to an attacker in
advance (before the hashes are possibly leaked/stolen) and knowing them
(but not yet the hashes) may enable char-by-char timing attacks on
hash encoding comparisons (thereby leaking the hashes after _many_
tries).  This is almost purely theoretical for a number of reasons,
yet if we strive for perfection, we should do the right thing in this
respect as well.

Perhaps the same lookup table should also replace ascii_is_unsafe() by
returning a value outside of the 0 .. 63 range in the unsafe cases.
We'd need to check for such out of range values, then.

Thank you for your comments!

Alexander


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: FreeSec crypt()
  2012-06-13  6:10   ` Szabolcs Nagy
  2012-06-13 12:43     ` Solar Designer
@ 2012-06-13 12:58     ` Rich Felker
  2012-06-13 13:18       ` Solar Designer
  1 sibling, 1 reply; 17+ messages in thread
From: Rich Felker @ 2012-06-13 12:58 UTC (permalink / raw)
  To: musl

On Wed, Jun 13, 2012 at 08:10:32AM +0200, Szabolcs Nagy wrote:
> > char *
> > _crypt_extended_r(const char *key, const char *setting, char *output)
> > {
> ....
> > 	while (q - (u_char *) keybuf < sizeof(keybuf)) {
> > 		*q++ = *key << 1;
> 
> implementation-defined signed shift

Undefined. Right shifts are implementation-defined. Left-shifts are
defined only for positive operands that don't overflow.

Note that even if the behavior were defined, this code seems to have
different behavior for high bytes depending on the signedness of char.
That looks like a major bug; I thought that bug in crypt
implementations was fixed years ago.

> > 		for (i = 1, count = 0; i < 5; i++) {
> > 			int value = ascii_to_bin(setting[i]);
> > 			if (ascii64[value] != setting[i])
> > 				return NULL;
> > 			count |= value << (i - 1) * 6;
> > 		}
> 
> signed shift (harmless)

Yes this one seems to be harmless unless it can overflow, and from the
loop bounds it seems impossible to overflow.

> > 			while (q - (u_char *) keybuf < sizeof(keybuf) && *key)
> > 				*q++ ^= *key++ << 1;
> 
> signed shift

This one looks like it has the same bug as the first.

Rich


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: FreeSec crypt()
  2012-06-13 12:58     ` Rich Felker
@ 2012-06-13 13:18       ` Solar Designer
  2012-06-13 14:56         ` Rich Felker
  0 siblings, 1 reply; 17+ messages in thread
From: Solar Designer @ 2012-06-13 13:18 UTC (permalink / raw)
  To: musl

On Wed, Jun 13, 2012 at 08:58:39AM -0400, Rich Felker wrote:
> On Wed, Jun 13, 2012 at 08:10:32AM +0200, Szabolcs Nagy wrote:
> > > 	while (q - (u_char *) keybuf < sizeof(keybuf)) {
> > > 		*q++ = *key << 1;
> > 
> > implementation-defined signed shift
> 
> Undefined. Right shifts are implementation-defined. Left-shifts are
> defined only for positive operands that don't overflow.

Right.  I'm going to fix that.

> Note that even if the behavior were defined, this code seems to have
> different behavior for high bytes depending on the signedness of char.

Do you mean high bits (not bytes)?  Why would signedness of char matter
if the behavior of the signed char overflowing left shift were defined?

Alexander


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: FreeSec crypt()
  2012-06-13 12:07   ` Solar Designer
@ 2012-06-13 14:53     ` Rich Felker
  2012-06-24  7:21       ` Solar Designer
  0 siblings, 1 reply; 17+ messages in thread
From: Rich Felker @ 2012-06-13 14:53 UTC (permalink / raw)
  To: musl

On Wed, Jun 13, 2012 at 04:07:54PM +0400, Solar Designer wrote:
> > I also removed
> > some unused code for things like keeping salt/keys between multiple
> > runs since the data is not preserved across multiple runs anyway,
> 
> It can be preserved if you make the (tiny) struct _crypt_extended_local
> static.

I suppose this is worth considering. I was assuming it would be
difficult to preserve (require otherwise-unneeded locking), but for
crypt_r each caller could be responsible for its own copy.

However, on the other hand I'm not sure I see the benefit. Why would
you call crypt or crypt_r multiple times with the same key/salt except
possibly for password cracking (in which case you'll want a highly
optimized-for-speed implementation)?

> > and the lookup tables for bitshifts.
> 
> As discussed on IRC, I had tried that, but reverted my changes because
> they resulted in code size increase (greater than the savings from not
> having those tables).  Apparently, the table lookups helped avoid mov's
> (on 2-operand archs like x86) and add/sub (becomes part of effective
> address calculation).  Did you check how this affected total size in
> your case?  Or do you prefer cleaner source code anyway (makes sense)?

I didn't compare, but yes I do prefer cleaner source. I'm a bit
surprised that you found the difference in code size to be >136 bytes
though. Are you sure the code size changed by that much?

> > > Also, we could want to add a runtime self-test, which would detect
> > > possible miscompiles.
> > 
> > I understand your motivation for doing this with security-critical
> > things, but really most/all of libc is security-critical, and we can't
> > have runtime miscompilation tests all over the place. Moreover, the
> > vast majority of cases of GCC "miscompiling" something turn out to be
> > code that's invoking undefined behavior; the only non-UB example I've
> > encountered while working on musl is gcc 4.7's bad breakage, which is
> > so bad you can't even get programs to start...
> 
> I agree that the vast majority of cases may involve UB, but so what - we
> can't be 100% certain we don't have any cases of UB in our source code
> even if we review it very carefully.

For purely computational code that doesn't recurse or loop arbitrarily
long based on the input or read and write all over memory, static
analysis can determine the absence of UB. Having no UB, and having the
code tested with both signed and unsigned char (easy with gcc -f
options) should give an extremely high level of confidence that cases
which have been tested on one system/compiler will behave identically
on another system or another compiler.

> We might miss things, and besides
> source code changes over time may introduce UB (e.g., changes in
> variable declarations may result in code that continues to work
> correctly, but actually depends on undefined behavior).

And the same is true elsewhere in much more critical code, like
snprintf, strcpy, strchr, etc. all of which could create huge vulns if
miscompiled. What I was saying is that if you're going to call into
question the compiler, I think you need some basis for deciding which
functions need to double-check that their results are correct, unless
your intent is to make ALL of your code fault-tolerant in the form of
double-checking its results. The latter would certainly be an
interesting project in itself, but I think it goes way beyond the
scope (and contrary to many of the goals, including size, speed, and
simplicity) of musl. If one really wants to make such a system, I
think it would make a lot more sense to have the checking
automatically generated by the compiler/toolchain (e.g. use 2 separate
compilers and compare the results constantly).

> We have a fairly large chunk of code here (a few KB), which we can have
> self-tested by a few hundred bytes of code more.  I think this is worth
> it.

I'm open to seeing them if they're really that small and non-invasive.

> Besides, the self-test may serve to zeroize sensitive data - and in
> a more reliable way than a memset() would (which may be optimized out,
> and it would not necessarily zeroize stack and registers).  In fact, I
> am using this approach to zeroization in crypt_blowfish.  Yes, the
> self-test needs to run _after_ hashing the real password, and then
> depending on self-test result you either return the originally computed


> value or you return an error indication (what to return from crypt() on
> error is a separate non-trivial topic).

I saw the notes on this. What real-world code breaks from the
conformant behavior?

> >  * This version is derived from the original implementation of FreeSec
> ....
> 
> Can you please add a comment documenting your changes, too?

Sure, I'll do that before committing. What I sent to the list was just
a quick draft, and I certainly don't want to misrepresent my modified
version as if it were your original, especially if it has bugs. I
really should have added a comment before even sending it...

> I notice that you fully dropped the tests - are you going to have
> compile-time tests elsewhere?

I just did it while shuffling around the code -- I think I moved them
and the adjacent table-gen code I wrote to a separate file at the same
time. I'm fairly indifferent to putting them in the file, but I'd much
rather have a test for the external API in an external test module as
the actual tests we use.

Rich


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: FreeSec crypt()
  2012-06-13 13:18       ` Solar Designer
@ 2012-06-13 14:56         ` Rich Felker
  2012-06-13 16:45           ` Solar Designer
  0 siblings, 1 reply; 17+ messages in thread
From: Rich Felker @ 2012-06-13 14:56 UTC (permalink / raw)
  To: musl

On Wed, Jun 13, 2012 at 05:18:07PM +0400, Solar Designer wrote:
> > Note that even if the behavior were defined, this code seems to have
> > different behavior for high bytes depending on the signedness of char.
> 
> Do you mean high bits (not bytes)?

"High bytes" is a poorly defined term I used to mean bytes with their
high bit set, or bytes outside the ASCII range, etc.

> Why would signedness of char matter
> if the behavior of the signed char overflowing left shift were defined?

Well if char is signed, (char)0x80 << 1 is -256. If char is unsigned,
(char)0x80 << 1 is 256.

Rich


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: FreeSec crypt()
  2012-06-13 14:56         ` Rich Felker
@ 2012-06-13 16:45           ` Solar Designer
  2012-06-13 17:27             ` Rich Felker
  2012-06-13 17:32             ` Szabolcs Nagy
  0 siblings, 2 replies; 17+ messages in thread
From: Solar Designer @ 2012-06-13 16:45 UTC (permalink / raw)
  To: musl

On Wed, Jun 13, 2012 at 10:56:03AM -0400, Rich Felker wrote:
> On Wed, Jun 13, 2012 at 05:18:07PM +0400, Solar Designer wrote:
> > > Note that even if the behavior were defined, this code seems to have
> > > different behavior for high bytes depending on the signedness of char.
...
> > Why would signedness of char matter
> > if the behavior of the signed char overflowing left shift were defined?
> 
> Well if char is signed, (char)0x80 << 1 is -256. If char is unsigned,
> (char)0x80 << 1 is 256.

Sure, but we had:

	const char *key;
	u_char *q;
	*q++ = *key << 1;

so while *key << 1 is either -256 or 256 (promoted to int or unsigned
int), those high bits get dropped on the assignment to *q anyway,
resulting in the same value there either way.  No?

Alexander


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: FreeSec crypt()
  2012-06-13 16:45           ` Solar Designer
@ 2012-06-13 17:27             ` Rich Felker
  2012-06-13 17:32             ` Szabolcs Nagy
  1 sibling, 0 replies; 17+ messages in thread
From: Rich Felker @ 2012-06-13 17:27 UTC (permalink / raw)
  To: musl

On Wed, Jun 13, 2012 at 08:45:46PM +0400, Solar Designer wrote:
> On Wed, Jun 13, 2012 at 10:56:03AM -0400, Rich Felker wrote:
> > On Wed, Jun 13, 2012 at 05:18:07PM +0400, Solar Designer wrote:
> > > > Note that even if the behavior were defined, this code seems to have
> > > > different behavior for high bytes depending on the signedness of char.
> ....
> > > Why would signedness of char matter
> > > if the behavior of the signed char overflowing left shift were defined?
> > 
> > Well if char is signed, (char)0x80 << 1 is -256. If char is unsigned,
> > (char)0x80 << 1 is 256.
> 
> Sure, but we had:
> 
> 	const char *key;
> 	u_char *q;
> 	*q++ = *key << 1;
> 
> so while *key << 1 is either -256 or 256 (promoted to int or unsigned
> int), those high bits get dropped on the assignment to *q anyway,
> resulting in the same value there either way.  No?

You're right on that. Ideally the functions should just take arguments
of type unsigned char *, and the crypt/crypt_r wrapper should cast the
original char * to unsigned char *. This is the same way all the
standard string functions (like strcmp) are required to work.

Casting/converting the _value_ after reading it also happens to work,
and is sufficient for musl's purposes (we assume, per POSIX, that
CHAR_BIT is 8, but also that signed types are twos complement), but
only reinterpreting by casting the pointer before reading it is 100%
portable. On a non-twos-complement machine, reading a signed char is
lossy (it can only obtain 255 possible values, not 256).

Rich


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: FreeSec crypt()
  2012-06-13 16:45           ` Solar Designer
  2012-06-13 17:27             ` Rich Felker
@ 2012-06-13 17:32             ` Szabolcs Nagy
  2012-06-13 17:36               ` Rich Felker
  1 sibling, 1 reply; 17+ messages in thread
From: Szabolcs Nagy @ 2012-06-13 17:32 UTC (permalink / raw)
  To: musl

* Solar Designer <solar@openwall.com> [2012-06-13 20:45:46 +0400]:
> On Wed, Jun 13, 2012 at 10:56:03AM -0400, Rich Felker wrote:
> > Well if char is signed, (char)0x80 << 1 is -256. If char is unsigned,
> > (char)0x80 << 1 is 256.
> 
> Sure, but we had:
> 
> 	const char *key;
> 	u_char *q;
> 	*q++ = *key << 1;
> 
> so while *key << 1 is either -256 or 256 (promoted to int or unsigned
> int), those high bits get dropped on the assignment to *q anyway,
> resulting in the same value there either way.  No?

yes the code happens to work whenever -128<<1 is -256

and i assume -256 is what most compilers will give
usually in case of two's complement int representation

but -128<<1 is UB and should be fixed anyway


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: FreeSec crypt()
  2012-06-13 17:32             ` Szabolcs Nagy
@ 2012-06-13 17:36               ` Rich Felker
  0 siblings, 0 replies; 17+ messages in thread
From: Rich Felker @ 2012-06-13 17:36 UTC (permalink / raw)
  To: musl

On Wed, Jun 13, 2012 at 07:32:48PM +0200, Szabolcs Nagy wrote:
> * Solar Designer <solar@openwall.com> [2012-06-13 20:45:46 +0400]:
> > On Wed, Jun 13, 2012 at 10:56:03AM -0400, Rich Felker wrote:
> > > Well if char is signed, (char)0x80 << 1 is -256. If char is unsigned,
> > > (char)0x80 << 1 is 256.
> > 
> > Sure, but we had:
> > 
> > 	const char *key;
> > 	u_char *q;
> > 	*q++ = *key << 1;
> > 
> > so while *key << 1 is either -256 or 256 (promoted to int or unsigned
> > int), those high bits get dropped on the assignment to *q anyway,
> > resulting in the same value there either way.  No?
> 
> yes the code happens to work whenever -128<<1 is -256
> 
> and i assume -256 is what most compilers will give
> usually in case of two's complement int representation
> 
> but -128<<1 is UB and should be fixed anyway

Note that x<<1 is always equal to x*2 when both are defined, and the
latter is defined in our case since the range of x is much smaller
than half the range of int. So if the underlying pointer type issue
isn't fixed, just changing <<1 to *2 would work.

In general, portable code wanting to use x<<n with negative values of
x could replace it with x*(1<<n) and hope the compiler is smart enough
to generate code equivalent to the traditional behavior of x<<n (i.e.
hope it can apply the associativity to * and <<).

Rich


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: FreeSec crypt()
  2012-06-13 14:53     ` Rich Felker
@ 2012-06-24  7:21       ` Solar Designer
  2012-06-24  7:32         ` Solar Designer
  2012-06-25  3:51         ` Rich Felker
  0 siblings, 2 replies; 17+ messages in thread
From: Solar Designer @ 2012-06-24  7:21 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 4594 bytes --]

On Wed, Jun 13, 2012 at 10:53:18AM -0400, Rich Felker wrote:
> On Wed, Jun 13, 2012 at 04:07:54PM +0400, Solar Designer wrote:
> > > I also removed
> > > some unused code for things like keeping salt/keys between multiple
> > > runs since the data is not preserved across multiple runs anyway,
> > 
> > It can be preserved if you make the (tiny) struct _crypt_extended_local
> > static.
> 
> I suppose this is worth considering. I was assuming it would be
> difficult to preserve (require otherwise-unneeded locking), but for
> crypt_r each caller could be responsible for its own copy.
> 
> However, on the other hand I'm not sure I see the benefit. Why would
> you call crypt or crypt_r multiple times with the same key/salt except
> possibly for password cracking (in which case you'll want a highly
> optimized-for-speed implementation)?

You're right.  Additionally, preserving this requires that we keep
sensitive data around (from the previous password hashed).  I've dropped
this stuff now.

> > > and the lookup tables for bitshifts.
> > 
> > As discussed on IRC, I had tried that, but reverted my changes because
> > they resulted in code size increase (greater than the savings from not
> > having those tables).  Apparently, the table lookups helped avoid mov's
> > (on 2-operand archs like x86) and add/sub (becomes part of effective
> > address calculation).  Did you check how this affected total size in
> > your case?  Or do you prefer cleaner source code anyway (makes sense)?
> 
> I didn't compare, but yes I do prefer cleaner source. I'm a bit
> surprised that you found the difference in code size to be >136 bytes
> though. Are you sure the code size changed by that much?

Yes, and I was surprised too.  However, I've re-introduced those changes
now, and along with other changes I made there appears to be almost no
size increase from them.

> > > > Also, we could want to add a runtime self-test, which would detect
> > > > possible miscompiles.
> > > 
> > > I understand your motivation for doing this with security-critical
> > > things, but really most/all of libc is security-critical, and we can't
> > > have runtime miscompilation tests all over the place. Moreover, the
> > > vast majority of cases of GCC "miscompiling" something turn out to be
> > > code that's invoking undefined behavior; the only non-UB example I've
> > > encountered while working on musl is gcc 4.7's bad breakage, which is
> > > so bad you can't even get programs to start...
> > 
> > I agree that the vast majority of cases may involve UB, but so what - we
> > can't be 100% certain we don't have any cases of UB in our source code
> > even if we review it very carefully.
> 
> For purely computational code that doesn't recurse or loop arbitrarily
> long based on the input or read and write all over memory, static
> analysis can determine the absence of UB.

Are you performing such static analysis on musl?

> > We have a fairly large chunk of code here (a few KB), which we can have
> > self-tested by a few hundred bytes of code more.  I think this is worth
> > it.
> 
> I'm open to seeing them if they're really that small and non-invasive.

I've added a runtime self-test now.  The cost appears to be 200 to 300
bytes of code and read-only data.

Another reason to have this for crypt() when we don't have it for many
other functions is that it is more difficult to recover from improperly
hashed passwords (than from some other bug in another libc interface).
Rather than just fix the bug, we might have to add backwards
compatibility code.  Even worse, those improperly computed hashes may be
a security risk that is not necessarily avoided by simply installing
fixed software.

> > value or you return an error indication (what to return from crypt() on
> > error is a separate non-trivial topic).
> 
> I saw the notes on this. What real-world code breaks from the
> conformant behavior?

I think the majority of daemons, etc. that do authentication with
crypt() don't handle possible NULL returns.  This is just starting to
change now.  They don't really break under normal conditions because
normally crypt() returns the hashed password and not NULL; but if it
does somehow return NULL (e.g., because the salt read from the shadow
file is invalid), then I'd expect most users of crypt() to crash.

...Attached is my latest revision of crypt_freesec.  I've reduced the
table sizes even further (7 KB, may be precomputed) and I made certain
other changes as discussed.  I'd appreciate another review, and some
fuzzing against another implementation wouldn't hurt.

Alexander

[-- Attachment #2: crypt_freesec.h --]
[-- Type: text/plain, Size: 2493 bytes --]

/*
 * The following notice applies to this header file (only):
 *
 * Copyright (c) 2002,2010,2012 Solar Designer <solar at openwall.com>
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted.
 */

#ifndef _CRYPT_FREESEC_H
#define _CRYPT_FREESEC_H

struct _crypt_extended_local {
	unsigned char output[21];
};

struct _crypt_extended_shared {
	uint32_t psbox[8][64];
	uint32_t ip_maskl[16][16], ip_maskr[16][16];
	uint32_t fp_maskl[8][16], fp_maskr[8][16];
	uint32_t key_perm_maskl[8][16], key_perm_maskr[12][16];
	uint32_t comp_maskl0[4][8], comp_maskr0[4][8];
	uint32_t comp_maskl1[4][16], comp_maskr1[4][16];
};

/*
 * _crypt_extended_init() must be called explicitly before first use of
 * _crypt_extended_r().  Strictly speaking, _crypt_extended_init() is not
 * reentrant unless the "shared" struct happens to be private to each call.
 * All it does is initialize some variables inside the "shared" struct to
 * constant values, so it is unlikely that anything would go wrong if this is
 * done multiple times in parallel, but correct behavior in that case is not
 * guaranteed (e.g., things may go wrong if a given CPU architecture can't
 * operate on 32-bit quantities natively, requiring read-modify-write
 * instruction sequences operating on larger quantities and thus affecting
 * nearby array elements).
 *
 * After _crypt_extended_init() has returned, the resulting "shared" struct
 * may in fact be safely shared between different threads' calls to
 * _crypt_extended_r(), which is in fact reentrant (but each concurrent call
 * must use its own instance of the "local" struct).
 *
 * _crypt_extended_r() returns NULL on error.  Although modern standards say
 * that crypt(3) does in fact return NULL on error, many applications do not
 * expect that.  Thus, it is recommended that a crypt(3)-like wrapper function
 * translate these NULL returns into strings guaranteed to be different from
 * the "setting" string, too short for matching a valid password hash, and not
 * containing any characters that would be special for the passwd file format.
 * Specifically, such a wrapper function may return "*0" on error as long as
 * the "setting" string does not start with "*0", or "*1" otherwise.
 */

void _crypt_extended_init(struct _crypt_extended_shared *shared);

char *_crypt_extended_r(const char *key, const char *setting,
    struct _crypt_extended_local *local,
    const struct _crypt_extended_shared *shared);

#endif

[-- Attachment #3: crypt_freesec.c --]
[-- Type: text/plain, Size: 22332 bytes --]

/*
 * This version is derived from the original implementation of FreeSec
 * (release 1.1) by David Burren.  I've made it reentrant, reduced its memory
 * usage from about 70 KB to about 7 KB (with only minimal performance impact
 * and keeping code size about the same), made the handling of invalid salts
 * mostly UFC-crypt compatible, added a quick runtime self-test (which also
 * serves to zeroize the stack from sensitive data), and added optional tests.
 * - Solar Designer <solar at openwall.com>
 */

/*
 * FreeSec: libcrypt for NetBSD
 *
 * Copyright (c) 1994 David Burren
 * Copyright (c) 2000,2002,2010,2012 Solar Designer
 * All rights reserved.
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions
 * are met:
 * 1. Redistributions of source code must retain the above copyright
 *    notice, this list of conditions and the following disclaimer.
 * 2. Redistributions in binary form must reproduce the above copyright
 *    notice, this list of conditions and the following disclaimer in the
 *    documentation and/or other materials provided with the distribution.
 * 3. Neither the name of the author nor the names of other contributors
 *    may be used to endorse or promote products derived from this software
 *    without specific prior written permission.
 *
 * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
 * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
 * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
 * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
 * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
 * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
 * SUCH DAMAGE.
 *
 *	$Owl: Owl/packages/glibc/crypt_freesec.c,v 1.6 2010/02/20 14:45:06 solar Exp $
 *	$Id: crypt.c,v 1.15 1994/09/13 04:58:49 davidb Exp $
 *
 * This is an original implementation of the DES and the crypt(3) interfaces
 * by David Burren.  It has been heavily re-worked by Solar Designer.
 */

#include <stdint.h>
#include <string.h>

#ifdef DEBUG
#include <assert.h>
#endif

#ifdef TEST
#include <stdio.h>
#endif

#include "crypt_freesec.h"

struct expanded_key {
	uint32_t l[16], r[16];
};

#define _PASSWORD_EFMT1 '_'

static const unsigned char IP[64] = {
	58, 50, 42, 34, 26, 18, 10,  2, 60, 52, 44, 36, 28, 20, 12,  4,
	62, 54, 46, 38, 30, 22, 14,  6, 64, 56, 48, 40, 32, 24, 16,  8,
	57, 49, 41, 33, 25, 17,  9,  1, 59, 51, 43, 35, 27, 19, 11,  3,
	61, 53, 45, 37, 29, 21, 13,  5, 63, 55, 47, 39, 31, 23, 15,  7
};

static const unsigned char key_perm[56] = {
	57, 49, 41, 33, 25, 17,  9,  1, 58, 50, 42, 34, 26, 18,
	10,  2, 59, 51, 43, 35, 27, 19, 11,  3, 60, 52, 44, 36,
	63, 55, 47, 39, 31, 23, 15,  7, 62, 54, 46, 38, 30, 22,
	14,  6, 61, 53, 45, 37, 29, 21, 13,  5, 28, 20, 12,  4
};

static const unsigned char key_shifts[16] = {
	1, 1, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 1
};

static const unsigned char comp_perm[48] = {
	14, 17, 11, 24,  1,  5,  3, 28, 15,  6, 21, 10,
	23, 19, 12,  4, 26,  8, 16,  7, 27, 20, 13,  2,
	41, 52, 31, 37, 47, 55, 30, 40, 51, 45, 33, 48,
	44, 49, 39, 56, 34, 53, 46, 42, 50, 36, 29, 32
};

/*
 * No E box is used, as it's replaced by some ANDs, shifts, and ORs.
 */

static const unsigned char sbox[8][64] = {
	{
		14,  4, 13,  1,  2, 15, 11,  8,  3, 10,  6, 12,  5,  9,  0,  7,
		 0, 15,  7,  4, 14,  2, 13,  1, 10,  6, 12, 11,  9,  5,  3,  8,
		 4,  1, 14,  8, 13,  6,  2, 11, 15, 12,  9,  7,  3, 10,  5,  0,
		15, 12,  8,  2,  4,  9,  1,  7,  5, 11,  3, 14, 10,  0,  6, 13
	},
	{
		15,  1,  8, 14,  6, 11,  3,  4,  9,  7,  2, 13, 12,  0,  5, 10,
		 3, 13,  4,  7, 15,  2,  8, 14, 12,  0,  1, 10,  6,  9, 11,  5,
		 0, 14,  7, 11, 10,  4, 13,  1,  5,  8, 12,  6,  9,  3,  2, 15,
		13,  8, 10,  1,  3, 15,  4,  2, 11,  6,  7, 12,  0,  5, 14,  9
	},
	{
		10,  0,  9, 14,  6,  3, 15,  5,  1, 13, 12,  7, 11,  4,  2,  8,
		13,  7,  0,  9,  3,  4,  6, 10,  2,  8,  5, 14, 12, 11, 15,  1,
		13,  6,  4,  9,  8, 15,  3,  0, 11,  1,  2, 12,  5, 10, 14,  7,
		 1, 10, 13,  0,  6,  9,  8,  7,  4, 15, 14,  3, 11,  5,  2, 12
	},
	{
		 7, 13, 14,  3,  0,  6,  9, 10,  1,  2,  8,  5, 11, 12,  4, 15,
		13,  8, 11,  5,  6, 15,  0,  3,  4,  7,  2, 12,  1, 10, 14,  9,
		10,  6,  9,  0, 12, 11,  7, 13, 15,  1,  3, 14,  5,  2,  8,  4,
		 3, 15,  0,  6, 10,  1, 13,  8,  9,  4,  5, 11, 12,  7,  2, 14
	},
	{
		 2, 12,  4,  1,  7, 10, 11,  6,  8,  5,  3, 15, 13,  0, 14,  9,
		14, 11,  2, 12,  4,  7, 13,  1,  5,  0, 15, 10,  3,  9,  8,  6,
		 4,  2,  1, 11, 10, 13,  7,  8, 15,  9, 12,  5,  6,  3,  0, 14,
		11,  8, 12,  7,  1, 14,  2, 13,  6, 15,  0,  9, 10,  4,  5,  3
	},
	{
		12,  1, 10, 15,  9,  2,  6,  8,  0, 13,  3,  4, 14,  7,  5, 11,
		10, 15,  4,  2,  7, 12,  9,  5,  6,  1, 13, 14,  0, 11,  3,  8,
		 9, 14, 15,  5,  2,  8, 12,  3,  7,  0,  4, 10,  1, 13, 11,  6,
		 4,  3,  2, 12,  9,  5, 15, 10, 11, 14,  1,  7,  6,  0,  8, 13
	},
	{
		 4, 11,  2, 14, 15,  0,  8, 13,  3, 12,  9,  7,  5, 10,  6,  1,
		13,  0, 11,  7,  4,  9,  1, 10, 14,  3,  5, 12,  2, 15,  8,  6,
		 1,  4, 11, 13, 12,  3,  7, 14, 10, 15,  6,  8,  0,  5,  9,  2,
		 6, 11, 13,  8,  1,  4, 10,  7,  9,  5,  0, 15, 14,  2,  3, 12
	},
	{
		13,  2,  8,  4,  6, 15, 11,  1, 10,  9,  3, 14,  5,  0, 12,  7,
		 1, 15, 13,  8, 10,  3,  7,  4, 12,  5,  6, 11,  0, 14,  9,  2,
		 7, 11,  4,  1,  9, 12, 14,  2,  0,  6, 10, 13, 15,  3,  5,  8,
		 2,  1, 14,  7,  4, 10,  8, 13, 15, 12,  9,  0,  3,  5,  6, 11
	}
};

static const unsigned char pbox[32] = {
	16,  7, 20, 21, 29, 12, 28, 17,  1, 15, 23, 26,  5, 18, 31, 10,
	 2,  8, 24, 14, 32, 27,  3,  9, 19, 13, 30,  6, 22, 11,  4, 25
};

static const unsigned char ascii64[] =
    "./0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
/*   0000000000111111111122222222223333333333444444444455555555556666 */
/*   0123456789012345678901234567890123456789012345678901234567890123 */

/*
 * We match the behavior of UFC-crypt on systems where "char" is signed by
 * default (the majority), regardless of char's signedness on our system.
 */
static uint32_t ascii_to_bin(int ch)
{
	int sch = (ch < 0x80) ? ch : -(0x100 - ch);
	int retval;

	retval = sch - '.';
	if (sch >= 'A') {
		retval = sch - ('A' - 12);
		if (sch >= 'a')
			retval = sch - ('a' - 38);
	}
	retval &= 0x3f;

	return retval;
}

/*
 * When we choose to "support" invalid salts, nevertheless disallow those
 * containing characters that would violate the passwd file format.
 */
static inline int ascii_is_unsafe(unsigned char ch)
{
	return !ch || ch == '\n' || ch == ':';
}

static void init_ip_k(struct _crypt_extended_shared *shared)
{
	unsigned int i, j, k, ibit, obit;
	uint32_t il, ir, fl, fr;
	unsigned char inv_key_perm[64];
	unsigned char inv_comp_perm[56];
	unsigned char init_perm[64], final_perm[64];

	/*
	 * Set up the initial & final permutations into a useful form, and
	 * initialise the inverted key permutation.
	 */
	for (i = 0; i < 64; i++) {
		init_perm[final_perm[i] = IP[i] - 1] = i;
		inv_key_perm[i] = 255;
	}

	/*
	 * Invert the key permutation and initialise the inverted key
	 * compression permutation.
	 */
	for (i = 0; i < 56; i++) {
		inv_key_perm[key_perm[i] - 1] = i;
		inv_comp_perm[i] = 255;
	}

	/*
	 * Invert the key compression permutation.
	 */
	for (i = 0; i < 48; i++) {
		inv_comp_perm[comp_perm[i] - 1] = i;
	}

	/*
	 * Set up the OR-mask arrays for the initial and final permutations,
	 * and for the key initial and compression permutations.
	 */
	for (k = 0; k < 16; k++) {
		for (i = 0; i < 16; i++) {
			il = ir = fl = fr = 0;
			for (j = 0; j < 4; j++) {
				ibit = 4 * k + j;
				if (i & (8 >> j)) {
					if ((obit = init_perm[ibit]) < 32)
						il |= 0x80000000 >> obit;
					else
						ir |= 0x80000000 >> (obit - 32);
					if ((obit = final_perm[ibit]) < 32)
						fl |= 0x80000000 >> obit;
					else
						fr |= 0x80000000 >> (obit - 32);
				}
			}
			shared->ip_maskl[k][i] = il;
			shared->ip_maskr[k][i] = ir;
			if (k & 1) {
				shared->fp_maskl[k >> 1][i] = fl;
#ifdef DEBUG
				assert(!fr);
#endif
			} else {
				shared->fp_maskr[k >> 1][i] = fr;
#ifdef DEBUG
				assert(!fl);
#endif
			}

			il = ir = 0;
			for (j = 0; j < 4 - (k & 1); j++) {
				ibit = 4 * k + j;
				if (i & (8 >> j)) {
					if ((obit = inv_key_perm[ibit]) == 255)
						continue;
					if (obit < 28)
						il |= 0x08000000 >> obit;
					else
						ir |= 0x08000000 >> (obit - 28);
				}
			}
			if (!(k & 1))
				shared->key_perm_maskl[k >> 1][i] = il;
#ifdef DEBUG
			else
				assert(!il);
#endif
			if (k < 8)
				shared->key_perm_maskr[k][i] = ir;
			else if (k & 1)
				shared->key_perm_maskr[4 + (k >> 1)][i] = ir;
#ifdef DEBUG
			else
				assert(!ir);
#endif
		}
	}
	for (k = 0; k < 8; k++) {
		for (i = 0; i < 8; i++) {
			il = ir = 0;
			for (j = 0; j < 3; j++) {
				ibit = 7 * k + j;
				if (i & (4 >> j)) {
					if ((obit = inv_comp_perm[ibit]) == 255)
						continue;
					if (obit < 24)
						il |= 0x00800000 >> obit;
					else
						ir |= 0x00800000 >> (obit - 24);
				}
			}
			if (k < 4) {
				shared->comp_maskl0[k][i] = il;
#ifdef DEBUG
				assert(!ir);
#endif
			} else {
				shared->comp_maskr0[k - 4][i] = ir;
#ifdef DEBUG
				assert(!il);
#endif
			}
		}
		for (i = 0; i < 16; i++) {
			il = ir = 0;
			for (j = 3; j < 7; j++) {
				ibit = 7 * k + j;
				if (i & (0x40 >> j)) {
					if ((obit = inv_comp_perm[ibit]) == 255)
						continue;
					if (obit < 24)
						il |= 0x00800000 >> obit;
					else
						ir |= 0x00800000 >> (obit - 24);
				}
			}
			if (k < 4) {
				shared->comp_maskl1[k][i] = il;
#ifdef DEBUG
				assert(!ir);
#endif
			} else {
				shared->comp_maskr1[k - 4][i] = ir;
#ifdef DEBUG
				assert(!il);
#endif
			}
		}
	}
}

static void init_s(struct _crypt_extended_shared *shared)
{
	unsigned int i, j, b;
	unsigned char u_sbox[8][64];
	unsigned char un_pbox[32];

	/*
	 * Invert the S-boxes, reordering the input bits.
	 */
	for (i = 0; i < 8; i++)
		for (j = 0; j < 64; j++) {
			b = (j & 0x20) | ((j & 1) << 4) | ((j >> 1) & 0xf);
			u_sbox[i][j] = sbox[i][b];
		}

	/*
	 * Invert the P-box permutation, and convert into OR-masks for
	 * handling the output of the S-box arrays setup above.
	 */
	for (i = 0; i < 32; i++)
		un_pbox[pbox[i] - 1] = i;

	for (b = 0; b < 8; b++) {
		for (i = 0; i < 64; i++) {
			uint32_t p = 0;
			for (j = 0; j < 4; j++) {
				if (u_sbox[b][i] & (8 >> j))
					p |= 0x80000000 >> un_pbox[4 * b + j];
			}
			shared->psbox[b][i] = p;
		}
	}
}

void _crypt_extended_init(struct _crypt_extended_shared *shared)
{
	init_ip_k(shared);
	init_s(shared);
}

static uint32_t setup_salt(uint32_t salt)
{
	uint32_t obit, saltbit, saltbits;
	unsigned int i;

	saltbits = 0;
	saltbit = 1;
	obit = 0x800000;
	for (i = 0; i < 24; i++) {
		if (salt & saltbit)
			saltbits |= obit;
		saltbit <<= 1;
		obit >>= 1;
	}

	return saltbits;
}

static void des_setkey(const unsigned char *key, struct expanded_key *ekey,
    const struct _crypt_extended_shared *shared)
{
	uint32_t k0, k1, rawkey0, rawkey1;
	unsigned int shifts, round, i, ibit;

	rawkey0 =
	    (uint32_t)key[3] |
	    ((uint32_t)key[2] << 8) |
	    ((uint32_t)key[1] << 16) |
	    ((uint32_t)key[0] << 24);
	rawkey1 =
	    (uint32_t)key[7] |
	    ((uint32_t)key[6] << 8) |
	    ((uint32_t)key[5] << 16) |
	    ((uint32_t)key[4] << 24);

	/*
	 * Do key permutation and split into two 28-bit subkeys.
	 */
	k0 = k1 = 0;
	for (i = 0, ibit = 28; i < 4; i++, ibit -= 4) {
		unsigned int j = i << 1;
		k0 |= shared->key_perm_maskl[i][(rawkey0 >> ibit) & 0xf] |
		      shared->key_perm_maskl[i + 4][(rawkey1 >> ibit) & 0xf];
		k1 |= shared->key_perm_maskr[j][(rawkey0 >> ibit) & 0xf];
		ibit -= 4;
		k1 |= shared->key_perm_maskr[j + 1][(rawkey0 >> ibit) & 0xf] |
		      shared->key_perm_maskr[i + 8][(rawkey1 >> ibit) & 0xf];
	}

	/*
	 * Rotate subkeys and do compression permutation.
	 */
	shifts = 0;
	for (round = 0; round < 16; round++) {
		uint32_t t0, t1;
		uint32_t kl, kr;

		shifts += key_shifts[round];

		t0 = (k0 << shifts) | (k0 >> (28 - shifts));
		t1 = (k1 << shifts) | (k1 >> (28 - shifts));

		kl = kr = 0;
		ibit = 25;
		for (i = 0; i < 4; i++) {
			kl |= shared->comp_maskl0[i][(t0 >> ibit) & 7];
			kr |= shared->comp_maskr0[i][(t1 >> ibit) & 7];
			ibit -= 4;
			kl |= shared->comp_maskl1[i][(t0 >> ibit) & 0xf];
			kr |= shared->comp_maskr1[i][(t1 >> ibit) & 0xf];
			ibit -= 3;
		}
		ekey->l[round] = kl;
		ekey->r[round] = kr;
	}
}

/*
 * l_in, r_in, l_out, and r_out are in pseudo-"big-endian" format.
 */
static void do_des(uint32_t l_in, uint32_t r_in,
    uint32_t *l_out, uint32_t *r_out,
    uint32_t count, uint32_t saltbits, const struct expanded_key *ekey,
    const struct _crypt_extended_shared *shared)
{
	uint32_t l, r;

	/*
	 * Do initial permutation (IP).
	 */
	l = r = 0;
	if (l_in | r_in) {
		unsigned int i, ibit;
		for (i = 0, ibit = 28; i < 8; i++, ibit -= 4) {
			l |= shared->ip_maskl[i][(l_in >> ibit) & 0xf] |
			     shared->ip_maskl[i + 8][(r_in >> ibit) & 0xf];
			r |= shared->ip_maskr[i][(l_in >> ibit) & 0xf] |
			     shared->ip_maskr[i + 8][(r_in >> ibit) & 0xf];
		}
	}

	while (count--) {
		/*
		 * Do each round.
		 */
		unsigned int round = 16;
		const uint32_t *kl = ekey->l;
		const uint32_t *kr = ekey->r;
		uint32_t f;
		while (round--) {
			uint32_t r48l, r48r;
			/*
			 * Expand R to 48 bits (simulate the E-box).
			 */
			r48l	= ((r & 0x00000001) << 23)
				| ((r & 0xf8000000) >> 9)
				| ((r & 0x1f800000) >> 11)
				| ((r & 0x01f80000) >> 13)
				| ((r & 0x001f8000) >> 15);

			r48r	= ((r & 0x0001f800) << 7)
				| ((r & 0x00001f80) << 5)
				| ((r & 0x000001f8) << 3)
				| ((r & 0x0000001f) << 1)
				| ((r & 0x80000000) >> 31);
			/*
			 * Do salting for crypt() and friends, and
			 * XOR with the permuted key.
			 */
			f = (r48l ^ r48r) & saltbits;
			r48l ^= f ^ *kl++;
			r48r ^= f ^ *kr++;
			/*
			 * Do S-box lookups (which shrink it back to 32 bits)
			 * and do the P-box permutation at the same time.
			 */
			f = shared->psbox[0][r48l >> 18]
			  | shared->psbox[1][(r48l >> 12) & 0x3f]
			  | shared->psbox[2][(r48l >> 6) & 0x3f]
			  | shared->psbox[3][r48l & 0x3f]
			  | shared->psbox[4][r48r >> 18]
			  | shared->psbox[5][(r48r >> 12) & 0x3f]
			  | shared->psbox[6][(r48r >> 6) & 0x3f]
			  | shared->psbox[7][r48r & 0x3f];
			/*
			 * Now that we've permuted things, complete f().
			 */
			f ^= l;
			l = r;
			r = f;
		}
		r = l;
		l = f;
	}

	/*
	 * Do final permutation (inverse of IP).
	 */
	{
		unsigned int i, ibit;
		uint32_t lo, ro;
		lo = ro = 0;
		for (i = 0, ibit = 28; i < 4; i++, ibit -= 4) {
			ro |= shared->fp_maskr[i][(l >> ibit) & 0xf] |
			      shared->fp_maskr[i + 4][(r >> ibit) & 0xf];
			ibit -= 4;
			lo |= shared->fp_maskl[i][(l >> ibit) & 0xf] |
			      shared->fp_maskl[i + 4][(r >> ibit) & 0xf];
		}
		*l_out = lo;
		*r_out = ro;
	}
}

static void des_cipher(const unsigned char *in, unsigned char *out,
    uint32_t count, uint32_t saltbits, const struct expanded_key *ekey,
    const struct _crypt_extended_shared *shared)
{
	uint32_t l_out, r_out, rawl, rawr;

	rawl =
	    (uint32_t)in[3] |
	    ((uint32_t)in[2] << 8) |
	    ((uint32_t)in[1] << 16) |
	    ((uint32_t)in[0] << 24);
	rawr =
	    (uint32_t)in[7] |
	    ((uint32_t)in[6] << 8) |
	    ((uint32_t)in[5] << 16) |
	    ((uint32_t)in[4] << 24);

	do_des(rawl, rawr, &l_out, &r_out, count, saltbits, ekey, shared);

	out[0] = l_out >> 24;
	out[1] = l_out >> 16;
	out[2] = l_out >> 8;
	out[3] = l_out;
	out[4] = r_out >> 24;
	out[5] = r_out >> 16;
	out[6] = r_out >> 8;
	out[7] = r_out;
}

static char *_crypt_extended_r_uut(const char *_key, const char *_setting,
    struct _crypt_extended_local *local,
    const struct _crypt_extended_shared *shared)
{
	const unsigned char *key = (const unsigned char *)_key;
	const unsigned char *setting = (const unsigned char *)_setting;
	struct expanded_key ekey;
	union {
		unsigned char c[8];
		uint32_t i[2];
	} keybuf;
	unsigned char *p, *q;
	uint32_t count, salt, l, r0, r1;
	unsigned int i;

	/*
	 * Copy the key, shifting each character left by one bit and padding
	 * with zeroes.
	 */
	q = keybuf.c;
	while (q <= &keybuf.c[sizeof(keybuf.c) - 1]) {
		*q++ = *key << 1;
		if (*key)
			key++;
	}
	des_setkey(keybuf.c, &ekey, shared);

	if (*setting == _PASSWORD_EFMT1) {
		/*
		 * "new"-style:
		 *	setting - underscore, 4 chars of count, 4 chars of salt
		 *	key - unlimited characters
		 */
		for (i = 1, count = 0; i < 5; i++) {
			uint32_t value = ascii_to_bin(setting[i]);
			if (ascii64[value] != setting[i])
				return NULL;
			count |= value << (i - 1) * 6;
		}
		if (!count)
			return NULL;

		for (i = 5, salt = 0; i < 9; i++) {
			uint32_t value = ascii_to_bin(setting[i]);
			if (ascii64[value] != setting[i])
				return NULL;
			salt |= value << (i - 5) * 6;
		}

		while (*key) {
			/*
			 * Encrypt the key with itself.
			 */
			des_cipher(keybuf.c, keybuf.c, 1, 0, &ekey, shared);
			/*
			 * And XOR with the next 8 characters of the key.
			 */
			q = keybuf.c;
			while (q <= &keybuf.c[sizeof(keybuf.c) - 1] && *key)
				*q++ ^= *key++ << 1;
			des_setkey(keybuf.c, &ekey, shared);
		}

		memcpy(local->output, setting, 9);
		local->output[9] = '\0';
		p = local->output + 9;
	} else {
		/*
		 * "old"-style:
		 *	setting - 2 chars of salt
		 *	key - up to 8 characters
		 */
		count = 25;

		if (ascii_is_unsafe(setting[0]) || ascii_is_unsafe(setting[1]))
			return NULL;

		salt = (ascii_to_bin(setting[1]) << 6)
		     |  ascii_to_bin(setting[0]);

		local->output[0] = setting[0];
		local->output[1] = setting[1];
		p = local->output + 2;
	}

	/*
	 * Do it.
	 */
	do_des(0, 0, &r0, &r1, count, setup_salt(salt), &ekey, shared);

	/*
	 * Now encode the result...
	 */
	l = (r0 >> 8);
	*p++ = ascii64[(l >> 18) & 0x3f];
	*p++ = ascii64[(l >> 12) & 0x3f];
	*p++ = ascii64[(l >> 6) & 0x3f];
	*p++ = ascii64[l & 0x3f];

	l = (r0 << 16) | ((r1 >> 16) & 0xffff);
	*p++ = ascii64[(l >> 18) & 0x3f];
	*p++ = ascii64[(l >> 12) & 0x3f];
	*p++ = ascii64[(l >> 6) & 0x3f];
	*p++ = ascii64[l & 0x3f];

	l = r1 << 2;
	*p++ = ascii64[(l >> 12) & 0x3f];
	*p++ = ascii64[(l >> 6) & 0x3f];
	*p++ = ascii64[l & 0x3f];
	*p = 0;

	return (char *)local->output;
}

char *_crypt_extended_r(const char *key, const char *setting,
    struct _crypt_extended_local *local,
    const struct _crypt_extended_shared *shared)
{
	const char *test_key = "\x80\xff\x80\x01 "
	    "\x7f\x81\x80\x80\x0d\x0a\xff\x7f \x81 test";
	const char *test_setting = "_0.../9Zz";
	const char *test_hash = "_0.../9ZzX7iSJNd21sU";
	struct _crypt_extended_local local_test;
	char *retval;
	const char *p;

	if (*setting != _PASSWORD_EFMT1) {
		test_setting = "\x80x";
		test_hash = "\x80x22/wK52ZKGA";
	}

	/*
	 * Hash the supplied password.
	 */
	retval = _crypt_extended_r_uut(key, setting, local, shared);

	/*
	 * Perform a quick self-test.  It is important that we make both calls
	 * to _crypt_extended_r_uut() from the same scope such that they likely
	 * use the same stack locations, which makes the second call overwrite
	 * the first call's sensitive data on the stack and makes it more
	 * likely that any alignment related issues would be detected.
	 */
	p = _crypt_extended_r_uut(test_key, test_setting, &local_test, shared);
	if (p && !strcmp(p, test_hash))
		return retval;

	/*
	 * Should not happen.
	 */
	return NULL;
}

#ifndef TEST_STATIC
#define TEST_STATIC /* not static */
#endif

#ifdef TEST
static char *_crypt_extended(const char *key, const char *setting)
{
	TEST_STATIC int initialized = 0;
	TEST_STATIC struct _crypt_extended_shared shared;
/* "local" must be static since it holds our own return value */
	static struct _crypt_extended_local local;

	if (!initialized) {
		memset(&shared, 's', sizeof(shared));
		memset(&local, 'l', sizeof(local));
		_crypt_extended_init(&shared);
		initialized = 1;
	}
	return _crypt_extended_r(key, setting, &local, &shared);
}

static const struct {
	char *hash;
	char *pw;
} tests[] = {
/* "new"-style */
	{"_J9..CCCCXBrJUJV154M", "U*U*U*U*"},
	{"_J9..CCCCXUhOBTXzaiE", "U*U***U"},
	{"_J9..CCCC4gQ.mB/PffM", "U*U***U*"},
	{"_J9..XXXXvlzQGqpPPdk", "*U*U*U*U"},
	{"_J9..XXXXsqM/YSSP..Y", "*U*U*U*U*"},
	{"_J9..XXXXVL7qJCnku0I", "*U*U*U*U*U*U*U*U"},
	{"_J9..XXXXAj8cFbP5scI", "*U*U*U*U*U*U*U*U*"},
	{"_J9..SDizh.vll5VED9g", "ab1234567"},
	{"_J9..SDizRjWQ/zePPHc", "cr1234567"},
	{"_J9..SDizxmRI1GjnQuE", "zxyDPWgydbQjgq"},
	{"_K9..SaltNrQgIYUAeoY", "726 even"},
	{"_J9..SDSD5YGyRCr4W4c", ""},
	{"_01234567IBjxKliXXRQ", "\xc3\x80" "1234abcd"},
	{"_012345678OSGpGQRVHA", "\xc3\x80" "9234abcd"},
/* "old"-style, valid salts */
	{"CCNf8Sbh3HDfQ", "U*U*U*U*"},
	{"CCX.K.MFy4Ois", "U*U***U"},
	{"CC4rMpbg9AMZ.", "U*U***U*"},
	{"XXxzOu6maQKqQ", "*U*U*U*U"},
	{"SDbsugeBiC58A", ""},
	{"./xZjzHv5vzVE", "password"},
	{"0A2hXM1rXbYgo", "password"},
	{"A9RXdR23Y.cY6", "password"},
	{"ZziFATVXHo2.6", "password"},
	{"zZDDIZ0NOlPzw", "password"},
	{"99PxawtsTfX56", "\xc3\x80" "1234abcd"},
	{"99jcVcGxUZOWk", "\xc3\x80" "9234abcd"},
/* "old"-style, "reasonable" invalid salts, UFC-crypt behavior expected */
	{"\001\002wyd0KZo65Jo", "password"},
	{"a_C10Dk/ExaG.", "password"},
	{"~\377.5OTsRVjwLo", "password"},
/* The below are erroneous inputs, so NULL return is expected/required */
	{"", ""}, /* no salt */
	{" ", ""}, /* setting string is too short */
	{"a:", ""}, /* unsafe character */
	{"\na", ""}, /* unsafe character */
	{"_/......", ""}, /* setting string is too short for its type */
	{"_........", ""}, /* zero iteration count */
	{"_/!......", ""}, /* invalid character in count */
	{"_/......!", ""}, /* invalid character in salt */
	{NULL, NULL}
};

int main(void)
{
	unsigned int i;

	for (i = 0; tests[i].hash; i++) {
		char *hash = _crypt_extended(tests[i].pw, tests[i].hash);
		if (!hash && strlen(tests[i].hash) < 13)
			continue; /* expected failure */
		if (hash && !strcmp(hash, tests[i].hash))
			continue; /* expected success */
		puts("FAILED");
		return 1;
	}

	puts("PASSED");

	return 0;
}
#endif

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: FreeSec crypt()
  2012-06-24  7:21       ` Solar Designer
@ 2012-06-24  7:32         ` Solar Designer
  2012-06-25  3:51         ` Rich Felker
  1 sibling, 0 replies; 17+ messages in thread
From: Solar Designer @ 2012-06-24  7:32 UTC (permalink / raw)
  To: musl

On Sun, Jun 24, 2012 at 11:21:12AM +0400, Solar Designer wrote:
> struct _crypt_extended_shared {
> 	uint32_t psbox[8][64];
> 	uint32_t ip_maskl[16][16], ip_maskr[16][16];
> 	uint32_t fp_maskl[8][16], fp_maskr[8][16];
> 	uint32_t key_perm_maskl[8][16], key_perm_maskr[12][16];
> 	uint32_t comp_maskl0[4][8], comp_maskr0[4][8];
> 	uint32_t comp_maskl1[4][16], comp_maskr1[4][16];
> };

Actually, we should re-order the first two lines.  That is make it:

struct _crypt_extended_shared {
	uint32_t ip_maskl[16][16], ip_maskr[16][16];
	uint32_t psbox[8][64];
	uint32_t fp_maskl[8][16], fp_maskr[8][16];
	uint32_t key_perm_maskl[8][16], key_perm_maskr[12][16];
	uint32_t comp_maskl0[4][8], comp_maskr0[4][8];
	uint32_t comp_maskl1[4][16], comp_maskr1[4][16];
};

That's because IP is not always used (it's only used for "new"-style
hashes, not for "old"-style ones where the initial block is always 0
and thus IP is skipped).  When it is used, it's used right before
psbox[] is.

Alexander


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: FreeSec crypt()
  2012-06-24  7:21       ` Solar Designer
  2012-06-24  7:32         ` Solar Designer
@ 2012-06-25  3:51         ` Rich Felker
  2012-06-29  5:25           ` Rich Felker
  1 sibling, 1 reply; 17+ messages in thread
From: Rich Felker @ 2012-06-25  3:51 UTC (permalink / raw)
  To: musl

On Sun, Jun 24, 2012 at 11:21:12AM +0400, Solar Designer wrote:
> On Wed, Jun 13, 2012 at 10:53:18AM -0400, Rich Felker wrote:
> > However, on the other hand I'm not sure I see the benefit. Why would
> > you call crypt or crypt_r multiple times with the same key/salt except
> > possibly for password cracking (in which case you'll want a highly
> > optimized-for-speed implementation)?
> 
> You're right.  Additionally, preserving this requires that we keep
> sensitive data around (from the previous password hashed).  I've dropped
> this stuff now.

Ah, nice catch!

> > > I agree that the vast majority of cases may involve UB, but so what - we
> > > can't be 100% certain we don't have any cases of UB in our source code
> > > even if we review it very carefully.
> > 
> > For purely computational code that doesn't recurse or loop arbitrarily
> > long based on the input or read and write all over memory, static
> > analysis can determine the absence of UB.
> 
> Are you performing such static analysis on musl?

Not yet/only to a minimal extent so far.

I haven't spent much time on it, but users/contributors have often
posted clang's analysis outputs for discussion on IRC, and it looks
like a promising direction.

> > > value or you return an error indication (what to return from crypt() on
> > > error is a separate non-trivial topic).
> > 
> > I saw the notes on this. What real-world code breaks from the
> > conformant behavior?
> 
> I think the majority of daemons, etc. that do authentication with
> crypt() don't handle possible NULL returns.  This is just starting to
> change now.  They don't really break under normal conditions because
> normally crypt() returns the hashed password and not NULL; but if it
> does somehow return NULL (e.g., because the salt read from the shadow
> file is invalid), then I'd expect most users of crypt() to crash.

Indeed, the first program I checked, dropbear, uses the return value
without checking it. It's rather disheartening that something as
important as authentication code is not checking the return value of
each function that could possibly fail. With TCB shadow especially,
password hashes could be highly invalid...

Anyway, nowhere does POSIX say crypt has to fail at all; as far as I
can tell, it's completely possible and reasonable to make an
implementation that always succeeds even on invalid salt. As long as
you ensure that the output can never be matched, it's probably fine to
do this.

> ....Attached is my latest revision of crypt_freesec.  I've reduced the
> table sizes even further (7 KB, may be precomputed) and I made certain
> other changes as discussed.  I'd appreciate another review, and some
> fuzzing against another implementation wouldn't hurt.

I put this off until after the release so as not to break anything at
the last minute, but I'll try to get it integrated soon.

Rich


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: FreeSec crypt()
  2012-06-25  3:51         ` Rich Felker
@ 2012-06-29  5:25           ` Rich Felker
  0 siblings, 0 replies; 17+ messages in thread
From: Rich Felker @ 2012-06-29  5:25 UTC (permalink / raw)
  To: musl

On Sun, Jun 24, 2012 at 11:51:03PM -0400, Rich Felker wrote:
> > ....Attached is my latest revision of crypt_freesec.  I've reduced the
> > table sizes even further (7 KB, may be precomputed) and I made certain
> > other changes as discussed.  I'd appreciate another review, and some
> > fuzzing against another implementation wouldn't hurt.
> 
> I put this off until after the release so as not to break anything at
> the last minute, but I'll try to get it integrated soon.

I've committed a modified version (with a comment explaining that it's
modified, this time :) to the musl git repo. It's using static
initialized tables instead of runtime generation on the stack. Despite
it being mildly controversial, I left in the runtime test for now,
mainly since it serves the double purpose of clearing potentially
sensitive data from the stack in a clever way without much additional
cost.

I did not commit the standalone tests in-tree, but I did run them and
they all pass. I may add them to libc-testsuite soon; the only reason
I haven't done so yet is that I'd want a way to selectively disable
some that won't be supported on all systems, and as of yet
libc-testsuite does not have a good framework for making some tests
optional.

I've also made crypt_r public through crypt.h.

Feedback is welcome. If anyone's interested in helping get md5, sha,
or other hash types integrated (while keeping the size cost down) that
would be great too.

Rich


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2012-06-29  5:25 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-12 23:51 FreeSec crypt() Solar Designer
2012-06-13  1:18 ` Rich Felker
2012-06-13  6:10   ` Szabolcs Nagy
2012-06-13 12:43     ` Solar Designer
2012-06-13 12:58     ` Rich Felker
2012-06-13 13:18       ` Solar Designer
2012-06-13 14:56         ` Rich Felker
2012-06-13 16:45           ` Solar Designer
2012-06-13 17:27             ` Rich Felker
2012-06-13 17:32             ` Szabolcs Nagy
2012-06-13 17:36               ` Rich Felker
2012-06-13 12:07   ` Solar Designer
2012-06-13 14:53     ` Rich Felker
2012-06-24  7:21       ` Solar Designer
2012-06-24  7:32         ` Solar Designer
2012-06-25  3:51         ` Rich Felker
2012-06-29  5:25           ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).