From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/8471 Path: news.gmane.org!not-for-mail From: Zack Weinberg Newsgroups: gmane.comp.lib.glibc.alpha,gmane.comp.gcc.devel,gmane.linux.lib.musl.general Subject: Compiler support for erasure of sensitive data Date: Wed, 9 Sep 2015 12:36:01 -0400 Message-ID: <55F05FF1.3000405@panix.com> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: ger.gmane.org 1441816635 31190 80.91.229.3 (9 Sep 2015 16:37:15 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 9 Sep 2015 16:37:15 +0000 (UTC) To: gcc@gcc.gnu.org, llvmdev@cs.uiuc.edu, GNU C Library , musl@lists.openwall.com Original-X-From: libc-alpha-return-63070-glibc-alpha=m.gmane.org@sourceware.org Wed Sep 09 18:37:08 2015 Return-path: Envelope-to: glibc-alpha@plane.gmane.org Original-Received: from server1.sourceware.org ([209.132.180.131] helo=sourceware.org) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1ZZiMf-0007m5-CV for glibc-alpha@plane.gmane.org; Wed, 09 Sep 2015 18:36:53 +0200 DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:to:from:subject:message-id:date:mime-version :content-type:content-transfer-encoding; q=dns; s=default; b=Qwf l2VwjAeG55+1XEYfPfPcrWrTEUcVUXhPrRctsUucKWaZZpgoYFqpgnFcn/NwKXfq M1SP/Bs4sAX7r3IR8eF+wyWxl/sCRR1r3QfxLIILmVxwBSjCbJH33+wtP/6gexvP Z+kx5vI7Mi0LxE/qYz60x5V8lXYLpWz+Xe0k+mq0= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:to:from:subject:message-id:date:mime-version :content-type:content-transfer-encoding; s=default; bh=ZUF4/51dJ f0n+CrsdeSipmnsbno=; b=FXqDJN6eG4L2hP6XE8BiHgS7B6kMFne7aqigIIn7R 07/5KYVzO+fsQBRca/f1gCJDln7hDiuJ/xAM8FyCbjR6RAbrnTtZ+dpI64IpuAMM JlZ+QRijt/xxYlyRwTwdVT/BtOALURkl7wCRpTzNQfNF63rjS8tqVRZKnFK9xj7w HU= Original-Received: (qmail 112240 invoked by alias); 9 Sep 2015 16:36:09 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Original-Sender: libc-alpha-owner@sourceware.org Original-Received: (qmail 112218 invoked by uid 89); 9 Sep 2015 16:36:08 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.6 required=5.0 tests=AWL,BAYES_20,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-qg0-f48.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=to:from:subject:message-id:date:user-agent:mime-version :content-type:content-transfer-encoding; bh=gOvB28WqpA/acs9U0bvySQQSQNvYvHDEX3lOtZCLqwA=; b=QqdAAkSset1qo2IKlcl8+zF91usQvNk6cgmlnFYZtNYY02HvUbIqwjtnIr/AmXPabu o2AvKoc91GOPuSbvDi6H+9Pmoia3xqhENnqCG8/E7J5qOY2gJmqqHpaNFpMPPaShoKLv NSld5VGiSAbRFfV7aCm+MytY1Ar7L8n36VhBymHuL+5V9JpXKIngcskHK0RrS33PfyzH PX88P9fxBuLZMoFhXpmW7IFtSjoX1Ix031oX/JDcTntavX92vO4f62kqIEZW5kuz0YXW 0jmHBDPtvuGHiPHUExA087hwq103jvuDDEROULsFh0mHbujJaHLYqApUxh+Yr/UqDUKf /dig== X-Received: by 10.140.96.135 with SMTP id k7mr37426337qge.34.1441816564094; Wed, 09 Sep 2015 09:36:04 -0700 (PDT) X-Enigmail-Draft-Status: N1010 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Icedove/38.1.0 Xref: news.gmane.org gmane.comp.lib.glibc.alpha:55350 gmane.comp.gcc.devel:141199 gmane.linux.lib.musl.general:8471 Archived-At: Over the past couple months I have been working on adding a function to glibc for erasure of sensitive data after it's used. This function already exists in various other libraries under various names -- explicit_bzero, explicit_memset, SecureZeroMemory, memset_s, OPENSSL_cleanse, etc -- and the problem it solves is, in this code: void encrypt_with_phrase(const char *phrase, const char *in, char *out, size_t n) { char key[16]; genkey(phrase, key); encrypt(key, in, out, n); memset(key, 0, 16); } the compiler is entitled to delete the final call to memset, on the grounds that `key` is dead after the call, so no conforming C program can access that memory to prove that it has been cleared. However, a crash dump, a debugger, or a plain old bug in the program might expose that region of the stack to prying eyes. This is listed at https://cwe.mitre.org/data/definitions/14.html as a "common weakness". explicit_bzero (under whatever name) is just bzero with an additional, documented guarantee that the compiler will not optimize it out even if the memory region is dead afterward. To further underline that this is a common scenario, there are 152 uses of either OPENSSL_cleanse or explicit_bzero in libressl 2.2.3 (they appear to be in the middle of changing the name). During development of my patches, we encountered two problems which cannot be fully resolved without compiler assistance, so I want to open a conversation about getting that assistance added to GCC and Clang. The first, simpler problem is strictly optimization. explicit_bzero can be optimized to memset followed by a vacuous use of the memory region (generating no machine instructions, but preventing the stores from being deleted as dead); this is valuable because the sensitive data is often small and fixed-size, so the memset can in turn be replaced by inline code. (This also facilitates implementation of -D_FORTIFY_SOURCE checks for explicit_bzero.) Again looking at libressl, 92 of those 152 uses are improved by a crude version of this optimization: void explicit_bzero(void *, size_t); extern inline __attribute__((gnu_inline, always_inline)) void explicit_bzero_constn(void *ptr, size_t len) { typedef struct {char x[len];} memblk; memset(ptr, 0, len); asm("" : : "m" (*(memblk __attribute__((may_alias)) *)ptr)); } #define explicit_bzero(s, n) \ (__extension__(__builtin_constant_p(n) && (n) > 0 \ ? explicit_bzero_constn(s, n) \ : explicit_bzero(s, n))) I call this "crude" because it only works in GCC, when compiling C, and when the length parameter is compile-time constant. GCC issues no error for this code when 'len' is not compile-time constant, but it is not documented to work reliably. When compiling C++, GCC does not accept a structure containing an array whose size is not *lexically* constant; even if the body of explicit_bzero_constn is moved into the macro so that the whole thing is guarded by __builtin_constant_p, using explicit_bzero with a non-constant size will cause a compile error. The same is true for Clang whether compiling C or C++. This problem could be solved with a very simple feature addition: extern inline __attribute__((gnu_inline, always_inline)) void explicit_bzero(void *ptr, size_t len) { memset(ptr, 0, len); __builtin_use_memory(ptr, len); } where __builtin_use_memory compiles to no machine instructions, but is treated as a non-deletable use of all LEN bytes starting at PTR, whether or not LEN is constant. The harder problem, unfortunately, is one of correctness. Consider now this example (courtesy Florian Weimer): extern void explicit_bzero(void *, size_t); struct key { unsigned long long hi, lo; }; extern struct key get_key(void); extern void use_key(struct key); void without_clear(void) { struct key k = get_key(); use_key(k); } void with_clear(void) { struct key k = get_key(); use_key(k); explicit_bzero(&k, sizeof k); } On AMD64 this will be compiled to something like without_clear: subq $8, %rsp call get_key movq %rdx, %rsi movq %rax, %rdi call use_key addq $8, %rsp ret with_clear: subq $24, %rsp call get_key movq %rax, %rdi movq %rdx, %rsi movq %rax, (%rsp) movq %rdx, 8(%rsp) call use_key movq %rsp, %rdi movl $16, %esi call explicit_bzero addq $24, %rsp ret The ABI dictates basically everything you see. The call to explicit_bzero has forced the compiler to *create* a second copy of the variable `k` on the stack, just so it can be erased -- and the copy in registers survives (at least for a short time), which is not what the programmer wanted. With or without explicit_bzero, we have no way of getting rid of the copy in registers. More complicated scenarios of course exist. I think the ideal feature addition to address this would be void safe(void) { struct key __attribute__((sensitive)) k = get_key(); use_key(k); } which should generate safe: subq $8, %rsp call get_key movq %rdx, %rsi movq %rax, %rdi call use_key xorq %rdx, %rdx xorq %rax, %rax xorq %rsi, %rsi xorq %rdi, %rdi addq $8, %rsp ret (use_key *could have* clobbered the values in all four of those registers, but it is not required to. If use_key gets inlined, of course the compiler could track what registers actually contain sensitive values.) We would have to nail down precise semantics for __attribute__((sensitive)). It needs to propagate to invisible copies made by the compiler, and to *partial* copies in register or stack slots. Doing a full-on taint propagation, however, might be too much. (In the original example, one would *not* want the compiler to erase `out`, even though what gets put into `out` is data-dependent on the value of `key`.) It also needs to work on heap allocations as well as declared variables. And there are uses in both glibc (crypt(3)) and libressl where the sensitive datum is one field of a larger struct, other fields of which must *not* be erased. For compatibility with existing code that uses explicit_bzero or similar, a second attribute would be desirable: extern void explicit_bzero(void *, size_t) __attribute__((erase_sensitive(1, 2))); informs the compiler that this function performs an erasure of arg#2 bytes of sensitive data pointed to by arg#1. This has the effect of back-propagating __attribute__((sensitive)) to the object pointed-to by argument #1. The compiler _is_ then allowed to omit a call to the function provided that its own erasures are at least as thorough. For heap allocations (which don't go out of scope) the call to this function marks the point where erasure of copies should occur. (I'm proposing an attribute rather than a builtin function because of the many different names used for this operation by existing code.) Comments? Please note that I do not have anything like the time required to implement any of this myself (and I'm ten years out of practice on GCC and have no experience whatsoever with Clang, anyway). I'm hoping this catches someone's interest. zw