From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.1 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 18930 invoked from network); 27 Oct 2020 20:43:36 -0000 Received: from krantz.zx2c4.com (192.95.5.69) by inbox.vuxu.org with ESMTPUTF8; 27 Oct 2020 20:43:36 -0000 Received: by krantz.zx2c4.com (ZX2C4 Mail Server) with ESMTP id 33c281b3; Tue, 27 Oct 2020 20:41:35 +0000 (UTC) Return-Path: Received: from mail-qk1-x743.google.com (mail-qk1-x743.google.com [2607:f8b0:4864:20::743]) by krantz.zx2c4.com (ZX2C4 Mail Server) with ESMTPS id bae642b4 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO) for ; Tue, 27 Oct 2020 20:41:32 +0000 (UTC) Received: by mail-qk1-x743.google.com with SMTP id 188so2559317qkk.12 for ; Tue, 27 Oct 2020 13:43:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linuxfoundation.org; s=google; h=date:from:to:subject:message-id:mail-followup-to:mime-version :content-disposition; bh=M3c3FtC7cxElb5hlYGuSsUqLtqoeHxP4V0LQjTxTll4=; b=bmnunthZv97bBWsikoE+J5bxzrf+7+GgZ6n6tbpkW/olVT6A4FYhBlQa/c3u7RO7RP PnJPme33VkHOO4eF19quHcXX8XSRwsMPKFxzk+t3uNI8SL3AyhfvcNTi+JyH+bpK5qdD NnLkXSwmIXKrGalYHoUFgjFD4j6lThTYRL7FQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:subject:message-id:mail-followup-to :mime-version:content-disposition; bh=M3c3FtC7cxElb5hlYGuSsUqLtqoeHxP4V0LQjTxTll4=; b=d+DCbH6hxdLDSBkit3hNw1oztqTRoj5R7eIbM4AT2iJnXvbQnX7vUkBzcyuZ0k4nM+ Ic2kFuIb0fa4GWm27eUb1tOvFsQhm6MVQNCkrUu4GAxn69UrhLpyD2QFgjk6eGhKrwCG hX8UIGLA4G/5EXm6OvdWuDUEjXEKqkaIZDCvBBaEmvL02KwWvG4RqHvTy39/cEBBQ58t 8JXhAWLV0i512tV6BBdjMZYFyHwAR85tQjTOzOm7GPiRjtEEV5Wc4oWU8vylE3ckFY14 BkodKNs0Ad+cnrUlfgdq2QPhYrI4bHk8zhYG5zmYqo2L+Xuy2FlLAyYWcj3kfTxJhFer SeVw== X-Gm-Message-State: AOAM532uWRewQMDeQVyZjMAcJHwYYURyRyjhZSwNAIFDT1xINZwD3A7u pW5agg+msyyfzRTSVkDSxgkWN+NFvf61S59I X-Google-Smtp-Source: ABdhPJxLvg/hcRbYYUEfVYmwG0XGGz05BuJqNSgLDAbqbjdKiiQjUivyrXygzuHhJB7gqZqhoZJYqg== X-Received: by 2002:ae9:efc7:: with SMTP id d190mr4200460qkg.12.1603831406841; Tue, 27 Oct 2020 13:43:26 -0700 (PDT) Return-Path: Received: from chatter.i7.local ([89.36.78.230]) by smtp.gmail.com with ESMTPSA id 123sm1569250qkj.85.2020.10.27.13.43.25 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Oct 2020 13:43:25 -0700 (PDT) Date: Tue, 27 Oct 2020 16:43:23 -0400 From: Konstantin Ryabitsev To: cgit@lists.zx2c4.com Subject: Optional reachability checks for direct object access Message-ID: <20201027204323.lznnkv37b5fa774q@chatter.i7.local> Mail-Followup-To: cgit@lists.zx2c4.com MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-BeenThere: cgit@lists.zx2c4.com X-Mailman-Version: 2.1.30rc1 Precedence: list List-Id: List for cgit developers and users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: cgit-bounces@lists.zx2c4.com Sender: "CGit" Hi, all: It is common for git hosting environments to configure all forks of the same repo to use an "object storage" repository. For example, this is what allows git.kernel.org's 600+ forks of linux.git to take up only 10GB on disk as opposed to 800GB. One of the side-effects of this setup is that any object in the shared repository can be accessed from any of the forks, which periodically confuses people into believing that something terrible has happened. Case in point: https://github.com/torvalds/linux/blob/b4061a10fc29010a610ff2b5b20160d7335e69bf/drivers/hid/hid-samsung.c#L113-L118 Now, this could be fixed by performing reachability checks, but they are expensive, so it makes sense to have this off in most cases. However, I think it would be a nice feature to be able to enable this per-repository -- for example, to avoid someone getting confused when they see odd objects in what would appear to be the official Linux repository. Git-upload-pack implements this, which is why it's possible to set uploadpack.allowReachableSHA1InWant (https://git-scm.com/docs/git-config#Documentation/git-config.txt-uploadpackallowReachableSHA1InWant). The easiest way to check if an object is reachable from a repository that I can think of is to run "git branch --contains [commit-id]" and checking if anything is returned. E.g., a commit 184c79e4acf5a55f1d6febcf4cf1b545880c2fde doesn't exist in torvalds/linux.git, but git finds it just fine through alternates: $ git cat-file -t 184c79e4acf5a55f1d6febcf4cf1b545880c2fde commit And you can access it via cgit: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=184c79e4acf5a55f1d6febcf4cf1b545880c2fde However, it doesn't exist in that tree: $ git branch --contains 184c79e4acf5a55f1d6febcf4cf1b545880c2fde $ If cgit could optionally perform this (or similar) check and return 404 if a repo is configured to only display reachable objects, that would help avoid confusion. Afaict, it's not even that expensive when commit-graphs are regularly built. For example, the above check only takes 0m0.034s of sys time, according to "time". What do you think? -K