From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on starla X-Spam-Level: X-Spam-Status: No, score=-1.1 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,MAILING_LIST_MULTI,SPF_HELO_PASS,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 Received: from nue.mailmanlists.eu (nue.mailmanlists.eu [94.130.110.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id E655A1F4BE for ; Mon, 21 Oct 2024 10:53:29 +0000 (UTC) Authentication-Results: dcvr.yhbt.net; dkim=pass (1024-bit key; unprotected) header.d=ml.ruby-lang.org header.i=@ml.ruby-lang.org header.a=rsa-sha256 header.s=mail header.b=SfNfC7ws; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=ruby-lang.org header.i=@ruby-lang.org header.a=rsa-sha256 header.s=s1 header.b=r8scJyxP; dkim-atps=neutral DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ml.ruby-lang.org; s=mail; t=1729508007; bh=9hi1QmnVnNazhMzKZB41/z3LlIR3XF175LuErqN/4W8=; h=Date:References:To:Reply-To:Subject:List-Id:List-Archive: List-Help:List-Owner:List-Post:List-Subscribe:List-Unsubscribe: From:Cc:From; b=SfNfC7wsyitLkBkVPhWYnV0c9bCpiOPBNDH36z8akUncAK3sUF9QDwmDPBwIVSMok 4Lz5h/t8HFKPZQ+G/7+6Amr5DTb3XCW8cjySp8sYMLjNO09fLgB/37iKN0fxR6ZmOs 569GP/Eq518Ry1JDKj1+nAXbrUJG8uOwypEz5txU= Received: from nue.mailmanlists.eu (localhost [IPv6:::1]) by nue.mailmanlists.eu (Postfix) with ESMTP id 7E0AA44970 for ; Mon, 21 Oct 2024 10:53:27 +0000 (UTC) Authentication-Results: nue.mailmanlists.eu; dkim=pass (2048-bit key; unprotected) header.d=ruby-lang.org header.i=@ruby-lang.org header.a=rsa-sha256 header.s=s1 header.b=r8scJyxP; dkim-atps=neutral Received: from s.wfbtzhsw.outbound-mail.sendgrid.net (s.wfbtzhsw.outbound-mail.sendgrid.net [159.183.224.105]) by nue.mailmanlists.eu (Postfix) with ESMTPS id 097F64485D for ; Mon, 21 Oct 2024 10:52:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ruby-lang.org; h=from:references:subject:mime-version:content-type: content-transfer-encoding:list-id:to:cc:content-type:from:subject:to; s=s1; bh=Nw/R0o+Fk9uicR6+9dPB9y2xzqGFzyHXWQBTwmU+xG4=; b=r8scJyxPT/DMQlPe9s1CWDaWEvsqVB0WqAnpco8/FErHtBxBfLYcQOubNOmENokfWoTW 4YrEEDdkbGKJphjcg3Qp57NPQZkSS8sTXG2foi6cAy2VeNkgSsWEkMTj+e3ebkwy5I7c8+ dz5geY44IMUOkV39uk8VEPx01TnI6ftJsR5ptRBK2FYKR91c39dbgSLXAHrfHbnSU8x/ft Npz+Dk1QnDU3hRrrcibsi4lMa5cCvTjkpP3/+sCPCE4SN1vSp+RccTIBjE/SvfC8npzum6 M5PJ1cOWdMhkzDno1j9lCFPSoBIqQLusDtp1EFJCGAxyQRInog/g0aabmXyjzhMQ== Received: by recvd-5577bcb48c-pn52b with SMTP id recvd-5577bcb48c-pn52b-1-67163289-16 2024-10-21 10:52:57.151461218 +0000 UTC m=+3344096.210562809 Received: from herokuapp.com (unknown) by geopod-ismtpd-14 (SG) with ESMTP id fP7Ap-4iQoGEq6Y7zuNPNg for ; Mon, 21 Oct 2024 10:52:57.111 +0000 (UTC) Date: Mon, 21 Oct 2024 10:52:57 +0000 (UTC) Message-ID: References: Mime-Version: 1.0 X-Redmine-Project: ruby-master X-Redmine-Issue-Tracker: Feature X-Redmine-Issue-Id: 20792 X-Redmine-Issue-Author: kddnewton X-Redmine-Issue-Priority: Normal X-Redmine-Sender: Eregon X-Mailer: Redmine X-Redmine-Host: bugs.ruby-lang.org X-Redmine-Site: Ruby Issue Tracking System X-Auto-Response-Suppress: All Auto-Submitted: auto-generated X-Redmine-MailingListIntegration-Message-Ids: 96205 X-SG-EID: =?us-ascii?Q?u001=2EByjZWvxTCjdoV8K03xEuhE7KqN4thWULFLM7+oH78KY30oYB3qFthsDpL?= =?us-ascii?Q?4w4cbYa3ttBh8bAHPOnE=2FkzPba67JNu7Lnrked2?= =?us-ascii?Q?O7K9VQ=2FJax2ysLBmyC9EMRwHe6=2Fewqv8byh5Wd1?= =?us-ascii?Q?Vg+lvuOV=2Fsze5=2Fj8IVJgV5c9AMDRdlFWbC6MYuK?= =?us-ascii?Q?bJOeoZYEiaBEIFOiv=2FtiYKpzloakdNoN3nmu4WD?= =?us-ascii?Q?ymd2M8lakrh92sTsy0ZKumoXCGVY9oc+90byLbK?= =?us-ascii?Q?CY4jrN+drgoYoaGUT2+B1k67=2FA=3D=3D?= To: ruby-core@ml.ruby-lang.org X-Entity-ID: u001.I8uzylDtAfgbeCOeLBYDww== Message-ID-Hash: WHNH4AN42Y4JTN26TROYZFTP4AXZ5SVE X-Message-ID-Hash: WHNH4AN42Y4JTN26TROYZFTP4AXZ5SVE X-MailFrom: bounces+313651-b711-ruby-core=ml.ruby-lang.org@em5188.ruby-lang.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.9 Precedence: list Reply-To: Ruby developers Subject: [ruby-core:119571] [Ruby master Feature#20792] String#forcible_encoding? List-Id: Ruby developers Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: From: "Eregon (Benoit Daloze) via ruby-core" Cc: "Eregon (Benoit Daloze)" Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Issue #20792 has been updated by Eregon (Benoit Daloze). Right. But if it is valid in that encoding, wouldn't you always or almost always then want the String (or a copy of it) in that encoding? If you do, `.with_encoding` would be more efficient as it keeps the computed coderange. If you don't then indeed just a predicate would avoid the String instance allocation (Strings are copy-on-write, so it's just one object allocation, string bytes are not copied). Do you have an example where you wouldn't want a String in that encoding? The [linked example](https://github.com/ruby/prism/blob/d6e9b8de36b4d18debfe36e4545116539964ceeb/lib/prism/parse_result.rb#L15-L30) needs a UTF-8 String on line 19. So `with_encoding` seems a perfect fit there and more efficient than the predicate (1 vs 2 coderange scans). The code also has to explicitly workaround `force_encoding` being inplace (a common inconvenience with `force_encoding`) on line 27, which `with_encoding` addresses. ---------------------------------------- Feature #20792: String#forcible_encoding? https://bugs.ruby-lang.org/issues/20792#change-110185 * Author: kddnewton (Kevin Newton) * Status: Open ---------------------------------------- I would like to add a method to String called `forcible_encoding?(encoding)`. This would return true or false depending on whether the receiver can be forced into the given encoding without breaking the string. It would effectively be an alias for: ```ruby def forcible_encoding?(enc) original = encoding result = force_encoding(enc).valid_encoding? force_encoding(original) result end ``` I would like this method because there are extremely rare but possible circumstances where source files are marked as binary but contain UTF-8-encoded characters. In that case I would like to check if it's possible to cleanly force UTF-8 before actually doing it. The code I'm trying to replace is here: https://github.com/ruby/prism/blob/d6e9b8de36b4d18debfe36e4545116539964ceeb/lib/prism/parse_result.rb#L15-L30. The pull request for the code is here: https://github.com/ruby/ruby/pull/11851. -- https://bugs.ruby-lang.org/ ______________________________________________ ruby-core mailing list -- ruby-core@ml.ruby-lang.org To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/