From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=DKIM_INVALID,DKIM_SIGNED, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 29456 invoked from network); 11 Sep 2023 17:01:42 -0000 Received: from mx1.math.uh.edu (129.7.128.32) by inbox.vuxu.org with ESMTPUTF8; 11 Sep 2023 17:01:42 -0000 Received: from lists1.math.uh.edu ([129.7.128.208]) by mx1.math.uh.edu with esmtps (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1qfkI6-005X7e-2l for ml@inbox.vuxu.org; Mon, 11 Sep 2023 12:01:38 -0500 Received: from lists1.math.uh.edu ([127.0.0.1] helo=lists.math.uh.edu) by lists1.math.uh.edu with smtp (Exim 4.96) (envelope-from ) id 1qfkI6-005bRt-03 for ml@inbox.vuxu.org; Mon, 11 Sep 2023 12:01:38 -0500 Received: from mx1.math.uh.edu ([129.7.128.32]) by lists1.math.uh.edu with esmtp (Exim 4.96) (envelope-from ) id 1qfkI4-005bRn-1q for ding@lists.math.uh.edu; Mon, 11 Sep 2023 12:01:36 -0500 Received: from quimby.gnus.org ([95.216.78.240]) by mx1.math.uh.edu with esmtps (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1qfkI2-005X79-23 for ding@lists.math.uh.edu; Mon, 11 Sep 2023 12:01:36 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnus.org; s=20200322; h=Content-Type:Mime-Version:References:Message-ID:Date:Subject: From:To:Sender:Reply-To:Cc:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=Ob7AyoYIUHisnA6SbMt+FquG1gW5aHDvg8mQwkyOnhI=; b=Gqx/SFbSesnGUstCDg6B6XuLVq DgO22wv+ytTW2Fje/2R9xnD0CCoWVi6Qylda+Nm0ICw1y866FrylW3VGD6ExeH5zfzecGyQeMXIx0 2OyNQCOJMWx4PTAxLhJKTXdQ7/0HIFf1CBiSpZ39uW7tQgrFEEyoIA+I5NKYsxVqZsxk=; Received: from ciao.gmane.io ([116.202.254.214]) by quimby.gnus.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1qfkHt-00026S-O2 for ding@gnus.org; Mon, 11 Sep 2023 19:01:28 +0200 Received: from list by ciao.gmane.io with local (Exim 4.92) (envelope-from ) id 1qfkHr-0006nX-DE for ding@gnus.org; Mon, 11 Sep 2023 19:01:23 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: ding@gnus.org From: Eric Abrahamsen Subject: Re: View docx/doc documents from Gnus in Docview Date: Mon, 11 Sep 2023 10:01:15 -0700 Message-ID: <877cowod9g.fsf@ericabrahamsen.net> References: <8734zoskoe.fsf@thaodan.de> <87o7iango3.fsf@ericabrahamsen.net> <87il8hajv0.fsf@thaodan.de> <87jzsxogb0.fsf@ericabrahamsen.net> <87pm2p5sjc.fsf@ust.hk> Mime-Version: 1.0 Content-Type: text/plain User-Agent: Gnus/5.13 (Gnus v5.13) Cancel-Lock: sha1:lfPEJ49Rk7Ks0RK1eKTyfjSYTd0= List-ID: Precedence: bulk Andrew Cohen writes: > Sorry for not replying sooner (I am swamped with real work and have > little time for other things); I have had this working for myself so I > thought I can provide some advice. > > Firstly, telling gnus to use doc-view for these documents is easy: you > need to modify the variable 'mailcap-user-mime-data (which controls user > overrides for various mime types). Here is an example (this will use > doc-view-mode for mime types of ms-excel and > openxmlformats-officedocument.wordprocessingml.document, and use eww for > html.) You should modify this for your own needs: > > (setq mailcap-user-mime-data > '(((viewer . doc-view-mode) > (test . window-system) > (type . "application/vnd.ms-excel")) > ((viewer . doc-view-mode) > (test . window-system) > (type . "application/vnd.openxmlformats-officedocument.wordprocessingml.document")) > ((viewer . eww) > (test . (fboundp 'eww)) > (type . "text/html")))) > > But unfortunately this won't work properly due to a deficiency in > doc-view. Doc-view has only a fairly primitive mechanism for figuring > out the type of the document; since docx documents are mostly zip > archives, and many other file formats are also zip archives, doc-view > will notice they are zip files and treat them as epub (for me, at > least). The right way to fix this is to smarten up doc-view to > correctly identify the file type. This isn't hard, but I don't have time > to do it right now (maybe someone else is willing?). In the meantime > you can use the following hack which works for me: replace the function > 'doc-view-set-doc-type with the modified version below. > > (defun doc-view-set-doc-type () > "Figure out the current document type (`doc-view-doc-type')." > (let* ((buffer-file-name (or buffer-file-name (buffer-name (current-buffer)))) > (name-types > (when buffer-file-name > (cdr (assoc-string > (file-name-extension buffer-file-name) > '( > ;; DVI > ("dvi" dvi) > ;; PDF > ("pdf" pdf) ("epdf" pdf) > ;; EPUB > ("epub" epub) > ;; PostScript > ("ps" ps) ("eps" ps) > ;; DjVu > ("djvu" djvu) > ;; OpenDocument formats. > ("odt" odf) ("ods" odf) ("odp" odf) ("odg" odf) > ("odc" odf) ("odi" odf) ("odm" odf) ("ott" odf) > ("ots" odf) ("otp" odf) ("otg" odf) > ;; Microsoft Office formats (also handled by the odf > ;; conversion chain). > ("doc" odf) ("docx" odf) ("xls" odf) ("xlsx" odf) > ("ppt" odf) ("pps" odf) ("pptx" odf) ("rtf" odf) > ;; CBZ > ("cbz" cbz) > ;; FB2 > ("fb2" fb2) > ;; (Open)XPS > ("xps" xps) ("oxps" oxps)) > t)))) > (content-types > (save-excursion > (goto-char (point-min)) > (cond > ((looking-at "%!") '(ps)) > ((looking-at "%PDF") '(pdf)) > ((looking-at "\367\002") '(dvi)) > ((looking-at "AT&TFORM") '(djvu)) > ;; The following pattern actually is for recognizing > ;; zip-archives, so that this same association is used for > ;; cbz files. This is fine, as cbz files should be handled > ;; like epub anyway. > ((looking-at "PK") '(epub odf)))))) > (setq-local > doc-view-doc-type > (car (or (nreverse (seq-intersection name-types content-types #'eq)) > (when (and name-types content-types) > (error "Conflicting types: name says %s but content says %s" > name-types content-types)) > name-types content-types > (error "Cannot determine the document type")))))) Thanks for this! The version of `doc-view-set-doc-type` in master looks almost exactly like what you've posted here, with the exception of the let* for (buffer-file-name (or buffer-file-name (buffer-name (current-buffer)))) at the top. Could someone have fixed it in the meantime?