From: Eric Abrahamsen <eric@ericabrahamsen.net>
To: ding@gnus.org
Subject: Re: View docx/doc documents from Gnus in Docview
Date: Mon, 11 Sep 2023 10:01:15 -0700 [thread overview]
Message-ID: <877cowod9g.fsf@ericabrahamsen.net> (raw)
In-Reply-To: <87pm2p5sjc.fsf@ust.hk>
Andrew Cohen <acohen@ust.hk> writes:
> Sorry for not replying sooner (I am swamped with real work and have
> little time for other things); I have had this working for myself so I
> thought I can provide some advice.
>
> Firstly, telling gnus to use doc-view for these documents is easy: you
> need to modify the variable 'mailcap-user-mime-data (which controls user
> overrides for various mime types). Here is an example (this will use
> doc-view-mode for mime types of ms-excel and
> openxmlformats-officedocument.wordprocessingml.document, and use eww for
> html.) You should modify this for your own needs:
>
> (setq mailcap-user-mime-data
> '(((viewer . doc-view-mode)
> (test . window-system)
> (type . "application/vnd.ms-excel"))
> ((viewer . doc-view-mode)
> (test . window-system)
> (type . "application/vnd.openxmlformats-officedocument.wordprocessingml.document"))
> ((viewer . eww)
> (test . (fboundp 'eww))
> (type . "text/html"))))
>
> But unfortunately this won't work properly due to a deficiency in
> doc-view. Doc-view has only a fairly primitive mechanism for figuring
> out the type of the document; since docx documents are mostly zip
> archives, and many other file formats are also zip archives, doc-view
> will notice they are zip files and treat them as epub (for me, at
> least). The right way to fix this is to smarten up doc-view to
> correctly identify the file type. This isn't hard, but I don't have time
> to do it right now (maybe someone else is willing?). In the meantime
> you can use the following hack which works for me: replace the function
> 'doc-view-set-doc-type with the modified version below.
>
> (defun doc-view-set-doc-type ()
> "Figure out the current document type (`doc-view-doc-type')."
> (let* ((buffer-file-name (or buffer-file-name (buffer-name (current-buffer))))
> (name-types
> (when buffer-file-name
> (cdr (assoc-string
> (file-name-extension buffer-file-name)
> '(
> ;; DVI
> ("dvi" dvi)
> ;; PDF
> ("pdf" pdf) ("epdf" pdf)
> ;; EPUB
> ("epub" epub)
> ;; PostScript
> ("ps" ps) ("eps" ps)
> ;; DjVu
> ("djvu" djvu)
> ;; OpenDocument formats.
> ("odt" odf) ("ods" odf) ("odp" odf) ("odg" odf)
> ("odc" odf) ("odi" odf) ("odm" odf) ("ott" odf)
> ("ots" odf) ("otp" odf) ("otg" odf)
> ;; Microsoft Office formats (also handled by the odf
> ;; conversion chain).
> ("doc" odf) ("docx" odf) ("xls" odf) ("xlsx" odf)
> ("ppt" odf) ("pps" odf) ("pptx" odf) ("rtf" odf)
> ;; CBZ
> ("cbz" cbz)
> ;; FB2
> ("fb2" fb2)
> ;; (Open)XPS
> ("xps" xps) ("oxps" oxps))
> t))))
> (content-types
> (save-excursion
> (goto-char (point-min))
> (cond
> ((looking-at "%!") '(ps))
> ((looking-at "%PDF") '(pdf))
> ((looking-at "\367\002") '(dvi))
> ((looking-at "AT&TFORM") '(djvu))
> ;; The following pattern actually is for recognizing
> ;; zip-archives, so that this same association is used for
> ;; cbz files. This is fine, as cbz files should be handled
> ;; like epub anyway.
> ((looking-at "PK") '(epub odf))))))
> (setq-local
> doc-view-doc-type
> (car (or (nreverse (seq-intersection name-types content-types #'eq))
> (when (and name-types content-types)
> (error "Conflicting types: name says %s but content says %s"
> name-types content-types))
> name-types content-types
> (error "Cannot determine the document type"))))))
Thanks for this!
The version of `doc-view-set-doc-type` in master looks almost exactly
like what you've posted here, with the exception of the let* for
(buffer-file-name (or buffer-file-name (buffer-name (current-buffer))))
at the top. Could someone have fixed it in the meantime?
next prev parent reply other threads:[~2023-09-11 17:01 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-08 16:18 Björn Bidar
2023-09-10 16:20 ` Eric Abrahamsen
2023-09-10 19:50 ` Björn Bidar
2023-09-10 21:43 ` Eric Abrahamsen
2023-09-11 2:53 ` Andrew Cohen
2023-09-11 17:01 ` Eric Abrahamsen [this message]
2023-09-12 0:37 ` Andrew Cohen
2023-09-12 17:59 ` Eric Abrahamsen
2023-09-11 19:01 ` Björn Bidar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=877cowod9g.fsf@ericabrahamsen.net \
--to=eric@ericabrahamsen.net \
--cc=ding@gnus.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).