ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
From: "Michal Vlasák via ntg-context" <ntg-context@ntg.nl>
To: <ntg-context@ntg.nl>
Cc: "Michal Vlasák" <lahcim8@gmail.com>
Subject: Multimedia, PDF and ConTeXt
Date: Tue, 27 Jul 2021 00:49:27 +0200	[thread overview]
Message-ID: <CD3FSWRSNXEL.CYFXL9DFR2G1@phobos> (raw)

Dear ConTeXt community,

recently, as part of my bachelor thesis, I looked into the state of
multimedia (audio, video, 3D) and other relatively obscure PDF
features, with the connection to TeX.

I already put off this write up long enough. Hopefully it clarifies
things that might influence the current uncertainty about these features
and their presence in ConTeXt (see abstract of Hans' talk for upcoming
ConTeXt meeting, chapter 8 of "On and on" and remarks in files like
"lpdf-wid.lmt").

The following text discusses the options for including multimedia in PDF
files from ConTeXt. First the different PDF mechanisms are introduced
and compared, then their support in ConTeXt is summarized. My ideas
for future steps are also included. Patch to fix some bugs is also
below. Lastly I link to some resources regarding the topic.

The various versions of the PDF standard have over the years developed
several ways of including "multimedia" in PDF files. The simplest are
XObjects which allow raster and vector graphics -- this is a well known
and well supported feature in both PDF writers and viewers. However,
later revisions of the PDF standard added what is essentially five
different mechanisms for including video, audio or 3D files (each
mechanism supports a different subset of these three). For evaluating
these mechanisms from perspective of ConTeXt it is possible to devise
the following criteria:

 - support in PDF standard (deprecation, etc.),
 - supported media types (audio, video[+optional audio], 3D),
 - support of different source types (embedded file, external file, URL)
 - what is possible to achieve ("usefulness") and at what cost
   ("complexity"),
 - current support in ConTeXt,
 - and the most important: support in PDF viewers.

Perhaps the "source types" need a bit of explanation. Essentially all
file references from PDF can be one of three types:

 - Embedded file. The referenced file is *in* the PDF file, this means
   that it can also be compressed as part of it (not very useful for
   multimedia though). This is nice because the result is integral --
   the media file can't get lost and there is only a single file to
   distribute.
 - URL file. The reference to the file is solely the URL. Takes almost
   no space at all in the PDF file, but means that the availability of
   the media file and PDF file are not tied.
 - External file. A file path is included in the PDF file. The file
   doesn't have to be available over the internet, but has to be
   distributed along with the PDF file (and the relative path has to
   match).

The "usefulness" aspect includes the possibility of interaction or
scripting. E.g. using media player buttons ("controls"), scripting with
JavaScript or some control with PDF actions (\goto, and triggers like
page open which allow auto-play).

The viewers I tested were: Acrobat Reader DC, Foxit Reader, Sumatra PDF
on Windows and Evince, Okular, Xpdf, MuPDF, Firefox and Google Chrome on
Linux.

Now to the different mechanisms:

1) Sound objects

 - First appeared in PDF 1.2 (1996), but had since been deprecated (PDF
   1.5, 2003) and became unsupported (PDF 2.0, 2017).

 - Only audio is supported.

 - "Raw" and in practice uncompressed PCM audio can be embedded (i.e.
   ".wav" format without the metadata).
   Otherwise an external file may be used (this one has to be in a real
   audio format - like ".wav" - i.e. with metadata).

 - Users usually don't have raw audio. So embedding requires
   preprocessing. Some control using PDF actions is possible.

 - Not supported in ConTeXt.

 - None of the viewers supports the external files. Only Acrobat Reader
   supports the embedded raw audio.

2) Movie objects

 - First appeared in PDF 1.2 (1996), but had since been deprecated (PDF
   1.5, 2003) and became unsupported (PDF 2.0, 2017).

 - Both video and audio is supported.

 - Any source (embedded, external and URL file).

 - In all regards superior to sound objects. Is still relatively simple
   and allows some customization and control (media player controls, PDF
   actions).

 - This is the backing mechanism for including video and audio in
   ConTeXt (\externalfigure, \useexternalsoundtrack).

 - Supported only in Evince and Okular (with their usual quirks, see
   below). Notably Acrobat Reader does no longer support this mechanism.

3) Multimedia ("Renditions")

 - First appeared in PDF 1.5 (2003). Adobe Acrobat considers them
   "legacy".

 - Both video and audio supported (as well as other unspecified types of
   multimedia, like images and Flash, but not really, see below).

 - In theory all source types should be possible.

 - This mechanism was supposed to replace sound and movie objects. Hence
   their deprecation. The mechanism is complex (the spec is 10 times
   longer than that for movie objects). It expects the PDF viewers to
   work with plugins and introduces ways for determining if a media file
   is really playable in some plugin. It is allowed to even include
   more media files (to serve as fallback should the primary one be
   unsupported by the viewer). Other complexity is that the concept of
   the rectangle where the media will be played ("screen") is separated
   from the media itself ("rendition"). In theory this allows mixing and
   matching them, but in practice is a lot of unnecessary complexity, at
   least in my opinion.

   This mechanism allows multimedia player controls, as well as PDF
   actions. The PDF action can be either one of the predefined ones or
   entirely specified in JavaScript (extra API is available for this).

 - This is the mechanism behind \useexternalrendering. This has been
   used for Flash (.swf files) and "manual" audio + video insertion as
   far as I can tell.

 - Evince and Okular support this (with usual quirks), but not for
   external files (Evince segfaults). Acrobat and Foxit support this
   mechanism as well, but Acrobat only allows embedded files. Okular by
   mistake auto-plays all media [1].

4) 3D art

 - First appeared in PDF 1.6 (2004).

 - Only 3D files supported. This means U3D and later PRC files. The 3D
   objects described in the files are shown in a scene whose parameters
   (like camera position, angle, background color, etc.) can be
   configured.

 - The source is not a file, but a "PDF stream" (which is essentially
   embedded file with different metadata, but allows also "external
   files" to contain the stream data).

 - The 3D functionality is nice. It allows great amount of interactivity
   (playing with the camera, selectively disabling 3D objects, etc.) and
   also scriptability (switching between predefined "views" with PDF
   actions and a _lot_ of possibilities with JavaScript scripts).

 - This is the mechanism used for u3d and prc files in the ConTeXt
   "figure" mechanism (\externalfigure).

 - Apart from the external streams (see above) everything works in Adobe
   Acrobat. Foxit Reader also has support, but it is limited (no
   support for JavaScript and printing).


5) Rich Media

 - First appeared in Adobe extension level 3 to PDF 1.7 (2008). Later
   included in PDF 2.0 (2017). It was meant to replace both multimedia
   (renditions) and 3D art mechanisms, with unified mechanism based on
   Flash, thus also supporting arbitrary Flash applications.

 - Supports video, audio and 3D.

 - Only embedded files are supported.

 - While the mechanism is heavily based on Flash (which is dead, since
   December 2020) it allows also "plain" Rich Media without Flash.

   The old idea was that the PDF viewer would support Flash (and playing
   its video as well as mp4), but the audio/video wouldn't be played
   directly by the PDF viewer, but by a Flash application (embedded in
   the PDF along with the media file). This means that the mechanism has
   inherent complexity that is not justified nowadays (essentially four
   levels of indirection for a plain audio / video file).

   While the same thing should have been true for 3D files I couldn't
   find any real usage like that. Instead it seems that 3D files with
   Rich Media have always been used like with the "3D art" mechanism
   (but with different wrappers).

   There is essentially no scriptability for audio and video. (Note that
   in this regard 3D files work just like with the "3D art" mechanism).
   There also isn't an easy way to display multimedia player controls (a
   hack works for Acrobat). One thing that it allows is playing the
   media in a customizable window, even full screen (not only in a part
   of a page like the previous mechanisms allow).

 - ConTeXt uses this mechanism for Flash (SWF) files in the figure
   mechanism. This is also allows audio/video (Flash media player, like
   "vplayer" is inserted and the media file is its parameter), see for
   example "java-imp-vplayer.mkiv".

 - Both Flash and "plain" Rich media are supported by Acrobat Reader.
   Okular only supports Flash Rich Media. How is this possible,
   considering that Flash player is dead? Well, both viewers have a
   compatibility layer that detects embedded Flash media player file and
   doesn't use it to play the video, but instead plays the video
   natively. This is good, because there are a lot of documents out
   there which use Flash based Rich Media. But there is absolutely no
   need to create new documents with embedded Flash player applications,
   it only takes space and isn't even used.

   Okular notably doesn't support plain Rich Media. The support is easy
   to add, but my proposed patch [2] depends on changes to poppler. The
   poppler developers want to take this chance to improve the Rich Media
   representation [3] but I haven't gotten to that, yet.

   Support similar to Okular's should be relatively easy to add to
   Evince as well.

   The 3D support is the same as with 3D art for Acrobat Reader. Weirdly
   Foxit Reader doesn't support 3D files wrapped in Rich Media, although
   there doesn't seem any good reason for it.


All in all, of the five mechanisms 2 are deprecated and unfortunately no
longer supported in the most used PDF viewer and other 2 mechanisms are
needlessly complex and in reality limited. For example, while the
multimedia mechanism supports JavaScript, (AFAIK) only Acrobat Reader
supports that, this further limits the viewer support or available
features, choose one.

The support for video and audio in Okular and Evince is based on
Gstreamer. Explaining Gstreamer is tricky, but essentially it allows the
viewers to play any media type as long as the right plugin is installed.
These plugins are distributed in bundless and three of them cover all
reasonable formats and more. But while the media file format support is
great, these viewers don't really support PDF actions or JavaScript for
more control over the media playback.

Acrobat and Foxit both use Windows Media Player for playing the video.
Both support controls, but behave differently -- Acrobat displays the
controls outside of the multimedia annotation, Foxit within...

As if it wasn't enough there is other trouble with playing multimedia in
Acrobat Reader and Foxit Reader. They nag you to allow the media
playback every time. You can select to trust the file once or from now
on, but if somebody opens a foreign PDF with video, they aren't going to
get smooth experience.

Another thing (but I don't remember well) is that there is a check box
in Acrobat Reader, that allows the "legacy" Multimedia mechanism. I
don't remember its state in an unaltered installation.

After evaluating these mechanisms I came to conclusion, that a PDF
writer today is best at:

 - Embedding video and audio using the "multimedia" ("renditions")
   mechanism. It is supported in proprietary and open source viewers
   alike. Customization and scripting / PDF actions is out of the
   question, though.

 - Embedding 3D files using the "Rich Media" mechanism. While it is
   essentially just a few differences in wrappers, it has real
   advantages (data sources are files not streams, and multiple
   JavaScript script files are supported), that I find nice enough for
   the implementation and users alike.


Some sources for this topic are also the LaTeX centric [4] and [5]. I go
into more details in the former. In the latter the "plain" Rich Media
and "multimedia" ("renditions") mechanisms are suggested as solutions
for the Flash media player approach.


And now for the future. What should ConTeXt do? On one hand all
available mechanisms are flawed in one way or the other. On the other
hand some users may still find the functionality useful. My suggestions
is to either delete all the support for audio/video or:

 1) Delete the "Movie objects" implementation of figures. It is not
    supported in viewers, where users expect it to [6].

 2) Delete all mentions of Flash. There is no reason to create new
    documents with embedded Flash files, even though they may work in
    some viewers. Plain Rich Media can be used instead, with hopefully
    soon equal support [2].

 3) The "externalrendering" mechanism (multimedia/renditions) can stay.
    If the insertion of audio/video as "figures" is to stay, then I
    suggest to use multimedia/renditions for it (in simplified form).

Note that the 3D support in ConTeXt is completely fine and works in
Acrobat and Foxit.

The "externalrendering" part currently has three "bugs". Previous
discussion at this list provides some context [7]. The following is
currently "wrong":

 - Currently ConTeXt wraps a PDF file specification for embedded file
   inside another file specification (i.e. embedded files don't work).
 - As a result of "externalrendering" inheriting from \framed, the
   PDF annotation late_lua whatsit is centered inside the frame and so
   the annotation itself is offset by half its width to the right.
 - ConTeXt doesn't explicitly allow the viewer to create temporary
   files, hence the playback fails in Acrobat Reader.

Hopefully the patch included below fixes all three. But note that while
I love ConTeXt I don't know it well and may be terribly wrong. I also
was aiming at a minimal diff for inclusion in this e-mail. This is a
test file for this:

    \starttext
    \setupinteraction[state=start]
    
    \useexternalrendering[myvideo][video/mp4][video.mp4][embed=yes]
    \useexternalrendering[myvideo2][video/mp4][https://gitlab.com/agrahn/media9/uploads/c7e2ae944fbd711df4ad7bd58000f83a/nightseq1.mp4]
    \useexternalrendering[myvideo3][video/mp4][video.mp4]
    
    \definerenderingwindow[myrenderingwindow][width=\textwidth, height=\textwidth]
    
    \noindent
    \placerenderingwindow[myrenderingwindow][myvideo]
    
    \goto{START}[StartRendering{myvideo}]
    \goto{STOP} [StopRendering{myvideo}]
    \goto{PAUSE}[PauseRendering{myvideo}]
    
    \vfil\break\noindent
    \placerenderingwindow[myrenderingwindow][myvideo2]
    
    \vfil\break\noindent
    \placerenderingwindow[myrenderingwindow][myvideo3]
    
    \stoptext

All three file source types are demonstrated. Any "video.mp4" in the
directory you compile in will do. (Works as expected in Okular on
Linux.)


This was a dump of knowledge that I gained from writing my thesis.
Sadly its in Czech, but part of it is PDF code snippets and tables
summarizing viewer support, that I can translate and provide if there is
interest. But a large part of what I deem practical today is implemented
and documented here:
http://mirrors.ctan.org/macros/luatex/optex/pdfextra/pdfextra-doc.pdf.
The source is probably hard to read, because of the "_" and "." prefixes
in the control sequences, but those can be ignored. I posted some "real"
documents in [3] and [5].

If more documents / snippets / explanations are needed I hope I can
provide them.

Sadly, while working on this, I didn't have access to the PDF 2.0
standard. My information mostly comes from the PDF 1.7 standard and
publicly known information about PDF 2.0 - the Rich Media mechanism got
included in PDF 2.0, but I am not sure in what extent did the Flash part
get included. I also don't know if there really is anything new, but
nothing suggests it. Regardless, viewer support isn't complete for
something standardized over 20 years ago, I don't expect revolution in
the PDF viewers, considering the price of the standard(s).


[1]: https://bugs.kde.org/show_bug.cgi?id=436709
[2]: https://invent.kde.org/graphics/okular/-/merge_requests/426
[3]: https://gitlab.freedesktop.org/poppler/poppler/-/merge_requests/855
[4]: https://tex.stackexchange.com/questions/516029/media9-is-becoming-obsolete-dec-2020-any-alternatives-for-embedding-video-audio/516102
[5]: https://gitlab.com/agrahn/media9/-/issues/9
[6]: https://wiki.contextgarden.net/Command/externalfigure
[7]: https://www.mail-archive.com/ntg-context@ntg.nl/msg88639.html


Best regards,
Michal Vlasák


--- a/tex/texmf-context/tex/context/base/mkxl/lpdf-wid.lmt
+++ b/tex/texmf-context/tex/context/base/mkxl/lpdf-wid.lmt
@@ -689,22 +689,26 @@
      --          B = start,
      --     }
      -- }
-     -- local parameters = pdfdictionary {
-     --     Type = pdfconstant(MediaPermissions),
-     --     TF   = pdfstring("TEMPALWAYS") }, -- TEMPNEVER TEMPEXTRACT TEMPACCESS TEMPALWAYS
-     -- }
+        local parameters = pdfdictionary {
+            Type = pdfconstant("MediaPermissions"),
+            TF   = pdfstring("TEMPALWAYS"), -- TEMPNEVER TEMPEXTRACT TEMPACCESS TEMPALWAYS
+            -- TEMPALWAYS - allows temporary files (needed for Acrobat / Windows Movie Player)
+        }
         local descriptor = pdfdictionary {
             Type = pdfconstant("Filespec"),
             F    = filename,
         }
         if isurl then
             descriptor.FS = pdfconstant("URL")
+            descriptor = pdfreference(pdfflushobject(descriptor))
         elseif option[v_embed] then
-            descriptor.EF = codeinjections.embedfile {
+            descriptor = codeinjections.embedfile {
                 file     = filename,
                 mimetype = mimetype, -- yes or no
                 compress = false,
             }
+        else
+            descriptor = pdfreference(pdfflushobject(descriptor))
         end
         local clip = pdfdictionary {
             Type = pdfconstant("MediaClip"),
@@ -712,8 +716,8 @@
             N    = label,
             CT   = mimetype,
             Alt  = pdfarray { "", "file not found" }, -- language id + message
-            D    = pdfreference(pdfflushobject(descriptor)),
-         -- P    = pdfreference(pdfflushobject(parameters)),
+            D    = descriptor,
+            P    = pdfreference(pdfflushobject(parameters)),
         }
         local rendition = pdfdictionary {
             Type = pdfconstant("Rendition"),
--- a/tex/texmf-context/tex/context/base/mkxl/scrn-wid.mklx
+++ b/tex/texmf-context/tex/context/base/mkxl/scrn-wid.mklx
@@ -649,6 +649,7 @@
     \letrenderingwindowparameter\c!closepageaction\empty
     \setrenderingwindowparameter\c!width          {\d_scrn_rendering_width }%
     \setrenderingwindowparameter\c!height         {\d_scrn_rendering_height}%
+    \setrenderingwindowparameter\c!align          {\v!flushleft}% don't center annotation whatsit
 \to \everypresetrenderingwindow
 
 \permanent\tolerant\protected\def\placerenderingwindow[#window]#spacer[#rendering]% do all in lua
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki     : http://contextgarden.net
___________________________________________________________________________________

             reply	other threads:[~2021-07-26 22:49 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-26 22:49 Michal Vlasák via ntg-context [this message]
2021-07-27  6:54 ` Hans Hagen via ntg-context
2021-07-27 11:20   ` Michal Vlasák via ntg-context
2021-07-27 13:21     ` Hans Hagen via ntg-context

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CD3FSWRSNXEL.CYFXL9DFR2G1@phobos \
    --to=ntg-context@ntg.nl \
    --cc=lahcim8@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).