caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] ANN: CamlPDF 1.7
@ 2013-08-15 11:21 John Whitington
  2013-08-15 14:21 ` oliver
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: John Whitington @ 2013-08-15 11:21 UTC (permalink / raw)
  To: caml users

Hi,

The first new release of the CamlPDF library for a while is here:

http://www.github.com/johnwhitington/camlpdf

(Or, shortly, via OPAM.)

The documentation is online here:

http://www.coherentpdf.com/camlpdf

A little introduction is here:

http://www.coherentpdf.com/introduction_to_camlpdf.pdf

Most importantly, CamlPDF is now open source, being under a standard 
LGPL with linking exception licence.

This release is much cleaner: development has moved to Github for 
transparency, ocamlfind is supported, and it should compile 
out-of-the-box with no dependencies on Windows, Mac and Linux. 
Documentation is much improved.

And, of course, there's lots of new functionality, such as 256 bit AES 
encryption, reading of malformed files, support for writing with object 
streams, and new modules for merging files, bookmarks, destinations. 
It's also very much faster.

There have, however, been significant non-backward-compatible API 
changes. These will be minimized in the future. Contact me directly or 
via the Github issue system if you need help updating code from a 
previous version.

Thanks,

-- 
John Whitington
Director, Coherent Graphics Ltd
http://www.coherentpdf.com/


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Caml-list] ANN: CamlPDF 1.7
  2013-08-15 11:21 [Caml-list] ANN: CamlPDF 1.7 John Whitington
@ 2013-08-15 14:21 ` oliver
  2013-08-15 14:28   ` John Whitington
  2013-08-15 18:40 ` oliver
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 15+ messages in thread
From: oliver @ 2013-08-15 14:21 UTC (permalink / raw)
  To: John Whitington; +Cc: caml users

Hi,

On Thu, Aug 15, 2013 at 12:21:44PM +0100, John Whitington wrote:
> Hi,
> 
> The first new release of the CamlPDF library for a while is here:
> 
> http://www.github.com/johnwhitington/camlpdf
[...]


This is really wonderful (hot) software :-)

Thanks a lot! :-)

But I had problems installing/using it.

The lib was installed under /usr/local/lib/ocaml/...
wchis is fine.

But the permissions were set so, that only
root could access it.

It needed as root to do (in the directory which contains camlpdf-directory):

  # chmod go+r camlpdf
  # cd camlpdf
  # chmod go+r *


After that, it could be used.
I tried just pdfdraft and it did it's job.

So, thanks again for this library. :-)
I will be eager to explore it.

Ciao,
   Oliver

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Caml-list] ANN: CamlPDF 1.7
  2013-08-15 14:21 ` oliver
@ 2013-08-15 14:28   ` John Whitington
  2013-08-15 16:17     ` Gerd Stolpmann
  0 siblings, 1 reply; 15+ messages in thread
From: John Whitington @ 2013-08-15 14:28 UTC (permalink / raw)
  To: oliver; +Cc: caml users

Hi,

oliver wrote:
> Hi,
>
> On Thu, Aug 15, 2013 at 12:21:44PM +0100, John Whitington wrote:
>> Hi,
>>
>> The first new release of the CamlPDF library for a while is here:
>>
>> http://www.github.com/johnwhitington/camlpdf
> [...]
>
>
> This is really wonderful (hot) software :-)
>
> Thanks a lot! :-)
>
> But I had problems installing/using it.
>
> The lib was installed under /usr/local/lib/ocaml/...
> wchis is fine.
>
> But the permissions were set so, that only
> root could access it.
>
> It needed as root to do (in the directory which contains camlpdf-directory):
>
>    # chmod go+r camlpdf
>    # cd camlpdf
>    # chmod go+r *
>
>
> After that, it could be used.
> I tried just pdfdraft and it did it's job.
>
> So, thanks again for this library. :-)
> I will be eager to explore it.

The installation is handled by ocamlfind, via OCamlMakefile in a 
completely standard way, so there shouldn't be any problem.

However, I know very little about build and packaging systems. Does 
someone on the list recognize this symptom and how it might be fixed at 
source?

With Thanks,

-- 
John Whitington
Director, Coherent Graphics Ltd
http://www.coherentpdf.com/


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Caml-list] ANN: CamlPDF 1.7
  2013-08-15 14:28   ` John Whitington
@ 2013-08-15 16:17     ` Gerd Stolpmann
  2013-08-15 18:39       ` oliver
  0 siblings, 1 reply; 15+ messages in thread
From: Gerd Stolpmann @ 2013-08-15 16:17 UTC (permalink / raw)
  To: John Whitington; +Cc: oliver, caml users

[-- Attachment #1: Type: text/plain, Size: 1991 bytes --]

Looks like Oliver installed as root, and maybe there was a restrictive
umask. In that case, setting a better umask (e.g. umask 002) before
"make install" would be the solution. But that's the user's problem,
IMHO.

Gerd

Am Donnerstag, den 15.08.2013, 15:28 +0100 schrieb John Whitington:
> Hi,
> 
> oliver wrote:
> > Hi,
> >
> > On Thu, Aug 15, 2013 at 12:21:44PM +0100, John Whitington wrote:
> >> Hi,
> >>
> >> The first new release of the CamlPDF library for a while is here:
> >>
> >> http://www.github.com/johnwhitington/camlpdf
> > [...]
> >
> >
> > This is really wonderful (hot) software :-)
> >
> > Thanks a lot! :-)
> >
> > But I had problems installing/using it.
> >
> > The lib was installed under /usr/local/lib/ocaml/...
> > wchis is fine.
> >
> > But the permissions were set so, that only
> > root could access it.
> >
> > It needed as root to do (in the directory which contains camlpdf-directory):
> >
> >    # chmod go+r camlpdf
> >    # cd camlpdf
> >    # chmod go+r *
> >
> >
> > After that, it could be used.
> > I tried just pdfdraft and it did it's job.
> >
> > So, thanks again for this library. :-)
> > I will be eager to explore it.
> 
> The installation is handled by ocamlfind, via OCamlMakefile in a 
> completely standard way, so there shouldn't be any problem.
> 
> However, I know very little about build and packaging systems. Does 
> someone on the list recognize this symptom and how it might be fixed at 
> source?
> 
> With Thanks,
> 
> -- 
> John Whitington
> Director, Coherent Graphics Ltd
> http://www.coherentpdf.com/
> 
> 

-- 
------------------------------------------------------------
Gerd Stolpmann, Darmstadt, Germany    gerd@gerd-stolpmann.de
My OCaml site:          http://www.camlcity.org
Contact details:        http://www.camlcity.org/contact.html
Company homepage:       http://www.gerd-stolpmann.de
------------------------------------------------------------


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Caml-list] ANN: CamlPDF 1.7
  2013-08-15 16:17     ` Gerd Stolpmann
@ 2013-08-15 18:39       ` oliver
  2013-08-18 12:04         ` Adrien Nader
  0 siblings, 1 reply; 15+ messages in thread
From: oliver @ 2013-08-15 18:39 UTC (permalink / raw)
  To: Gerd Stolpmann; +Cc: John Whitington, caml users

On Thu, Aug 15, 2013 at 06:17:19PM +0200, Gerd Stolpmann wrote:
> Looks like Oliver installed as root, and maybe there was a restrictive
> umask. In that case, setting a better umask (e.g. umask 002) before
> "make install" would be the solution. But that's the user's problem,
> IMHO.



Yes, of course I installed as root (via sudo).
Umask might be the reason, yes.
And I also had that problem with some other libraries.


An installation can also set the
permissions of installed files accordingly to what is needd.
This might be added for this kind of install procedure.


Ciao,
   Oliver

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Caml-list] ANN: CamlPDF 1.7
  2013-08-15 11:21 [Caml-list] ANN: CamlPDF 1.7 John Whitington
  2013-08-15 14:21 ` oliver
@ 2013-08-15 18:40 ` oliver
  2013-08-15 18:42   ` oliver
  2013-08-16 10:53 ` Armaël Guéneau
  2013-08-21 12:01 ` oliver
  3 siblings, 1 reply; 15+ messages in thread
From: oliver @ 2013-08-15 18:40 UTC (permalink / raw)
  To: John Whitington; +Cc: caml users

On Thu, Aug 15, 2013 at 12:21:44PM +0100, John Whitington wrote:
> Hi,
> 
> The first new release of the CamlPDF library for a while is here:
> 
> http://www.github.com/johnwhitington/camlpdf
[...]

I'm just exploring the lib a bit.

When I use Pdf.objiter and just print the index-values,
they go from highest to lowest values.
Is this intended?
I would expect to get the lowest index first.

Ciao,
   Oliver

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Caml-list] ANN: CamlPDF 1.7
  2013-08-15 18:40 ` oliver
@ 2013-08-15 18:42   ` oliver
  0 siblings, 0 replies; 15+ messages in thread
From: oliver @ 2013-08-15 18:42 UTC (permalink / raw)
  To: John Whitington; +Cc: caml users

On Thu, Aug 15, 2013 at 08:40:45PM +0200, oliver wrote:
> On Thu, Aug 15, 2013 at 12:21:44PM +0100, John Whitington wrote:
> > Hi,
> > 
> > The first new release of the CamlPDF library for a while is here:
> > 
> > http://www.github.com/johnwhitington/camlpdf
> [...]
> 
> I'm just exploring the lib a bit.
> 
> When I use Pdf.objiter and just print the index-values,
> they go from highest to lowest values.
> Is this intended?
> I would expect to get the lowest index first.
[...]

Ah, forget this last maiol...

there also is Pdf.objiter_inorder !


Ciao,
   Oliver

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Caml-list] ANN: CamlPDF 1.7
  2013-08-15 11:21 [Caml-list] ANN: CamlPDF 1.7 John Whitington
  2013-08-15 14:21 ` oliver
  2013-08-15 18:40 ` oliver
@ 2013-08-16 10:53 ` Armaël Guéneau
  2013-08-16 11:06   ` John Whitington
  2013-08-21 12:01 ` oliver
  3 siblings, 1 reply; 15+ messages in thread
From: Armaël Guéneau @ 2013-08-16 10:53 UTC (permalink / raw)
  To: caml-list

[-- Attachment #1: Type: text/plain, Size: 970 bytes --]

Hi,

Le 15/08/2013 13:21, John Whitington a écrit :
> The first new release of the CamlPDF library for a while is here:
>
> http://www.github.com/johnwhitington/camlpdf
>
> (Or, shortly, via OPAM.)
Thanks!

I have been playing with CamlPDF a bit, trying to do text extraction.
I'm a total novice about the PDF format, so i might be doing it wrong,
but I was wondering if there were facilities, in CamlPDF, to handle
diacritics and ligatures.

For example, when reading the PDF operators for "Université", I get

Pdfops_TJ (Pdf.Array [Pdf.String "Universit"; Pdf.String "\019"; Pdf.Real 486.; 
Pdf.String "e"])

For "efficient", with "ffi" being ligated, I get

   Pdfops_TJ (Pdf.Array [Pdf.String "e\014cient"])

How can I convert these back, especially the ligature? I tried to use the
conversion functions of Pdftext, like codepoints_of_text followed by
utf8_of_codepoints, but that didn't seem to work. It's highly possible
that I'm also doing it wrong here.

Armaël

[-- Attachment #2: Type: text/html, Size: 1923 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Caml-list] ANN: CamlPDF 1.7
  2013-08-16 10:53 ` Armaël Guéneau
@ 2013-08-16 11:06   ` John Whitington
  2013-08-16 11:45     ` Armaël Guéneau
  0 siblings, 1 reply; 15+ messages in thread
From: John Whitington @ 2013-08-16 11:06 UTC (permalink / raw)
  To: Armaël Guéneau; +Cc: caml-list

Hi,

Armaël Guéneau wrote:
> Hi,
>
> Le 15/08/2013 13:21, John Whitington a écrit :
>> The first new release of the CamlPDF library for a while is here:
>>
>> http://www.github.com/johnwhitington/camlpdf
>>
>> (Or, shortly, via OPAM.)
> Thanks!
>
> I have been playing with CamlPDF a bit, trying to do text extraction.
> I'm a total novice about the PDF format, so i might be doing it wrong,
> but I was wondering if there were facilities, in CamlPDF, to handle
> diacritics and ligatures.
>
> For example, when reading the PDF operators for "Université", I get
>
> Pdfops_TJ (Pdf.Array [Pdf.String "Universit"; Pdf.String "\019";
> Pdf.Real 486.; Pdf.String "e"])

So 0o019 looks like a floating acute in that encoding, followed by a 
kern of 486/1000 of a point to shift leftward, followed by an 'e'. So, 
this is an accented character built by composition of glyphs.

> For "efficient", with "ffi" being ligated, I get
>
> Pdfops_TJ (Pdf.Array [Pdf.String "e\014cient"])

In the font in use here, character 0o014 appears to be a single glyph 
for the ffi ligature.

> How can I convert these back, especially the ligature? I tried to use the
> conversion functions of Pdftext, like codepoints_of_text followed by
> utf8_of_codepoints, but that didn't seem to work. It's highly possible
> that I'm also doing it wrong here.

Drop me a note off-list with the example file and I'll take a look.

Text extraction is almost always possible with modern PDFs. With older 
PDFs, text extraction sometimes isn't possible. You can test this by 
trying to copy & paste the text in Adobe Reader -- if it comes out 
correct, CamlPDF will be able to extract the text too -- if not, it won't.

With Thanks,

-- 
John Whitington
Director, Coherent Graphics Ltd
http://www.coherentpdf.com/


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Caml-list] ANN: CamlPDF 1.7
  2013-08-16 11:06   ` John Whitington
@ 2013-08-16 11:45     ` Armaël Guéneau
  2013-08-16 14:26       ` John Whitington
  0 siblings, 1 reply; 15+ messages in thread
From: Armaël Guéneau @ 2013-08-16 11:45 UTC (permalink / raw)
  To: caml-list

> So 0o019 looks like a floating acute in that encoding, followed by a kern of 
> 486/1000 of a point to shift leftward, followed by an 'e'. So, this is an 
> accented character built by composition of glyphs.
>
>> For "efficient", with "ffi" being ligated, I get
>>
>> Pdfops_TJ (Pdf.Array [Pdf.String "e\014cient"])
>
> In the font in use here, character 0o014 appears to be a single glyph for the 
> ffi ligature.
Yes, ok. How do you know that? I mean, without knowing the displayed text.
Is there a way, knowing the glyph code (here, 0o019 or 0o014), to convert
it to something more "readable"? Like, hum, ['] for the floating acute, and [ffi]
for the ligature.

I tried to copy paste the text from the pdf using evince, and the floating acute
is indeed rendered separately, but the ligature is properly converted to "ffi".

I guess the interpretation of the glyph code depends on the font, but I don't
find how to do that with CamlPDF - using glyphnames_of_text just returned
only "/.notdef"...

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Caml-list] ANN: CamlPDF 1.7
  2013-08-16 11:45     ` Armaël Guéneau
@ 2013-08-16 14:26       ` John Whitington
  0 siblings, 0 replies; 15+ messages in thread
From: John Whitington @ 2013-08-16 14:26 UTC (permalink / raw)
  To: Armaël Guéneau; +Cc: caml-list

Hi,

Armaël Guéneau wrote:
>> So 0o019 looks like a floating acute in that encoding, followed by a
>> kern of 486/1000 of a point to shift leftward, followed by an 'e'. So,
>> this is an accented character built by composition of glyphs.
>>
>>> For "efficient", with "ffi" being ligated, I get
>>>
>>> Pdfops_TJ (Pdf.Array [Pdf.String "e\014cient"])
>>
>> In the font in use here, character 0o014 appears to be a single glyph
>> for the ffi ligature.
> Yes, ok. How do you know that? I mean, without knowing the displayed text.
> Is there a way, knowing the glyph code (here, 0o019 or 0o014), to convert
> it to something more "readable"? Like, hum, ['] for the floating acute,
> and [ffi]
> for the ligature.

Sometimes, sometimes not. This is what the Pdftext module does. Modern 
PDFs have a /ToUnicode for each font, which is a special data structure 
mapping bytes or sequences of bytes directly to unicode codepoints or 
sequences of unicode codepoints.

In the absence of this, one can fall back to the other parts of the font 
metadata (or even the font itself) which might give the encoding in use.

> I tried to copy paste the text from the pdf using evince, and the
> floating acute
> is indeed rendered separately, but the ligature is properly converted to
> "ffi".

> I guess the interpretation of the glyph code depends on the font, but I
> don't
> find how to do that with CamlPDF - using glyphnames_of_text just returned
> only "/.notdef"...

So it looks like only the ffi part will be doable. I can't really 
comment more without seeing the PDF and your code...

Thanks,

-- 
John Whitington
Director, Coherent Graphics Ltd
http://www.coherentpdf.com/


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Caml-list] ANN: CamlPDF 1.7
  2013-08-15 18:39       ` oliver
@ 2013-08-18 12:04         ` Adrien Nader
  2013-08-18 14:04           ` Florent Monnier
  0 siblings, 1 reply; 15+ messages in thread
From: Adrien Nader @ 2013-08-18 12:04 UTC (permalink / raw)
  To: oliver; +Cc: Gerd Stolpmann, John Whitington, caml users

On Thu, Aug 15, 2013, oliver wrote:
> On Thu, Aug 15, 2013 at 06:17:19PM +0200, Gerd Stolpmann wrote:
> > Looks like Oliver installed as root, and maybe there was a restrictive
> > umask. In that case, setting a better umask (e.g. umask 002) before
> > "make install" would be the solution. But that's the user's problem,
> > IMHO.
> 
> 
> 
> Yes, of course I installed as root (via sudo).
> Umask might be the reason, yes.
> And I also had that problem with some other libraries.

It's a fairly common mistake. It's not very dangerous, until you change
your libc that is. Installing through "su" or without the umask is the
right way to do it.

> An installation can also set the
> permissions of installed files accordingly to what is needd.
> This might be added for this kind of install procedure.

It's better to just use the root account, with the right perms (which is
the default) since you'll never get full coverage.

-- 
Adrien Nader

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Caml-list] ANN: CamlPDF 1.7
  2013-08-18 12:04         ` Adrien Nader
@ 2013-08-18 14:04           ` Florent Monnier
  2013-08-18 18:23             ` oliver
  0 siblings, 1 reply; 15+ messages in thread
From: Florent Monnier @ 2013-08-18 14:04 UTC (permalink / raw)
  To: caml users

2013/08/18, Adrien Nader wrote:
[...]
> It's a fairly common mistake. It's not very dangerous, until you change
> your libc that is. Installing through "su" or without the umask is the
> right way to do it.

It's even better to use "su -" so that you don't polute the env by
importing the one from the user account.

-- 
Regards

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Caml-list] ANN: CamlPDF 1.7
  2013-08-18 14:04           ` Florent Monnier
@ 2013-08-18 18:23             ` oliver
  0 siblings, 0 replies; 15+ messages in thread
From: oliver @ 2013-08-18 18:23 UTC (permalink / raw)
  To: Florent Monnier; +Cc: caml users

On Sun, Aug 18, 2013 at 04:04:47PM +0200, Florent Monnier wrote:
> 2013/08/18, Adrien Nader wrote:
> [...]
> > It's a fairly common mistake. It's not very dangerous, until you change
> > your libc that is. Installing through "su" or without the umask is the
> > right way to do it.
> 
> It's even better to use "su -" so that you don't polute the env by
> importing the one from the user account.

Both umask's are the same.

There were times, when people used the install-program
for installing files, and it allows to set the permissions
of the files during the installation process.

==> man(1) install

I wonder if modern, "fancy" tools don't allow this kind of
control anymore, which this old tool tool did allow...


Ciao,
   Oliver

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Caml-list] ANN: CamlPDF 1.7
  2013-08-15 11:21 [Caml-list] ANN: CamlPDF 1.7 John Whitington
                   ` (2 preceding siblings ...)
  2013-08-16 10:53 ` Armaël Guéneau
@ 2013-08-21 12:01 ` oliver
  3 siblings, 0 replies; 15+ messages in thread
From: oliver @ 2013-08-21 12:01 UTC (permalink / raw)
  To: John Whitington; +Cc: caml users

On Thu, Aug 15, 2013 at 12:21:44PM +0100, John Whitington wrote:
> Hi,
> 
> The first new release of the CamlPDF library for a while is here:
> 
> http://www.github.com/johnwhitington/camlpdf
[...]

How can pdf Actions (e.g. opening another file)
be done with this library?
I could not find a module that is used for pdf-actions.

Or can this possibly achieved with other od the provided modules?


Ciao,
   Oliver

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2013-08-21 12:02 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-15 11:21 [Caml-list] ANN: CamlPDF 1.7 John Whitington
2013-08-15 14:21 ` oliver
2013-08-15 14:28   ` John Whitington
2013-08-15 16:17     ` Gerd Stolpmann
2013-08-15 18:39       ` oliver
2013-08-18 12:04         ` Adrien Nader
2013-08-18 14:04           ` Florent Monnier
2013-08-18 18:23             ` oliver
2013-08-15 18:40 ` oliver
2013-08-15 18:42   ` oliver
2013-08-16 10:53 ` Armaël Guéneau
2013-08-16 11:06   ` John Whitington
2013-08-16 11:45     ` Armaël Guéneau
2013-08-16 14:26       ` John Whitington
2013-08-21 12:01 ` oliver

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).