ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
* Bad PDF to text crawlers
@ 2015-08-19 21:05 Kip Warner
  2015-08-19 21:35 ` Peter Münster
  2015-08-20 17:57 ` creating multirow curly brace in tables to symbolize row span Henry House
  0 siblings, 2 replies; 5+ messages in thread
From: Kip Warner @ 2015-08-19 21:05 UTC (permalink / raw)
  To: ntg-context


[-- Attachment #1.1: Type: text/plain, Size: 876 bytes --]

Hey list,

I have an important document online that I would prefer to keep as a PDF 
and not in another format. Unfortunately bots frequently try to provide 
those looking for it with a text version they try to extract (beyond my 
control). The extraction looks just absolutely awful and has been a 
major pain in leaving readers with a really bad understanding of the 
contents of the document.

I was thinking that there must be some way of tricking these bots, 
depending on how they are implemented, and let's assume they will always 
find the PDF, to get them to extract only a small invisible layer that 
just contains some hidden text directing a user to the location to 
download the original high quality ConTeXt PDF.

Any suggestions?

-- 
Kip Warner -- Senior Software Engineer
OpenPGP encrypted/signed mail preferred
http://www.thevertigo.com

[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

[-- Attachment #2: Type: text/plain, Size: 485 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-08-20 18:05 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-19 21:05 Bad PDF to text crawlers Kip Warner
2015-08-19 21:35 ` Peter Münster
2015-08-20 16:43   ` Kip Warner
2015-08-20 17:57 ` creating multirow curly brace in tables to symbolize row span Henry House
2015-08-20 18:05   ` Aditya Mahajan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).