List for cgit developers and users
 help / color / mirror / Atom feed
* [PATCH v2] filters: Improved syntax-highlighting.py
@ 2014-01-13 11:02 stefan
  2014-01-13 14:21 ` Jason
  0 siblings, 1 reply; 5+ messages in thread
From: stefan @ 2014-01-13 11:02 UTC (permalink / raw)


- Switched back to python2 according to a problem in pygments with python3.
  With the next release of pygments this problem should be fixed.
  Issue see here:
  https://bitbucket.org/birkenfeld/pygments-main/issue/901/problems-with-python3
- Just read the stdin, decode it to utf-8 and ignore unknown signs. This ensures
  that even destroyed files do not cause any errors in the filter.
- Improved language guessing:
  -> At first use guess_lexer_for_filename for a better detection of the used
     programming languages (even mixed cases will be detected, e.g. php + html).
  -> If nothing was found look if there is a shebang and use guess_lexer.
  -> As default/fallback choose TextLexer.
- Using inline CSS instead of this sys.stdout.print() hack.
- Using trac theme for pygments (it is very clean and not intrusive like the
  default or pastie theme).
- I had to fix cgit.css according to a alignment issue with the line-numbers
  table.

Signed-off-by: Stefan Tatschner <stefan at sevenbyte.org>
---
 cgit.css                       |  1 +
 filters/syntax-highlighting.py | 48 +++++++++++++++++++++++++-----------------
 2 files changed, 30 insertions(+), 19 deletions(-)

diff --git a/cgit.css b/cgit.css
index 71b0b9b..ef99b5d 100644
--- a/cgit.css
+++ b/cgit.css
@@ -289,6 +289,7 @@ div#cgit table.blob td.linenumbers {
 
 div#cgit table.blob pre {
 	padding: 0; margin: 0;
+	line-height: 125%;
 }
 
 div#cgit table.blob td.linenumbers a,
diff --git a/filters/syntax-highlighting.py b/filters/syntax-highlighting.py
index 72d9097..b95baed 100755
--- a/filters/syntax-highlighting.py
+++ b/filters/syntax-highlighting.py
@@ -1,13 +1,16 @@
-#!/usr/bin/env python3
+#!/usr/bin/env python2
 
-# This script uses Pygments and Python3. You must have both installed for this to work.
+# This script uses Pygments and Python2. You must have both installed
+# for this to work.
+#
 # http://pygments.org/
 # http://python.org/
 #
-# It may be used with the source-filter or repo.source-filter settings in cgitrc.
+# It may be used with the source-filter or repo.source-filter settings
+# in cgitrc.
 #
-# The following environment variables can be used to retrieve the configuration
-# of the repository for which this script is called:
+# The following environment variables can be used to retrieve the
+# configuration of the repository for which this script is called:
 # CGIT_REPO_URL        ( = repo.url       setting )
 # CGIT_REPO_NAME       ( = repo.name      setting )
 # CGIT_REPO_PATH       ( = repo.path      setting )
@@ -18,22 +21,29 @@
 
 
 import sys
-import cgi
-import codecs
-from pygments.lexers import get_lexer_for_filename
 from pygments import highlight
+from pygments.util import ClassNotFound
+from pygments.lexers import TextLexer
+from pygments.lexers import guess_lexer
+from pygments.lexers import guess_lexer_for_filename
 from pygments.formatters import HtmlFormatter
 
-sys.stdin = codecs.getreader("utf-8")(sys.stdin.detach())
-sys.stdout = codecs.getwriter("utf-8")(sys.stdout.detach())
-doc = sys.stdin.read()
+
+# read stdin and decode to utf-8. ignore any unkown signs.
+data = sys.stdin.read().decode(encoding='utf-8', errors='ignore')
+filename = sys.argv[1]
+formatter = HtmlFormatter(encoding='utf-8', style='trac', noclasses=True)
+
 try:
-	lexer = get_lexer_for_filename(sys.argv[1])
-	formatter = HtmlFormatter(style='pastie')
-	sys.stdout.write("<style>")
-	sys.stdout.write(formatter.get_style_defs('.highlight'))
-	sys.stdout.write("</style>")
+    lexer = guess_lexer_for_filename(filename, data, encoding='utf-8')
+except ClassNotFound:
+    # check if there is any shebang
+    if data[0:2] == '#!':
+        lexer = guess_lexer(data, encoding='utf-8')
+    else:
+        lexer = TextLexer(encoding='utf-8')
+except TypeError:
+    lexer = TextLexer(encoding='utf-8')
 
-	highlight(doc, lexer, formatter, sys.stdout)
-except:
-	sys.stdout.write(str(cgi.escape(doc).encode("ascii", "xmlcharrefreplace"), "ascii"))
+# highlight! :-)
+highlight(data, lexer, formatter, outfile=sys.stdout)
-- 
1.8.5.2



^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2] filters: Improved syntax-highlighting.py
  2014-01-13 11:02 [PATCH v2] filters: Improved syntax-highlighting.py stefan
@ 2014-01-13 14:21 ` Jason
       [not found]   ` <52D3FC8C.7040809@sevenbyte.org>
  0 siblings, 1 reply; 5+ messages in thread
From: Jason @ 2014-01-13 14:21 UTC (permalink / raw)


Thanks for all your hard work on this. Sorry for the extended back and
forth. More comments, alas alas, below.

On Mon, Jan 13, 2014 at 12:02 PM, Stefan Tatschner <stefan at sevenbyte.org> wrote:
> - Using inline CSS instead of this sys.stdout.print() hack.

Please don't do this. Inline CSS makes for much bigger files, and
there's nothing in the HTML5 spec that forbids us from putting <style>
tags there in the first place.

> - Using trac theme for pygments (it is very clean and not intrusive like the
>   default or pastie theme).

This is probably best as a separate commit, since some folks might complain.

> - I had to fix cgit.css according to a alignment issue with the line-numbers
>   table.

This is this issue, right?
https://code.google.com/p/chromium/issues/detail?id=141945
http://data.zx2c4.com/italics-broken.png
http://data.zx2c4.com/italics-broken.html

Why does changing the line-height to 125% fix it exactly?


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2] filters: Improved syntax-highlighting.py
       [not found]     ` <CAHmME9quebs1U7-QxZQ7eZNS=j=dMRzr0zsqfdO=V=EM3rqPtw@mail.gmail.com>
@ 2014-01-13 14:50       ` Jason
  2014-01-13 14:53         ` stefan
  0 siblings, 1 reply; 5+ messages in thread
From: Jason @ 2014-01-13 14:50 UTC (permalink / raw)


On Mon, Jan 13, 2014 at 3:47 PM, Stefan Tatschner <stefan at sevenbyte.org> wrote:
> No, that's not the issue. By default 'pygments' adds a <pre
> style="line-height: 125%"> tag so I had to change the line-height of the
> other column as well.

Strange. It doesn't do that on the current script. Is this because of
the inline styles? In which case, since we're reverting to the <style>
way of doing things, we can just drop this part too from this commit.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2] filters: Improved syntax-highlighting.py
  2014-01-13 14:50       ` Jason
@ 2014-01-13 14:53         ` stefan
  0 siblings, 0 replies; 5+ messages in thread
From: stefan @ 2014-01-13 14:53 UTC (permalink / raw)


Am 13.01.2014 15:50, schrieb Jason A. Donenfeld:
> On Mon, Jan 13, 2014 at 3:47 PM, Stefan Tatschner <stefan at sevenbyte.org> wrote:
>> No, that's not the issue. By default 'pygments' adds a <pre
>> style="line-height: 125%"> tag so I had to change the line-height of the
>> other column as well.
> 
> Strange. It doesn't do that on the current script. Is this because of
> the inline styles? In which case, since we're reverting to the <style>
> way of doing things, we can just drop this part too from this commit.

You're right this is because of the inline styles. If we just write the
CSS definitions to stdout it doesn't matter and we could remove this as
well.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 555 bytes
Desc: OpenPGP digital signature
URL: <http://lists.zx2c4.com/pipermail/cgit/attachments/20140113/15b973a3/attachment.asc>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2] filters: Improved syntax-highlighting.py
       [not found]   ` <52D3FC8C.7040809@sevenbyte.org>
@ 2014-01-13 14:49     ` stefan
       [not found]     ` <CAHmME9quebs1U7-QxZQ7eZNS=j=dMRzr0zsqfdO=V=EM3rqPtw@mail.gmail.com>
  1 sibling, 0 replies; 5+ messages in thread
From: stefan @ 2014-01-13 14:49 UTC (permalink / raw)


Am 13.01.2014 15:21, schrieb Jason A. Donenfeld:
> On Mon, Jan 13, 2014 at 12:02 PM, Stefan Tatschner <stefan at sevenbyte.org> wrote:
>> - Using inline CSS instead of this sys.stdout.print() hack.
> 
> Please don't do this. Inline CSS makes for much bigger files, and
> there's nothing in the HTML5 spec that forbids us from putting <style>
> tags there in the first place.

Ok. I will revert that change. I think the best solution would be
putting the style definitions in a seperate file. But I will go with the
former implementation.

>> - Using trac theme for pygments (it is very clean and not intrusive like the
>>   default or pastie theme).
> 
> This is probably best as a separate commit, since some folks might complain.

Ok. I will do this in a seperate commit.

>> - I had to fix cgit.css according to a alignment issue with the line-numbers
>>   table.
> 
> This is this issue, right?
> https://code.google.com/p/chromium/issues/detail?id=141945
> http://data.zx2c4.com/italics-broken.png
> http://data.zx2c4.com/italics-broken.html
> 
> Why does changing the line-height to 125% fix it exactly?

No, that's not the issue. By default 'pygments' adds a <pre
style="line-height: 125%"> tag so I had to change the line-height of the
other column as well. To be honest in my opionion it looks also a bit
better with that kind of expanded line-height.

Stefan


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 555 bytes
Desc: OpenPGP digital signature
URL: <http://lists.zx2c4.com/pipermail/cgit/attachments/20140113/e83a2705/attachment.asc>


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-01-13 14:53 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-13 11:02 [PATCH v2] filters: Improved syntax-highlighting.py stefan
2014-01-13 14:21 ` Jason
     [not found]   ` <52D3FC8C.7040809@sevenbyte.org>
2014-01-13 14:49     ` stefan
     [not found]     ` <CAHmME9quebs1U7-QxZQ7eZNS=j=dMRzr0zsqfdO=V=EM3rqPtw@mail.gmail.com>
2014-01-13 14:50       ` Jason
2014-01-13 14:53         ` stefan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).