List for cgit developers and users
 help / color / mirror / Atom feed
* [PATCH v3 1/2] filters: Improved syntax-highlighting.py
@ 2014-01-13 21:10 stefan
  2014-01-13 21:10 ` [PATCH v3 2/2] filters: Choose 'trac' theme in pygments stefan
  2014-01-13 21:50 ` [PATCH v3 1/2] filters: Improved syntax-highlighting.py Jason
  0 siblings, 2 replies; 10+ messages in thread
From: stefan @ 2014-01-13 21:10 UTC (permalink / raw)


- Switched back to python2 according to a problem in pygments with python3.
  With the next release of pygments this problem should be fixed.
  Issue see here:
  https://bitbucket.org/birkenfeld/pygments-main/issue/901/problems-with-python3
- Just read the stdin, decode it to utf-8 and ignore unknown signs. This ensures
  that even destroyed files do not cause any errors in the filter.
- Improved language guessing:
  -> At first use guess_lexer_for_filename for a better detection of the used
     programming languages (even mixed cases will be detected, e.g. php + html).
  -> If nothing was found look if there is a shebang and use guess_lexer.
  -> As default/fallback choose TextLexer.

Signed-off-by: Stefan Tatschner <stefan at sevenbyte.org>
---
 filters/syntax-highlighting.py | 52 +++++++++++++++++++++++++++---------------
 1 file changed, 33 insertions(+), 19 deletions(-)

diff --git a/filters/syntax-highlighting.py b/filters/syntax-highlighting.py
index 72d9097..53081a4 100755
--- a/filters/syntax-highlighting.py
+++ b/filters/syntax-highlighting.py
@@ -1,13 +1,16 @@
-#!/usr/bin/env python3
+#!/usr/bin/env python2
 
-# This script uses Pygments and Python3. You must have both installed for this to work.
+# This script uses Pygments and Python2. You must have both installed
+# for this to work.
+#
 # http://pygments.org/
 # http://python.org/
 #
-# It may be used with the source-filter or repo.source-filter settings in cgitrc.
+# It may be used with the source-filter or repo.source-filter settings
+# in cgitrc.
 #
-# The following environment variables can be used to retrieve the configuration
-# of the repository for which this script is called:
+# The following environment variables can be used to retrieve the
+# configuration of the repository for which this script is called:
 # CGIT_REPO_URL        ( = repo.url       setting )
 # CGIT_REPO_NAME       ( = repo.name      setting )
 # CGIT_REPO_PATH       ( = repo.path      setting )
@@ -18,22 +21,33 @@
 
 
 import sys
-import cgi
-import codecs
-from pygments.lexers import get_lexer_for_filename
 from pygments import highlight
+from pygments.util import ClassNotFound
+from pygments.lexers import TextLexer
+from pygments.lexers import guess_lexer
+from pygments.lexers import guess_lexer_for_filename
 from pygments.formatters import HtmlFormatter
 
-sys.stdin = codecs.getreader("utf-8")(sys.stdin.detach())
-sys.stdout = codecs.getwriter("utf-8")(sys.stdout.detach())
-doc = sys.stdin.read()
+
+# read stdin and decode to utf-8. ignore any unkown signs.
+data = sys.stdin.read().decode(encoding='utf-8', errors='ignore')
+filename = sys.argv[1]
+formatter = HtmlFormatter(encoding='utf-8', style='pastie')
+
 try:
-	lexer = get_lexer_for_filename(sys.argv[1])
-	formatter = HtmlFormatter(style='pastie')
-	sys.stdout.write("<style>")
-	sys.stdout.write(formatter.get_style_defs('.highlight'))
-	sys.stdout.write("</style>")
+    lexer = guess_lexer_for_filename(filename, data, encoding='utf-8')
+except ClassNotFound:
+    # check if there is any shebang
+    if data[0:2] == '#!':
+        lexer = guess_lexer(data, encoding='utf-8')
+    else:
+        lexer = TextLexer(encoding='utf-8')
+except TypeError:
+    lexer = TextLexer(encoding='utf-8')
 
-	highlight(doc, lexer, formatter, sys.stdout)
-except:
-	sys.stdout.write(str(cgi.escape(doc).encode("ascii", "xmlcharrefreplace"), "ascii"))
+# highlight! :-)
+# printout pygments' css definitions as well
+sys.stdout.write('<style>')
+sys.stdout.write(formatter.get_style_defs('.highlight'))
+sys.stdout.write('</style>')
+highlight(data, lexer, formatter, outfile=sys.stdout)
-- 
1.8.5.2



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 2/2] filters: Choose 'trac' theme in pygments
  2014-01-13 21:10 [PATCH v3 1/2] filters: Improved syntax-highlighting.py stefan
@ 2014-01-13 21:10 ` stefan
  2014-01-14  1:33   ` Jason
  2014-01-13 21:50 ` [PATCH v3 1/2] filters: Improved syntax-highlighting.py Jason
  1 sibling, 1 reply; 10+ messages in thread
From: stefan @ 2014-01-13 21:10 UTC (permalink / raw)


Using trac theme for pygments. It is very clean and not as
intrusive as the default or pastie theme. Especially I do
not like the the 'pastie' theme very much because of the
very strange illustration of multiline strings (the red
background thing).

Signed-off-by: Stefan Tatschner <stefan at sevenbyte.org>
---
 filters/syntax-highlighting.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/filters/syntax-highlighting.py b/filters/syntax-highlighting.py
index 53081a4..67855d1 100755
--- a/filters/syntax-highlighting.py
+++ b/filters/syntax-highlighting.py
@@ -32,7 +32,7 @@ from pygments.formatters import HtmlFormatter
 # read stdin and decode to utf-8. ignore any unkown signs.
 data = sys.stdin.read().decode(encoding='utf-8', errors='ignore')
 filename = sys.argv[1]
-formatter = HtmlFormatter(encoding='utf-8', style='pastie')
+formatter = HtmlFormatter(encoding='utf-8', style='trac')
 
 try:
     lexer = guess_lexer_for_filename(filename, data, encoding='utf-8')
-- 
1.8.5.2



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 1/2] filters: Improved syntax-highlighting.py
  2014-01-13 21:10 [PATCH v3 1/2] filters: Improved syntax-highlighting.py stefan
  2014-01-13 21:10 ` [PATCH v3 2/2] filters: Choose 'trac' theme in pygments stefan
@ 2014-01-13 21:50 ` Jason
  2014-01-13 22:13   ` stefan
  1 sibling, 1 reply; 10+ messages in thread
From: Jason @ 2014-01-13 21:50 UTC (permalink / raw)


Perfect! Applied. Thanks for going through all the revisions.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 1/2] filters: Improved syntax-highlighting.py
  2014-01-13 21:50 ` [PATCH v3 1/2] filters: Improved syntax-highlighting.py Jason
@ 2014-01-13 22:13   ` stefan
  2014-01-13 22:16     ` Jason
  0 siblings, 1 reply; 10+ messages in thread
From: stefan @ 2014-01-13 22:13 UTC (permalink / raw)


Am 13.01.2014 22:50, schrieb Jason A. Donenfeld:
> Perfect! Applied. Thanks for going through all the revisions.

thanks as well. :)
I have one simple question (just out of curiosity; I do not want to bash
anybody).

Why did you apply my patch with tabs instead of spaces? I was wondering
because I adjusted the python script according to pep8 [1] and I'm sure
the patchfile was with spaces. Maybe you have an automatic convert script?

[1] http://www.python.org/dev/peps/pep-0008/#tabs-or-spaces

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 490 bytes
Desc: OpenPGP digital signature
URL: <http://lists.zx2c4.com/pipermail/cgit/attachments/20140113/1ef5f41b/attachment.asc>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 1/2] filters: Improved syntax-highlighting.py
  2014-01-13 22:13   ` stefan
@ 2014-01-13 22:16     ` Jason
  2014-01-13 22:19       ` stefan
  0 siblings, 1 reply; 10+ messages in thread
From: Jason @ 2014-01-13 22:16 UTC (permalink / raw)


On Mon, Jan 13, 2014 at 11:13 PM, Stefan Tatschner <stefan at sevenbyte.org> wrote:
> Why did you apply my patch with tabs instead of spaces? I was wondering
> because I adjusted the python script according to pep8 [1] and I'm sure
> the patchfile was with spaces. Maybe you have an automatic convert script?

I like tabs. They're for tabbing. There's a character for doing the
thing that they do. I like to use that character. The rest of cgit is
the same way. So, python scripts in the tree follow suit.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 1/2] filters: Improved syntax-highlighting.py
  2014-01-13 22:16     ` Jason
@ 2014-01-13 22:19       ` stefan
  2014-02-26 17:34         ` Jason
  0 siblings, 1 reply; 10+ messages in thread
From: stefan @ 2014-01-13 22:19 UTC (permalink / raw)


Am 13.01.2014 23:16, schrieb Jason A. Donenfeld:
> On Mon, Jan 13, 2014 at 11:13 PM, Stefan Tatschner <stefan at sevenbyte.org> wrote:
>> Why did you apply my patch with tabs instead of spaces? I was wondering
>> because I adjusted the python script according to pep8 [1] and I'm sure
>> the patchfile was with spaces. Maybe you have an automatic convert script?
> 
> I like tabs. They're for tabbing. There's a character for doing the
> thing that they do. I like to use that character. The rest of cgit is
> the same way. So, python scripts in the tree follow suit.

got it. :)


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 490 bytes
Desc: OpenPGP digital signature
URL: <http://lists.zx2c4.com/pipermail/cgit/attachments/20140113/23b7a472/attachment.asc>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 2/2] filters: Choose 'trac' theme in pygments
  2014-01-13 21:10 ` [PATCH v3 2/2] filters: Choose 'trac' theme in pygments stefan
@ 2014-01-14  1:33   ` Jason
  2014-01-14 10:39     ` stefan
  0 siblings, 1 reply; 10+ messages in thread
From: Jason @ 2014-01-14  1:33 UTC (permalink / raw)


Personally, I think the trac colors are a bit ugly. I like pastie
best. But this is just preference.

Here's a comparison site:
http://blog.favrik.com/2011/02/22/preview-all-pygments-styles-for-your-code-highlighting-needs/

If folks want to take some kind of vote, I'll go with majority opinion.

pastie: 1
trac: 1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 2/2] filters: Choose 'trac' theme in pygments
  2014-01-14  1:33   ` Jason
@ 2014-01-14 10:39     ` stefan
  0 siblings, 0 replies; 10+ messages in thread
From: stefan @ 2014-01-14 10:39 UTC (permalink / raw)


Am 14.01.2014 02:33, schrieb Jason A. Donenfeld:
> Personally, I think the trac colors are a bit ugly. I like pastie
> best. But this is just preference.

I personally also like the 'default' theme it is pretty much the same as
'pastie' but it does not add a background color to strings. This
background color looks really strange with multiline strings. It might
be another solution. If we use 'default' I think we will have to
overwrite the background color of <div class="highlight"> to white.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 490 bytes
Desc: OpenPGP digital signature
URL: <http://lists.zx2c4.com/pipermail/cgit/attachments/20140114/dac58d8c/attachment.asc>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 1/2] filters: Improved syntax-highlighting.py
  2014-01-13 22:19       ` stefan
@ 2014-02-26 17:34         ` Jason
  2014-02-26 18:05           ` stefan
  0 siblings, 1 reply; 10+ messages in thread
From: Jason @ 2014-02-26 17:34 UTC (permalink / raw)


Hey Stefan,

Can you keep the list up to date with pygments upstream? Has that
python3 change made it through? Before 0.10.1 release (probably
tomorrow), should we switch the highlight script back to python3? Or
is pygments upstream still not released?

Thanks,
Jason


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 1/2] filters: Improved syntax-highlighting.py
  2014-02-26 17:34         ` Jason
@ 2014-02-26 18:05           ` stefan
  0 siblings, 0 replies; 10+ messages in thread
From: stefan @ 2014-02-26 18:05 UTC (permalink / raw)


Hi Jason,

Am 26.02.2014 18:34, schrieb Jason A. Donenfeld:
> Can you keep the list up to date with pygments upstream? Has that
> python3 change made it through? Before 0.10.1 release (probably
> tomorrow), should we switch the highlight script back to python3? Or
> is pygments upstream still not released?

I have an eye on pygments upstream. :) If there are any changes I will
submit a patch and port the script to python3. Currently pygments 1.7 is
under development [1] and I have no idea when it's ready...

[1] http://pygments.org/docs/changelog/#version-1-7

Stefan

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 490 bytes
Desc: OpenPGP digital signature
URL: <http://lists.zx2c4.com/pipermail/cgit/attachments/20140226/958bbdd9/attachment.asc>


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-02-26 18:05 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-13 21:10 [PATCH v3 1/2] filters: Improved syntax-highlighting.py stefan
2014-01-13 21:10 ` [PATCH v3 2/2] filters: Choose 'trac' theme in pygments stefan
2014-01-14  1:33   ` Jason
2014-01-14 10:39     ` stefan
2014-01-13 21:50 ` [PATCH v3 1/2] filters: Improved syntax-highlighting.py Jason
2014-01-13 22:13   ` stefan
2014-01-13 22:16     ` Jason
2014-01-13 22:19       ` stefan
2014-02-26 17:34         ` Jason
2014-02-26 18:05           ` stefan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).