List for cgit developers and users
 help / color / mirror / Atom feed
* syntax-highlighting.py
@ 2013-11-04 20:46 stefan
  2014-01-08 14:56 ` RESEND: syntax-highlighting.py stefan
  0 siblings, 1 reply; 7+ messages in thread
From: stefan @ 2013-11-04 20:46 UTC (permalink / raw)


Hi there,
I created a python script for the syntax-highlighting thing in cgit
which solves a few problems with the shipped script in cgit [1]. I use
it in production since half a year.

- It uses python2 because there are still a lot of problems with 
  python3 and pygments out there. [2], [3]
- I used guess_lexer_for_filename for a better detecting of the used    
  languages (e.g. mixed HTML and PHP, HTML (Django)...). 
  guess_lexer_for_filename does not work with python3 yet.
- The script looks whether there is a shebang line if it cannot   
  determine the language.
- It maps cmakelists.txt and pkgbuild (arch linux package system) files 
  to the correct lexer. Any other filenames can be added easily. 
- The CSS has to be defined in the CSS file to avoid inline CSS.
- `sys.stdin.read().decode(encoding='utf-8', errors='ignore')` is used
  to ensure the functionality even for corrupted file encodings.

If you like my implementation I would be happy to create a patchfile. :)

Stefan

Links:
[1]:
https://github.com/statschner/cgit_stuff/blob/4f847ddcd5aa7de0948bdb6b200a966b6389d94a/cgit_pygments.py
[2]:
https://bitbucket.org/birkenfeld/pygments-main/issue/901/problems-with-python3
[3]:
https://bitbucket.org/birkenfeld/pygments-main/issue/847/test-failures-with-python-33

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 490 bytes
Desc: This is a digitally signed message part
URL: <http://lists.zx2c4.com/pipermail/cgit/attachments/20131104/1fae4a14/attachment.asc>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* RESEND: syntax-highlighting.py
  2013-11-04 20:46 syntax-highlighting.py stefan
@ 2014-01-08 14:56 ` stefan
  2014-01-08 15:14   ` Jason
  0 siblings, 1 reply; 7+ messages in thread
From: stefan @ 2014-01-08 14:56 UTC (permalink / raw)


Am 04.11.2013 21:46, schrieb Stefan Tatschner:
Hi there,
I created a python script for the syntax-highlighting thing in cgit
which solves a few problems with the shipped script in cgit [1]. I use
it in production since half a year.

- It uses python2 because there are still a lot of problems with
  python3 and pygments out there. [2], [3]
- I used guess_lexer_for_filename for a better detecting of the used
  languages (e.g. mixed HTML and PHP, HTML (Django)...).
  guess_lexer_for_filename does not work with python3 yet.
- The script looks whether there is a shebang line if it cannot
  determine the language.
- It maps cmakelists.txt and pkgbuild (arch linux package system) files
  to the correct lexer. Any other filenames can be added easily.
- The CSS has to be defined in the CSS file to avoid inline CSS.
- `sys.stdin.read().decode(encoding='utf-8', errors='ignore')` is used
  to ensure the functionality even for corrupted file encodings.

If you like my implementation I would be happy to create a patchfile. :)

Stefan

Links:
[1]:
http://cgit.sevenbyte.org/cgit_stuff/tree/cgit_pygments.py?id=4f847ddcd5aa7de0948bdb6b200a966b6389d94a
[2]:
https://bitbucket.org/birkenfeld/pygments-main/issue/901/problems-with-python3
[3]:
https://bitbucket.org/birkenfeld/pygments-main/issue/847/test-failures-with-python-33


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 490 bytes
Desc: OpenPGP digital signature
URL: <http://lists.zx2c4.com/pipermail/cgit/attachments/20140108/8f6e0a2b/attachment.asc>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* RESEND: syntax-highlighting.py
  2014-01-08 14:56 ` RESEND: syntax-highlighting.py stefan
@ 2014-01-08 15:14   ` Jason
       [not found]     ` <CAHmME9p-89eXYBW0V-nn32XNrkFadFHD6r==qDjf9Pwk+jDNPg@mail.gmail.com>
  0 siblings, 1 reply; 7+ messages in thread
From: Jason @ 2014-01-08 15:14 UTC (permalink / raw)


Hi Stefan,

We ship with this script already:
http://git.zx2c4.com/cgit/tree/filters/syntax-highlighting.py

It's pretty similar. Yours might have some fixes or bells and whistles
that mine does not? If so, would you mind submitting a patch that
improves the one we already have?

Thanks,
Jason


^ permalink raw reply	[flat|nested] 7+ messages in thread

* RESEND: syntax-highlighting.py
       [not found]     ` <CAHmME9p-89eXYBW0V-nn32XNrkFadFHD6r==qDjf9Pwk+jDNPg@mail.gmail.com>
@ 2014-01-10 13:07       ` stefan
  2014-01-10 16:19         ` Jason
  0 siblings, 1 reply; 7+ messages in thread
From: stefan @ 2014-01-10 13:07 UTC (permalink / raw)


Am 08.01.2014 16:19, schrieb Jason A. Donenfeld:
> Okay reading this closer, it seems what the one in-tree could benefit from is:
> 
> - Expanded list of filename mappings, made more generic than what you
> have in your script, but basically the same idea. -- { "pkgbuild":
> "bashlexer", "cmakelists.txt", "cmakelexer" }. Is there a way to do
> this that ties directly into pygment's guess_lexer_for_filename? If
> not, could you submit a patch upstream?

I have created an upstream pull request and it got merged after 30
seconds [1]. Moreover it seems like the cmakelists.txt thing was also
fixed [2].

Should I remove the filename mapping thing in my patch according to
upstream or should I keep this until the next pygments release?


[1]
https://bitbucket.org/birkenfeld/pygments-main/commits/7cc1e6d0143445302445360fb8b963a7644f1912
[2]
https://bitbucket.org/birkenfeld/pygments-main/src/8fda165f5da04bdf5040fc7b8d3e1589d8fc1b4f/pygments/lexers/text.py?at=default#cl-1609

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 490 bytes
Desc: OpenPGP digital signature
URL: <http://lists.zx2c4.com/pipermail/cgit/attachments/20140110/eb45187a/attachment.asc>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* RESEND: syntax-highlighting.py
  2014-01-10 13:07       ` stefan
@ 2014-01-10 16:19         ` Jason
  0 siblings, 0 replies; 7+ messages in thread
From: Jason @ 2014-01-10 16:19 UTC (permalink / raw)


On Fri, Jan 10, 2014 at 2:07 PM, Stefan Tatschner <stefan at sevenbyte.org> wrote:
> I have created an upstream pull request and it got merged after 30
> seconds [1]. Moreover it seems like the cmakelists.txt thing was also
> fixed [2].

Nice! Great!

>
> Should I remove the filename mapping thing in my patch according to
> upstream or should I keep this until the next pygments release?

Yea, go ahead and remove that, and resubmit. When pygments releases,
it'll automatically have that improved functionality.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* RESEND: syntax-highlighting.py
  2014-01-08 16:28   ` Jason
@ 2014-01-09 19:52     ` stefan
  0 siblings, 0 replies; 7+ messages in thread
From: stefan @ 2014-01-09 19:52 UTC (permalink / raw)


Am 08.01.2014 17:28, schrieb Jason A. Donenfeld:
>> What do you want to de with the CSS definitions? Just put them into the
>> default cgit CSS file or maybe include it seperately?
> 
> We already handle that on line 34 of [1], so no need to do anything
> differently. Please base your work off [1] generally.
>
> [1] http://git.zx2c4.com/cgit/tree/filters/syntax-highlighting.py

Actually I do not like this implementation of CSS output in the script.
In the final HTML output there is something like this:

<td class='lines'><pre><code><style>.highlight .hll { background-color:
#ffffcc }
.highlight  { background: #ffffff; }
.highlight .c { color: #888888 } /* Comment */
.highlight .err { color: #a61717; background-color: #e3d2d2 }

[...]

</style><div class="highlight"><pre>

You see the CSS definitions for the synthax highlighting thing come
within a table within a pre within a code tag. I do not think this is
the right place for CSS definitions.

What do you think?

Stefan

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 490 bytes
Desc: OpenPGP digital signature
URL: <http://lists.zx2c4.com/pipermail/cgit/attachments/20140109/f76b1af2/attachment.asc>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* RESEND: syntax-highlighting.py
  2014-01-08 16:22 ` Fwd: " stefan
@ 2014-01-08 16:28   ` Jason
  2014-01-09 19:52     ` stefan
  0 siblings, 1 reply; 7+ messages in thread
From: Jason @ 2014-01-08 16:28 UTC (permalink / raw)


On Wed, Jan 8, 2014 at 5:22 PM, Stefan Tatschner <stefan at sevenbyte.org> wrote:
> I have looked in pygment's sourcecode and found this [0]. I just have to
> add the filenames in the lexer definitions. I will try to create an
> upstream pull request. The only problem is that the latest release of
> pygments is almost 1 year old. It think it will take a while until the
> changes get merged...
>
> As a workaround we could create such a dict as you suggested and check
> if the filenames match.

Another workaround might be just to monkey patch _mapping.LEXERS. Any
of these possibilities is probably fine.


> At first I run guess_lexer_for_filename (line 26). It checks the content
> of the file and also looks at the filename. After that I check if there
> is a shebang because the script has already looked at the content with
> guess_lexer_for_filename. I have tested this without the shebang
> detection and especially with plaintext files it often returns crap...

Okay fair enough. Shebang guessing only.


> It will take a few days because currently I have exams... Should I work
> on the master branch or something else?

Yes the master branch.

>
> What do you want to de with the CSS definitions? Just put them into the
> default cgit CSS file or maybe include it seperately?

We already handle that on line 34 of [1], so no need to do anything
differently. Please base your work off [1] generally.


Thanks a bunch! Looking forward to seeing the results.

Jason

[1] http://git.zx2c4.com/cgit/tree/filters/syntax-highlighting.py


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-01-10 16:19 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-04 20:46 syntax-highlighting.py stefan
2014-01-08 14:56 ` RESEND: syntax-highlighting.py stefan
2014-01-08 15:14   ` Jason
     [not found]     ` <CAHmME9p-89eXYBW0V-nn32XNrkFadFHD6r==qDjf9Pwk+jDNPg@mail.gmail.com>
2014-01-10 13:07       ` stefan
2014-01-10 16:19         ` Jason
     [not found] <52CD7ADF.2080903@sevenbyte.org>
2014-01-08 16:22 ` Fwd: " stefan
2014-01-08 16:28   ` Jason
2014-01-09 19:52     ` stefan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).