Hi. I want to convert html file file1.html to file1.tex. The html file contains   .

How can i write an python script as a filter to remove non-breaking space (&nbsp)?

This is my code:

    #!/usr/bin/env python
   
   
"""
    Pandoc filter to removing   string from the text
    """

   
   
from pandocfilters import toJSONFilter, Para
   
   
def debug(content):
    file
= open('debug.txt', 'w')
   
for item in content:
    file
.write("%s\n" % item)
   
   
def nbsp(key, value, format, meta):
    uniString
= unicode(value, "UTF-8")
    uniString
= value.replace(" ", " ")
   
   
return uniString
   
   
if __name__ == "__main__":
    toJSONFilter
(nbsp)

 but calling command:

pandoc file1.html --filter ./nbsp.py -o file1.tex

give me some errors.

--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/3be5ee09-90dc-41ad-a368-9298b965dfaa%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.