On Jul 9, 2009, at 3:02 AM, Ethan Grammatikidis wrote:
; hget http://google.gr/
<!doctype html><html><head><meta http-equiv="content-type" content="text/html; charset=ISO-8859-7">
i'm pretty sure that ISO-8859-7 != utf-8.
I guess that's server-side mucking about based on user-agent not reporting utf-8 capability or something stupid. Firefox page info feature reports the page as utf-8, and on inspection of the source:
<!doctype html><html><head><meta http-equiv="content-type" content="text/html; charset=UTF-8">
I wonder if there's some 'prefered encoding' message the UA can send to the server.
Accept-Charset is the http header that you want, but to do it `right' you probably want to muck about with http's q-value weighting system. The shorter form is that you'll have to tell the server you're ok with UTF, or it'll fall back to it's best-guess techniques, with the default fallback of iso-8859.
*Chad