Arc Forumnew | comments | leaders | submitlogin
Fix: UFT-8 in app server
16 points by olavk 6165 days ago | 4 comments
The web server does not seem support non-ASCII text correctly, but that should be very easy to fix. E.g. if you write

   (defop hello req (pr "hello world \u263B"))
You get some strange looking text in you browser. This seem to be because arc is generating UTF-8 encoded output (which I think is MzScheme default encoding) but not declaring the encoding in the HTTP header, which will make most browsers default to interpret it as ISO-8859-1.

It can be fixed by changing svr.asc line 105 to

    Content-Type: text/html;charset=utf-8


2 points by byronsalty 6164 days ago | link

So I should this be a srv startup option (to support utf-8)? And then all Content-Type: text/html should become Content-Type: text/html;charset=utf-8 ?

That would be easy to add to the header stuff I was working on yesterday. Can probably do that tonight.

-----

2 points by olavk 6164 days ago | link

Cool. I don't think it should be an option though, since the server generates utf-8 anyway - it just doesn't label it correctly. I can't imagine when it would be useful _not_ to indicate the encoding.

-----

7 points by kens 6164 days ago | link

Not indicating the encoding leaves you vulnerable to an XSS attack. For instance, the following looks harmless, but if you don't set the encoding explicitly it can get executed if your browser is set to UTF-7, or auto-detects to UTF-7:

+ADw-script+AD4-alert('XSS')+ADw-/script+AD4-

Edit to add some explanation: if displayed as UTF-7, the above will pop up a "XSS" alert box. It's just an example; it doesn't actually do anything bad but it shows the potential for malicious XSS. A key point is that HTML-escaping your output or filtering out HTML tags isn't enough, since innocuous-looking characters can cause problems if the encoding is misinterpreted.

-----

1 point by sacado 6165 days ago | link

Yep, it's working ! Thanks !

-----