?

Log in

No account? Create an account

Previous Entry | Next Entry

JavaMail

JavaMail's character set support is janky. Is this just because it predates the whole InputStream/OutputStream vs Reader/Writer rationalisation? If you blithely write:

message.setContent("this is a string, it cost £20 to write", "text/plain");


It may emit QP content like so:

------=_Part_0_15040737.1106653904705
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

this is a string, it cost =A320 to write


which is broken, argh. The "UTF-8" in the charset is just taken from the system property mail.mime.charset, or file.encoding, but the content has been encoded as Latin-1. In order to fix, I had to write:

message.setContent(new String(bodyString.getBytes("UTF-8"), "ISO-8859-1"), "text/plain");


which is just ridiculous. Between this and the pre-Collections API and other nastiness, I'm seriously considering the possibility of writing my own mail-encoding routines, although not really happy about it. Last time I tried to use GNUmail, it had other bugs, which was also frustrating.

Tags:

Comments

( 2 comments — Leave a comment )
inferis
Jan. 25th, 2005 12:39 pm (UTC)
Character encoding is still one of those strange arcane arts to me. I'm quite aware of the uses, but how does one actually determine the encoding on a windows machine? I've *never* seen a tool for this...
araqnid
Feb. 19th, 2005 12:53 pm (UTC)
Well, it should presumably be part of the territory configuration: e.g. GB English is CodePage 1252 or whatever. No, I don't know where it's configured either :| Eclipse certainly seems to be able to figure out that the host encoding on my Windows machine here is Cp1252, though.

I thought Windows had lots of duplicated system calls, one "encoded" version and one Unicode version for each? I do seem to remember that recent filesystems (NTFS and Joliet) specify what encoding filenames are in.

My preference for mail would be to always encode things as Unicode, but apparently thats a very Euro-centric view and JP/KR/CN users don't like Unicode as much as we do :}
( 2 comments — Leave a comment )

Profile

araqnid
Steve Haslam

Latest Month

March 2009
S M T W T F S
1234567
891011121314
15161718192021
22232425262728
293031    

Page Summary

Powered by LiveJournal.com
Designed by Tiffany Chow