Encoding question

0 votes
asked Aug 26 by john-r-3444 (120 points)

My question is very similar to this (Why do I see strange or non-printable characters in my message (e.g. "Ã" or a "�")?

I see many "â—�S" and " —Â" and a smaller number of mangled degree and copyright symbols.  These characaters appear as the output of source.getNode().  In the XML that they are pulled from they appear properly as bullets, degree symbols, dashes (or emdashes maybe) and copyright characters.  

I am finding that I can do a replace of the ugly strings to the proper ones but it seems silly to try and do this for every possible problematic character.  Can I force an encoding so that a majority of these are fixed?

1 Answer

0 votes

You can set the encoding on the source node:

You can also set a preprocess script to convert the encoding to UTF-8:


answered Aug 26 by brandon-w-8204 (26,350 points)