My question is very similar to this (Why do I see strange or non-printable characters in my message (e.g. "Ã" or a "�")?)
I see many "â—�S" and " —Â" and a smaller number of mangled degree and copyright symbols. These characaters appear as the output of source.getNode(). In the XML that they are pulled from they appear properly as bullets, degree symbols, dashes (or emdashes maybe) and copyright characters.
I am finding that I can do a replace of the ugly strings to the proper ones but it seems silly to try and do this for every possible problematic character. Can I force an encoding so that a majority of these are fixed?