This can happen when two different character encoding standards (or Character Sets) are used when reading and then writing of the document.
The symbol � generally means the original document was ISO-8859-1, and you are using UTF-8 to read it.
The symbol à generally means the original document was UTF-8, and you are using ISO-8859-1 to read it.
For example, if you receive a base64 encoded document, you may need to change the character set you use when reading the document.