Microsoft Word's auto-formatting creates special ASCII characters that are illegal in XML and, thus, in InfoPath. Here is a list of ASCII Control Characters that are unique to Word, followed by their ASCII hex value:
- Tab (hex 09)
- New line (hex 0B)
- Page break (hex 0C)
- Paragraph (hex 0D)
- Column break (hex 0E)
- Non-breaking hyphen (hex 1E)
- Optional hyphen (hex 1F)
- Non-breaking space (hex A0)
- Ampersand (hex 26)
To workaround this problem, there are two options:
a) Replace any illegal characters before conversion to XML (i.e. before InfoPath sees them).
b) Change AutoFormat option in Word to prevent generating these characters.
i. In Word 2003, go to Tools->AutoCorrect Options->AutoFormat->AutoFormat As You Type and uncheck "Hyphens..." and other characters that result in special characters.
ii. In Word 2007, press Alt-F, click Word Options->Proofing->AutoCorrect Options->AutoFormat->AutoFormat As You Type and uncheck "Hyphens..." and other characters that result in special characters.