Microsoft Word's auto-formatting creates special ASCII characters that are illegal in XML and, thus, in InfoPath. Here is a list of ASCII Control Characters that are unique to Word, followed by their ASCII hex value:
Tab (hex 09)
New line (hex 0B)
Page break (hex 0C)
Paragraph (hex 0D)
Column break (hex 0E)
Non-breaking hyphen (hex ...