How to Copy Content from Microsoft Word into Plone
This How-to applies to: Any version.
Many of us use Word (or similar word processing programs like Open Office, or Text Edit) to create, share, and collaborate on content creation. Word is a great tool to for intra-office use and it works well in conjunction with PowerPoint, Excel, and other Microsoft Office programs. However, Word was never intended to be used to generate web pages or to be used in conjunction with a Content Management System such as Plone.
Copying from Word brings tag attributes, stylesheet information, and invalid HTML into Plone. You cannot see this information unless you switch to the edit HTML mode in the visual editor.
This isn't Plone's or the Internet's fault - Word documents contain formatting information that was designed to be understood by other Microsoft Office programs, not web browsers. Still, it is very common that the bulk of the content an organization wants to post online is coming from Word documents.
Hence the frequently asked question: Can I copy and paste my page content from Word directly into Plone?
The short answer is no. The longer answer is, yes but you need to prepare the content first. You must get rid of the Word-centric formatting first by using either Notepad or Text Edit for Mac users.
Notepad
The best method is to simply copy content from Word and paste it into Notepad. From there, copy the content out of Notepad and paste it into Plone. This removes all the extra formatting that comes from copying out of Word. You will lose formatting such as bold, italics, font sizes, and so on, but you can use the toolbar in Plone's visual editor to recreate it. This method is the "cleanest" way to move content from Word into Plone.
Mac users can use Text Edit to accomplish the same thing. Be sure that Text Edit is in "plain text" mode.
Examples of Word Formatting
Below is an example of what problematic Word formatting looks like. If you happen to see this kind of formatting information either rendered on a page, or in the HTML view of your Plone page, you (or someone you know) probably copied the content from Word without going through Notepad first.
<w:LatentStyles DefLockedState="false" LatentStyleCount="156">
</w:LatentStyles>
</xml><![endif]--><!--[if gte mso 10]>
<style>
/* Style Definitions */
table.MsoNormalTable
{mso-style-name:"Table Normal";
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-parent:"";
mso-padding-alt:0in 5.4pt 0in 5.4pt;
mso-para-margin:0in;
mso-para-margin-bottom:.0001pt;
mso-pagination:widow-orphan;
}
</style>
<![endif]--><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
This kind of inline style information will conflict with the stylesheet information in Plone. In short, you may observe strange behavior on your pages.
In order to get rid of it, you will have to edit the HTML of the page (via the HTML icon in the visual editor), or simply delete the page and start over!