Import Word XML content

Import Word XML

Developer

Gavin Cooney

  File Modified

ZIP Archive importxml.zip Download the Open Office input plugin

Jan 04, 2006 by David Whiterod

This plugin allows you to import large amounts of content from Microsoft Word into farcry.

Using an Open Office document, a whole section of the site, that is navigation and pages with content, can be added quickly and easily. You may then have to return and clean up these web pages, but this method can be ideal for putting large sections of the site online.

You can download Open Office from http://www.openoffice.org

You can start from either a MS Word document or a text document - the important thing is how the text is styled. In much the same way as using the Quick Site Builder you are specifying a hierarchy for the site to use. Instead of using '-' signs, you use heading styles. In addition to this, you can have content on the pages that are created.

Sample site layout
Science
-Activities
--Activity 1
--Activity 2
--Activity 3

For example

If you wanted to have a structure and pages like those above, you would write a document and style it like this:

Science (Heading 1)
Activities (Heading 2)
Activity 1 (Heading 3)
Text on activity page. More text on activity page.
Activity 2 (Heading 3)
Activity 3 (Heading 3)

Here we have 'Science' styled as a 'Heading 1' which will create a navigation node and page called 'Science' at the top level. Then we have 'Activities' styled as a 'Heading 2' which will create a child node and page below 'Science'. Then we have 'Activities 1-3' styled as Heading 3'.

What we will also see is 'Text on activity page. More text on activity page.' Will appear in the body section of the page 'Activity 1'. This process allows you to take existing documents of content and style them simply in order to publish them through FarCry.

Notes

  • You will also need to apply 'Heading 1' and 'Heading 2' styles to heading within the text even though you do not want them to become new pages. In order to do this, just use the html heading tags eg <h2>This is a Heading 2</h2> or put 'H2:' in front of the heading eg H2:This is a Heading 2.
  • The Open Office process will recognize headings, but some things will not display correctly, such as text simply styled in bold (best to remove this) or website addresses (this will import correctly if they are simply returned down to the next line).
  • Only very basic styles are supported. These include:
    Text (just paragraphs)
    tables
    unordered lists
    ordered lists
    nested lists
    lists inside a table etc
    hyperlinks
  • If you want to add any custom HTML to the pages you are importing, just type it into the word document. Any line starting with a "<" is treated as HTML. So you can easily enclose a number of paragraphs with a
    <div class="myclass">
    paragraphs
    </div
    This also allows us to do bolds and spans by just adding a <p> tag.
    <p>this line needs some <strong>bold text</stong>. </p>

Once you have styled you document in MS Word (although styling can be done in Open Office), open the document using Open Office and save the file as a native Open Office document (.sxw).

Click on the 'Admin 2' tab and click on 'Import Open Office Doc'. Then you select where you would like these navigation nodes and pages to sit using the drop down menu (much like in the code Quick site builder).

Then click 'Choose file' and navigate to the Open Office file you would like to use.

Once this is done, select the 'Default type and display method' that applies to most of the pages you will create. This defaults to dmHTML displayStandard.

Now click on 'Build site structure'. This will go ahead and read your Open Office document, creating navigation from you headings and body sections from your content.

Before it finalises this creation, FarCry will ask you to select a 'Type and display method' for each page individually. If you already selected the relevant Default type and display method in the previous step, you can just select the types for the files that don't fit into these categories.

Once you are happy with these pages, confirm your selection and these pages and navigation nodes will be added to the tree.

This is possible because Open Office "word" documents are actually a series of XML files zipped into a .swx file. Thanks to Mark Lynch (http://www.lynchconsulting.com.au) for helping me with the unzipping of the Open Office document on the server.

Installation instructions

  • Add these files into <your-site-folder>/customadmin/importxml
  • Then use something like this for your customadmin.xml in the parent folder
 
<?xml version="1.0" encoding="utf-8"?>
 <customtabs>
  <parenttab permission="MainNavAdminTab">Site Builder
    <subtabs permission="MainNavAdminTab">Import Word XML
      <menutitle>Open Office Word Doc XML</menutitle>
      <menuitem>
        <label>Import Open Office Doc</label>
        <link>/farcry/admin/customadmin.cfm?module=importxml/index.cfm</link>
      </menuitem>
      <menuitem>
        <label>View Open Office Doc HTML</label>
        <link>/farcry/admin/customadmin.cfm?module=importxml/viewWordHtml.cfm</link>
      </menuitem>
    </subtabs>
  </parenttab>
</customtabs>