Jump to: navigation, search

Compare Word Documents

Michael Valenta June 11, 2008

One this page, we describe support for comparing Word documents using Eclipse. The purpose of this page is two-fold. First, we will described the goals and features of the Word Document compare support. After that, we will dig a bit into how we integrated the Word comparison with the Eclipse Compare framework. However, we are not going to dig into the OLE portion of the implementation.

Obtaining the Word Comparison Plug-in

The current plan is to ship this bundle with Eclipse 3.5. However, the bundle is compatible with 3.3 and 3.4. You can download the built bundle from here and put it in the plugins directory of you Eclipse install. The source code will be added to CVS once HEAD is opened in for 3.5 development.

Comparing Word Documents

In this section, we describe what our goals for the Word document merge viewer were. We then described the features of the viewer and show some screen shots of what the result looks like.

Goals and Features

When we started implementing the Word merge viewer, we had the following goals in mind:

  • Compare changes using the revision support that is provided by the Word application
  • Offer the user flexibility in whether they want to edit in-place (i.e. embedded in Eclipse) or in a separate Word Application window
  • Provide a means to save changes back to the Eclipse workspace.

To that end, we implemented a merge viewer with the following features:

  • The merge viewer is associated with the "doc" file extension and a "Word Document" content type. The use of a content type will allow the user to associate other file extensions with the merge viewer.
  • The merge viewer opens the Word based comparison in-place by default but there is a toggle in the toolbar that allows the document to be opened in a separate window.
  • The word application is opened on a temporary file containing the results of the comparison of the two files or file states being compared. Saving the compare editor (or clicking on the Save button in the compare viewer toolbar) will save any edits that have been made to the temporary file and then copy the contents of the temporary file into the Eclipse workspace file

What it Looks Like

The following screen shot shows the word comparison in-place. You can edit the text and save in the compare editor which will save the result to the 'important.doc' file that is being compared.

Wordinplace.png

This next screen shot shows the compare viewer in "external" mode where the word document is open in a separate window.

Wordseparate.png

We don't have a screen shot of the separate Word application window since it is simply a full blown word application. Notice that the "In-place" toggle button in the above screen shot has been changed from the previous screen shot. Also notice that the editor is now dirty and the save action in the toolbar is enabled. The editor is dirty because we have made changes using the Word application. Even though the document is shown in a separate window, we are still keeping a link to the document so we know when it has been modified. Saving in the compare editor will still work (as indicated by the text that appears in the compare viewer).

Caveats

A limitation of the viewer is that it doesn't offer the ability to save if neither or both sides of the comparison are editable. For the "both editable" case, we didn't feel we could adequately present the save options in the UI. The workaround is to open the comparison in a separate window so you can then save any changes to any location on disk using the Word Save As menu action.

Implementing a Compare Merge Viewer

We will now look at how to integrate a compare merge viewer into Eclipse using our Word compare viewer as an example. First we'll discuss the extension points we need to extend and then we'll discuss the compare viewer implementation.

Compare Framework Extension Points

The first extension point we are going to extend is not in the Compare framework. The Compare framework allows content merge viewers to be associated with either a file extension or a content type. The advantage of using a content type is that the user can associate other file extensions with the content type manually. Here is what our Word Document content type definition looks like:

<extension point="org.eclipse.core.contenttype.contentTypes">
   <content-type
         name="Word Document"
         id="org.eclipse.compare.wordDoc"
         file-extensions="doc">
   </content-type>
</extension>

Once we have the content type, we can then define our content merge viewer using the org.eclipse.compare.contentMergeViewers extension point and then associate that viewer with our content type. The XML for this looks like:

<extension point="org.eclipse.compare.contentMergeViewers">
   <viewer
         class="org.eclipse.compare.internal.win32.WordViewerCreator"
         extensions="doc"
         id="org.eclipse.compare.wordMergeViewer">
   </viewer>
   <contentTypeBinding
         contentMergeViewerId="org.eclipse.compare.wordMergeViewer"
         contentTypeId="org.eclipse.compare.wordDoc">
   </contentTypeBinding>
</extension>

The class that is registered with the extension point is a factory for creating our content merge viewer. The code for our class is fairly simple:

public class WordViewerCreator implements IViewerCreator {
   public Viewer createViewer(Composite parent, CompareConfiguration config) {
      return new WordMergeViewer(parent, config);
   }
}

In the next section, we'll look at several aspects of the content merge viewer class.

Compare Merge Viewer Basics

In this section, we look at how to define a viewer and embed it in the compare editor. In the first part of the section, we will look at configuring the viewer. In the second part, we will look at how to work with the input to the comparison and how to write changes back to that input.

Implementing the Viewer

The following code snippet shows the definition of our viewer class.

public class WordMergeViewer extends Viewer 
      implements IFlushable, IPropertyChangeNotifier {

Notice that the class is a subclass of Viewer and implements IFlushable and IPropertyChangeNotifier. The compare editor can host any subclass of Viewer inside it's edit area. There are some specialized subclasses that clients can use: ContentMergeViewer and TextMergeViewer but for this example, we don't need to use these. These specialized subclasses are described further in the Eclipse ISV doc. As for the interfaces, we'll describe these in more detail later.

The next part of the viewer we want to look at is the constructor. The following code shows the general shape of a compare viewer constructor.

public WordMergeViewer(Composite parent, CompareConfiguration configuration) {
   this.configuration = configuration;
   createControl(parent);
   getControl().addDisposeListener(new DisposeListener() {
      public void widgetDisposed(DisposeEvent e) {
         handleDispose();
      }
   });
   getControl().setData(CompareUI.COMPARE_VIEWER_TITLE, "Words Document Compare");
   IToolBarManager toolBarManager = CompareViewerPane.getToolBarManager(parent);
   if (toolBarManager != null) {
      toolBarManager.removeAll();
      initializeToolbar(toolBarManager);
   }
   updateEnablements();
}

Some interesting points of this constructor are:

  • The configuration contains information about the comparison (labels, which sides are editable, etc.)
  • The createControl(parent) method creates the control for our viewer. In our case, it creates an area to host the Word Document using OLE
  • We register a dispose listener with the viewer control (i.e. composite) and call a dispose handler on our viewer when the control is disposed. We do this to clean up any additional OS resources (e.g. the Word Document OLE objects)
  • We set the CompareUI.COMPARE_VIEWER_TITLE property of the control to the title we want to appear in the toolbar above the viewer area.
  • We use the CompareViewerPane.getToolBarManager(parent) method to obtain the toolbar of the viewer so we can add our buttons to it. We first need to clear the toolbar.
  • We finally call our updateEnablements method which is a general pattern used to update any button states

We'll now look at the IFlushable interface. Bascially, this interface is used to indicate that the viewer has content that needs to be written back to the compare input when a save is requested. Our implementation of flush looks like this (we'll look into how we write the data back to the compare input in the next section).

public void flush(IProgressMonitor monitor) {
   Display.getDefault().syncExec(new Runnable() {
      public void run() {
         if (isDirty())
           saveDocument();
      }
   });
}

The second interface we implemented was IPropertyChangeNotifier. That interface is used to indicate to clients that our viewer can fire events (i.e. it defines add and remove listener methods). The main event of interest to the compare framework is the dirty state of the viewer. Whenever our dirty state changes, we call this method which fires a property change event to any listeners.

protected void setDirty(boolean dirty) {
   if (isDirty != dirty) {
      isDirty = dirty;
      updateEnablements();
      firePropertyChange(CompareEditorInput.DIRTY_STATE, 
             Boolean.valueOf(!isDirty), Boolean.valueOf(isDirty));
   }
}

That's pretty much it for configuring the viewer. In the next section we'll discuss how to manipulate the compare input.

Working with the Compare Input

The input to the viewer is an object that is an instance of ICompareInput. However, the Viewer#getInput() method returns an object so we need to cast when we work with the input. The methods of interest on this interface are:

  • getLeft() and getRight for getting the two sides of the comparison. There is also a getancestor() that is used when the comparison is 3-way.
  • getKind() which returns a bit-wise combination of the description of the input

The input kin is broken into two parts: the type and the direction. The values to be used are constants on the Differencer class:

  • The ADDITION, DELETION and CHANGE constants define the type. The type is obtained by ANDing the kind with the CHANGE_TYPE_MASK (e.g. if (input.getKind() & Differencer.CHANGE_TYPE_MASK) == Differencer.CHANGE)
  • The LEFT, RIGHT and CONFLICT constants indicate the direction if the change. The direction is obtained by ANDing the kind with the DIRECTION_MASK. A result of 0 indicates a two-way comparison.

As an example to illustrate this, consider the following method:

protected boolean isOneSided() {
   if (input instanceof ICompareInput) {
      ICompareInput ci = (ICompareInput) input;
      int type = ci.getKind() & Differencer.CHANGE_TYPE_MASK;
      return type != Differencer.CHANGE;
   }
   return false;
}

This method returns true if we have an addition or deletion. We use it in our Word comparison to determine if we need to do a comparison or not. That is, for additions or deletions, we just open Word on the side that exists.

Another compare interface that is useful is ITypedElement. This is an interface that is often used for the left, right and ancestor properties of the compare input. It provides a name, type and image. We use it when showing the name of the element to the user in a message:

String name = "unknown";
if (input.getLeft() instanceof ITypedElement) {
   ITypedElement te = (ITypedElement) input.getLeft();
   name = te.getName();
}

For our example, we need to make sure that the left and right sides being compared are available in files on the local file system. We use the IResourceProvider interface to determine if the element has an associated resource. If a resource cannot be found that, we also use the IAdaptable interface. The method we use looks like this:

protected IFile getEclipseFile(Object element) {
   if (element instanceof IResourceProvider) {
      IResourceProvider rp = (IResourceProvider) element;
      IResource resource = rp.getResource();
      if (resource.getType() == IResource.FILE) {
         return (IFile)resource;
      }
   }
   if (element instanceof IAdaptable) {
      IAdaptable a = (IAdaptable) element;
      Object result = a.getAdapter(IResource.class);
      if (result == null) {
         result = a.getAdapter(IFile.class);
      }
      if (result instanceof IFile) {
         return (IFile) result;
      }
   }
   return null;
}

If there is an IFile, we convert it to a local file and use that, otherwise, we try to extract the contents from the element using the IStreamContentAccessor interface.

private File cacheContents(ITypedElement element) throws IOException {
   if (element instanceof IStreamContentAccessor) {
      IStreamContentAccessor sca = (IStreamContentAccessor) element;
      InputStream contents = sca.getContents();
      if (contents != null) {
         try {
            return createTempFile(contents);
         } finally {
            try {
               contents.close();
            } catch (IOException e) }
         }
      }
   }
   return null;
}

That is the essence of the code we need to open populate the Word document (except, of course, for the OLE code but we are not going to go into that here). The only remaining part is how to write any changes back into the compare input. The first aspect of this is determining whether either side of the comparison is editable. The following snippet shows how to use the configuration parameters and the left element to determine if the left side of the comparison is editable.

private IEditableContent getEditableLeft() {
   ICompareInput compareInput = getCompareInput();
   if (compareInput != null) {
      ITypedElement left = compareInput.getLeft();
      if (left instanceof IEditableContent && configuration.isLeftEditable()) {
         return (IEditableContent) left;		
      }	
   }	
   return null;
}

The IEditableContent interface is used to write bytes back into the compare input. Here is what the saveDocument() method of our compare viewer looks like:

protected void saveDocument() {
   try {
      File result = getResultFile();
      wordArea.saveAsDocument(result.getAbsolutePath());
      // Forward the saved content to the save target
      synchronized (result) {
         if (result.exists()) {
            IEditableContent saveTarget = getSaveTarget();
            saveTarget.setContent(asBytes(result));
            resultFileTimestamp = result.lastModified();
         }
      }
      updateEnablements();
   } catch (IOException e) {
      handleError(e);
   } catch (SWTException e) {
      handleError(e);
   }
}

The way our save works is we write the edited word document to a temporary file returned by getResultFile(). We then extract the bytes from this file and put them into the IEditableContent element of the compare input (i.e. saveTarget.setContent(asBytes(result))).

Acknowledgments

Special thanks go to Duong Nguyen for providing the initial OLE code for opening and comparing Word documents.