Mylyn/Rich Editor For Wiki Markup

From Eclipsepedia

Jump to: navigation, search

Rich Editor For Wiki Markup

This page contains the project proposal created by [Harshana Martin] for the idea of "Rich Editor For Wiki Markup" from the [GSoc 2010 idea list]. I am extreamly happy to have comments from the viewers.


Mylyn WikiText project provides extensible framework and tools for parsing, editing and presenting lightweight markup. As an important component of the WikiText project, Wiki Text Editor provides the ability to create and edit wiki files written in several wiki markup languages such as MediaWiki, Textile, Confluence, TracWiki and TWiki. But the problem with the existing wiki text editor is, users need to have knowledge about these wiki mark-ups or they have to use the cheat sheets when they are using the editor. This makes the existing wiki text editor less user friendly and less usable. This project aims to provide a Rich Wiki Text editor such as WYSIWYG editor which can parse multiple wiki markup languages and HTML so that the wiki text editor users do not need to have a prior knowledge about the wiki language mark-ups. The stakeholders of the project will be software developers, wiki page creators and many others including the people who want to convert same document to many other formats.


Eclipse Mylyn is the Application Lifecycle Management (ALM) framework for Eclipse and it provides task management tools for Eclipse, A task focused interface and a set of Agile, ALM and developer collaboration tools. Mylyn is consists of set of sub projects such as Tasks, Context, SCM, Build, Docs and etc. Mylyn WikiText project is placed under the Docs sub project.

Mylyn WikiText project provides extensible framework and tools for parsing, editing and presenting lightweight markup. WikiText project allows its users to create wiki files in several wikitext languages such as MediaWiki, Textile, Confluence, TracWiki and TWiki and it provides facility to convert these wiki files to several file types such as HTML, Eclipse Help, DocBook, DITA and XSL-FO and PDF.

Mylyn WikiText provides an editor for editing markup languages within Eclipse, and integrates with the Mylyn task editor causing it to be markup aware. This wiki text editor is consists of 2 views as source view where we edit the source using wiki mark-ups and the preview view which shows the preview of the currently editing wiki file.

As mentioned above, wiki text editor source view is the place where user can edit wiki files. When user editing a wiki file, he/she has to use the wiki text language mark-ups within the editor to do the formatting. As an example if the user needs to mark the heading he/she has to write as “h1. This is an example heading” and the result in the preview will be as “This is an example heading” and if user need to creating numbering he/she has to do it as following.

"#Number 1"
"##Number 1.1"
"#Number 2"

and the preview will be as

  1. Number 1
    1. Number 1.1
  2. Number 2

So in this case I have created a textile wiki file and I have used textile wiki mark-ups in the source view to create this sample wiki document. But this is not going to be very much user friendly as it requires entering the wiki mark-ups to create and edit this wiki page. So the wiki text editor user will spend more time searching for the correct wiki mark-ups in the cheat sheet or in some other reference and he/she will spend less time in creating and editing the actual document. This will lead to a productivity decrement of the Wiki text user as well as making the user very much uncomfortable when using the editor. So that’s why wiki text project needs a new rich wiki text editor such as WYSIWYG editor as an alternative for the current wiki text editor so that users are able to create,edit and present wiki documents without using actual wiki mark-ups of their wiki markup language. This will allows users to work in a very familiar and user friendly editor and mean while they will be able to see the preview of the currently editing document in the preview view in the editor. This new rich test editor will still allow users to fall back to the current wiki text editor which uses wiki mark-ups to edit the document in cases the user needs to do some advanced wiki formatting, etc. This might be useful for the users who are already familiar with wiki mark-ups and prefer to use them in creating and editing the documents. This is a common practice among experienced HTML designers and wiki writers.

WikiText component works by parsing the mark-ups using the Regular Expressions specific to the markup language and then converting the mark-ups to HTML or other formats by selecting a specific DocumentBuilder implementation. As an example if we create the wiki file using Textile markup language and we want to convert it to a HTML file format then the instances of the MarkupParser class in the org.eclipse.mylyn.wikitext.core.parser package will be instantiated with the markupLanguage attribute with the Textile language which is in the org.eclipse.mylyn.wikitext.textile.core.TextileLanguage class and the builder attribute with the HtmlDocumentBuilder which is in the org.eclipse.mylyn.wikitext.core.parser.builder package. MarkupParser class has method parseToHtml() method and it is used to convert the wiki text to HTML output.

In this project the aim is to implement a WYSIWYG editor which can be used create,edit and parse wiki files of multiple markup language types such as MediaWiki, Textile, Confluence, TracWiki and TWiki as well as non markup language HTML which posses the markup language features. This editor will work on a model such that the model is created by parsing the wiki mark-ups using DocumentBuilder and this will create a DOM model for the selected wiki mark-ups. Since Mylyn WikiText has set of parsers for markup languages, the editor can be used with any of them in this way. When saving and generating preview, the editor will run a markup specific emitter(Validator for a specific wiki markup language/HTML) that would generate markup source from the DOM. In this way it would be relatively easy to support wiki markup and HTML with the same editor. If a non compatible construct or non compatible markup is detected, then the rich editor will fall back to the current WikiText editor.

So the project is consists of 2 major sections.

  1. Implementing the WYSIWYG editor UI.
  2. Implementing the editor backend including the DOM model, parsers, DocumentBuilder and integrating them with the existing system.

When I am carrying out the project I will do it in several steps.

  1. Identify the good features that I should adapt and the defects that I should fix and solve.
  2. Studying the WikiText codebase closer and get a thorough understanding of how the things are happening right now with the existing editor.
  3. Implementing the WYSIWYG editor UI with the exiting WikiText JFace Viewer.
  4. Implementing the DOM model, parsers, DocumentBuilders and integrating them with the WikiText project.

Developing a text editor that has a backing model is a complex task. The following section explains how I anticipate the editor to interact with the model. DOM is model that provides ability for repeated access in a non sequential order and bidirectional access to its nodes. DOM keeps complete tree structure for a document in the memory and hence the above repeated arbitrary access is possible. So in the editor what I’m planning is to create the DOM tree according to the commands from the user. When editing a wiki file also, first we create the DOM for the document and then identify the DOM object the user intend to edit and then edit the DOM tree accordingly and then modify the DOM and get the new mark-up from the DOM and generate the preview from the mark-ups. Let’s consider that we want to create a textile file using the new WYSIWYG editor. Then we create a new file and open it using the editor. Since we know this is a textile file, we provide the corresponding facilities for textile language in the UI via Icons. As an example we can think this UI as the FCKeditor in the rich editor for Eclipse Wiki. There are set of icons for Bold, Italic, Tables, etc. If we want to add them we just click on them and the corresponding wiki mark-ups are added to the wiki file. So I’m thinking of using the same approach in creating the DOM. Let’s say the user need to add a h1 heading. Then the user will click on the button/icon for h1 heading. So we know that he/she is going to add a h1 heading. So I create a new DOM node with element name=”h1” and Text as the heading string so that each and every DOM node has the markup as well as the content of that markup embedded in it. Then what I have to do is identify the position where user intends to add the heading. After identifying the correct place to add the DOM node, add the DOM node. If user has modified the current document, then I retrieve the corresponding DOM node for that position and modify the DOM tree according to the change user has made. This may happen in few forms. If the user has made a string a heading from the current normal text, then I would create a new DOM node for that and insert it to the tree at that point. If user does not change the formatting but only the text, then no change to the DOM tree but only to the text content of the corresponding DOM node. In this way, we can create the DOM for any XML based markup languages like MediaWiki, Textile and even HTML.

The next problem is how I identify the DOM corresponding to the place where user edits. If user is creating the document from scratch this is not going to be a huge issue since the place user editing will be the bottom of the current document. So in that case I just need to insert new DOM nodes to the current leaf node in the DOM tree and extend the DOM tree. But the problem is if user is editing a document already exists. So first we parse the document and create the DOM tree for the existing document. Then we can identify the DOM node where user is editing. So this identification has to be done dynamically.

I can use the DOM model and caret positions to handle this issue. Caret is a place within a document view that represents where things can be inserted into the document model. Caret has a dot which refers to the location of the caret in the document model. Caret has a mark which represents the end of a selection. If there is no selection, then the dot and the mark will be the same value. If they are different, there is a selection. So when representing our document in the UI, we use SWT components. The jTextComponent in the SWT libraries can be used to get the Caret positions. So when we are typing on a Java Swing or Java SWT jTextComponents, system has the access to the Caret and Caret positions. So I can identify the place/location in the editor that user is currently typing. So if I can use a DOM attribute to store the locations of the text for each markup. So when ever user types on the document I can retrieve the information of the location he/she is typing and using that information I can search the DOM tree structure and identify the DOM relevant to that Caret position using a simple search on the DOM tree node attributes. So this way I can identify the DOM. There is a slight problem with this approach because the caret position is depends on the viewer. So different users will have different caret positions for the same document due to different viewers they use. In that case we can avoid it by updating the caret position attribute when a user is opening a file on the viewer/editor and if the user is resizing the viewer/editor, then again we have to update the caret positions for the same document while the viewer/editor is redrawing the text on the resized editor/viewer.

The next thing I have to explain is how I am going to obtain mark-ups from the DOM model. As I have explained in the above section, each and every DOM node is created with the command from the user. So the created DOM node name is equals to the markup name and the text is equals to the text of the markup. So a DOM node is providing a very much meaningful representation to the mark-ups in a document. In retrieving the mark-ups from the DOM tree, what I have to do is, parse the DOM tree and extract the DOM node name and Text properties from each and every DOM node in the DOM tree and validate the markups for the selected language. This is stated as a markup specific emitter in an above section. So only the valid markups would be selected by this validator/emitter. Hence I can get the mark-ups and values from the DOM tree and save them to the source file. When we want to convert this source files to another format like eclipse help or HTML, the existing process which works with the existing editor can be used and hence we don't have to test it since it is working fine.