Suppose you're writing context-sensitive help for a cross-platform application that's translated into six languages, including Japanese. If you were writing help for an Eclipse-based application, it would be easy. The Eclipse user assistance infrastructure is powerful and flexible. The format is simple, has internationalization built-in, and can be produced from a variety of formats including DocBook, DITA, and some traditional HATs (Help Authoring Tools). Even if you’re writing help for a native Windows application, you would just produce a .chm file and leave it to the localization vendor to figure out how to produce the Japanese version. But often you find yourself writing help for a Web application and in that case, the choice isn’t as clear.
Sure, you can now create a .war version of an Eclipse infocenter, but that’s probably overkill for a help system that’s going to contain a single document. Moreover, many tech writers won’t have the technical skills to get it to work and even if they do, they’ll probably meet resistance from the developers. Your project manager might be worried about taxing the app server’s performance or worried that if they support another app server they’ll have trouble getting your little war file to work in the new environment. Often developers just want a bundle of HTML, CSS, and JS files from their writers and nothing more complicated.
Now, there are a few cross platform formats that offer all the stuff you expect to find in a help system: a table of contents pane on the left with a search tab and maybe an index tab on the left. Ideally it will highlight the search results in the page. The search needs to support stemming and needs to support languages other than English, including Chinese, Japanese, and Korean (“CJK”). CJK is hard, by the way, because those languages don’t have spaces between their words, so the strategy used by most client-side search engines fails. The indexer can’t easily create a list of terms in the set and indicate on which pages those terms occur, because there’s nothing to use as a delimiter in tokenizing the words. In addition to all that, you might also want a button to hide the table of contents to maximize the content area and you certainly want a way to sync the contents of the table of contents with the content so the user will be able to see where a topic occurs in the table of contents, when jumping from topic to topic via cross references or other links. Oh, and you must have a way to deep-link into the help set so you can do contextual help. It’s not such a hard thing, but for too long our options have been limited to a few commercial tools of questionable overall quality. In fact, on this very blog, Janet has called this the “holy grail”.
Enter the Google Summer of Code 2010 and Kasun Gajasinghe, a student at the University of Moratuwa in Sri Lanka. This past spring, nudged along by Dick Hamilton and Stefan Seefeld, the DocBook Open Repository project applied to be a mentoring organization for the first time. Why hadn’t we done this before? No idea. I guess we all had our heads down in our own problems and it didn’t occur to anyone that Summer of Code would be a great way to advance DocBook development while introducing some bright students to the ins and outs of open source development. In any case, I’m very happy that Dick and Stefan got the ball rolling and were willing to administer our participation in GSoC as well as mentor students and especially happy that Kasun submitted his proposal to provide webhelp for DocBook.
The features that he implemented include:
- Full text search with:
- Stemming support for English, French, and German. Stemming support can be added for other languages by implementing a stemmer.
- Support for Chinese, Japanese, and Korean using code from the Lucene search engine.
- Search highlighting that shows where the searched for term appears in the results.
- Search results can include brief descriptions of the target.
- Table of contents pane with collapsible TOC tree.
- Auto-synchronization of content pane and TOC.
- TOC and search pane implemented without the use of a frameset.
- An Ant build.xml file to generate output.
Kasun's work from his Summer of Code project is now part of version 1.76.1 of the DocBook XSLs. Please take a look and share your thoughts below.
Thanks, Janet, for letting me hijack your blog!