Rebuild Gearpump documentation with MkDocs

2016-09-25Home

(Disclamer: I am a committer of Apache Gearpump(incubating))

What's wrong with Jekyll

Apache Gearpump(incubating) website is currently built with Jekyll, the most popular static site generator. Most of the contents are written in Markdown while Jekyll allows you to mix in HTMLs. With the default "Bootstrap" style theme, the website has a two-level top navigation bar. That is good for a general blog but not for documentations because the items are hidden in the pull-down menu and there is no table of contents for each page. First-time visitors won't know what to look for or which menu to click on.

I prefer the "ReadTheDocs" style like Confluent's Doc site where all the items are shown on the sidebar. If you click on one item, it will be collapsed into a multi-level table of contents and it's easy to go back and forth. I've been searching for such themes in Jekyll but the closest I've found is Jekyll Doc Theme. However, the table of contents of a page lies at the head of contents which I think is not convenient for navigation. Confluent's Doc site is built with Sphinx and the sources are written in reStructuredText. I didn't want to raise the bar for contributors by learning another markup language so I started to look for alternative generators that support Markdown.

MkDocs is by far the best answer with both Markdown sources and built-in "ReadTheDocs" theme. Then I started to rebuild documentations with MkDocs. It's not a smooth transition and I'd like to record the roadblocks.

Roadblocks

Global Variables

We may have version numbers in the documentation here and there. For example,

### To run WordCount example
bin/gear app -jar examples/wordcount-2.11-0.8.1-assembly.jar org.apache.gearpump.streaming.examples.wordcount.WordCount

There are two version numbers here, scala version(2.11) and gearpump version(0.8.1). We don't want to hardcode the numbers which would mean updating all the numbers whenever a new version of Gearpump is released. We want variables. In Jekyll, we can define version numbers as variables in configuration files. Unfortunately, MkDocs hasn't supported variables for Markdown source yet and the work is still under discussion.

In the discussion thread, someone shared a work-around where markdown sources are pre-processed with a template system like mustache and variables will be replaced with corresponding values. The previous example is now written as

{% raw %}

### To run WordCount example
bin/gear app -jar examples/wordcount-{{SCALA_BINARY_VERSION}}-{{GEARPUMP_VERSION}}-assembly.jar org.apache.gearpump.streaming.examples.wordcount.WordCount

{% endraw %}

and the variables are defined in a yaml file

---
GEARPUMP_VERSION: "0.8.2-SNAPSHOT"
SCALA_BINARY_VERSION: "2.11"
SCALA_VERSION: "2.11.8"
---

Also, I added scripts to copy documentations to a temporary directory and traverse down the directory pre-processing source with mustache.

Finally, update docs_dir configuration from the default docs to tmp in mkdocs.yml.

docs_dir: tmp

My first thought was to pre-process the HTMLs in site but they got overwritten when I run mkdocs serve.

Code Blocks

Existing code examples are written in fenced code blocks from GitHub Flavored Markdown. MkDocs uses Python Markdown library to translate Markdown files into HTML. Python-Markdown has fenced_code extensions but there is a warning that

Fenced Code Blocks are only supported at the document root level. Therefore, they cannot be nested inside lists or blockquotes.

A downside of markdown is many implementations do not enforce markdown syntax, for example, "each subsequent paragraph in a list item must be indented by either 4 spaces or one tab". This is enforced in Python-Markdown.

For code blocks inside list, we indent by 8 spaces or two tabs and use codehilite extensions.

1. item1 

		:::scala
		# Code goes here ...

2. item2
3. item3

There is a bug when using both fenced_code and codehilite extensions. Meanwhile I think it's better to stick to one style so I replaced all fenced code blocks with codehilite syntax.

Remember to add the markdown extension in mkdocs.yml,

markdown_extensions:
    - codehilite:
        use_pygments: False

I switched off Pygments highlights since it caused a line of code to overflow if it is too long.

Unsolved issues

With the above roadblocks moved aside, the new documentation is good to go although some minor issues remain unsolved yet.

Summary

Here is a comparison of documentations before and after the change.

before

after

The details are at https://github.com/apache/incubator-gearpump/pull/88.

This seems a lot of work but I believe it's worth transiting to a doc style that helps users to learn Gearpump more easily.