Continuous Checking Markup Documents with Travis, Asciidoctor, IntelliJ IDEA and RedPen

I introduced an idea of CI integration for document writing in a previous post. This spring, we finally built a continuous checking environment with Travis for RedPen user’s manual. The following is an image of the RedPen manual.

Screen Shot 2016-07-07 at 16.50.27

The manual is written in AsciiDoc and the source file is maintained in GitHub.

What do we check?

In the CI system, we checked two aspects of the document.

  1. Document build
    RedPen user’s manual is written in AsciiDoc. The AsciiDoc files are converted to HTML with Asciidoctor. In the CI system, we check if the conversion is successful.

  2. Document quality
    The quality of the document is checked with RedPen, a linting tool for markup text. We checked the document with the following settings. For the details of the RedPen configuration, please refer to the RedPen user’s manual.

<redpen-conf lang="en">
    <validators>
        <validator name="ParagraphNumber">
            <property name="max_num" value="5"/>
        </validator>
        <validator name="ParagraphStartWith">
            <property name="start_from" value=""/>
        </validator>
        <validator name="SectionLength">
            <property name="max_num" value="1000"/>
        </validator>
        <validator name="WordFrequency">
            <property name="deviation_factor" value="5.0"/>
            <property name="min_word_count" value="2000"/>
        </validator>
        <validator name="CommaNumber">
            <property name="max_num" value="3"/>
        </validator>
        <validator name="DoubleNegative"/>
        <validator name="EndOfSentence"/>
        <validator name="Hyphenation">
            <property name="list" value=""/>
            <property name="dict" value=""/>
        </validator>
        <validator name="InvalidExpression">
            <property name="list" value=""/>
            <property name="dict" value=""/>
        </validator>
        <validator name="InvalidSymbol"/>
        <validator name="InvalidWord">
            <property name="list" value=""/>
            <property name="dict" value=""/>
        </validator>
        <validator name="NumberFormat">
            <property name="decimal_delimiter_is_comma" value="false"/>
            <property name="ignore_years" value="true"/>
        </validator>
        <validator name="Quotation">
            <property name="use_ascii" value="false"/>
        </validator>
        <validator name="SentenceLength">
            <property name="max_len" value="150"/>
        </validator>
        <validator name="SuccessiveWord"/>
        <validator name="SuggestExpression">
            <property name="dict" value=""/>
        </validator>
        <validator name="SymbolWithSpace"/>
        <validator name="WeakExpression"/>
        <validator name="WordNumber">
            <property name="max_num" value="30"/>
        </validator>
        <validator name="JavaScript">
            <property name="script-path" value="js"/>
        </validator>
    </validators>
</redpen-conf>

As we see, the the checks such as length of the sentences are primitive, but points are important for the readability. I will enhance the tests in the near future.

Setting CI

For CI testing of RedPen manual, we use TravisCI, a popular continuous integration service. The following is the TravisCI setting for document checking.

language: ruby

rvm:
- 2.0.0-p598

jdk:
- oraclejdk8

env:
- URL=https://github.com/redpen-cc/redpen/releases/download/redpen-1.6.1

install:
- wget $URL/redpen-1.6.1.tar.gz
- tar xvf redpen-1.6.1.tar.gz
- export PATH=$PATH:$PWD/redpen-distribution-1.6.1/bin
- gem install asciidoctor
- gem install coderay
- gem install --pre asciidoctor-pdf
- sudo apt-get update && sudo apt-get install oracle-java8-installer

script:
- make check # Apply RedPen
- make html # Generate HTML document

The install block installs the tools for building and checking the document are installed. The script block checks the quality of RedPen document and tries to build the HTML files with Asciidoctor.

BUILDDIR = build
ASCIIDOCTOR = asciidoctor
.PHONY: help clean check html

check:
redpen -f asciidoc source/*.adoc

html:
mkdir -p $(BUILDDIR)/html
cp source/*.jpg source/*.png $(BUILDDIR)/html/
cp -r source/styles/redpen $(BUILDDIR)/html/
$(ASCIIDOCTOR) -a source-highlighter=coderay -a stylesdir=styles -a target-version=1.6 -d book -b html5 source/index.adoc -D$(BUILDDIR)/html
@echo "Build finished. The HTML pages are in $(BUILDDIR)/html"

As we see the checks are run with make commands. The following is the Makefile.

# Makefile for RedPen documentation
#

# You can set these variables from the command line.
BUILDDIR = build

ASCIIDOCTOR = asciidoctor
.PHONY: help clean check html

help:
@echo "Please use \`make ' where is one of"
@echo " html to make standalone HTML files"

clean:
-rm -rf $(BUILDDIR)/*

check:
redpen -f asciidoc source/*.adoc

html:
mkdir -p $(BUILDDIR)/html
cp source/*.jpg source/*.png $(BUILDDIR)/html/
cp -r source/styles/redpen $(BUILDDIR)/html/
$(ASCIIDOCTOR) -a source-highlighter=coderay -a stylesdir=styles -a target-version=1.6 -d book -b html5 source/index.adoc -D$(BUILDDIR)/html
@echo "Build finished. The HTML pages are in $(BUILDDIR)/html"

Check documents in an editor

I have been writing the document with IntelliJ IDEA, with plugins for document writing. IntelliJ IDEA is a popular Java IDE but also useful for document writing. In addition, to check documents we can use the IntelliJ IDEA plugin for RedPen reusing the configuration file for the CI. For details of the IntelliJ IDEA plugin for RedPen, please see this blog post.

The below is the image.

Screen Shot 2016-07-07 at 15.51.11

Results

We got the green badge from Travis as the following image.

travis-result
Travis Badge

Summary and Future Work

This article shows how to implement the continuous checking of markup documents with  CIs and editors. The system checks the build and quality  with RedPen and Asciidoctor. We will enhance the checks adding more and more validators.

Continuous Checking Markup Documents with Travis, Asciidoctor, IntelliJ IDEA and RedPen

Annotation for Suppressing Errors Reported from RedPen, a Linting tool for Markup text

RedPen version 1.6 supports error suppression with text annotations. This feature gives us a good balance between the quality and the productivity for document writing.

Sometimes we do not want to fix errors from RedPen, a text linting tool. Most of reasons are that the cost to remove the error is high. Or the writer breaks the writing standard at particular points on purpose. For such cases, the error suppression by annotation is useful. The annotations are added just before the sections containing the errors.

Currently error suppression is supported for four types of formats (AsciiDoc, Markdown, Re:VIEW, LaTeX). In the following section, I will show a sample text containing the error suppression annotation for error suppression.

As the sample of the annotation for error suppression, an AsciiDoc text is applied. AsciiDoc is a popular format, which is adopted by GitBook.

Sample: error suppression in AsciiDoc text

For AsciiDoc text, writers add the suppress annotation in attribute block. The annotation is [suppress]. For example, the following AsciiDoc text suppresses the all the errors in the section.

[suppress]
= Instances
Some software tools work in more than one machine, and such distributed (cluster)systems can handle huge data or tasks, because such software tools make use of large amount of computer resources, such as CPU, Disk, and Memory.

When we apply RedPen to the AsciiDoc file, we got the following messages.

$ redpen sample.asciidoc
redpen redpen-suppress.asciidoc
[2016-06-14 16:10:43.850][INFO ] cc.redpen.Main - Configuration file: /usr/local/Cellar/redpen/1.6.1/libexec/conf/redpen-conf-en.xml
[2016-06-14 16:10:43.856][INFO ] cc.redpen.config.ConfigurationLoader - Loading config from specified config file: &amp;amp;amp;amp;amp;amp;amp;quot;/usr/local/Cellar/redpen/1.6.1/libexec/conf/redpen-conf-en.xml&amp;amp;amp;amp;amp;amp;amp;quot;
[2016-06-14 16:10:43.867][INFO ] cc.redpen.config.ConfigurationLoader - Succeeded to load configuration file
[2016-06-14 16:10:43.867][INFO ] cc.redpen.config.ConfigurationLoader - Language is set to &amp;amp;amp;amp;amp;amp;amp;quot;en&amp;amp;amp;amp;amp;amp;amp;quot;
[2016-06-14 16:10:43.867][WARN ] cc.redpen.config.ConfigurationLoader - No variant configuration...
[2016-06-14 16:10:43.868][INFO ] cc.redpen.config.ConfigurationLoader - No &amp;amp;amp;amp;amp;amp;amp;quot;symbols&amp;amp;amp;amp;amp;amp;amp;quot; block found in the configuration
[2016-06-14 16:10:43.872][INFO ] cc.redpen.config.SymbolTable - Default symbol settings are loaded
[2016-06-14 16:10:43.923][INFO ] cc.redpen.parser.SentenceExtractor - &amp;amp;amp;amp;amp;amp;amp;quot;[., ?, !]&amp;amp;amp;amp;amp;amp;amp;quot; are added as a end of sentence characters
[2016-06-14 16:10:43.924][INFO ] cc.redpen.parser.SentenceExtractor - &amp;amp;amp;amp;amp;amp;amp;quot;[', &amp;amp;amp;amp;amp;amp;amp;quot;]&amp;amp;amp;amp;amp;amp;amp;quot; are added as a right quotation characters
[2016-06-14 16:10:44.064][INFO ] org.reflections.Reflections - Reflections took 71 ms to scan 1 urls, producing 4 keys and 46 values
[2016-06-14 16:10:44.231][INFO ] cc.redpen.util.DictionaryLoader - Succeeded to load UnexpandedAcronymValidator default dictionary.
[2016-06-14 16:10:44.237][INFO ] cc.redpen.util.DictionaryLoader - Succeeded to load weak expressions.
[2016-06-14 16:10:44.243][INFO ] cc.redpen.util.DictionaryLoader - Succeeded to load word frequencies.
[2016-06-14 16:10:44.245][INFO ] cc.redpen.validator.JavaScriptValidator - JavaScript validators directory: js

We can see that there is no errors in the output. 

When the we want to suppress only the specified errors, add Validator names after suppress. The following example suppresses only two types of errors (Contraction WeakExpression) in the section.

[suppress='Contraction WeakExpression']
= Instances
Some software tools work in more than one machine, and such distributed (cluster)systems can handle huge data or tasks, because such software tools make use of large amount of computer resources, such as CPU, Disk, and Memory.

When we apply RedPen to the AsciiDoc file, we got the following messages.

redpen sample2.asciidoc
[2016-06-14 16:13:38.005][INFO ] cc.redpen.Main - Configuration file: /usr/local/Cellar/redpen/1.6.1/libexec/conf/redpen-conf-en.xml
[2016-06-14 16:13:38.010][INFO ] cc.redpen.config.ConfigurationLoader - Loading config from specified config file: &amp;amp;amp;amp;amp;amp;amp;quot;/usr/local/Cellar/redpen/1.6.1/libexec/conf/redpen-conf-en.xml&amp;amp;amp;amp;amp;amp;amp;quot;
[2016-06-14 16:13:38.019][INFO ] cc.redpen.config.ConfigurationLoader - Succeeded to load configuration file
[2016-06-14 16:13:38.019][INFO ] cc.redpen.config.ConfigurationLoader - Language is set to &amp;amp;amp;amp;amp;amp;amp;quot;en&amp;amp;amp;amp;amp;amp;amp;quot;
[2016-06-14 16:13:38.019][WARN ] cc.redpen.config.ConfigurationLoader - No variant configuration...
[2016-06-14 16:13:38.020][INFO ] cc.redpen.config.ConfigurationLoader - No &amp;amp;amp;amp;amp;amp;amp;quot;symbols&amp;amp;amp;amp;amp;amp;amp;quot; block found in the configuration
[2016-06-14 16:13:38.023][INFO ] cc.redpen.config.SymbolTable - Default symbol settings are loaded
[2016-06-14 16:13:38.082][INFO ] cc.redpen.parser.SentenceExtractor - &amp;amp;amp;amp;amp;amp;amp;quot;[., ?, !]&amp;amp;amp;amp;amp;amp;amp;quot; are added as a end of sentence characters
[2016-06-14 16:13:38.083][INFO ] cc.redpen.parser.SentenceExtractor - &amp;amp;amp;amp;amp;amp;amp;quot;[', &amp;amp;amp;amp;amp;amp;amp;quot;]&amp;amp;amp;amp;amp;amp;amp;quot; are added as a right quotation characters
[2016-06-14 16:13:38.200][INFO ] org.reflections.Reflections - Reflections took 63 ms to scan 1 urls, producing 4 keys and 46 values
[2016-06-14 16:13:38.349][INFO ] cc.redpen.util.DictionaryLoader - Succeeded to load UnexpandedAcronymValidator default dictionary.
[2016-06-14 16:13:38.353][INFO ] cc.redpen.util.DictionaryLoader - Succeeded to load weak expressions.
[2016-06-14 16:13:38.361][INFO ] cc.redpen.util.DictionaryLoader - Succeeded to load word frequencies.
[2016-06-14 16:13:38.363][INFO ] cc.redpen.validator.JavaScriptValidator - JavaScript validators directory: js
redpen-suppress-2.asciidoc:3: ValidationError[SentenceLength], The length of the sentence (226) exceeds the maximum of 120. at line: Some software tools work in more than one machi\
ne, and such distributed (cluster)systems can handle huge data or tasks, because such software tools make use of large amount of computer resources, such as CPU, Disk, and Memory.
redpen-suppress-2.asciidoc:3: ValidationError[CommaNumber], The number of commas (6) exceeds the maximum of 3. at line: Some software tools work in more than one machine, and such \
distributed (cluster)systems can handle huge data or tasks, because such software tools make use of large amount of computer resources, such as CPU, Disk, and Memory.
redpen-suppress-2.asciidoc:3: ValidationError[SymbolWithSpace], Need whitespace after symbol &amp;amp;amp;amp;amp;amp;amp;quot;)&amp;amp;amp;amp;amp;amp;amp;quot;. at line: Some software tools work in more than one machine, and such distributed (\
cluster)systems can handle huge data or tasks, because such software tools make use of large amount of computer resources, such as CPU, Disk, and Memory.
[2016-06-14 16:13:38.411][ERROR] cc.redpen.Main - The number of errors &amp;amp;amp;amp;amp;amp;amp;quot;3&amp;amp;amp;amp;amp;amp;amp;quot; is larger than specified (limit is &amp;amp;amp;amp;amp;amp;amp;quot;1&amp;amp;amp;amp;amp;amp;amp;quot;).

We can see that we got the only the errors not specified in the annotation block are flush.

Summary and Future work

This article demonstrates the error suppression by text annotation. The next release of RedPen IntelliJ plugin is going to support the quick fix of errors by inserting the suppress annotation.

Annotation for Suppressing Errors Reported from RedPen, a Linting tool for Markup text

We added Russian to RedPen, now it’s your turn…

RedPen proved to be an extremely flexible and universal tool. We managed to add basic Russian language support without any major changes done to the code. In fact, the only thing that we had to implement was Russian language auto-detection. The rest worked out the box with simple modification made to the default symbols configuration: in Russian language «quote» is used instead of “quote” and instead of #. Most likely, it will work equally fine for Ukrainian and Belorussian as well, and maybe some other languages using the Cyrillic script. This is how it looks like in RedPen Intellij Plugin:

russian

Actually adding custom languages to RedPen can be done in Intellij IDEA without making any modifications to RedPen itself. Use Settings -> Editor -> RedPen-> Import.

import-settings

For example, to add Russian language with correct double quotation marks, number sign and a couple of validators all you need to do is import a file with the following contents:

<redpen-conf lang="ru">
    <validators>
        <validator name="InvalidSymbol"/>
        <validator name="NumberFormat">
            <property name="decimal_delimiter_is_comma" value="false"/>
            <property name="ignore_years" value="true"/>
        </validator>
    </validators>
    <symbols>
        <symbol name="NUMBER_SIGN" value="№" invalid-chars="##" before-space="true"/>
        <symbol name="LEFT_DOUBLE_QUOTATION_MARK" value="«" invalid-chars="&quot;" before-space="true"/>
        <symbol name="RIGHT_DOUBLE_QUOTATION_MARK" value="»" invalid-chars="&quot;" after-space="true"/>
    </symbols>
</redpen-conf>

Only symbols that are different from English configuration should be listed. As for validators, you need to list all that you would like to use.

Most European/Western languages should also work fine with RedPen either using the default English configuration or a slight modification of it.

If you decide to add your own custom language, then it is a good idea to export English language configuration via Settings -> Editor -> RedPen-> Export and to change the language name in lang attribute of redpen-conf tag of the resulting file. It will serve as a good template to start with.

Then, optionally, you can add spelling and/or other dictionaries by specifying file names in either dict or list validator properties. Dictionary files are just text files with words listed one per line.

We added Russian to RedPen, now it’s your turn…

RedPen IntelliJ IDEA Plugin

To make usage of RedPen among developers even easier we created an Intellij IDEA Plugin that also works with recent releases of other JetBrains IDEs. This plugin integrates RedPen text validation by adding a new RedPen inspection.

editor

By default, RedPen validation errors are underlined with red (Intellij error style), but you can change it to yellow (warning) or any other highlighting style in Settings -> Editor -> Inspections -> Code style issues -> RedPen Validation.

Alternatively, raw validation error messages can be listed by pressing Ctrl+Alt+Shift+R or via IDEA menu Analyze -> RedPen: List Errors having a file selected either in editor or in the Project pane.

Installation

The plugin is available in JetBrains Plugin Repository and can be installed the same way as any other IDEA plugin.

Just open Settings -> Plugins -> Browse Repository, and search for RedPen to install.

File formats

RedPen plugin supports the following file formats provided that the relevant plugins are installed:

  • Plain Text
  • Properties and Resource Bundles
  • Markdown
  • AsciiDoc

Language support

The plugin supports all default RedPen languages and variants (currently, English and Japanese). Language and variant are auto-detected for each file, but can be manually overridden per file via status bar widget. Manually chosen language will be saved to .idea/redpen/files.xml and therefore selection will be preserved within the project.

status-widget

Quick fixes

Some validation errors can be fixed via quick fix (Alt+Enter when cursor is on an error). If no specific fix is available, it will at least offer you to remove the erroneous text. We will be adding more specific quick fixes in later releases.

quick-fix

RedPen configuration

RedPen is highly customizable with its configuration files, where you can define specific validators, change their properties or configure valid and invalid symbols for your writing style.

All the same can be done using RedPen configuration in Settings -> Editor -> RedPen.

Screenshot from 2016-03-10 13-17-45

Validators can be disabled by unchecking them and their properties can be edited by double-clicking on them in the table. Different properties are separated by semicolons, so you can use comma-separated values for e.g. list properties, allowing to use short custom dictionaries (see Advanced Topics for details). Spaces after = are not trimmed, which allows you to have space-only values for e.g. start_from property.

Screenshot from 2016-03-10 13-18-01

If you already have configuration files in xml format that you previously used with command-line version of RedPen, you can import them in the Settings dialog using the Import button. In a similar way Export button allows you to save current configuration snapshot for future use in other projects.

Configuration is edited or imported for each language and variant separately. If you have changed the default configuration for some language and variant pair, it will be stored per project under .idea/redpen directory, so it can be shared with fellow developers by committing it to version control.

Advanced topics

In case you want to edit raw xml configuration files under .idea/redpen, make sure you either reload the project or switch focus away from IDEA for the changes to take effect.

Many RedPen validators support custom dictionaries. In most cases, they provide two properties, list and dict.

You can use the list property to provide a short inline dictionary, just separate words with commas, e.g. list=apples,oranges. Do not put spaces between the words.

Longer custom dictionaries can be put into separate files under .idea/redpen directory. Once the file is there, you can use the dict property to specify its name, e.g. dict=mywords.txt

JavaScriptValidator is a special one, it allows you to write additional custom validators in JavaScript. By default, you can put such scripts to .idea/redpen/js directory or override the location using script_path property. All custom validators from *.js files will be activated if JavaScriptValidator is enabled in Settings.

 

RedPen IntelliJ IDEA Plugin

RedPen WordPress Plugin is now available

As a major step towards popularization of RedPen we created a WordPress Plugin. It was never easier to auto-detect mistakes in posts before they get published. RedPen Plugin allows validating posts as you type by marking validation errors in-place in Visual WordPress editor or highlighting them on-click in Text WordPress editor. Mistake explanation can be found by hovering marked text in the post or in a list below, which shows all currently present mistakes.

screenshot-1

To make it easier to maintain multilingual websites the language of the current post is detected automatically up to a variant (e.g. different Japanese symbol widths: zenkaku or hankaku). However, manual language change is also available if auto-detection fails (e.g. for cases when multiple languages are used in a single post).

Validation starts working out of the box after simple installation from WordPress Plugin Directory. Advanced users of RedPen will find it easy to customize RedPen settings to match their needs: all validators and symbols can be easily configured via convenient GUI. Don’t hesitate to modify the configuration, it can always be reset to default by a single click.

screenshot-2

The plugin is integrated with the RedPen Server via REST API. By default, the plugin uses public RedPen installation at Heroku for validation. However, if you are uncomfortable sending your text for validation to an external server, you can easily configure the plugin to use your own instance of RedPen Server. Server location is configurable via Settings > Writing.

screenshot-3

RedPen WordPress Plugin is now available

Converting RedPen documentation to AsciiDoc

We recently converted the RedPen documentation from RST (reStructured Text) to AsciiDoc, primarily to provide a working example of AsciiDoc markup to use as a test-case for RedPen.

This initial attempt to convert the RST documents involved using pandoc v1.13.2, an automated markup translator.

We processed the resulting .adoc files using AsciiDoctor.

Pandoc correctly converted in-line formatting, and automatically provided anchors for all headings. It quickly enabled us to get basic AsciiDoc versions of the existing files.

However, although pandoc is a very useful and powerful tool, it encountered a few problems converting our RST files to AsciiDoc. This was not totally unexpected, since there are markup options that cannot be directly translated between reStructured Text and AsciiDoc. However, some of the problems encountered meant that a significant amount of text had to to be reconverted by hand.

The issues we encountered were:

Missing tables

Several of the tables in the source documents were totally absent in the AsciiDoc files pandoc created. For example, this RST text:

SentenceLength validator checks the length of sentences in the input document. If the length of the sentence is greater than the specified maximum length, the validator generates a warning.

.. table::

  ============== ============= ============================
  Property       Default Value Description
  ============== ============= ============================
  ``"max_len"``  50            Maximum length of sentence.
  ============== ============= ============================

was translated by pandoc to:

 
SentenceLength validator checks the length of sentences in the input
document. If the length of the sentence is greater than the specified
maximum length, the validator generates a warning.

The table is completely absent. The correct AsciiDoc table is as follows:

[options="header"]
|====
|Property        |Default Value  |Description
|``max_len``     |50             |Maximum length of sentence.
|====

Source blocks

Source blocks in AsciiDoc should be formatted as follows:

[ source,xml]
----
<validators>
    <validator name="SentenceLength">
        <property name="max_len" value="200"/>
    </validator>
    <validator name="InvalidSymbol" />
    <validator name="SpaceWithSymbol" />
    <validator name="SectionLength">
        <property name="max_num" value="2000"/>
    </validator>
    <validator name="ParagraphNumber" />
 </validators>
---- 

However, the converted version was translated as:

code,sourceCode,xml------------------------------------------------------------------------------------------------
code,sourceCode,xml
<validators>
    <validator name="SentenceLength">
        <property name="max_len" value="200"/>
    </validator>
    <validator name="InvalidSymbol" />
    <validator name="SpaceWithSymbol" />
    <validator name="SectionLength">
        <property name="max_num" value="2000"/>
    </validator>
    <validator name="ParagraphNumber" />
 </validators>
------------------------------------------------------------------------------------------------

The format did not render properly when processed with AsciiDoctor.

Unsupported :ref: and :doc:

The RST source files included :ref: and :doc: references, as in:

RedPen supports default symbols for "en" and "ja", which are 
described in :ref:`en-default-symbol-setting` 
and :ref:`ja-default-symbol-setting`.

was converted to:

RedPen supports default symbols for ``en'' and ``ja'', which are
described in en-default-symbol-setting 
and ja-default-symbol-setting.

Although not 100% correct, we were hoping for something more like the following, which would preserve the notion that a link was involved:

RedPen supports default symbols for "en" and "ja", which are 
described in <<en-default-symbol-setting>> 
and <<ja-default-symbol-setting>>.

Heading levels

Although pandoc converted our headings perfectly correctly, it used the AsciiDoc heading style which is arguably the least flexible. Pandoc chose the following (correct) translation:

Heading Level 1
---------------
Heading Level 2
~~~~~~~~~~~~~~~
Heading Level 3
^^^^^^^^^^^^^^^

However, we would have preferred the easier-to-edit alternate format:

= Heading Level 1
== Heading Level 2
=== Heading Level 3
==== Heading Level 4

Other considerations

AsciiDoctor does not currently support the creation of a table of contents that spans multiple source documents. Given this limitation, we decided to combine all RedPen documents into a single HTML page. Although the page is larger, it is easier to navigate on some devices, and does not require any additional tools to build a surrounding multi-document index or menu.

Summary

Converting documents between different markup formats is reasonably straightforward, and tools such as pandoc are excellent choices for making headway quickly and easily. However, such tools do not always support all markup formats equally, and there may still be plenty of manual validation and editing to get your documents in good order.

Although pandoc v1.13.2 is not the current version, it was the version available at the time on Fedora 23. However, at the time of writing, the latest version available via http://pandoc.org/try/ still appears to produce the same translation as we encountered.

Converting RedPen documentation to AsciiDoc

Released RedPen v1.4 (LaTeX support)

We released RedPen v1.4. We hope that you will download it from the following URL and try using it.

https://github.com/redpen-cc/redpen/releases/tag/v1.4.0

The centerpiece of release v1.4 is support for the LaTeX format.

LaTeX support

In this release, we provided experimental support for LaTeX as an input format. Many people have requested LaTeX support, starting from the initial development of RedPen. Since the v0.6 release, we took one year to support LaTeX, and we finally succeeded.

Unfortunately, the LaTeX support is limited in the following ways:

  1. RedPen LaTeX parser does not work well when macros are used to add your own tags
  2. It does not support a complete check of sentence in lists and tables

Although these are big constraints, we believe that it would be used to inspect papers and documents.

Enhancement of functions (Validators)

In v1.4, we also concentrated on enhancements the (Validator) functions. To the added functions, three types of language support were added: support in both Japanese and English, support in English only, and support in Japanese only.

Functions supported in both Japanese and English

  • DoubleNegative In both Japanese and English, double negative statements are difficult to understand. If a double negative is present in the text, an error is output.

Functions supported in English only

  • FrequentSentenceStart When writing a document in English, many sentences can start with We. Because even when there is no problem with the content, the appearance is bad, and therefore it is good to swiftly replace them. Consider the following example.


We propose a novel method. We demonstrate the effectiveness of the method.

We in the above example has been used twice in a row. Without changing the meaning, we will edit the sentences to prevent continuous use of the same subject.


We propose a novel method. The effectiveness of the method is demonstrated in the experiments.

  • UnexpandedAcronym This function checks documents for the presence of acronyms and also for the original words that they represent.
  • WordFrequency If the word frequency within the document differs from the usual, an error is output.
  • Hyphenation If hyphen usage is not correct, an error is output.
  • NumberFormat If number formats differ from correct usage in English, an error is output.
  • ParenthesizedSentence This function inspects for usage of parentheses. If there are nested parentheses or more parentheses than specified, an error is output.
  • WeakExpression If the text has an ambiguous English expression, an error is output. For example, words such as €œcompletely€ and €œhuge€ should be replaced with more accurate representations.

Functions supported in Japanese only

  • Okurigana If Japanese okurigana word endings are used incorrectly, an error is output.
  • DoubledJoshi if a particle is used more than once in a sentence, it might be difficult to read.

Prospect of version 1.5

We will continue development of RedPen v1.5 and more. The fact is that we have not yet set the priorities, but for v1.5, a mechanism that can easily test functions written in JavaScript would be included.

Released RedPen v1.4 (LaTeX support)