REST API for Document Validation

RedPen is an open source command line tool for proofreading documents, but RedPen also provides a server. This article introduces the functions of the RedPen server.

The RedPen server provides not only a Web UI but also a REST API, which enables users to check their documents without installing RedPen on their computers.

One of the features of the RedPen server API is its configurability. Users can validate their documents according to their configuration settings. In addition, the server can be deployed with few clicks in Heroku.

RedPen Server Web UI

A RedPen server is available at the following URL:

http://redpen.herokuapp.com/

The below image shows the current RedPen Web UI page.

redpen-ui

When a user visits the RedPen server at the URL above, the top left box is automatically preloaded with sample text that contains many mistakes. RedPen shows the errors in the top left box as red bars. The bottom left window provides detailed error information.

When users paste their documents in the box, any validation errors are displayed in the left bottom box.

We can configure the settings in the right box.
Specifically we can configure validation items and character (symbol) settings. For detailed configuration information, please see the RedPen configuration document page.

RedPen Server API

The RedPen server provides a REST API that enables users to apply RedPen validation without installing RedPen.

Currently the RedPen server API provides three types of validation.

  • /rest/config/redpens

This function returns validation errors using preconfigured redpens.

  • /document/validate

This function validates a document with the user’s configuration and then returns the errors.

  • /document/validate/json

This function is similar to the /document/validate function, but the configuration is written in JSON format.

Sample: REST API (/doument/validate)

The /document/validate function has several parameters.

  • document contains the text of the document RedPen should validate
  • documentParser specifies the input document format. Valid options are PLAIN,
    MARKDOWN, and WIKI.
  • lang specifies the language used to tokenize the document. Currently, values of ja (Japanese) and en (English/Whitespace) are supported.
  • format determines the format for the results. This can be either json (the default), json2, plain, plain2 or xml.
  • config contains the contents of a RedPen XML configuration file.

Now let us try the REST function with both configuration and text. The following is a sample RedPen configuration file (redpen-conf-en.xml) bundled with the RedPen package.

<redpen-conf lang="en">
  <validators>
    <validator name="SentenceLength">
      <property name="max_len" value="100"/>
    </validator>
    <validator name="InvalidSymbol"/>
    <validator name="SymbolWithSpace"/>
    <validator name="SectionLength">
      <property name="max_char_num" value="2000"/>
    </validator>
    <validator name="ParagraphNumber"/>
    <validator name="Spelling"/>
    <validator name="Contraction" />
    <validator name="DoubledWord" />
    <validator name="SuccessiveWord" />
    <validator name="EndOfSentence" />
    <validator name="SpaceBeginningOfSentence" />
  </validators>
</redpen-conf>

Next we validate a short input sentence with the RedPen server. The following command sends the document and configuration.

curl --data document="Twas brillig and the slithy toves did gyre and gimble in the wabe"
  --data lang=en --data format=PLAIN2 \
  --data config="`cat redpen-conf-en.xml`" \
  redpen.herokuapp.com/rest/document/validate/
  Line: 1, Offset: 0
    Sentence: Twas brillig and the slithy toves did gyre and gimble in the wabe
      Spelling: Found possibly misspelled word "brillig".
      Spelling: Found possibly misspelled word "slithy".
      Spelling: Found possibly misspelled word "toves".
      Spelling: Found possibly misspelled word "gyre".
      Spelling: Found possibly misspelled word "gimble".
      "and".querying the input

As we see, the RedPen server returns several errors in the input sentence.

Deploying your RedPen Server with the Heroku Button

In the previous sample, I was using a server already deployed in Heroku, but this server is not powerful enough if many users send their validation requests.

If you need short response time, of course you can deploy the RedPen server in your own environment, but this could be tiresome.

For users that would like to deploy their own server easily, RedPen provides a Heroku Button. Users can deploy the RedPen server with just a few clicks.

The Heroku Button can be located in the README of the RedPen source.

heroku-button

When we click the button, then the following page is shown.

heroku-deploy

When the user clicks the Deploy for free button, the RedPen server is deployed in a few minutes.

Advertisements
REST API for Document Validation