Posts Tagged ‘RDF’

RSuite, MarkLogic and the DITA Open Toolkit

April 17, 2015


RSuite DITA Transform Demo


The above video has a brief introduction to RSuite and the DITA Open Toolkit.

I had the great pleasure of being a speaker at last week’s MarkLogic World 2015 San Francisco event. I’ll be writing much more about my MarkLogic MEAN Stack presentation later but for now I‘d like to give a good shout out to RSuite.

While at MarkLogic World, I also had the great pleasure of reconnecting with the wonderful RSuite team. I’m a big advocate of the RSuite content management system.

This past summer, I had an amazing opportunity to work at Harper Collins where I helped to deploy their new content management services using RSuite. My curiosity with RSuite stemmed from my work as a professional services consultant at MarkLogic. MarkLogic has a loyal base of customers in the media/publishing industry.

Media/publishing customers choose MarkLogic to store their content which consists of text based documents and binary assets (photos, audio, video) with the respective metadata.

MarkLogic lowered the pricing in 2013. The new pricing made it more affordable to store binary assets in MarkLogic. Keeping the binary assets together with the text based content also greatly simplifies the infrastructure and management overhead.

The RSuite CMS has the following features:

  1. Workflow
  2. DITA Transforms – provides “multi-channel” output.
  3. Role Based Security
  4. Distribution

The RSuite secret sauce is the DITA Open Toolkit. The other key component is MarkLogic.

The workflow engine is provided by jBPM which uses MySQL to store the workflow configurations and drives the finite state machine.

The DITA Open Toolkit provides the “multi-channel output” feature needed by most publishers. This is the ability to render the content to many formats such as PDF, ePub, XHTML, Adobe In Design, Word docx, any format.

The DITA acronym is Darwin Information Typing Architecture. It is an XML Data Model for Authoring and Publishing.

Eliot Kimber is the force behind DITA. Here’s some useful links.


Publishers should avoid using XHTML as the storage format of book content for many reasons. The industry standard format is DITA or DocBook because it provides a higher level of abstraction that makes it much easier to have “multi-channel output”. These standards are also more flexible when providing custom publishing services.

The DITA format is especially interesting because of the specialization feature that makes XML structures polymorphic.

Some key points about DITA:

  1. Topic Oriented
  2. Each Topic is a separate XML file
  3. DocBook is Book Oriented
  4. DITA Initial Spec in 2001
  5. DocBook Initial Spec in 1991
  6. Core DITA Topic Types are Concept, Task, and Reference
  7. Specialization: This is subtyping where new topics are derived from existing topics.
  8. Darwin term is used because the polymorphic specializations provide an evolution path.
  9. DITA Map XML document is used to stitch the Topic XML documents.


I had an opportunity to chat with Norm Walsh about it at this week’s MarkLogic World event. Norm is the author of DocBook: The Definitive Guide. He’s also an active member in a few of the XML/JSON standards committees.

DITA is a competing standard to DocBook. Norm wrote this interesting blog post about DITA back in October 2005.

My question is which one has better support for semantic annotations. Most content these days is semantically enriched using multiple Ontologies. This is needed so that SPARQL queries can be used to provide Dynamic Semantic Publishing services.

Norm was quick to tell me that DocBook 5.1 now supports RDFa. I’ll definitely investigate.

For DITA, there’s an interesting DITA to RDF transform in the works here.

Bob DuCharme, author of Learning SPARQL, has this nice blog post on using RDFa with DITA and DocBook.

In addition to the screencast above, the following screencasts will take a deeper dive into the RSuite software and DITA Open Toolkit.

Please take a look. Hopefully, the screencast will shed some more light on the need to store content using a higher level of abstraction (DITA or DocBook).

The screencasts will also show the value of RSuite as a full blown Content Management and Digital Asset Management (DAM) Solution.


DITA Open Toolkit Demo


DITA Open Toolkit Demo


RSuite Architecture and Code


RSuite Architecture and Code