xmlguru.cz

Spreading the XML paradigm around

Choosing DocBook table model

2006-03-14

For a long time DocBook was supporting only CALS table model. Since V4.3 support for HTML table model was added. Which model you should use? How to remove the other table model from schema?


To be honest, I was never happy that HTML table model was added into DocBook. Why? Because it is inconsistent with the rest of DocBook, it is not using namespaces to identify itself and it has bloated DocBook DTD with attributes like onmouseover.

Example 1. Sample CALS table and its rendering

<table>
  <title>Test table</title>
  <tgroup cols="2">
    <colspec colname="c1"/>
    <colspec colname="c2"/>
    <thead>
      <row>
	<entry>A</entry>
	<entry>B</entry>
      </row>
    </thead>
    <tbody>
      <row>
	<entry>42</entry>
	<entry>123</entry>
      </row>
      <row>
	<entry namest="c1" nameend="c2">spanned</entry>
      </row>
    </tbody>
  </tgroup>
</table>

Table 1. Test table

AB
42123
spanned


The CALS table model has a very long history and some things can be considered as archaic—e.g. specifying number of columns is required by CALS because table rendering was quite hard and expensive task few years ago. Knowing total number of columns allows to use more effective rendering routines. This is no longer true, but attribute is still there.

Another little inconvenience is need to name columns if you want to merge cells. You are right that this is little bit overkill for simple cases, but it can be advantage for complex tables with many spans when you have to insert new column inside spanned area. Indirection is good, but it means more work usually. And anyway, you are editing DocBook tables in a WYSIWYG editor and your are shielded from seeing XML code. ;-)

Let's compare sample table with the same table expressed in HTML.

Example 2. Sample HTML table

<table>
  <caption>Test table</caption>
  <thead>
    <tr>
      <th>A</th>
      <th>B</th>
    </tr>      
  </thead>
  <tbody>
    <tr>
      <td>42</td>
      <td>123</td>
    </tr>
    <tr>
      <td colspan="2">spanned</td>
    </tr>
  </tbody>
</table>

The structure of the table is almost identical. You can see that the title of the table is expressed by caption element. This is inconsistent with the rest of DocBook formal objects which use title. You are also not allowed to use titleabbrev and info elements for HTML table. I personally consider this as the biggest disadvantage. This simply hides some general DocBook functionality from you and creates inconsistent schema.

There is no need for specifying total number of columns and merging of cells is done directly by using colspan attribute. But simply renaming rowtr and entrytd/th is the real difference between table models.

There are also some reasons for using HTML tables in DocBook—more users are familiar with them and you can copy'n'paste them from Web pages. But I don't consider this as a real argument. Why then not allow HTML inline elements as well? Why not to allow h1 instead of bridgehead as well? OK, table model is something little bit different, but it should have been added in XHTML namespace.

I will left decision about which table model to use on you. But it would be very strange to mix those two models in the same document. Because of this, it is good idea to create customized DocBook schema which will permit only one table model. I must say that I really like RELAX NG (thanks, James Clark and Murata Makoto) and modularity of DocBook V5.0 schemas (thanks, Norm). To disable HTML table models it is sufficient to use the following simple customization:

include "docbook.rnc"
{
  db.html.table = notAllowed
  db.html.informaltable = notAllowed
}

If you prefer HTML tables, and want to disable CALS tables, take it as a homework. Given my attitude I'm not going to help you with removing CALS and using HTML tables in DocBook, but process is completely analogical. You can even do things like using CALS for tables, and HTML for informal tables. Or you can jump directly from window.

blog comments powered by Disqus
Copyright © Jiří Kosek, 2006–2018