Spreading the XML paradigm around
2006-03-14
I started to develop quite large DocBook V5.0 customization today. During this process I found that I don't know all RELAX NG corners well. If you are interested how to create DocBook subsets by removing elements and attribute, you might find this short article quite interesting.
Norm already posted a nice
piece about creating subsets of DocBook. His solution allows
you to easily remove unwanted elements from DocBook grammar. For
example, if you want DocBook without recursive section
s
(you are preferring sect1
–sect5
), you can easily
create custom schema that will forbid usage of this element.
include "docbook.rnc" { db.section = notAllowed }
We are using fact, that for each DocBook element in the schema,
there is a corresponding named pattern labeled with db.
prefix. You can use notAllowed
RELAX NG pattern to
disable this pattern, effectively removing element from the
schema.
This works well for elements, but you might also want to remove
some attributes. Suppose that we want to remove linking attributes
(linkend
and XLink) from general
elements. Attributes in the DocBook schema are also defined using
several named patterns. There is pattern
db.common.linking.attributes
which defines
attributes we want to remove. My first attempt for removing those
attributes was following element removal process:
include "docbook.rnc" { db.common.linking.attributes = notAllowed }
But if you will try to validate any DocBook V5.0 document with this
schema, you get errors like “unknown element "article" from
namespace "http://docbook.org/ns/docbook"”. Why? Because the
way RELAX
NG simplification works. This notAllowed
pattern is propagated to element level, because declaration
like (it is little bit simplified from the actual schema):
db.article = element article
{
db.common.linking.attributes,
db.….attributes,
content model
}
will be substituted by:
db.article = element article
{
notAllowed,
db.….attributes,
content model
}
and this means that element article
is not allowed to
appear in a document.
Fortunately RELAX NG offers pattern empty
which is ignored when occurring inside group. Thus we can use it for
disabling attributes.
# This DocBook variant omits section element and # linking attributes from general non-linking elements include "docbook.rnc" { db.section = notAllowed db.common.linking.attributes = empty }
To summarize it. If you need to remove element from DocBook, use
notAllowed
. If you need to remove attribute(s) from
DocBook, use empty
pattern.