Customization of the transformation with Groovy

To customize the transformation you have to write a groovy script for the element you want to edit. There are 2 different hooks where you can influence the transformation. The first is right after the parsing of HTML code and before the conversion starts. The second is at the end of the transformation, when the DocBook tree is traversed a last time.

Customization of the HTML tree

The customization of the HTML tree can be used to remove certain type of elements, e.g. to suppress all <script> tags. To accomplish this, you have to create a file named script.groovy in a directory of your choice. When executing herold use the command line argument --html-script-path to point to this directory. The script will be called for every node that represents a script tag. As binding it has the variable node, which is of type org.dbdoclet.xiphias.dom.NodeImpl and references the DOM node. The following program listing shows the script:

node.getParentNode().removeChild(node);

At this stage you can manipulate the DOM tree of the HTML code to remove everything unneccesary like advertisment, banners or tables.