The Wayback Machine - https://web.archive.org/web/20041212090900/http://www.orbeon.com:80/ois/doc/reference-xpl-pipelines

XPL and Pipelines

1. Introduction

This section describes the XML Pipeline Definition Language (XPL) used by Presentation Server. The XPL interpreter is actually itself implemented as an XML processor, the Pipeline processor. For an introduction to pipelines, see Orbeon Presentation Server Tutorial.

2. Namespace

All the elements defined by XPL must be in the namespace with a URI: http://www.orbeon.com/oxf/pipeline. For consistency, XPL elements should use the p prefix. This document we will assumes that this prefix is used.

3. <p:config> element

The root element of a XPL document (config) defines:

  • Zero or more input or output parameters to the pipeline with <p:param>
  • The list of statements that need to executed for this pipeline. A statement defines either a processor with its connections to other processors in the pipeline using <p:processor> , or a condition using <p:choose>.

The <p:config> element and its content are defined in the Relax NG schema with:

  <start>
  <ref name="config"/>
  </start>
  <define name="config">
  <element name="p:config">
  <optional>
  <attribute name="id"/>
  </optional>
  <ref name="param"/>
  <ref name="statement"/>
  </element>
  </define>
  <define name="statement">
  <interleave>
  <zeroOrMore>
  <ref name="processor"/>
  </zeroOrMore>
  <zeroOrMore>
  <ref name="choose"/>
  </zeroOrMore>
  <zeroOrMore>
  <ref name="for-each"/>
  </zeroOrMore>
  </interleave>
  </define>

4. <p:param> element

The <p:param> element defines what the inputs and outputs of the pipeline are. Each input and output has a name. There cannot be two inputs with the same name or two outputs with the same name, but it is possible to have an output and an input with the same name. Every input name defines an id that can be later referenced with the href attribute such as when connecting processors. The output names can be referenced with the ref attribute on <p:output> .

The inputs and outputs of the above pipeline are declared in the XPL document below:

  <p:config xmlns:p="http://www.orbeon.com/oxf/pipeline">
  <p:param type="input" name="data"/>
  <p:param type="input" name="foo"/>
  <p:param type="output" name="bar"/>
  <p:param type="output" name="data"/>
  </p:config>

The <p:param> element and its content are defined in the Relax NG schema with:

  <define name="param">
  <zeroOrMore>
  <element name="p:param">
  <interleave>
  <attribute name="name"/>
  <attribute name="type"/>
  </interleave>
  </element>
  </zeroOrMore>
  </define>

5. <p:processor> element

The <p:processor> element places a processor in the pipeline and connects it to other processors, pipeline inputs, or pipeline outputs.

  • The kind of processor created is specified with the name attribute, which is an XML qualified name. A qualified name is composed of two parts:

    • A prefix: The prefix is mapped to a URI defining a namespace.
    • A local name: This name is a name in the namespace defined by the prefix.

    This mechanism allows grouping related processors in a namespace. For example, all the basic Presentation Server processors are grouped in the http://www.orbeon.com/oxf/processors namespace. This namespace is typically mapped to the oxf prefix. Processors are then referred to using names such as oxf:xslt or oxf:scope-serializer.

    The name maps to a processor factory. Processor factories are registered through the processors.xml file described in Packaging and Deployment.

    Note
    For backward compatibility, the uri attribute is still supported.
  • The <p:input> element connects the input of the processor to an inline document in the <p:input> element or to another document referenced with the href attribute.

  • The <p:output> element defines an id corresponding to that output with the id attribute or connects the output to a processor output with the ref attribute.

  • Optionally, <p:input> and <p:output> can have a schema-href or schema-uri attribute. Those attributes specify a schema that is used by the Pipeline processor to validate the corresponding input or output. schema-href references a document using the href syntax. schema-uri specifies the URI of a schema that is mapped to a specific schema in the Presentation Server properties file.

  • Optionally, <p:input> and <p:output> can have a debug attribute. When this attribute is present, the document that passes through that input or output is logged with Log4J. This is useful during development to see the XML going through the pipeline.

The following example feeds an XSLT processor with an inline document and an external stylesheet.

  <p:processor name="oxf:xslt" xmlns:p="http://www.orbeon.com/oxf/pipeline">
  <p:input name="config" href="stylesheet.xsl"/>
  <p:input name="data" schema-href="oxf:/address-book-schema.xml">
  <address-book>
  <card>
  <name>John Smith</name>
  <email>js@example.com</email>
  </card>
  <card>
  <name>Fred Bloggs</name>
  <email>fb@example.net</email>
  </card>
  </address-book>
  </p:input>
  <p:output name="data" id="address-book"/>
  </p:processor>

The <p:processor> element and its content are defined in the Relax NG schema with:

  <define name="processor">
  <element name="p:processor">
  <attribute name="name"/>
  <interleave>
  <zeroOrMore>
  <element name="p:input">
  <attribute name="name"/>
  <ref name="debug"/>
  <ref name="schemas"/>
  <optional>
  <choice>
  <attribute name="href"/>
  <ref name="anyElement"/>
  </choice>
  </optional>
  </element>
  </zeroOrMore>
  <zeroOrMore>
  <element name="p:output">
  <attribute name="name"/>
  <ref name="schemas"/>
  <ref name="debug"/>
  <choice>
  <attribute name="id"/>
  <attribute name="ref"/>
  </choice>
  </element>
  </zeroOrMore>
  </interleave>
  </element>
  </define>

6. <p:choose> element

The <p:choose> element can be used to execute different processors depending on a specific condition. The general syntax for this is very close to XSLT:

  <p:choose href="#condition-document" xmlns:p="http://www.orbeon.com/oxf/pipeline">
  <p:when test="first-condition">...</p:when>
  <p:when test="second-condition">...</p:when>
  <p:otherwise>...</p:otherwise>
  </p:choose>

The conditions are expressed in XPath and operate on the XML document specified by the href attribute on p:choose. Each branch can contain regular processor declarations as well as nested conditions.

Outputs declared in a branch are subject to the following conditions:

  • An output id cannot override an output id in scope before the corresponding choose element
  • The scope of an output id is local to the branch if it is connected inside that branch
  • The set of output ids not connected inside a branch become visible to processors declared after the corresponding choose element
  • The set of output ids not connected inside the branch must be consistent among all branches

The last condition means that if a branch has two non-connected outputs such as output1 and output2, then all other branches must declare the same outputs. On the other hand, inputs in branches do not have to refer to the same outputs.

The <p:choose> element and its content are defined in the Relax NG schema with:

  <define name="choose">
  <element name="p:choose">
  <attribute name="href"/>
  <oneOrMore>
  <element name="p:when">
  <attribute name="test"/>
  <ref name="statement"/>
  </element>
  </oneOrMore>
  <optional>
  <element name="p:otherwise">
  <ref name="statement"/>
  </element>
  </optional>
  </element>
  </define>

7. <p:for-each> element

With <for-each> you can execute processors multiple times based on the content of a document. Consider this example: an XML document contains information about employees, each described in an emp element. This document is stored in a file called company.xml:

  <company>
  <emp>
  <firstname>John</firstname>
  <lastname>Smith</lastname>
  </emp>
  <emp>
  <firstname>Judy</firstname>
  <lastname>Matthews</lastname>
  </emp>
  <emp>
  <firstname>Gloria</firstname>
  <lastname>Schwartz</lastname>
  </emp>
  </company>

You want to apply a stylesheet (stored in transform-employee.xsl) to each employee. You can do this with the following pipeline:

  <p:config xmlns:p="http://www.orbeon.com/oxf/pipeline">
  <p:for-each href="company.xml" select="/company/emp" root="new-company" id="company-out">
  <p:processor name="oxf:xslt">
  <p:input name="data" href="current()"/>
  <p:input name="config" href="transform-employee.xsl"/>
  <p:output name="data" ref="company-out"/>
  </p:processor>
  </p:for-each>
  <!-- The id "company-out" can now be referenced by other -->
  <!-- processor in the pipeline. -->
  </p:config>

This diagram describes how the iteration is done in the above example:

  • In a <for-each> you can have multiple processors connected together, <choose> statements and nested <for-each>, just like outside of a <for-each>.
  • The output of a processor (or other <for-each>) inside the <for-each> must be "connected to the for-each" using a ref="..." attribute. The value in the ref attribute must match the value of the <for-each> id attribute.
  • You access the current part of the XML document being iterated with current() in an href expression. If you have nested <for-each>, current() applies to the <for-each> that directly includes the current() expression.
  • The processor inside a <for-each> can access ids declared before the <for-each> statement.
  • The aggregated document (the "output of the <for-each>") is available in the rest of the pipeline with the id declared in the id attribute. Alternatively, you can directly connect the output of the <for-each> to an output of the current pipeline with a ref attribute (as in the processor <output> element). If the ref attribute is used (instead of id), then the value of the ref must be referenced (instead of the value of the id attribute). When both the id and ref attributes are used, the value of the id attribute must be referenced.
  • The <for-each> can have optional attributes: input-debug, input-schema-href, input-schema-uri, output-debug, output-schema-href and output-schema-uri. The attributes starting with "input" (respectively "output") work as the similar attributes, just without the prefix, on the <input> element (respectively <output> element). The attributes starting with "input" apply to the document referenced by the href expression. The attributes starting with "output" apply to the output of the <for-each>.

8. href attribute

The href attribute is used to:

  • Reference external documents
  • Refer outputs of other processors
  • Aggregate documents using the aggregate() function
  • Select part of a document using XPointer

The complete syntax of the href attribute is described below in a Backus Nauer Form (BNF)-like syntax:

    href              ::= ( local_reference | uri | aggregation ) [ xpointer ]
    local_reference   ::= "#" id
    aggregation       ::= "aggregate(" root_element_name "," agg_parameter ")"
    root_element_name ::= "'"  name "'"
    agg_parameter     ::= href [ "," agg_parameter ]
    xpointer          ::= "#xpointer(" xpath_expression ")"

8.1. URI

The URI syntax is defined in RFC 2396. A URI is used to references an external document. A URI can be:

  • Absolute, if a protocol is specified. For instance file:/dir/file.xml.
  • Relative, if no protocol is specified. For instance ../file.xml. The document is loaded relatively to the URL of the XPL document where the href is declared, as specified in RFC 1808.

8.2. Aggregation

Multiple documents can be aggregated with the aggregate() function. The name of the root element that will contain the aggregated document is specified in the first argument. The documents to aggregate are specified in the following arguments. There is no restriction on the number of documents that can be aggregated.

For example, you have a document (with output id first):

  <employee>John</employee>

And a second document (with output id second):

  <employee>Marc</employee>

Those two documents can be aggregated using aggregate('employees', #first, #second). This produces the following document:

  <employees>
  <employee>John</employee>
  <employee>Marc</employee>
  </employees>

8.3. XPointer

The XPointer syntax is used to select parts of a document. For example, if you have a document in a file called company.xml:

  <company>
  <name>Orbeon</name>
  <site>
  <web>http://www.orbeon.com/</web>
  <ftp>ftp://ftp.orbeon.com/</ftp>
  </site>
  </company>

The expression company.xml#xpointer(/company/site) produces the document:

  <site>
  <web>http://www.orbeon.com/</web>
  <ftp>ftp://ftp.orbeon.com/</ftp>
  </site>

8.4. Multiple References to an Identifier

The same id may be referenced multiple times in the same XPL document. For example, the id doc is referenced by two processors in the following example:

  <p:config xmlns:p="http://www.orbeon.com/oxf/pipeline">
  <p:processor uri="A">
  <p:output name="data" id="doc"/>
  </p:processor>
  <p:processor uri="B">
  <p:input name="data" href="#doc"/>
  </p:processor>
  <p:processor uri="C">
  <p:input name="data" href="#doc"/>
  </p:processor>
  </p:config>

The document seen by B and C are identical. This situation can be graphically represented as: