help

Introduction

The primary function of tX is to transform the bracketed hierarchical structure of an XML file into an indented hierarchical structure, which is easier the interact with and view. While a bracketed file format creates a structure based on the placement of surrounding brackets (or parenthesis, or tags), an indented file format creates a hierarchy based on how indented a line is in comparison with its preceding lines. If a line is indented the same amount as the line(s) before it, it resides on the same level as the preceding line(s). If it is more indented then it is a child of the preceding line(s). If it is less indented then it has no hierarchical relationship with the preceding line(s), and it is the child of the 1st preceding line which is indented by one less. Python is an example of a language which uses indentation rather than brackets to denote a hierarchy.

By using indentation of blank table cells to denote parent-child relationships, and by separating the semantic units of an XML file into separate cells, an XML file can be displayed without any of its syntactic elements used solely for creating a file structure. The result is a table that only has data. This is especially ideal for viewing long sets of data that originate from databases because it omits the constant repetition of the same characters and attribute keys.


How an XML file is read

The best way to show how an XML file is read is by example. Consider the following XML, which is a portion of an ANT build file:

<project name="suncertify" default="build" basedir="."> some data...
        <description>build file</description>
        <!-- set global properties for this build -->
        <property name="src" location="code"/>
        <property name="compile" location="classes"/>
        <property name="doc" location="docs/javadoc"/>
</project>

Notice that in the three "property" tags, there is a consistent repetition of the attribute keys "name" and "location". This is common in XML formatted data because their semantics are often predefined in a document type definition (DTD) which restrict the allowed element tags, their attributes, and their children. It occurs even when a DTD is not specified because XML data is usually intended to be written by or read by a computer program which expects a regular, highly structured data format. This results in data which often consists of repeated labels to actual data, which is fine for a computer to read, but seems unnecessary when a person reads it. People are accustomed to reading data in tables. When they see lined columns and rows their eyes naturally scan upwards and to the left for the labels of the data they're viewing.

A more natural way to present the property tags would be:


name

location

property

src

code

property

compile

classes

property

doc

docs/javadoc


If attribute names change, the header lines can simply be printed out again above the rows for which they are different.

For the description tag, a single label to the left or above the text would be preferred:

description

build file


We can combine the two views and assume that a field with no header is a tag's value:


name

default

basedir


project

suncertify

build

.

some data...


Child tags can be associated to their parents by a single indentation below them. Other child data which are not element tags, like comments, text, and CDATA, can be added as singular lines indented below their parents. Ideally a user should be able to look at the result and tell where each cells originated from the original XML. This can be accomplished by assigning predefined font and color styles to each of the possible cell types: elements, attribute keys, attribute values, text, CDATA, and comments.


basedir

name

default

project

.

suncertify

build

some data...

description

build file

set global properties for this build

location

name

property

code

src

property

classes

compile

property

docs/javadoc

doc


How an XML file is interpreted from a table

Now that you are aware of how a table is created from an XML file, you should have a good idea of how to create a well formed table that can be transformed into an XML file. A typical line on a table can only fall into 5 categories: element, list of attribute keys, comment, text, or CDATA. Comment, text, and CDATA are all single line cells with no children. They can be specified as such by typing in text into a table cell and selecting the appropriate formatting option from the menu or tool-bar. An element line consists of an element cell, zero or more attribute values, and zero or one of either a text cell or a CDATA cell. An attribute key line consists of only attribute keys that correspond to their appropriate attribute values directly below each of their cells. If a value has no corresponding key, it is given an anonymous one when the data is saved.

If a line is indented, it is determined to be the child of a previous line. If the first column occupied by the line is N and its row is M, its parent will be the first nonempty cell in column N-1 found when traversing from row M to 1. If N is 0, it is considered to be the root. It follows that there is at most one cell allowed in the first column and that cell must be an element cell, because only a single root is permitted in an XML file. Since non-elements are not permitted to have children, no non-elements should have rows with greater indentation immediately below them. Elements with children should have their child indented by a single cell below them.


Encoding

Encoding rules within value cells are handled by the application. Special reserved characters like <, >, =, ', and " are interpreted to their XML equivalents. There are some caveats in what is permitted within the table cells. Within CDATA segments inserting the string sequence "]]>" can lead to unexpected results. This string denotes the end of the CDATA segment, so inserting it as data can close the segment prematurely.

Cells which act as names (notably element and attribute key) must follow the syntax rules of names in XML. This means that elements and keys must:

  • start with a letter, underscore "_", or colon ":"

  • have no spaces within them

  • have letters, numbers, period ".", minus sign "-", underscore "_", colon ":", and other special characters permitted by the XML specification

For more information on the allowed format of names see: http://www.w3.org/TR/2004/REC-xml-20040204/#NT-Name

website design / branding by Trillamar Technology Marketing