XBUP - Extensible Binary Universal Protocol

» Concepts » Progress

Concept: Introduction of the Types

This document is part of the eXtensible Binary Universal Protocol project documentation. Provides description declaration of block and attribute types and way how to use them.

Type Introduction

In the previous parts of the documentation encoding numbers and block tree structure was described, where blocks has the sequences of attributes. We can described as an existence of certain duality between attributes and subblocks, since the attributes can be expressed as subblocks, but attributes are limited to finite value. In order to process the data, it is necessary to define the meaning of individual attributes and subblocks. Since the definition must be finite, it is necessary to limit to the finite number of items of both types, yet allow to realize high enumerable sequences of attributes. It is also necessary to consider how exactly will be expressed the relationships such as generalization / specialization. It seems appropriate to consider for example:

In the first case, these blocks can be considered as different, because using a different unit of measurement, although in many programming languages and databases there are still used only the basic expressions of the values without mentioning any specific units. In the second and third case it will be probably more appropriate to express the relationship rather as link to supobject, because such differentiation would lead to using huge count of types.

Block Type

In the following part there will be described a way how to recognize the meaning of data contained in the individual blocks. Few alternatives should be considered again:

Perhaps the best way is to identify the type of data that represents the block using its attributes. Single types of blocks should be divided into groupsof by the importance.

Obviously these attributes should be placed in the block as the first ones. Therefore if the block has at least two attributes, the first two values are known as UBBlockType and are as follows:

1 UBNatural - BlockGroup
2 UBNatural - BlockType

Blocks are organized by type into groups (Groups) and the BlockGroup value determines to which group the block belongs to. The value 0 means that it is a basic block, which is a block that is natively processible by programs using build-in support. The BlockType determines the specific type of block in the group. Allowed ranges of values and thus the meaning of groups of blocks, determines the definition block.

If the block has less than two attributes, it is possible to select several variants how to use such incomplete blocks:

Special case of single-attribute block can be used several ways:

Document Type

Also for recognition of the document as a whole, or for the type of the stream, it is possible to consider several options:

In addition to its own blocks there should be at the beginning of the stream a sequence of bits that would indicate the used method of coding. As mentioned in part relevant to coding, it is necessary to at least determine the size of the used cluster.

Byte - ClusterSize = FEh

In order to determine the ClusterSize value, which may be of any value, it must be introduced at the beginning of the file, since the encoding on of the following values depends on it. This value is due to the universality encrypted unary. The advantage is that its code has the same as the cluster, usually for ClusterSize = 7. Using this value also exclude the use of purely unary coding.

For development purposes the header of the file is enriched by several other chars. Text characters are placed in the header for compatibility with existing operating systems and are readable only for ClusterSize 8x + 7 The existence of these values is a purely technical nature and they may be removed in later releases.

UBNatural - ProtocolVersion = 00h
DWord(4xUBNatural) - ProtocolSignature = 58 42 00 XXh (“XB” + development version)

Protocol version 0 is reserved for the protocol development stage. The development version then specify particular structure of the file and any incompatible changes in shall be reflected in a change to this value.

If the file contains no other data, then it's called an empty file. Otherwise, the data is processed as a single block, and data after that block are called as extended area. This should allow the use of protocol for the specifications of XBUP bitstream of generally infinite length. One reason is that in some operating systems file name extension is not used to distinguish the type of file, but just the first characters of the content are usually accessed. The header can be interpreted as a 32-bit identification number and 16-bit version number. It is assumed that the final version will have different file headers.

The principle of Default Zero

One of the interesting features in the block attributes interpretation is the possibility of using the principle of implied zero, which says that if there is not the attribute present in the attribute part, it is equivalent to the state as if it had been present and had zero value. This principle can be used to shorten the length of the whole block where the attributes which are usually 0 can be place at the end of a sequence of attributes and in the practical application of them they will be removed. This principle can be also used as an argument for declaring the order of the block attributes.

This technique also helps with the implementation of compatibility realized as an extension of the previous version providing new attributes with special meaning. Also, it would be appropriate to use this principle while defining the rules for the construction of attributes order.

In the case of the use of this technique it is possible establish a clear record, which indicates the minimum number of attributes, which means that it presents the only attributes to the last non-zero value, followed by zero values only, which are not present in the block.

Groups of Blocks

Another consideration is trying to solve the problem how to organize the groups.

The count of blocks in the group may be hypothetically infinite. However, it is appropriate to comply with the final number, in a negligent manner not save the value in the endless sequence of definitions.

Relations Between Blocks

Between the blocks in the tree there is defined the basic relation of a parent-child as the tree of definition's goes, which may not fully cover the needs of data representation. An important aspect here is for example, is a dynamic context, which allows to replace the various blocks between each other. Block relations to other individual data items (parameters) can be addressed in several ways:

Since the protocol is designed as a dynamic, static variant requires existence of the dynamics on the definitions side, which is not viable. Variant using block type recognition would need to scan data and therefore possibly to cache, as it might be required to return to them. Full or direct referencing raises the demand for data capacity, but also contradicts the concept of a blob of data and require extence of links which are not necessary. Direct references could provide the necessary momentum for a reasonable price of single attribute per parameter.

As the solution static option was chosen, which best suits the concept of a data block as a blob. For the need to create a list of blocks is, however, expanded on the possibility of using the attribute to determine the number of items of the same type in sequence of subblocks. Reordering can be implemented using links. It's possible to handle types of parameters using a number of ways, such as:

Revision

To ensure backward / forward compatibility, it is useful to allow to support the addition of new definitions in the specification, while maintaining the existing ones. It is possible to provide either a higher level protocol or to define a revision already at this level. The revision technique defines how the document is processed so that the application can handle the newer / older revisions.

Possible approaches for the definition of a block type:

Attribute Types

The next step is the introduction of the types of attributes.

In the case of multivalued attributes the question is how to deal with unlimited large sequence. Also here you can specify the size of the used area, but in this case using the number of attributes seems more appropriate, also thanks to the possible conflict with the principle implicit zero.

There is also possibility of introducing some connection between attributes and blocks, which represents just one value. In this comparison the attribute would represent a block without parameters and types of attributes could be presented as a sequence of such blocks. It is also appropriate to consider whether it is possible to apply tree hierarchy on the attributes just like on the blocks.

Attribute Type Examples

As a simple attributes can mentioned sequences of UBNumber values with the fixed number of elements. The basic and already mentioned types are UBNatural, UBInteger and their variants expanded for the infinity constants and UBRatio type. These types can then be extended to the meaninf of defined specification, for example, using units or any other specific meaning.

Pointer

This type (UBPointer) is the basic for the solution to the problem of linking the documents. Unlike the XML it is not appropriate here for the internal links in the document to use subnodes search, because especially with regard to possible transformation it could be a problem to identify a specific node. It is possible to choose between several possible solutions:

The chosen solution for the UBPointer attribute type is realized as the following value:

UBNatural - SubBlockIndex

Value is used for referring to his own subblock using the index value of the order between subblocks. In the case when referenced block is not present, a corresponding error WrongPointer is raised. Blocks are indexed from 1 and value 0 means empty pointer.

An alternative approach is the UBAccPointer, which is similar to the previous option only in the case of zero assuming the position next from the last position of UBAccPointer in this node.

Boolean Type

Simple values UBNatural has restrictions on the value to 0 and 1 and was established as the UBBoolean type for storing logical values.

Alternatively, you can use the value of UBBitField, which gives array of bits for bits of UBNatural value.

Fractions

Target here is to enable the implementation of the fractional values. These values are determined by calculation and therefore should not be included as the basic types.

UBFraction type is used for the implementation of the fraction in the interval <0,1> with non-negative integer values and without division by zero. From the perspective of the respective real values have a repeating values. Values are stored following sequence:

1/1, 1/2, 2/2, 1/3, 2/3, 3/3, 1/4, 2/4, 3/4, 4/4, ...
Sequence[n=1..][m=1..n](m/n)

UBIntFraction type is an extension for the whole members.

Attribute Sequences

Because of the need for complex blocks it was necessary to define a specific sequence of attributes showing compound information. Single items have their own names and forms a certain hierarchy.

There is more possible ways how to deal with such attribute groups:

Whether this encoding should introduce a new level, or possibly merge some characteristics into one level it is not yet entirely clear and will be decided later. In the meantime, it is possible to continue without a solution to this problem.

Examples of some types of multivalued attributes will be included. Some of them were already mentioned in part about encoding, or in this document in part about BlockType.

Real and Complex Numbers

Real number UBReal is already described in the section dealing with the encoding:

UBInteger - Base
UBInteger - Mantis

There are also complex numbers available:

UBReal - RealPart
UBReal - IrationalPart

It is also possible to use the extension of those types including constants for infinity, which is UBEReal, and UBEComplex. Alternatively, use of those types can be restricted on the positive, or integer variants, such as UBPositiveReal (UBCutInteger / UBTruncate).

Version of Block

Blocks which is the compatibility required are declared using the following UBVersion type, which is the sequence of two attributes to determine the version of the block:

UBNatural - MajorVersion
UBNatural - MinorVersion

If both values are zero, assuming that there is not a version of the block. MajorVersion = 0 value is a test version. For an expanded version of UBVersionExt there is usually followed attribute:

UBPointer - AlternativeBlock

It is a reference to the other blocks of the same type but with a different version. For the realization of the version it is the same as in the case of the need for two values. The first value determines backward and the other forward compatibility. For the same value MajorVersion there must be guaranteed increasing value of MinorVersion that the sequence of attributes is only extended to include new items.

List

List UBList is the structure defining the final list of attributes:

UBNatural - ItemsCount
UBNumber - Value 1
..
UBNumber - Value n

Alternatively, allow UBENatural ItemsCount?

Dynamic Sequences of Attributes

It seems to be appropriate to allow the creation of items represented using a variable number of attributes. Implementation of these sequences is somewhat problematic:

Path

This type is called UBPath and is defined as a sequence of UBNatural type values and is intended primarily for the implementation of the path in the tree.

UBNatural - PathCount
UBPointer - Path0Node
UBPointer - Path1Node
UBPointer - Path2Node
...

Using the previous type there can be constructed UBLink as reference to another block in the document.

UBNatural - UpCount
UBPath - LinkPath

List of Linked Items

The following UBPointerList type is similar to the UBList type, where various items of the list are referenced using the UBPointer type value, which allows putting them in a different order than those defined in the index. It is also possible to insert additional blocks between the individual items.

UBNatural - ItemsCount
UBPointer - Item0
UBPointer - Item1
UBPointer - Item2
...

Attribute Types Hierarchy

Specification of the block from the previous level of the document can define a list referring to the blocks representing the various attributes. Block, representing the attribute should allow to specify the type attribute as follows.

Attributes are defined as well as blocks in the tree structure. Root type of attribute is the UBNumber. The current proposed structure of attributes is as follows: