Documentation

1 How to read this documentation

An important feature of DataForge framework is ability to work with meta-data. This documentation is automatically generated by grain script from markdown and html pieces. Such pieces are called shards. The shard consists of data in one of supported formats and meta-data header. The ordering of shards is automatically inferred from meta-data by the script.

The meta-data of each shards includes version of this precise shard last update date, it's label for reference and ordering information.

2 Meta-data

One of the main ideas of DataForge is that all changeable information is represented by Meta objects (called annotation in earlier versions). Meta is simple tree-like object that can conveniently store different data.

The naming of meta elements and values follows basic DataForge [navigation](naming and navigation) convention. Meaning that elements and values could be called like child_name.grand_child_name.value_name. One can event use numbers as queries in such paths like child_name.grand_child_name[3].value_name.

2.1 Meta-data object structure

The Meta object is a tree-like structure, which can contain other meta objects as branches (which are called elements) and Value objects as leafs. Both Values and Meta elements are organized in String-keyed maps. And each map element is a list of appropriate type. By requesting single Value or Meta element one is supposed to request first element of this list.

Note that such lists are always immutable. Trying to change it may cause a error.

While meta itself does not have any write methods and is considered to be immutable, some of its extensions do have methods that can change meta structure. One should be careful not to use mutable meta elements when one need immutable one.

In order to conveniently edit meta, there is MetaBuilder class.

2.2 Configuration

The configuration is a very important extension of basic Meta class. It is basically a mutable meta which incorporates external observers. It is also important that while simple Meta knows its children knows its children, but could be attached freely to any ancestor, configuration has one designated ancestor that is notified than configuration is changed.

Note that putting elements or values to configuration follows the same naming convention as getting from it. Meaning putting to some_name.something will actually create or put to the node some_name if it exists. Otherwise, new node is created.

3 General

3.1 Context

Context encapsulation.

This section is under construction...

Navigation

This section is under construction...

4 Data flow

Data flow models

4.1 Actions

Actions Push data flow model

4.2 Tasks

Pull data flow model Tasks

5 Data format

The DataForge functionality is largely based on metadata exchange and therefore the main medium for messages between different parts of the system is Meta object and its derivatives. But sometimes one needs not only to transfer metadata but some binary or object data as well.

In order to do so one should use an Envelope format. It is a combined format for both text metadata and data in single block. An Envelope container consists of three main components:

Properties. A set of key-value bindings defining envelope format: metadata format, encoding and length, data format and length and general envelope version.
Meta. A text metadata in any supported format.
Data. Ant binary or textual data. The rules to read this data could be derived either from properties header or from envelope meta.

5.1 Envelope format

An envelope is a logical structure and its physical binary representation could be different for different purposes, but the default for envelope files or streams is the following:

Tag. First 30 bytes of file or stream is reserved for envelope properties binary representation:
1. #! - two ASCII symbols, beginning of binary string.
2. 4 bytes - properties type field: envelope format type and version. Depending on this value the rest of the binary string could be interpreted differently.
3. 4 bytes - currently reserved.
4. 4 bytes - properties metaType field: metadata type and encoding.
5. 4 bytes - properties metaLength field: metadata length in bytes including new lines and other separators.
6. 4 bytes - properties dataType field: data format and type. This field is not necessary for some applications and could be used for other purposes.
7. 4 bytes - properties dataLength field: the data length in bytes.
8. !# - two ASCII symbols, end of binary string.
9. \r\n - two bytes, new line.
The values are read as binary and transformed into 4-byte unsigned tag codes.
Properties override. Properties could be overridden with text values using following notation:

#? <property key> : <property value>; <new line>

Any whitespaces before <property value> begin are ignored. The ; symbol is optional, but everything after it is ignored. Every property must be on a separate line. It is recommended to use universal new line sequence \r\n to ensure correct work on any system.

Properties are accepted both in their textual representation or tag code.
Metadata block. Metadata in any accepted format. Additional formats could be provided by modules. The default metadata format is UTF-8 encoded XML (tag code 0x00000000). JSON format is provided by storage module.

One must note that metaLength property is very important and in most cases is mandatory. It could be set to 0xffffffff or -1 value in order to force envelope reader to derive meta length automatically, but different readers do it in a different ways, so it strongly not recommended to do it if data block is not empty.
Data block. Any other data. If dataLength property is set to 0xffffffff or -1, then it is supposed that data block ends with the end of file or stream. Data block does not have any limitations for its content. It could even contain envelopes inside it!

5.2 Envelope codes

The currently used envelope properties and codes are the following:

This section is under construction...

6 Storage plugin

Storage plugin defines an interface between DataForge and different data storage systems such as databases, remote servers or other means to save and load data.

The main object in storage system is called Storage. It represents a connection to some data storing back-end. In terms of SQL databases (which are not used by DataForge by default) it is equivalent of database. Storage could provide different Loaders. A Loader governs direct data pushing and pulling. In terms of SQL it is equivalent of table.

Note: DataForge storage system is designed to be used with experimental data and therfore loaders optimized to put data online and then analyze it. Operations to modify existing data are not supported by basic loaders.

Storage system is hierarchical: each storage could have any number of child storages. So ot is basically a tree. Each child storage has a reference for its parent. The sotrage without a parent is called root storage. The system could support any number of root storages at a time using storage context plugin.

By default DataForge storage module supports following loader types:

PointLoader. Direct equivalent of SQL table. It can push or pull DataPoint objects. PointLoader contains information about DataPoint DataFormat. It is assumed that this format is just a minimum requirement for DataPoint pushing, but implementation can just cut all fields tht are not contained in loader format.
EventLoader. Can push DataForge events.
StateLoader. The only loader that allows to change data. It holds a set of key-value pairs. Each subsequent push overrides appropriate state.
BinaryLoader. A named set of fragment objects. Type of these objects is defined by generic and API does not define the format or procedure to read or right these objects.

Loaders as well as Storages implement Responder interface and and could accept requests in form of envelopes.

6.1 File storage

The FileStorage is the default implementation of storage API.

This section is under construction...

7 Control plugin

DataForge control subsystem defines a general api for data acquisition processes. It could be used to issue commands to devices, read data and communicate with storage system or other devices.

The center of control API is a Device class. The device has following important features:

States: each device has a number of states that could be accessed by `getState` method. States could be either stored as some internal variables or calculated on demand. States calculation is synchronous!
Listeners: some external class which listens device state changes and events. By default listeners are represented by weak references so they could be finalized any time if not used.
Connections: any external device connectors which are used by device. The difference between listener and connection is that device is obligated to notify all registered listeners about all changes, but connection is used by device at its own discretion. Also usually only one connection is used for each single purpose.

1 How to read this documentation

2 Meta-data

2.1 Meta-data object structure

2.2 Configuration

3 General

3.1 Context

This section is under construction...

3.2 Providers and navigation

This section is under construction...

4 Data flow

4.1 Actions

4.2 Tasks

5 Data format

5.1 Envelope format

5.2 Envelope codes

This section is under construction...

6 Storage plugin

6.1 File storage

This section is under construction...

7 Control plugin