1 How to read this documentation

An important feature of DataForge framework is ability to work with meta-data. This documentation is automatically generated by grain script from markdown and html pieces. Such pieces are called shards. The shard consists of data in one of supported formats and meta-data header. The ordering of shards is automatically inferred from meta-data by the script.

The meta-data of each shards includes version of this precise shard last update date, it's label for reference and ordering information.


2 Meta-data

One of the main ideas of DataForge is that all changeable information is represented by Meta objects (called annotation in earlier versions). Meta is simple tree-like object that can conveniently store different data.

The naming of meta elements and values follows basic DataForge [navigation](naming and navigation) convention. Meaning that elements and values could be called like child_name.grand_child_name.value_name. One can event use numbers as queries in such paths like child_name.grand_child_name[3].value_name.


2.1 Meta-data object structure

Meta structure

The Meta object is a tree-like structure, which can contain other meta objects as branches (which are called elements) and Value objects as leafs. Both Values and Meta elements are organized in String-keyed maps. And each map element is a list of appropriate type. By requesting single Value or Meta element one is supposed to request first element of this list.


Note that such lists are always immutable. Trying to change it may cause a error.


While meta itself does not have any write methods and is considered to be immutable, some of its extensions do have methods that can change meta structure. One should be careful not to use mutable meta elements when one need immutable one.

In order to conveniently edit meta, there is MetaBuilder class.


2.2 Configuration

The configuration is a very important extension of basic Meta class. It is basically a mutable meta which incorporates external observers. It is also important that while simple Meta knows its children knows its children, but could be attached freely to any ancestor, configuration has one designated ancestor that is notified than configuration is changed.


Note that putting elements or values to configuration follows the same naming convention as getting from it. Meaning putting to some_name.something will actually create or put to the node some_name if it exists. Otherwise, new node is created.



3 General


3.1 Context

Context encapsulation.

Under construction...

This section is under construction...


Navigation

Under construction...

This section is under construction...


4 Data flow

Data flow models


4.1 Actions

Actions Push data flow model


4.2 Tasks

Pull data flow model Tasks


5 Data format

The DataForge functionality is largely based on metadata exchange and therefore the main medium for messages between different parts of the system is Meta object and its derivatives. But sometimes one needs not only to transfer metadata but some binary or object data as well.

In order to do so one should use an Envelope format. It is a combined format for both text metadata and data in single block. An Envelope container consists of three main components:

  1. Properties. A set of key-value bindings defining envelope format: metadata format, encoding and length, data format and length and general envelope version.
  2. Meta. A text metadata in any supported format.
  3. Data. Ant binary or textual data. The rules to read this data could be derived either from properties header or from envelope meta.


5.1 Envelope format

An envelope is a logical structure and its physical binary representation could be different for different purposes, but the default for envelope files or streams is the following:


5.2 Envelope codes

The currently used envelope properties and codes are the following:

Under construction...

This section is under construction...


6 Storage plugin

Storage plugin defines an interface between DataForge and different data storage systems such as databases, remote servers or other means to save and load data.

The main object in storage system is called Storage. It represents a connection to some data storing back-end. In terms of SQL databases (which are not used by DataForge by default) it is equivalent of database. Storage could provide different Loaders. A Loader governs direct data pushing and pulling. In terms of SQL it is equivalent of table.


Note: DataForge storage system is designed to be used with experimental data and therfore loaders optimized to put data online and then analyze it. Operations to modify existing data are not supported by basic loaders.


Storage system is hierarchical: each storage could have any number of child storages. So ot is basically a tree. Each child storage has a reference for its parent. The sotrage without a parent is called root storage. The system could support any number of root storages at a time using storage context plugin.

By default DataForge storage module supports following loader types:

  1. PointLoader. Direct equivalent of SQL table. It can push or pull DataPoint objects. PointLoader contains information about DataPoint DataFormat. It is assumed that this format is just a minimum requirement for DataPoint pushing, but implementation can just cut all fields tht are not contained in loader format.
  2. EventLoader. Can push DataForge events.
  3. StateLoader. The only loader that allows to change data. It holds a set of key-value pairs. Each subsequent push overrides appropriate state.
  4. BinaryLoader. A named set of fragment objects. Type of these objects is defined by generic and API does not define the format or procedure to read or right these objects.

Loaders as well as Storages implement Responder interface and and could accept requests in form of envelopes.


6.1 File storage

The FileStorage is the default implementation of storage API.

Under construction...

This section is under construction...


7 Control plugin

DataForge control subsystem defines a general api for data acquisition processes. It could be used to issue commands to devices, read data and communicate with storage system or other devices.

The center of control API is a Device class. The device has following important features: