Next steps

decode header details to apply as -v "varname" on command line of the main process
header details and logistic detail in special file for model
consecutive filtering of data on payload level --filter record type --time resolution (seconds, sum over thy day) --digital twin starts with variable "mileage" and "speed" customer/driver info, loco information, wheel sensor correction

The concept using a standard tool is not intended for productive usage, but demonstration and teaching. Using (named) pipe would allow separating the processes better in time an CPU.

2 Interpreting USAR Files (using only standard tools)

The XML file contains the information for the RedBox recording process and not only the data structure of the rng files.

The XML config file contains the definitions of variable types,variable names and record types in separate sections.

modified tar format (USAR) = tar of { RNG Files + cfg File}
RNG Header 512Byte + Records
HDLC FRAME (DLE+STX/ETX) per record
Packet DLE stuffed
Packet CRC checked
RecType, distance, timestamp + Payload
Payload (array of elements of signals including Bit Fields, signal are types and scales factors)
Correction factors and validation information applied

All binary and bit interpretation is done with character tools based on hex output or "1001"-Strings like bitstruct. The data processing is done with gawk (ver > 4.1) scripts. With standard tools, it should be possible to define a set of information (variables) and to extract the information similar to an output (CSV export) configuration in ADS4. The application of the "Rules", additional algorithmic statement can be also included in an awk script. Actually this command line is used:

od -t x2 -v -j 512 --endian=big $datapath/$filemask | tr “[a-f]” “[A-F]” | gawk -f h0.awk | gawk -f h1.awk | ...

2.1 UsarExtractor

Use UsarExtractor to extract tar archieve with rng files and mfr1.cfg (xml format). The program runs under Linux 64 bit and Cocalc.com, source code is available somewhere in git. Workaround for "checksum error", use ADS4 to pre-load usar file the copy rng and cfg file from temp folder. Tool chain works with this single rng also.

2.2 Create Records

2.2.1 Convert rng binary file to hex with od

Use od for hex output, od tool has some advantage, because it can adjust the byte order (which is needed).

2.2.2 Convert hex with tr (to force hex in upper case hex)

The tr step is only needed if the upper case are requested for next processing steps.

2.2.3 Separate / concat hex bytes with with h0.awk

The hex char stream is separated in bytes with a separator char ';' to allow search in bytes for next step. Note to prevent to get stuck with 0100310020 while searching for 1003102 (char position is odd not even),

2.2.4 Separate hex in records with h1.awk

Here the DLE unstuffing gsub("10;10","10"); and the search for the records separator is done. The actual separator used is RS=";10;03;10;02;"; meaning is combined byte sequence of 0x10 0x30 0x10 0x20 indicates end of last record and start of new record. Field separators are removed by gsub(";","",$0);. The output is hex records with line separator "\n" to be used with other tools.

2.2.5 Interprete records with h2.awk

Output definition and details for the display of curves in a graph are defined in an XML-file by ADS4. This can be used as a definition for the record types and field which should decoded. For each field a decoding statements is generated. Actually this is generate from a shorter definition format without references using awk script (generate.awk). The output code including some scaling can be managed by this step. (Actually, code example can be found in bits.awk)

2.2.6 Use histogram and mean value for records length to check process

In a histogram with 256 bins, the sum of all length values is summed and at the end written to file hist.txt. (Actually, code can be found in h1.awk)

2.2.7 gawk extension and programming tricks used

Table lookup is using gawk arrays which accepts string as index is standard. Variable defined in function() are used as global (arrggg...). @include "decode.inc" statement is used, file is generated by generate.awk Record type RT=0x00 is used for header values definition True multi-dimensional array are an gawk extension, standard has two dimensional with x,y converted to x@y strings. Check with Alpine Linux is open, there only awk sandbox is provided. Actually first record will be lost.

2.2.8 CardHeader.bin

Tool chain can be used to convert CadHeader.bin (RT=0xFF) in gawk -v varname = value sequences, all header information is available in awk is supplied via command line $details. Check cardheader.awk and line in script file.

3 Minimal system model (Digital twin) for mileage and speed

Dynamic model of mass point and Kalman filter to reduce data Customer/driver info, loco information, wheel sensor correction multiple scale/resolution of information