SkillAgentSearch skills...

Tcobs

๐Ÿ—œ Compression with elimination of zeroes โ“ฟ, optimized for data with a bit more 00 and FF bytes, as messages often carry 16, 32 or 64 bit numbers with small values.

Install / Use

/learn @rokath/Tcobs
About this skill

Quality Score

0/100

Supported Platforms

Zed

README

TCOBS v1 & v2

<!-- TABLE OF CONTENTS --> <details> <summary>Table of Contents</summary> <ol> <!-- Use Shift-Ctrl-P "Generate TOC for Markdown to get the automatic numbering. --> <!-- Use Shift-Ctrl-P "Markdown All in Once: Create Table of Contence" to get correct links in the TOC. Delete the old one. --> <!-- vscode-markdown-toc --> <!-- vscode-markdown-toc-config numbering=true autoSave=true /vscode-markdown-toc-config --> <!-- /vscode-markdown-toc --> <div id="top"></div></ol></details> <!-- ๐ŸŸขโœ…๐ŸŸกโ›”๐Ÿ”ด๐Ÿ”ต๐Ÿ’งโ“โ†ฉเทดโš“๐Ÿ›‘โ—๐ŸŒกโฑโˆ‘โœณโ€ผโ™ฆโ™ฃ๐Ÿšซโš ๐ŸŽฅ๐Ÿ“ท๐ŸŒŠ๐Ÿ†˜๐Ÿงท๐Ÿขโžกโ˜• ![GitHub Workflow Status](https://img.shields.io/github/workflow/status/rokath/tcobs/goreleaser) ![GitHub release (latest by date)](https://img.shields.io/github/v/release/rokath/tcobs) ![GitHub commits since latest release](https://img.shields.io/github/commits-since/rokath/tcobs/latest) -->

GitHub All Releases GitHub code size in bytes GitHub watchers Go Report Card PRs Welcome test Coverage Status GitHub issues

<!-- ABOUT THE PROJECT -->

1. <a name='AboutTheproject'></a>About The project

./docs/ref/COBSDataDisruption.svg

  • TCOBS is a variant of COBS combined with real-time RLE data compression especially for short messages containing integers.
  • The maximum overhead with TCOBS (v1 or v2) is 1 byte for each starting 31 bytes in the worst case, when no compression is possible. This results from the 5 chaining bits in the needed NOP-sigil bytes ๐Ÿ–‡ for uncompressable data: For an input buffer size iz is the maximum needed output buffer size oz = iz * 32/31 + 1. (Example: A 1000 bytes buffer can be encoded with max 33 additional bytes.) This is more compared to the original COBS with +1 byte for each starting 254 bytes, but if the data contain integer numbers, as communication packets often do, the encoded data will be statistically shorter with TCOBS compared to the legacy COBS.

1.1. <a name='Assumptions'></a>Assumptions

  • Most messages like Trices consist of 16 or less bytes.
  • Some messages or user data are longer.
  • Several zeros in a row are a common pattern (example:00 00 00 05).
  • Several 0xFF in a row are a common pattern too (example -1 as 32 bit value).
  • Maybe some other bytes appear also in a row.
  • TCOBS does not know the inner data structure and is therefore usable on any user data.
<p align="right">(<a href="#top">back to top</a>)</p>

2. <a name='Preface'></a> Preface

  • TCOBS was originally developed as an optional Trice part and that's the T is standing for. It aims to reduce the binary trice data together with framing in one step.
    • T symbols also the joining of the 2 orthogonal tasks compression and framing.
    • Additionally, the usage of ternary and quaternary numbers in TCOBSv2 is reflected in the letter T.
  • TCOBSv2 is a better approach for TCOBSv1, suited perfect when long sequences of equal characters occur in the data stream.
    • The TCOBSv1 compression is expected to be not that good as with TCOBSv2.
  • About the data is assumed, that 00-bytes and FF-bytes occur a bit more often than other bytes.
  • The compression aim is more to get a reasonable data reduction with minimal computing effort, than reducing to an absolute minimum. The method shown here simply counts repeated bytes and transforms them into shorter sequences. It works well also on very short messages, like 2 or 4 bytes and on very long buffers. The compressed buffer contains no 00-bytes anymore what is the aim of COBS. <!-- In the worst case, if no repeated bytes occur at all, the encoded data can be about 3% longer (1 byte per each 31 input bytes). -->
  • TCOBS is stand-alone usable in any project for package framing with data minimizing.
  • Use cases in mind are speed, limited bandwidth and long time data recording in the field.
  • TCOBS is inspired by rlercobs. The ending sigil byte idea comes from rCOBS. It allows a straight forward encoding avoiding lookahead and makes this way the embedded device code simpler.
  • TCOBS uses various chained sigil bytes to achieve an additional lossless compression if possible.
  • Each encoded package ends with a sigil byte.
  • 0 is usable as delimiter byte between the packages containing no 0 anymore. It is up to the user to insert the optional delimiters for framing after each or several packages.

2.1. <a name='Whynotin2steps'></a> Why not in 2 steps?

  • Usually it is better to divide this task and do compression and COBS encoding separately. This is good if size and time do not really matter.
  • Each single transformation adds a separate control byte, so a combined transformation adds just 1 byte instead of 2.
  • The for TCOBS expected messages are typically in the range of 2 to 300 bytes, but not limited, and a run-length encoding then makes sense for real-time compression.
  • Separating compression and COBS costs more time (2 processing loops) and does not allow to squeeze out the last byte.
  • With the TCOBS algorithm, in only one processing loop a smaller transfer packet size is expected, combined with more speed.
<p align="right">(<a href="#top">back to top</a>)</p>

3. <a name='DataDisruptionHandling'></a>Data Disruption Handling

  • In case of data disruption, the receiver will wait for the next 0-delimiter byte. As a result it will get a packet start and end of 2 different packages A and Z.

    <a href="https://github.com/rokath/tcobs"> <img src="./docs/ref/COBSDataDisruption.svg" alt="Logo" width="1200" height="120"> </a>
  • For the decoder it makes no difference if the packages starts or ends with a sigil byte. In any case it will run into issues in such case with high probability and report a data disruption. But a false match is not excluded for 100%.

    • If the decoded data are structured, one can estimate the false match probability and increase the safety with an additional package CRC before encoding, if needed.
  • The receiver calls continuously a Read() function. The received buffer can contain 0-delimited packages and the receiver assumes them all to be valid because there is no known significant time delay between package start and end.

  • If a package start was received and the next package end reception is more than ~100ms away, a data disruption is likely and the receiver should ignore these data.

    • Specify a maximum inter-byte delay inside a single package like ~50ms for example.
    • To minimize the loss in case of data disruption, each message should get TCOBS encoded and 0-byte delimited separately.
    • The more often 0-byte delimiters are increasing the transmit overhead a bit on the other hand.
  • Of course, when the receiver starts, the first buffer can contain broken TCOBS data, but we have to live with that on a PC. Anyway there is a reasonable likelihood that a data inconsistency is detected as explained.

<p align="right">(<a href="#top">back to top</a>)</p>

4. <a name='CurrentState'></a>Current State

  • [x] The TCOBSv1 & TCOBSv2 code is stable and ready to use without limitations.

| Property | TCOBSv1 | TCOBSv2 | |--------------------------------------------------------|-----------------|-----------------| | Code amount | ๐ŸŸข less | ๐ŸŸก more | | Speed assumption (not measured yet) | ๐ŸŸข faster | ๐ŸŸข fast | | Compression on short messages from 2 bytes length | ๐ŸŸข yes | ๐ŸŸข yes | | Compression on messages with many equal bytes in a row | ๐ŸŸก good | ๐ŸŸข better | | Encoding C language support | ๐ŸŸข yes | ๐ŸŸข yes | | Decoding C language support | ๐ŸŸข yes

View on GitHub
GitHub Stars31
CategoryDevelopment
Updated26d ago
Forks2

Languages

C

Security Score

95/100

Audited on Mar 2, 2026

No findings