TDengine is a popular open-source data platform purpose-built for time-series data. With over 19,000 stars on GitHub and hundreds of new running instances daily, TDengine is used in over 50 countries worldwide. The company has raised $69M in venture capital, including a $47M 2021 series B round from MatrixPartners China, Sequoia Capital China, tdengine GGV Capital, and Index Capital.
Table of Contents
IntroductionCompression methods
TDengine implementationIntroductionIf the original data and decompressed data are exactly the same, the compression method can be considered lossless. A compression method that alters data is considered lossy.
Enterprise-ready cloud solution, providing robust backup, multi-cloud replication, user privilege controls and behavior auditing, VPC peering, and IP whitelisting features. TDengine Cloud delivers the carrier-grade performance and stability that you need to support your business.
Simplified setup and management, dramatically reducing the tools needed to start, operate, and manage your time-series database at scale. As a managed service, TDengine Cloud saves you time by taking care of clustering, backup, and data retention on its own.
Easier data analytics and sharing, enabling you to gain insight from your data more conveniently than ever. You can quickly access data in TDengine Cloud with Python, Java, Go, Rust, and Node.js connectors; create dashboards and applications that subscribe to your topics and streams; and replicate data across your enterprise with edge-to-cloud and cloud-to-cloud synchronization.
Fast and easy data ingestion, supporting standard SQL with connectors for popular programming languages as well as an MQTT broker with which you can send data to TDengine without writing any custom code, in addition to schemaless insert protocols. With
tdengine database Cloud, you can choose the method for writing data into your time-series database that is most convenient for you and your business scenario.
Join the community
Register at cloud.tdengine.com today for a free account and walk through a short tutorial to quickly understand the capabilities and advantages of using TDengine to unlock the power of your time-series data.
Today
TDengine Database , the open-source, cloud-native time-series database (TSDB) optimized for IoT, was named one of the Top 50 Most Promising Startups by tech sector-oriented media organization The Information.
For a delete operation, we need to record the start and end time interval, as well as the version number of the delete request, as shown above, a delete request corresponds to a purple rectangle on a two-dimensional graph, and all points inside this rectangle are deleted. In 3.0, temporal data deletion appends a tuple of (st, et, version) records, and at query time, the query engine does a final merge of the written data and the deleted record tuple, and gets the final result of the deletion.
The Top 50 Most Promising Startups is a yearly collection of up-and-coming enterprises in fields such as data, fintech, and cybersecurity that are considered by industry sources to have the most potential for future growth and valuation.
The TDengine Server and Client can now be run on x64 systems that are running macOS, Windows 10 and 11, Windows Server 2016 and 2019, CentOS 7.9 and 8, or Ubuntu 18 and 20, as well as ARM64 systems running macOS or CentOS. For the latest information on operating system support, see the documentation.
Scale your system with a powerful hybrid solution: At the same time, with the included TDengine Data Reference, you can continue using PI Vision to view and manipulate all of your data stored in either PI or TDengine. The TDengine PI Connector causes no disruption to your existing workflows and helps you make the best use of your PI System without having to purchase new points.
Losslessly compressed data can be completely reconstructed after decompression into the original raw data. This compression method is used in scenarios where data accuracy is important, such as compressing entire hard drives or compressing executable files. It may also be used for multimedia compression. However, lossless methods have relatively low compression ratios. Common lossless compression methods include differential coding, run-length encoding (RLE), Huffman coding, LZW compression, and arithmetic coding.
On the other hand, with lossy compression, the compressed data cannot be reconstructed into the original data – only into an approximation of that data. This compression method is used in scenarios where data accuracy is not important, such as compressing multimedia files. However, lossy compression enables high compression ratios. Common lossy compression methods include predictive codecs, fractal compression, wavelet compression, JPEG, and MPEG.
Compression methodsCommonly used compression methods for time series data are described as follows:
Huffman coding:
This is one of the most widespread compression schemes, first described in the 1850s by David Huffman in his paper "A Method for Constructing Minimal Redundancy Codes". Huffman coding works by finding the optimal prefix code for a given alphabet. It should be noted that a prefix code here represents a value, and it is necessary to ensure that the prefix code of each symbol in the alphabet does not become the prefix of another symbol prefix code. For example, if 0 is the prefix code for our first symbol, A, then no other symbol in the alphabet can start with 0. This mechanism is also useful because the prefix code makes decoding of the bitstream unambiguous.
Run-length encoding (RLE)
A lossless data compression technique independent of the nature of the data, based on "replacing the original data that recurs continuously with a code of varying length" to achieve compression. For example, a set of string "AAAABBBCCDEEEE" consists of 4 A , 3 B , 2 C , 1 D , and 4 E. After RLE, the string can be compressed into 4A3B2C1D4E. Its advantages are simplicity, high speed, and the ability to compress continuous and highly repetitive data into small units. The disadvantage is also obvious, the data compression effect with low repetition is not good.
XOR
This algorithm is a specific algorithm designed in combination with the data characteristics that follow the IEEE754 standard floating-point storage format. The first value is not compressed, and In the event you adored this post and also you desire to acquire details relating to
tdengine Time Series Database generously go to the web site. the following values are the result of XOR (exclusive OR) with the first value. If the result is the same, only the storage A 0 to store the XORed result if the result is different. The algorithm is greatly affected by data fluctuation, the more severe the fluctuation, the worse the compression effect.
Delta
Differential encoding is also called incremental encoding. The first data is unchanged during encoding, and other data are converted into the delta of the previous data. The principle is similar to XOR, both of which are to calculate the difference between two adjacent data. This algorithm is widely used, such as when you need to see the history of changes to a file (version control, Git, etc.). However, this algorithm is rarely used alone in time series databases, and is generally used with RLE, Simple8b or Zig-zag, and the compression effect is better.
Delta-of-deltaAlso known as second-order differential encoding, Delta encoding is used again on the basis of Delta encoding, which is more suitable for encoding monotonically increasing or decreasing sequence data. For example, 2,4,4,6,8 , 2,2,0,2,2 after delta encoding, and 2,0,-2,2,0 after delta encoding. Usually it is also used with RLE, Simple8b or Zig-zag.