Tdengine Database Concepts: Data Model

Question

Tdengine Database Concepts: Data Model

asked Sep 6, 2023 by DiegoBazley7 (280 points)

The Top 50 Most Promising Startups is a yearly collection of up-and-coming enterprises in fields such as data, fintech, and cybersecurity that are considered by industry sources to have the most potential for future growth and valuation.

Today TDengine Database , the open-source, cloud-native time-series database (TSDB) optimized for IoT, was named one of the Top 50 Most Promising Startups by tech sector-oriented media organization The Information.

TDengine is a popular open-source data platform purpose-built for time-series data. With over 19,000 stars on GitHub and hundreds of new running instances daily, TDengine is used in over 50 countries worldwide. The company has raised $69M in venture capital, including a $47M 2021 series B round from MatrixPartners China, Sequoia Capital China, GGV Capital, and Index Capital.

Time-series data storage engine
Update and deletion of chronological data
In 2.0, the update and delete functions were developed after the engine was developed, so the update and delete functions in 2.0 are relatively simple but weak. 2.0 updates are based on a distribution of timestamps on the horizontal axis, and the operation of updating data is to append the data with the same timestamp after it. If you cherished this information and also you would like to get details relating to Tdengine Time Series Database generously go to the webpage. The query engine then merges these disordered data to get the updated result. The implementation of delete is simpler, similar to physical deletion, the data to be deleted will be directly "killed" in memory, hard disk, relatively inefficient.

The advantage of this feature is that the query engine can be reused, simple and easy to develop, but the disadvantage is also obvious, not only will there be poor computation of real-time, query pressure resulting in a waste of computing power, more importantly, the problem of disorder can not be handled. The data will be sorted according to the timestamp after entering TSDB, a disordered data comes in and inserted to the front, TSDB is unable to launch this data, which will lead to the problem of disorder; in addition, because the query engine is reused, the query engine of TSDB will not process the new disordered data and change the result verification.

Table of Contents
Introduction
Compression methods
TDengine implementation
Introduction
If the original data and decompressed data are exactly the same, the compression method can be considered lossless. A compression method that alters data is considered lossy.

"I’m thrilled to see that TDengine has been selected as one of the most promising startups," said Jeff Tao, founder and core developer of TDengine. "This is another indication that industry insiders understand the potential of TDengine to power modern data workflows in IoT and other rapidly growing markets. I’m confident that TDengine is well positioned to continue making technical as well as business breakthroughs going forward."

The first example is an aggregation of scalar functions with partition by tbname, where each vnode from the source DB is aggregated individually and distributed to the target DB, where the Stream Task is responsible for writing the data to the corresponding sub-table.

The subtables in the supertable are aggregated individually in a session window
CREATE STREAM current_stream
TRIGGER WINDOW_CLOSE
WATERMARK 60s
INTO current_stream_output_stb AS
SELECT
_wstart as start,
_wend as end,
avg(current) as avg_current
FROM meters
SESSION(ts, 30s)
PARTITION BY TBNAME
The third example is the session window. The stream supports a session window and a state window in addition to a tumble window and a sliding window (hop window), which are defined exactly the same as the normal queries in TDengine. Here again, we introduce the PARTITION BY TBNAME clause, which indicates that each sub-table computes the session window independently and writes the result to the destination table.

SELECT ts, pm25-DIFF(pm25), co-DIFF(co), no2-DIFF(no2), windspeed-DIFF(windspeed), pm25,co,no2,windspeed FROM weather.p1;
Using the Python connector, you can get the results of any of the above queries into a Pandas dataframe.

Losslessly compressed data can be completely reconstructed after decompression into the original raw data. This compression method is used in scenarios where data accuracy is important, such as compressing entire hard drives or compressing executable files. It may also be used for multimedia compression. However, lossless methods have relatively low compression ratios. Common lossless compression methods include differential coding, run-length encoding (RLE), Huffman coding, LZW compression, and arithmetic coding.
On the other hand, with lossy compression, the compressed data cannot be reconstructed into the original data – only into an approximation of that data. This compression method is used in scenarios where data accuracy is not important, such as compressing multimedia files. However, lossy compression enables high compression ratios. Common lossy compression methods include predictive codecs, fractal compression, wavelet compression, JPEG, and MPEG.
Compression methods
Commonly used compression methods for time series data are described as follows:

Tdengine Database Concepts: Data Model

Your answer

0 Answers