A comprehensive understanding of the operational principles of infrastructures such as IoTeX, DePHY, and peaq

Geek Web3
2024-03-18 21:32:49
Collection
Understanding the working principles of DePIN and Web3 IoT projects from the perspective of protocol design.

Written by: Pika, Sui Public Chain Ambassador, DePIN Researcher

Edited by: Faust, Geek Web3

Introduction: Although the DePIN track is currently very popular, there are still technical barriers for IoT devices related to DePIN to be integrated on a large scale into the blockchain. Generally speaking, to connect IoT hardware to the blockchain, the following three key stages must be experienced:

  1. Trusted operation of hardware devices;

  2. Collecting, verifying, and providing data;

  3. Distributing data to different applications.

Different attack scenarios and countermeasures exist in these three stages, requiring the introduction of various mechanism designs. This article reviews and analyzes the entire process of IoT devices generating trusted data, verifying and storing data, producing proofs through computation, and rolling up data to the blockchain from the perspective of project workflow and protocol design. If you are an entrepreneur in the DePIN track, I hope this article can help your project development in terms of methodology and technical design.

In the following text, we take the air quality monitoring scenario as an example, analyzing the three DePIN infrastructures: IoTeX, DePHY, and peaq, to explain how DePIN infrastructures work. Such infrastructure platforms can connect IoT devices with blockchain/Web3 facilities, helping project parties quickly launch DePIN application projects.

Trusted Operation of Hardware Devices

The trusted operation of hardware devices includes trust in device identity and verifiable, tamper-proof trust in program execution.

Basic Working Model of DePIN

In most incentive schemes of DePIN projects, operators of hardware devices provide services externally, using this as leverage to request rewards from the incentive system. For example, in Helium, network hotspot devices obtain HNT rewards by providing signal coverage. However, before obtaining incentives from the system, DePIN devices need to present evidence to prove that they have indeed exerted a certain "effort" as required.

This type of proof, used to demonstrate that one has provided a certain service or engaged in certain activities in the real world, is called Proof of Physical Work (PoPW). In the protocol design of DePIN projects, physical work proof plays a crucial role, and various attack scenarios and corresponding countermeasures exist.

DePIN projects rely on the blockchain to complete incentive distribution and token allocation. Similar to the public-private key system in traditional public chains, the identity verification process of DePIN devices also requires the use of public and private keys. The private key is used to generate and sign the "proof of physical work," while the public key is used externally to verify the aforementioned proof or serves as the device's identity tag (Device ID).

In addition, directly receiving token incentives with the device's on-chain address is not convenient, so DePIN project parties often deploy a smart contract on-chain, which records the on-chain account addresses of different device holders, similar to a one-to-one or many-to-one relationship in a database. In this way, the token rewards that the off-chain physical devices should receive can be directly credited to the on-chain accounts of the device holders.

Witch Attack

Most platforms that provide incentive mechanisms will encounter "witch attacks," where someone may control a large number of accounts or devices, or generate different identity proofs to disguise themselves as multiple individuals and claim multiple rewards. Taking the air quality monitoring example mentioned earlier, the more devices providing this service, the more rewards the system distributes. Someone can quickly generate multiple air quality test data and corresponding device signatures through technical means, creating a large number of physical work proofs to profit, which could lead to high inflation of the DePIN project's tokens, so such cheating behaviors must be curtailed.

The so-called anti-witch measures, if not using methods that compromise privacy like KYC, typically involve POW and POS. In the Bitcoin protocol, miners must expend a large amount of computational resources to obtain mining rewards, while POS public chains require network participants to stake a significant amount of assets.

In the DePIN field, anti-witch measures can be summarized as "raising the cost of generating physical work proofs." Since the generation of physical work proofs relies on valid device identity information (private key), as long as the cost of obtaining identity information is raised, it can prevent certain low-cost cheating behaviors that generate a large number of work proofs.

To achieve this goal, a relatively effective solution is to allow DePIN device manufacturers to monopolize the generation of identity information, customizing devices and recording a unique identity tag for each device. This is akin to having the public security bureau uniformly record the identity information of all citizens, where only those verifiable in the public security bureau's database are eligible to receive government subsidies.

(Image Source: DigKey)

In the production phase, DePIN device manufacturers will use programs to generate root keys over a sufficiently long period and then randomly select root keys to be written into the chip using eFuse technology. Here, it's worth noting that eFuse (electronic fuse) is an electronic technology used to store information in integrated circuits, and the information recorded is usually non-tamperable or erasable, providing strong security assurance.

In this production process, neither the device holder nor the manufacturer can know the device's private key or the root key. The hardware device can derive and use a working key from the root key in a TEE (Trusted Execution Environment) isolated environment, which includes the private key used for signing information and the public key used for external verification of the device's identity. Individuals or programs outside the TEE environment cannot perceive the details of the keys.

In the above model, if you want to obtain token incentives, you must purchase devices from exclusive manufacturers. Witch attackers wishing to bypass device manufacturers and generate a large number of work proofs at low cost would need to hack the manufacturer's security system to register their generated key's public key in the network's permitted devices. It is difficult for witch attackers to launch low-cost attacks unless the device manufacturers are colluding.

Once suspicious signs of wrongdoing by device manufacturers are detected, they can be exposed through social consensus, which often leads to the DePIN project itself suffering collateral damage. However, in most cases, device manufacturers, as the core beneficiaries of the DePIN network protocol, generally lack the motivation to commit wrongdoing, because ensuring the orderly operation of the network protocol means that the profits from selling mining machines will exceed those from DePIN mining, so they are more inclined to act ethically.

(Image Source: Pintu Academy)

If hardware devices are not uniformly supplied by centralized manufacturers, then when any device connects to the DePIN network, the system must first confirm that the device possesses the characteristics required by the protocol. For example, the system will check whether these newly added devices have dedicated hardware modules, as devices without such modules often cannot pass certification. To enable devices to have the aforementioned hardware modules, a certain amount of funding is required, which raises the cost of witch attacks, thereby achieving anti-witch objectives. In this case, it is wiser and more prudent to operate devices normally rather than engage in witch attacks.

Data Tampering Attacks

Let’s imagine, if the air quality monitoring data collected by a device exhibits greater volatility, the system considers the data more valuable and provides more rewards for it, then any device has ample motivation to fabricate data to deliberately show high volatility. Even devices certified by centralized manufacturers can "smuggle" during the data computation process, rewriting the collected raw data.

How can we ensure that DePIN devices are honest and trustworthy, without arbitrarily modifying the collected data? This requires the use of Trusted Firmware technology, among which TEE (Trusted Execution Environment) and SPE (Secure Processing Environment) are well-known. These hardware-level technologies can ensure that data is executed on the device according to pre-verified programs, without "smuggling" during the computation process.

(Image Source: Trustonic)

To briefly introduce, TEE (Trusted Execution Environment) is typically implemented in processors or processor cores to protect sensitive data and execute sensitive operations. TEE provides a trusted execution environment where the code and data are protected by hardware-level security, preventing malware, malicious attacks, or unauthorized access. For example, hardware wallets like Ledger and Keystone utilize TEE technology.

Most modern chips support TEE, especially those designed for mobile devices, IoT devices, and cloud services. Generally, high-performance processors, secure chips, smartphone SoCs (System on Chip), and cloud server chips integrate TEE technology, as the applications involved often have a high demand for security.

However, not all hardware supports Trusted Firmware; some lower-end microcontrollers, sensor chips, and custom embedded chips may lack TEE support. For these low-cost chips, methods such as probe attacks can be used to obtain the identity information stored within the chip, thereby fabricating device identities and behaviors. For instance, an attacker could obtain the private key data stored on the chip and then use the private key to sign tampered or fabricated data, masquerading as data generated by the device itself.

However, probe attacks rely on specialized equipment and precise operations, as well as data analysis processes, making the attack cost too high, far exceeding the cost of directly acquiring such low-cost chips from the market. Compared to profiting from cracking and fabricating identity information of low-end devices through probe attacks, attackers would be more willing to directly purchase more low-cost devices.

Data Source Attack Scenarios

The previously mentioned TEE can ensure that hardware devices generate data results truthfully, only proving that the data has not been maliciously processed after being input into the device, but it cannot ensure that the input source of the data is trustworthy before computation, which is similar to the challenges faced by oracle protocols.

For example, if an air quality monitor is placed near a factory emitting waste gas, but someone covers the monitor with a sealed glass jar at night, the data collected by this air quality monitor will definitely be untrue. However, such attack scenarios often yield no profit, and attackers usually have no need to do so, as it is labor-intensive and thankless. For the DePIN network protocol, as long as the device meets the honest and trustworthy computation process and exerts the required workload to meet the incentive protocol, it should theoretically receive rewards.

Solution Introduction

IoTeX

IoTeX provides the W3bStream development tool to connect IoT devices to the blockchain and Web3. The W3bStream IoT SDK includes basic components such as communication and messaging, identity and credential services, and cryptographic services.

The IoT SDK of W3bStream has a well-developed cryptographic functionality, including implementations of various cryptographic algorithms, such as PSA Crypto API, Cryptographic primitives, Cryptographic services, HAL, Tooling, and Root of Trust.

With these modules, data generated by devices can be signed in a secure or less secure manner on various hardware devices and transmitted over the network to subsequent data layers for verification.

DePHY

DePHY provides DID (Device ID) authentication services on the IoT side. DID is minted by the manufacturer, and each device has exactly one corresponding DID. The metadata of the DID can be customized and may include device serial numbers, models, warranty information, etc.

For hardware devices that support TEE, the manufacturer initially generates a key pair and uses eFuse to write the key into the chip, while DePHY's DID service can help manufacturers generate DID based on the device's public key. The private key generated by the manufacturer is only written into the IoT device and is held solely by the manufacturer.

Since Trusted Firmware can achieve secure and reliable message signing and keep the hardware-side private key confidential, if people discover cheating behaviors in generating device private keys in the network, it can generally be assumed that the device manufacturer is acting maliciously, allowing for traceability back to the corresponding manufacturer, thus achieving trust traceability.

After purchasing a device, DePHY users can obtain the device's activation information and then call the on-chain activation contract to associate the hardware device's DID with their on-chain address, thereby connecting to the DePHY network protocol. After the IoT device undergoes the DID setup process, it can achieve bidirectional data flow between users and devices.

When users send control commands to devices through their on-chain accounts, the process is as follows:

  1. Confirm that the user has access control permissions. Since the device's access control permissions are written in metadata on the DID, permissions can be confirmed by checking the DID;

  2. Allow users and devices to establish a private channel connection to support user control of the device. In addition to NoStr relay, DePHY relayers also include peer-to-peer network nodes that can support point-to-point channels, with other nodes in the network assisting in relaying traffic. This allows users to control devices in real-time off-chain.

When IoT devices send data to the blockchain, the subsequent data layer will read the device's permission status from the DID, only registered and permitted devices can upload data. For example, devices registered by the manufacturer.

Another interesting feature of this DID service is that it provides authentication for the functional characteristics (traits) of IoT devices. This certification can identify whether IoT hardware devices possess certain specific functions, qualifying them to participate in incentive activities of specific blockchain networks. For example, a WiFi transmitter can be considered to provide wireless network connectivity if it is identified as having LoRaWAN functionality (trait), thus allowing participation in the Helium network. Similar traits include GPS trait, TEE trait, etc.

In terms of expanding services, DePHY's DID also supports participation in staking, linking programmable wallets, etc., facilitating participation in on-chain activities.

peaq

peaq's solution is quite unique, dividing its approach into three levels: device-originated authentication, pattern recognition verification, and oracle-based authentication.

1. Device-originated authentication. peaq also provides functions to generate key pairs, sign information using the private key on the device, and bind the device address (peaq ID) to the user address. However, their open-source code lacks the implementation of Trusted Firmware functionality. The simple method of signing device information with a private key does not guarantee the integrity of device operation and data tampering. peaq resembles an optimistic Rollup, assuming devices will not act maliciously and then verifying the trustworthiness of data in subsequent stages.

2. Pattern recognition verification. The second approach combines machine learning and pattern recognition. By learning from previous data to obtain a model, when new data is input, it is compared with the previous model to determine its trustworthiness. However, statistical models can only identify anomalous data and cannot determine whether IoT devices are operating honestly.

For example, an air quality monitor placed in a basement in City A may produce data that differs from other air quality monitors, but this does not necessarily indicate data fabrication; the device may still be operating honestly. On the other hand, as long as the rewards are substantial enough, hackers may also be willing to use methods like GAN to generate data that is difficult for machine learning to identify, especially when the discriminative model is publicly shared.

3. Oracle-based authentication. The third approach involves selecting more trusted data sources as oracles to compare and verify data collected from other DePIN devices. For instance, if a precise air quality monitor is deployed in City A, data collected from other air quality monitors that deviates too much will be deemed untrustworthy.

This method introduces and relies on authority for the blockchain, but it may also lead to sampling biases in the oracle data sources, causing the entire network's data sampling to be biased.

Based on the current information, peaq's infrastructure cannot guarantee the trustworthiness of devices and data on the IoT side. (Note: The author reviewed peaq's official website, development documentation, GitHub repository, and a draft of the only white paper from 2018. Even after emailing the development team, no additional explanatory materials were obtained before publication.)

Data Generation and Publication (DA)

The second stage of the DePIN workflow primarily involves collecting and verifying the data transmitted by IoT devices, storing it for subsequent stages, ensuring that the data is sent intact and verifiable to specific recipients, which is referred to as the Data Availability layer (DA layer).

IoT devices often broadcast data and signature authentication information through protocols such as HTTP and MQTT. When the DePIN infrastructure's data layer receives information from the device side, it needs to verify the credibility of the data and aggregate and store the verified data.

Here, it's worth introducing that MQTT (MQ Telemetry Transport) is a lightweight, open, publish/subscribe-based messaging transport protocol designed for connecting constrained devices, such as sensors and embedded systems, to communicate in low-bandwidth and unstable network environments, making it very suitable for IoT applications.

In the process of verifying IoT device messages, it will include authentication of the device's trusted execution and message authentication.

Device trusted execution authentication can be combined with TEE. TEE ensures the secure collection of data by isolating the data collection code in the device's protected area.

Another method is zero-knowledge proof, which allows the device to prove the accuracy of its data collection without revealing the underlying data details. This solution varies by device; for high-performance devices, ZKP can be generated locally, while for constrained devices, it can be generated remotely.

After authenticating the trust of the device, the message signature can be verified using DID, confirming that the message was generated by that device.

Solution Introduction

IoTeX

In W3bStream, it is divided into trusted data collection, verification, data cleaning, and data storage.

  • Trusted data collection and verification utilize TEE and zero-knowledge proof methods.
  • Data cleaning refers to standardizing and normalizing the data formats uploaded from different types of devices for easier storage and processing.
  • In the data storage phase, different application projects can choose different storage systems by configuring storage adapters.

In the current implementation of W3bStream, different IoT devices can directly send data to W3bStream's service terminal or first collect data through a server and then send it to W3bStream's server terminal.

When receiving incoming data, W3bStream acts like a central distribution scheduler, distributing the incoming data to different programs for processing, while DePIN projects within the W3bStream ecosystem will apply for registration on W3bStream and define event trigger logic (Event Strategy) and processing programs (Applet).

Each IoT device will have a device account, belonging to one and only one project on W3bStream. Therefore, when the messages from IoT devices reach the W3bStream service port, they can be redirected to a specific project based on the registration binding information and verify the data's credibility.

As for the previously mentioned event trigger logic, it can define the types of data information received from HTTP API terminals, MQTT topics, and detected event records on the blockchain, as well as event triggers that can be activated when the blockchain height is detected, binding corresponding processing programs to handle them.

The processing program (Applet) defines one or more execution functions, compiled into WASM format. Data cleaning and formatting can be executed through the Applet. The processed data is stored in a key-value database defined by the project.

DePHY

The DePHY project adopts a more decentralized approach to handle and provide data, which they refer to as the DePHY Message Network (DePHY Message Network).

The DePHY Message Network consists of permissionless DePHY relayer nodes. IoT devices can input data through any DePHY relayer node's RPC port, and the input data will first call middleware to verify data credibility using DID.

Data verified through trust must be synchronized between different relayer nodes to form consensus. The DePHY Message Network uses the NoStr protocol to achieve this. Originally intended for building decentralized social media, NoStr has also proven to be cleverly suitable for synchronizing DePIN data.

In the DePHY network, the data fragments stored by each IoT device can be organized into a Merkle tree, with nodes synchronizing the root of this Merkle tree and the tree hash among themselves. When a relayer obtains the aforementioned Merkle root and tree hash, it can quickly identify which data is still missing, facilitating retrieval from other relayers. This method can efficiently achieve consensus confirmation (Finalize).

The operation of nodes in the DePHY Message Network is permissionless; anyone can stake assets and run DePHY network nodes. The more nodes there are, the higher the network's security and accessibility. DePHY nodes can receive rewards through zero-knowledge contingent payments (Zero-Knowledge Contingent Payments). This means that applications with data indexing needs will determine how much to pay the relayer based on the ZK proof of whether the data can be retrieved when requesting data from DePHY relayer nodes.

At the same time, anyone can connect to the DePHY network to listen to and read data. Nodes operated by project parties can set filtering rules to only store data related to their projects. By accumulating raw data, the DePHY Message Network can serve as a data availability layer for subsequent tasks.

The DePHY protocol requires relayer nodes to store the received data locally for a certain period during operation before transferring cold data to a permanent storage platform like Arweave. If all data is treated as hot data, it will ultimately raise the storage costs for nodes, increasing the operational threshold for full nodes, making it difficult for ordinary people to run full nodes.

Through the design of classifying hot and cold data processing, DePHY can significantly reduce the operational costs of full nodes in the message network, better handling massive amounts of IoT data.

peaq

The previous two solutions execute data collection and storage off-chain and then roll up to the blockchain. This is because the volume of data generated by IoT applications is immense, and there are communication latency requirements. If DePIN transactions were executed directly on the blockchain, the data processing capacity would be limited, and storage costs would be high.

Simply waiting for node consensus leads to intolerable latency issues. However, peaq takes a different approach by building its own public chain to directly carry and execute these computations and transactions. It is developed based on Substrate, and when the mainnet goes live, the increasing number of DePIN devices will ultimately lead to performance bottlenecks for peaq, making it unable to handle such a large volume of computation and transaction requests.

Since peaq lacks the functionality of Trusted Firmware, it is fundamentally unable to effectively verify data credibility. In terms of data storage, peaq directly introduces how to integrate IPFS distributed storage into the substrate-based blockchain in its development documentation.

Distributing Data to Different Applications

The third stage of the DePIN workflow is to extract data from the data availability layer based on the needs of blockchain applications, efficiently synchronizing the execution results to the blockchain through computation or zero-knowledge proof.

Solution Introduction

IoTeX

W3bStream refers to this stage as Data Proof Aggregation. This part of the network consists of many aggregator nodes (Aggregator Nodes) forming a computing resource pool, shared by all DePIN projects.

Each aggregator node will record its working status on the blockchain, whether busy or idle. When a computation demand from a DePIN project comes in, an idle aggregator node is selected based on the on-chain status monitor to handle it.

The selected aggregator node will first retrieve the required data from the storage layer; then, based on the needs of the DePIN project, it will perform computations on this data and generate proofs of the computation results; finally, it will send the proof results to the blockchain for verification by smart contracts. After completing the workflow, the aggregator node returns to an idle state.

When generating proofs, the aggregator node will use a layered aggregation circuit. The layered aggregation circuit consists of four parts:

  • Data compression circuit: Similar to a Merkle tree, it verifies that all collected data comes from a specific Merkle tree root.
  • Signature batch verification circuit: It verifies the validity of data from devices in batches, with each piece of data associated with a signature.
  • DePIN computation circuit: It proves that DePIN devices correctly executed certain instructions according to specific computation logic, such as verifying steps in a healthcare project or verifying the energy produced in a solar power plant.
  • Proof aggregation circuit: It aggregates all proofs into a single proof for final verification by Layer 1 smart contracts.

Data proof aggregation is crucial for ensuring the integrity and verifiability of computations in DePIN projects, providing a reliable and efficient method for verifying off-chain computations and data processing.

The revenue aspect of IoTeX also primarily occurs in this stage, where users can stake IOTX tokens to run aggregator nodes. The more aggregators participate, the more computational processing power is available, forming a sufficiently powerful computing layer.

DePHY

In terms of data distribution, DePHY provides co-processors to listen for finalized messages from the DePHY message network, perform state changes, and then package and compress the data for submission to the blockchain.

State changes are functions of smart contracts used to process messages, customized by different DePIN project parties, and may include zkVM or TEE-based computation packaging data processing solutions. This part is provided by the DePHY team to DePIN project parties as project scaffolding for development and deployment, offering a high degree of freedom.

In addition to the co-processor provided by DePHY, DePIN project parties can also connect the DA layer data to the computing layer of other infrastructures based on the project scaffolding to achieve on-chain integration.

Comprehensive Analysis

Although the DePIN track is hot, there are still technical barriers for IoT devices to be integrated on a large scale into the blockchain. This article reviews and analyzes the entire process of IoT devices generating trusted data, verifying and storing data, producing proofs through computation, and rolling up data to the blockchain from a technical implementation perspective, thereby supporting the integration of IoT devices into Web3 applications. If you are an entrepreneur in the DePIN track, I hope this article can assist you in methodology and technical design for project development.

Among the three analyzed DePIN infrastructures, peaq still resembles the comments made online six years ago, is just hype. Both DePHY and IoTeX have chosen an off-chain data collection model for IoT devices, then rolling up to the chain, which can connect IoT device data to the blockchain under conditions of low latency and ensuring data credibility.

DePHY and IoTeX each have their own focuses; DePHY's DID includes verification of hardware function traits and bidirectional data transmission, while the DePHY message network emphasizes a decentralized data availability layer, serving more as a loosely coupled functional module integrated with DePIN projects; IoTeX has a high level of development completeness, with a complete development workflow, focusing more on binding processing programs to different events, leaning towards the computation layer. DePIN project parties can choose different technical solutions to combine based on actual needs.

ChainCatcher reminds readers to view blockchain rationally, enhance risk awareness, and be cautious of various virtual token issuances and speculations. All content on this site is solely market information or related party opinions, and does not constitute any form of investment advice. If you find sensitive information in the content, please click "Report", and we will handle it promptly.
ChainCatcher Building the Web3 world with innovators