An Overview of TON's Technical Features and Smart Contract Development Paradigms
Original source: Web3 Mario X account
Author: @Web3 Mario
Introduction: With Binance launching Notcoin, the largest game in the TON ecosystem, and the massive wealth effect triggered by the fully circulating token economic model, TON has gained significant attention in a short period. After chatting with friends, I learned that the technical threshold of TON is relatively high, and the DApp development paradigm differs greatly from mainstream public chain protocols. Therefore, I spent some time delving into related topics and would like to share some insights. In short, the core design philosophy of TON is to reconstruct traditional blockchain protocols in a "bottom-up" manner, sacrificing interoperability to achieve extreme pursuits of high concurrency and high scalability.
TON's Core Design Philosophy - High Concurrency and High Scalability
It can be said that all the complex technical choices in TON stem from the pursuit of high concurrency and high scalability. Of course, we can easily understand this from its background. TON, or The Open Network, is a decentralized computing network that includes an L1 blockchain and multiple components. TON was initially developed by Nikolai Durov, the founder of Telegram, and his team, and has since been supported and maintained by a global community of independent contributors. Its inception dates back to 2017 when the Telegram team began exploring blockchain solutions for themselves. At that time, no existing L1 blockchain could support Telegram's nine-digit user base, so they decided to design their own blockchain, originally called Telegram Open Network. By 2018, to obtain the resources needed to realize TON, Telegram launched the sale of the Gram token (later renamed Toncoin) in the first quarter of 2018. In 2020, due to regulatory issues, the Telegram team withdrew from the TON project. Subsequently, a small group of open-source developers and winners of Telegram's competition took over the TON codebase, renamed the project to The Open Network, and have continued to actively develop the blockchain to this day, adhering to the principles outlined in the original TON white paper.
Since the design goal is to serve as a decentralized execution environment for Telegram, it naturally faces two challenges: high concurrency requests and massive data. We know that with the development of technology, even the highest TPS claimed by Solana has a maximum measured TPS of only 65,000, which is clearly insufficient to support the million-level TPS requirements of the Telegram ecosystem. Meanwhile, with the large-scale application of Telegram, the amount of data generated has long surpassed astronomical levels, and as a highly redundant distributed system, it is unrealistic to require every node in the network to maintain a complete copy of the data.
To address these two issues, TON has made two optimizations to mainstream blockchain protocols:
By adopting the "Infinite Sharding Paradigm," the system is designed to solve the data redundancy problem, enabling it to handle large data while alleviating performance bottlenecks;
By introducing a fully parallel execution environment based on the Actor model, it significantly enhances network TPS;
Building a Blockchain - Providing Each Account with a Dedicated Account Chain Through Infinite Sharding
Currently, we know that sharding has become the mainstream solution for most blockchain protocols to enhance performance and reduce costs, and TON has taken this to the extreme by proposing the Infinite Sharding Paradigm. This paradigm allows the blockchain to dynamically increase or decrease the number of shards based on network load. This enables TON to maintain high performance while handling large-scale transactions and smart contract operations. Theoretically, TON can establish a dedicated account chain for each account and ensure consistency between these chains through certain rules.
Abstractly, there are four layers of chain structure in TON:
Account Chain: This layer represents a chain composed of a series of transactions related to a specific account. The reason transactions can form a chain structure is that for a state machine, as long as the execution rules are consistent, the results obtained by the state machine upon receiving instructions in the same order will be consistent. Therefore, all distributed systems in blockchain need to perform chain-based ordering of transactions, and TON is no exception. The account chain is the most basic unit in the TON network, and typically, the account chain is a virtual concept and is unlikely to exist as an independent account chain.
Shard Chain: In most contexts, the shard chain is the actual unit of composition in TON, which is a collection of account chains.
Work Chain: This can also be called a set of shard chains with custom rules. For example, creating a work chain based on EVM to run Solidity smart contracts. Theoretically, anyone in the community can create their own work chain. In practice, building it is quite a complex task, and one must pay the (expensive) fee to create it and obtain 2/3 of the votes from validators to approve the creation of their work chain.
Master Chain: Finally, there is a special chain in TON called the master chain, which is responsible for providing finality for all shard chains. Once the hash of a shard chain block is merged into a block of the master chain, that shard chain block and all its parent blocks are considered final, meaning they can be regarded as fixed and immutable content referenced by subsequent blocks of all shard chains.
By adopting this paradigm, the TON network possesses the following three characteristics:
Dynamic Sharding: TON can automatically split and merge shard chains to adapt to changes in load. This means new blocks are always generated quickly, and transactions do not experience long wait times.
High Scalability: Through the Infinite Sharding Paradigm, TON can support an almost unlimited number of shards, theoretically reaching 2 to the power of 60 work chains.
Adaptability: When the load in a certain part of the network increases, that part can be subdivided into more shards to handle the increased transaction volume. Conversely, when the load decreases, shards can be merged to improve efficiency.
In such a multi-chain system, the first challenge to face is cross-chain communication, especially due to the infinite sharding capability. When the number of shards in the network reaches a certain scale, routing information between chains becomes a challenging task. Imagine a network with 4 nodes, each responsible for maintaining an independent work chain. The link relationships indicate that each node, in addition to managing the transaction ordering in its own work chain, also needs to listen for and process state changes in the target chain, which is specifically achieved in TON by listening to messages from the output queue.
Suppose account A in work chain 1 wants to send a message to account C in work chain 3. This involves message routing, and in this example, there are two routing paths: work chain 1 -> work chain 2 -> work chain 3 and work chain 1 -> work chain 4 -> work chain 3.
When facing more complex situations, an efficient and low-cost routing algorithm is needed to quickly complete message communication. TON has chosen the so-called "Hypercube Routing Algorithm" to achieve cross-chain message communication routing discovery. The hypercube structure refers to a special network topology, where an n-dimensional hypercube consists of 2^n vertices, each uniquely identified by an n-bit binary number. In this structure, any two vertices that differ by only one bit in their binary representation are considered adjacent. For example, in a 3-dimensional hypercube, vertex 000 and vertex 001 are adjacent because they differ only in the last bit. The above example illustrates a 2-dimensional hypercube.
In the hypercube routing protocol, the routing process of messages from the source work chain to the target work chain is conducted by comparing the binary representations of the source and target work chain addresses. The routing algorithm finds the minimum distance between these two addresses (i.e., the number of differing bits in the binary representation) and forwards the information step by step through adjacent work chains until it reaches the target work chain. This method ensures that data packets are transmitted along the shortest path, thereby improving the communication efficiency of the network.
Of course, to simplify this process, TON has also proposed an optimistic technical solution. When a user can provide a valid proof of a certain routing path, typically a Merkle trie root, nodes can directly acknowledge the credibility of the message submitted by that user. This is also known as instant hypercube routing.
Thus, we can see that the addresses in TON are distinctly different from those in other blockchain protocols. Most mainstream blockchain protocols use the hash of the public key generated by the elliptic curve cryptography algorithm as the address, as the address is only for uniqueness and does not need to carry routing addressing functionality. In TON, the address consists of two parts: (workchainid, accountid), where workchain_id is encoded according to the hypercube routing algorithm, which will not be elaborated here.
Another point that may raise questions is that you might have noticed the master chain and each work chain have link relationships. So can't all cross-chain information be relayed through the master chain, similar to Cosmos? In TON's design philosophy, the master chain is only used to handle the most critical tasks, namely maintaining the finality of numerous work chains. Routing messages through the master chain is possible, but the transaction fees incurred would be quite expensive.
Finally, a brief mention of its consensus algorithm: TON adopts a BFT+PoS approach, meaning any staker has the opportunity to participate in block packaging. TON's election governance contract randomly selects a cluster of validators from all stakers at regular intervals. The selected nodes, known as validators, will package blocks using the BFT algorithm. If they package incorrect information or act maliciously, their staked tokens will be forfeited; conversely, they will receive block rewards. This is essentially a relatively common choice, so it will not be elaborated here.
Smart Contracts Based on the Actor Model and Fully Parallel Execution Environment
Another point where TON differs from mainstream blockchain protocols is its smart contract execution environment. To break through the TPS limitations of mainstream blockchain protocols, TON adopts a bottom-up design approach, using the Actor model to reconstruct smart contracts and their execution methods, enabling fully parallel execution capabilities.
We know that most mainstream blockchain protocols adopt a single-threaded serial execution environment. Taking Ethereum as an example, its execution environment EVM is a state machine that takes transactions as input. When the block-producing node completes the ordering of transactions by packaging blocks, it executes the transactions in that order through the EVM. The entire process is completely serial and single-threaded, meaning only one transaction can be executed at a time. The advantage of this approach is that once the transaction order is confirmed, the execution results have consistency across a widely distributed cluster. Meanwhile, since only one transaction is executed serially at a time, it means that during execution, no other transactions can modify the state data being accessed, thus achieving interoperability between smart contracts. For example, when we use Uniswap to buy ETH with USDT, when that transaction is executed, the distribution of LPs in that trading pair is a determined value, allowing us to derive the corresponding result through certain mathematical models. However, if the situation were different, such as during the execution of a bonding curve calculation, if another LP added new liquidity, the calculation result would be outdated, which is clearly unacceptable.
However, this architecture has obvious limitations, namely the TPS bottleneck, which seems outdated in the current multi-core processor environment. It's like using a modern PC to play some old computer games, such as Red Alert; when the number of combat units reaches a certain amount, you still find it lagging. This is a software architecture issue.
You may have heard that some protocols are already paying attention to this issue and have proposed their own parallel solutions. Taking Solana, which claims to have the highest TPS, as an example, it also has parallel execution capabilities. However, its design philosophy differs from that of TON. In Solana, the core idea is to divide all transactions into several groups based on execution dependency relationships, with no shared state data between different groups. This means there are no identical dependencies, allowing transactions within different groups to be executed in parallel without worrying about conflicts. However, transactions within the same group still follow the traditional serial execution method.
In contrast, TON completely abandons the serial execution architecture and adopts a development paradigm specifically designed for parallelism, the Actor model, to reconstruct the execution environment. The Actor model was first proposed by Carl Hewitt in 1973, aiming to solve the complexity of shared state in traditional concurrent programs through message passing. Each Actor has its own private state and behavior, and does not share any state information with other Actors. The Actor model is a computational model for concurrent computing that achieves parallel computation through message passing. In this model, an "Actor" is the basic unit of work, capable of processing received messages, creating new Actors, sending more messages, and deciding how to respond to subsequent messages. The Actor model must possess the following characteristics:
Encapsulation and Independence: Each Actor is completely independent when processing messages and can handle messages in parallel without interfering with each other.
Message Passing: Actors interact solely through sending and receiving messages, and message passing is asynchronous.
Dynamic Structure: Actors can create more Actors at runtime, allowing the Actor model to scale the system as needed.
TON adopts this architecture to design its smart contract model, meaning that in TON, each smart contract is an Actor model with completely independent storage space. It does not rely on any external data. Additionally, calls to the same smart contract are still executed according to the order of messages in the receiving queue, allowing transactions in TON to be executed efficiently in parallel without worrying about conflicts.
However, this design also brings some new implications. For DApp developers, their accustomed development paradigms will be disrupted, specifically as follows:
1. Asynchronous Calls Between Smart Contracts: In TON, it is impossible to make atomic calls to external contracts or access external contract data within a smart contract. In Solidity, for example, calling function 1 of contract A to call function 2 of contract B, or accessing certain state data through the read-only function 3 of contract C, is an atomic process executed within a single transaction, which is very straightforward. However, in TON, this cannot be achieved. Any interaction with external smart contracts will be executed asynchronously by packaging new transactions, known as internal messages initiated by smart contracts. During execution, it cannot block to obtain execution results.
For instance, if we develop a DEX and adopt the common paradigm in EVM, there would typically be a unified router contract to manage transaction routing, while each Pool separately manages LP data related to a specific trading pair. Suppose there are two pools, USDT-DAI and DAI-ETH. When a user wants to purchase ETH directly with USDT, they can use the router contract to sequentially request these two pools in a single transaction, completing an atomic transaction. However, in TON, this is not so easily achievable. A new development paradigm needs to be considered. If the existing paradigm is reused, the information flow might look like this: this request will be accompanied by an external message initiated by the user and three internal messages (note that this is for illustrative purposes; in real development, even the ERC 20 paradigm would need to be redesigned).
2. Careful consideration is needed for handling execution error situations that arise during cross-contract calls, designing corresponding bounce functions for each inter-contract call. We know that in mainstream EVM, when a transaction encounters an issue during execution, the entire transaction will be rolled back to its initial state. This is easy to understand in a serial single-threaded model. However, in TON, due to the asynchronous execution of inter-contract calls, even if an error occurs in a subsequent step, the previously successfully executed transactions have already been executed and confirmed, which can lead to problems. Therefore, TON has established a special message type called a bounce message, which allows the triggered contract to reset certain states in the triggering contract through a reserved bounce function when an error occurs in the subsequent execution process triggered by an internal message.
3. In some complex situations, the first received transaction may not necessarily be completed first, so such temporal relationships cannot be assumed. In a system with asynchronous and parallel smart contract calls, defining the order of operations can be challenging. This is why each message in TON has its logical time, Lamport time (abbreviated as lt). It is used to understand which event triggered another and what validators need to process first. For a simple model, the first received transaction will definitely be executed first.
In this model, A and B represent two smart contracts, so if msg 1lt < msg 2lt, then tx 1lt < tx 2lt establishes the temporal relationship.
However, in more complex situations, this rule may be broken. The official documentation provides an example where we have three contracts A, B, and C. In a single transaction, A sends two internal messages msg 1 and msg 2: one to B and the other to C. Although they are created in exact order (msg 1 first, then msg 2), we cannot determine that msg 1 will be processed before msg 2. This is because the routing from A to B and from A to C may differ in length and validator sets. If these contracts are located on different shard chains, one of the messages may take several blocks to reach the target contract. Thus, we have two possible transaction paths, as illustrated.
4. In TON, the persistent storage of smart contracts uses a directed acyclic graph with Cells as the unit as its data structure. Data will be compactly compressed into a Cell according to encoding rules while extending downward in a directed acyclic graph manner. This differs from the hashmap-based structure of state data in EVM. Due to the different data request algorithms, TON has set different Gas prices for data processing at different depths; the deeper the Cell, the higher the Gas required for processing. Therefore, there exists a DOS attack paradigm in TON, where some malicious users send a large number of junk messages to occupy all shallow Cells in a smart contract, meaning that the storage costs for honest users will increase. In contrast, in EVM, due to the O(1) query complexity of hashmap, there is a uniform Gas cost, preventing similar issues. Thus, TON DApp developers should avoid unbounded data types in smart contracts. When unbounded data types appear, they should be fragmented using sharding.
- There are also some features that are less unique, such as smart contracts needing to pay rent for storage, smart contracts in TON being inherently upgradeable, and the native abstract account functionality, meaning that all wallet addresses in TON are smart contracts that are just uninitialized, etc. Developers need to pay close attention to these.
The above are some insights I have gained while studying TON-related technologies recently. I hope to share them with you. If there are any mistakes, please feel free to correct me. Meanwhile, I believe that with Telegram's massive traffic resources, the TON ecosystem will certainly bring some new applications to Web3. If anyone is interested in TON DApp development, feel free to contact me to discuss together.