Ton

ton.tex
6133 строки · 324.9 Кб
Перенос по словам
1
\documentclass[12pt,oneside]{article}
2
\usepackage[T1]{fontenc}
3
%\usepackage{euler}
4
\usepackage{amssymb, amsmath, amsfonts, stmaryrd}
5
\usepackage[mathscr]{euscript}
6
\usepackage{mathrsfs}
7
\usepackage{theorem}
8
\usepackage[english]{babel}
9
\usepackage{bm}
10
\usepackage[all]{xy}
11
%\usepackage{chngcntr}
12
%\CompileMatrices
13
\usepackage[bookmarks=false,pdfauthor={Nikolai Durov},pdftitle={Telegram Open Network}]{hyperref}
14
\usepackage{fancyhdr}
15
\usepackage{caption}
16
%
17
\setlength{\headheight}{15.2pt}
18
\pagestyle{fancy}
19
\renewcommand{\headrulewidth}{0.5pt}
20
%
21
\def\makepoint#1{\medbreak\noindent{\bf #1.\ }}
22
\def\zeropoint{\setcounter{subsection}{-1}}
23
\def\zerosubpoint{\setcounter{subsubsection}{-1}}
24
\def\nxpoint{\refstepcounter{subsection}%
25
  \smallbreak\makepoint{\thesubsection}}
26
\def\nxsubpoint{\refstepcounter{subsubsection}%
27
  \smallbreak\makepoint{\thesubsubsection}}
28
\def\nxsubsubpoint{\refstepcounter{paragraph}%
29
  \makepoint{\paragraph}}
30
%\setcounter{secnumdepth}{4}
31
%\counterwithin{paragraph}{subsubsection}
32
\def\refpoint#1{{\rm\textbf{\ref{#1}}}}
33
\let\ptref=\refpoint
34
\def\embt(#1.){\textbf{#1.}}
35
\def\embtx(#1){\textbf{#1}}
36
\long\def\nodo#1{}
37
%
38
%\def\markbothsame#1{\markboth{#1}{#1}}
39
\fancyhf{}
40
\fancyfoot[C]{\thepage}
41
\def\markbothsame#1{\fancyhead[C]{#1}}
42
\def\mysection#1{\section{#1}\fancyhead[C]{\textsc{Chapter \textbf{\thesection.} #1}}}
43
\def\mysubsection#1{\subsection{#1}\fancyhead[C]{\small{\textsc{\textrm{\thesubsection.} #1}}}}
44
\def\myappendix#1{\section{#1}\fancyhead[C]{\textsc{Appendix \textbf{\thesection.} #1}}}
45
%
46
\let\tp=\textit
47
\let\vr=\textit
48
\def\workchainid{\vr{workchain\_id\/}}
49
\def\shardpfx{\vr{shard\_prefix}}
50
\def\accountid{\vr{account\_id\/}}
51
\def\currencyid{\vr{currency\_id\/}}
52
\def\uint{\tp{uint}}
53
\def\opsc#1{\operatorname{\textsc{#1}}}
54
\def\blkseqno{\opsc{blk-seqno}}
55
\def\blkprev{\opsc{blk-prev}}
56
\def\blkhash{\opsc{blk-hash}}
57
\def\Hash{\opsc{Hash}}
58
\def\Sha{\opsc{sha256}}
59
\def\height{\opsc{height}}
60
\def\len{\opsc{len}}
61
\def\leaf{\opsc{Leaf}}
62
\def\node{\opsc{Node}}
63
\def\root{\opsc{Root}}
64
\def\emptyroot{\opsc{EmptyRoot}}
65
\def\code{\opsc{code}}
66
\def\Ping{\opsc{Ping}}
67
\def\Store{\opsc{Store}}
68
\def\FindNode{\opsc{Find\_Node}}
69
\def\FindValue{\opsc{Find\_Value}}
70
\def\Bytes{\tp{Bytes}}
71
\def\Transaction{\tp{Transaction}}
72
\def\Account{\tp{Account}}
73
\def\State{\tp{State}}
74
\def\Maybe{\opsc{Maybe}}
75
\def\List{\opsc{List}}
76
\def\Block{\tp{Block}}
77
\def\Blockchain{\tp{Blockchain}}
78
\def\isValidBc{\tp{isValidBc}}
79
\def\evtrans{\vr{ev\_trans}}
80
\def\evblock{\vr{ev\_block}}
81
\def\Hashmap{\tp{Hashmap}}
82
\def\Type{\tp{Type}}
83
\def\nat{\tp{nat\/}}
84
\def\hget{\vr{hget\/}}
85
\def\bbB{{\mathbb{B}}}
86
\def\st#1{{\mathbf{#1}}}
87
%
88
\hfuzz=0.8pt
89

90
\title{Telegram Open Network}
91
\author{Dr.\ Nikolai Durov}% a.k.a. K.O.T.
92
\begin{document}
93

94
%\pagestyle{myheadings}
95
\maketitle
96

97
\begin{abstract}
98
  The aim of this text is to provide a first description of the
99
  Telegram Open Network (TON) and related blockchain, peer-to-peer,
100
  distributed storage and service hosting technologies. To reduce the
101
  size of this document to reasonable proportions, we focus mainly on
102
  the unique and defining features of the TON platform that are
103
  important for it to achieve its stated goals.
104
\end{abstract}
105

106
\section*{Introduction}
107
\markbothsame{Introduction}
108

109
The {\em Telegram Open Network (TON)} is a fast, secure and scalable
110
blockchain and network project, capable of handling millions of
111
transactions per second if necessary, and both user-friendly and
112
service provider-friendly. We aim for it to be able to host all
113
reasonable applications currently proposed and conceived. One might
114
think about TON as a huge distributed supercomputer, or rather a huge
115
``superserver'', intended to host and provide a variety of services.
116

117
This text is not intended to be the ultimate reference with respect to
118
all implementation details. Some particulars are likely to change
119
during the development and testing phases.
120

121
\clearpage
122
\tableofcontents
123

124
\clearpage
125
\mysection{Brief Description of TON Components}\label{sect:ton.components}
126

127
The {\em Telegram Open Network (TON)} is a combination of the
128
following components:
129
\begin{itemize}
130
\item A flexible multi-blockchain platform ({\em TON Blockchain};
131
  cf.\ Chapter~\ptref{sect:blockchain}), capable of processing
132
  millions of transactions per second, with Turing-complete smart
133
  contracts, upgradable formal blockchain specifications,
134
  multi-cryptocurrency value transfer, support for micropayment
135
  channels and off-chain payment networks. {\em TON Blockchain\/}
136
  presents some new and unique features, such as the ``self-healing''
137
  vertical block\-chain mechanism (cf.~\ptref{sp:inv.sh.blk.corr}) and
138
  Instant Hypercube Routing (cf.~\ptref{sp:instant.hypercube}), which
139
  enable it to be fast, reliable, scalable and self-consistent at the
140
  same time.
141
\item A peer-to-peer network ({\em TON P2P Network}, or just {\em TON
142
  Network}; cf.\ Chapter~\ptref{sect:network}), used for accessing the
143
  TON Block\-chain, sending transaction candidates, and receiving
144
  updates about only those parts of the blockchain a client is
145
  interested in (e.g., those related to the client's accounts and
146
  smart contracts), but also able to support arbitrary distributed
147
  services, blockchain-related or not.
148
\item A distributed file storage technology {\em (TON Storage;}
149
  cf.~\ptref{sp:ex.ton.storage}), accessible through {\em TON
150
    Network}, used by the TON Blockchain to store archive copies of
151
  blocks and status data (snapshots), but also available for storing
152
  arbitrary files for users or other services running on the platform,
153
  with torrent-like access technology.
154
\item A network proxy/anonymizer layer {\em (TON Proxy;}
155
  cf.~\ptref{sp:ex.ton.proxy} and~\ptref{sp:tunnels}), similar to the
156
  $I^2P$ (Invisible Internet Project), used to hide the identity and
157
  IP addresses of {\em TON Network\/} nodes if necessary (e.g., nodes
158
  committing transactions from accounts with large amounts of
159
  cryptocurrency, or high-stake blockchain validator nodes who wish to
160
  hide their exact IP address and geographical location as a measure
161
  against DDoS attacks).
162
\item A Kademlia-like distributed hash table ({\em TON DHT};
163
  cf.~\ptref{sect:kademlia}), used as a ``torrent tracker'' for {\em
164
    TON Storage} (cf.~\ptref{sp:distr.torr.tr}), as an ``input tunnel
165
  locator'' for {\em TON Proxy\/} (cf.~\ptref{sp:loc.abs.addr}), and
166
  as a service locator for {\em TON Services}
167
  (cf.~\ptref{sp:loc.serv}).
168
\item A platform for arbitrary services ({\em TON Services};
169
  cf.\ Chapter~\ptref{sect:services}), residing in and available
170
  through {\em TON Network\/} and {\em TON Proxy}, with formalized
171
  interfaces (cf.~\ptref{sp:pub.int.smartc}) enabling browser-like or
172
  smartphone application interaction. These formal interfaces and
173
  persistent service entry points can be published in the TON
174
  Blockchain (cf.~\ptref{sp:ui.ton.dns}); actual nodes providing
175
  service at any given moment can be looked up through the {\em TON
176
    DHT\/} starting from information published in the TON Blockchain
177
  (cf.~\ptref{sp:loc.serv}). Services may create smart contracts in
178
  the TON Blockchain to offer some guarantees to their clients
179
  (cf.~\ptref{sp:mixed.serv}).
180
\item {\em TON DNS\/} (cf.~\ptref{sp:ton.dns}), a service for
181
  assigning human-readable names to accounts, smart contracts,
182
  services and network nodes.
183
\item {\em TON Payments\/} (cf.\ Chapter~\ptref{sect:payments}), a
184
  platform for micropayments, micropayment channels and a micropayment
185
  channel network. It can be used for fast off-chain value transfers,
186
  and for paying for services powered by {\em TON Services}.
187
\item TON will allow easy integration with third-party messaging and
188
  social networking applications, thus making blockchain technologies
189
  and distributed services finally available and accessible to
190
  ordinary users (cf.~\ptref{sp:ton.www}), rather than just to a
191
  handful of early cryptocurrency adopters. We will provide an example
192
  of such an integration in another of our projects, the Telegram
193
  Messenger (cf.~\ptref{sp:telegram.integr}).
194
\end{itemize}
195

196
While the TON Blockchain is the core of the TON project, and the other
197
components might be considered as playing a supportive role for the
198
blockchain, they turn out to have useful and interesting functionality
199
by themselves. Combined, they allow the platform to host more
200
versatile applications than it would be possible by just using the TON
201
Blockchain (cf.~\ptref{sp:blockchain.facebook}
202
and~\ptref{sect:ton.service.impl}).
203

204
\clearpage
205
\mysection{TON Blockchain}\label{sect:blockchain}
206

207
We start with a description of the Telegram Open Network (TON)
208
Blockchain, the core component of the project. Our approach here is
209
``top-down'': we give a general description of the whole first, and
210
then provide more detail on each component.
211

212
For simplicity, we speak here about {\em the\/} TON Blockchain, even
213
though in principle several instances of this blockchain protocol may
214
be running independently (for example, as a result of hard forks). We
215
consider only one of them.
216

217
\mysubsection{TON Blockchain as a Collection of 2-Blockchains}
218

219
The TON Blockchain is actually a {\em collection\/} of blockchains
220
(even a collection of {\em blockchains of blockchains}, or {\em
221
  2-blockchains}---this point will be clarified later
222
in~\ptref{sp:inv.sh.blk.corr}), because no single blockchain project
223
is capable of achieving our goal of processing millions of
224
transactions per second, as opposed to the now-standard dozens of
225
transactions per second.
226

227
\nxsubpoint\label{sp:list.blkch.typ}
228
\embt(List of blockchain types.) The blockchains in this collection
229
are:
230
\begin{itemize}
231
\item The unique {\em master blockchain}, or {\em masterchain\/} for
232
  short, containing general information about the protocol and the
233
  current values of its parameters, the set of validators and their
234
  stakes, the set of currently active workchains and their ``shards'',
235
  and, most importantly, the set of hashes of the most recent blocks
236
  of all workchains and shardchains.
237
\item Several (up to $2^{32}$) {\em working blockchains}, or {\em
238
  workchains\/} for short, which are actually the ``workhorses'',
239
  containing the value-transfer and smart-contract
240
  transactions. Different workchains may have different ``rules'',
241
  meaning different formats of account addresses, different formats of
242
  transactions, different virtual machines (VMs) for smart contracts,
243
  different basic cryptocurrencies and so on. However, they all must
244
  satisfy certain basic interoperability criteria to make interaction
245
  between different work\-chains possible and relatively simple. In
246
  this respect, the TON Blockchain is {\em heterogeneous\/}
247
  (cf.~\ptref{sp:blkch.hom.het}), similarly to the EOS
248
  (cf.~\ptref{sp:discuss.EOS}) and PolkaDot
249
  (cf.~\ptref{sp:discuss.PolkaDot}) projects.
250
\item Each workchain is in turn subdivided into up to $2^{60}$ {\em
251
  shard blockchains}, or {\em shardchains\/} for short, having the
252
  same rules and block format as the workchain itself, but responsible
253
  only for a subset of accounts, depending on several first (most
254
  significant) bits of the account address. In other words, a form of
255
  sharding is built into the system
256
  (cf.~\ptref{sp:shard.supp}). Because all these shardchains share a
257
  common block format and rules, the TON Blockchain is {\em
258
    homogeneous\/} in this respect (cf.~\ptref{sp:blkch.hom.het}),
259
  similarly to what has been discussed in one of Ethereum scaling
260
  proposals.\footnote{\url{https://github.com/ethereum/wiki/wiki/Sharding-FAQ}}
261
\item Each block in a shardchain (and in the masterchain) is actually
262
  not just a block, but a small blockchain. Normally, this ``block
263
  blockchain'' or ``vertical blockchain'' consists of exactly one
264
  block, and then we might think this is just the corresponding block
265
  of the shardchain (also called ``horizontal block\-chain'' in this
266
  situation). However, if it becomes necessary to fix incorrect
267
  shardchain blocks, a new block is committed into the ``vertical
268
  block\-chain'', containing either the replacement for the invalid
269
  ``horizontal blockchain'' block, or a ``block difference'',
270
  containing only a description of those parts of the previous version
271
  of this block that need to be changed. This is a TON-specific
272
  mechanism to replace detected invalid blocks without making a true
273
  fork of all shardchains involved; it will be explained in more
274
  detail in~\ptref{sp:inv.sh.blk.corr}. For now, we just remark that
275
  each shardchain (and the masterchain) is not a conventional
276
  blockchain, but a {\em blockchain of blockchains}, or {\em
277
    2D-blockchain}, or just a {\em 2-blockchain}.
278
\end{itemize}
279

280
\nxsubpoint\label{sp:ISP} \embt(Infinite Sharding Paradigm.)  Almost
281
all blockchain sharding proposals are ``top-down'': one first imagines
282
a single blockchain, and then discusses how to split it into several
283
interacting shardchains to improve performance and achieve
284
scalability.
285

286
The TON approach to sharding is ``bottom-up'', explained as follows.
287

288
Imagine that sharding has been taken to its extreme, so that exactly
289
one account or smart contract remains in each shardchain. Then we have
290
a huge number of ``account-chains'', each describing the state and
291
state transitions of only one account, and sending value-bearing
292
messages to each other to transfer value and information.
293

294
Of course, it is impractical to have hundreds of millions of blockchains, with updates (i.e., new blocks) usually appearing quite rarely in each of them. In order to implement them more efficiently, we group these ``account-chains'' into ``shardchains'', so that each block of the shardchain is essentially a collection of blocks of account-chains that have been assigned to this shard. Thus the ``account-chains'' have only a purely virtual or logical existence inside the ``shardchains''. 
295

296
We call this perspective the {\em Infinite Sharding Paradigm}. It explains many of the design decisions for the TON Blockchain.
297

298
\nxsubpoint\label{sp:msg.IHR} \embt(Messages. Instant Hypercube Routing.)
299
The Infinite Sharding Para\-digm instructs us to regard each account
300
(or smart contract) as if it were in its own shardchain by
301
itself. Then the only way one account might affect the state of
302
another is by sending a {\em message\/} to it (this is a special
303
instance of the so-called Actor model, with accounts as Actors;
304
cf.~\ptref{sp:actors}). Therefore, a system of messages between
305
accounts (and shardchains, because the source and destination accounts
306
are, generally speaking, located in different shardchains) is of
307
paramount importance to a scalable system such as the TON
308
Blockchain. In fact, a novel feature of the TON Blockchain, called
309
{\em Instant Hypercube Routing\/} (cf.~\ptref{sp:instant.hypercube}),
310
enables it to deliver and process a message created in a block of one
311
shardchain into the very next block of the destination shardchain,
312
{\em regardless of the total number of shardchains in the system.}
313

314
\nxsubpoint \embt(Quantity of masterchains, workchains and
315
shardchains.) A TON Blockchain contains exactly one
316
masterchain. However, the system can potentially accommodate up to
317
$2^{32}$ workchains, each subdivided into up to $2^{60}$ shardchains.
318

319
\nxsubpoint \embt(Workchains can be virtual blockchains, not true
320
blockchains.) Because a workchain is usually subdivided into
321
shardchains, the existence of the workchain is ``virtual'', meaning
322
that it is not a true blockchain in the sense of the general
323
definition provided in~\ptref{sp:gen.blkch.def} below, but just a
324
collection of shardchains. When only one shardchain corresponds to a
325
workchain, this unique shardchain may be identified with the
326
workchain, which in this case becomes a ``true'' blockchain, at least
327
for some time, thus gaining a superficial similarity to customary
328
single-blockchain design. However, the Infinite Sharding Paradigm
329
(cf.~\ptref{sp:ISP}) tells us that this similarity is indeed
330
superficial: it is just a coincidence that the potentially huge number
331
of ``account-chains'' can temporarily be grouped into one blockchain.
332

333
\nxsubpoint \embt(Identification of workchains.)  Each workchain is
334
identified by its {\em number\/} or {\em workchain identifier\/}
335
($\workchainid:\uint_{32}$), which is simply an unsigned 32-bit
336
integer. Workchains are created by special transactions in the
337
masterchain, defining the (previously unused) workchain identifier and
338
the formal description of the workchain, sufficient at least for the
339
interaction of this workchain with other workchains and for
340
superficial verification of this workchain's blocks.
341

342
\nxsubpoint \embt(Creation and activation of new workchains.)  The
343
creation of a new workchain may be initiated by essentially any member
344
of the community, ready to pay the (high) masterchain transaction fees
345
required to publish the formal specification of a new
346
workchain. However, in order for the new workchain to become active, a
347
two-thirds consensus of validators is required, because they will need
348
to upgrade their software to process blocks of the new workchain, and
349
signal their readiness to work with the new workchain by special
350
masterchain transactions. The party interested in the activation of
351
the new workchain might provide some incentive for the validators to
352
support the new workchain by means of some rewards distributed by a
353
smart contract.
354

355
\nxsubpoint\label{sp:shard.ident} \embt(Identification of
356
shardchains.)  Each shardchain is identified by a couple
357
$(w,s)=(\workchainid, \shardpfx)$, where $\workchainid:\uint_{32}$
358
identifies the corresponding workchain, and
359
$\shardpfx:\st2^{0\ldots60}$ is a bit string of length at most 60,
360
defining the subset of accounts for which this shardchain is
361
responsible. Namely, all accounts with $\accountid$ starting with
362
$\shardpfx$ (i.e., having $\shardpfx$ as most significant bits) will
363
be assigned to this shardchain.
364

365
\nxsubpoint \embt(Identification of account-chains.)  Recall that
366
account-chains have only a virtual existence
367
(cf.~\ptref{sp:ISP}). However, they have a natural
368
identifier---namely, $(\workchainid,\accountid)$---because any
369
account-chain contains information about the state and updates of
370
exactly one account (either a simple account or smart contract---the
371
distinction is unimportant here).
372

373
\nxsubpoint\label{sp:dyn.split.merge} \embt(Dynamic splitting and
374
merging of shardchains; cf.~\ptref{sect:split.merge}.)  A less
375
sophisticated system might use {\em static sharding}---for example, by
376
using the top eight bits of the $\accountid$ to select one of 256
377
pre-defined shards.
378

379
An important feature of the TON Blockchain is that it implements {\em
380
  dynamic sharding}, meaning that the number of shards is not
381
fixed. Instead, shard $(w,s)$ can be automatically subdivided into
382
shards $(w,s.0)$ and $(w,s.1)$ if some formal conditions are met
383
(essentially, if the transaction load on the original shard is high
384
enough for a prolonged period of time). Conversely, if the load stays
385
too low for some period of time, the shards $(w,s.0)$ and $(w,s.1)$
386
can be automatically merged back into shard $(w,s)$.
387

388
Initially, only one shard $(w,\emptyset)$ is created for workchain
389
$w$. Later, it is subdivided into more shards, if and when this becomes necessary (cf.~\ptref{sp:split.necess} and~\ptref{sp:merge.necess}).
390

391
\nxsubpoint\label{sp:basic.workchain} \embt(Basic workchain or
392
Workchain Zero.)  While up to $2^{32}$ workchains can be defined with
393
their specific rules and transactions, we initially define only one,
394
with $\workchainid=0$. This workchain, called Workchain Zero or the
395
basic workchain, is the one used to work with {\em TON smart
396
  contracts\/} and transfer {\em TON coins}, also known as {\em
397
  Grams\/} (cf.\ Appendix~\ref{app:coins}). Most applications are
398
likely to require only Workchain Zero. Shardchains of the basic
399
workchain will be called {\em basic shardchains}.
400

401
\nxsubpoint \embt(Block generation intervals.)  We expect a new block
402
to be generated in each shardchain and the masterchain approximately
403
once every five seconds. This will lead to reasonably small
404
transaction confirmation times. New blocks of all shardchains are
405
generated approximately simultaneously; a new block of the masterchain
406
is generated approximately one second later, because it must contain
407
the hashes of the latest blocks of all shardchains.
408

409
\nxsubpoint\label{sp:sc.hash.mc} \embt(Using the masterchain to make
410
workchains and shardchains tightly coupled.)  Once the hash of a block
411
of a shardchain is incorporated into a block of the masterchain, that
412
shardchain block and all its ancestors are considered ``canonical'',
413
meaning that they can be referenced from the subsequent blocks of all
414
shardchains as something fixed and immutable. In fact, each new
415
shardchain block contains a hash of the most recent masterchain block,
416
and all shardchain blocks referenced from that masterchain block are
417
considered immutable by the new block.
418

419
Essentially, this means that a transaction or a message committed in a
420
shardchain block may be safely used in the very next blocks of the
421
other shardchains, without needing to wait for, say, twenty
422
confirmations (i.e., twenty blocks generated after the original block
423
in the same blockchain) before forwarding a message or taking other
424
actions based on a previous transaction, as is common in most proposed
425
``loosely-coupled'' systems (cf.~\ptref{sp:blkch.interact}), such as
426
EOS. This ability to use transactions and messages in other
427
shardchains a mere five seconds after being committed is one of the
428
reasons we believe our ``tightly-coupled'' system, the first of its
429
kind, will be able to deliver unprecedented performance
430
(cf.~\ptref{sp:shard.supp} and~\ptref{sp:blkch.interact}).
431

432
\nxsubpoint \embt(Masterchain block hash as a global state.)
433
According to~\ptref{sp:sc.hash.mc}, the hash of the last masterchain
434
block completely determines the overall state of the system from the
435
perspective of an external observer. One does not need to monitor the
436
state of all shardchains separately.
437

438
\nxsubpoint \embt(Generation of new blocks by validators;
439
cf.~\ptref{sect:validators}.)  The TON Blockchain uses a
440
Proof-of-Stake (PoS) approach for generating new blocks in the
441
shardchains and the masterchain. This means that there is a set of,
442
say, up to a few hundred {\em validators}---special nodes that have
443
deposited {\em stakes\/} (large amounts of TON coins) by a special
444
masterchain transaction to be eligible for new block generation and
445
validation.
446

447
Then a smaller subset of validators is assigned to each shard $(w,s)$
448
in a deterministic pseudorandom way, changing approximately every 1024
449
blocks. This subset of validators suggests and reaches consensus on
450
what the next shardchain block would be, by collecting suitable
451
proposed transactions from the clients into new valid block
452
candidates. For each block, there is a pseudorandomly chosen order on
453
the validators to determine whose block candidate has the highest
454
priority to be committed at each turn.
455

456
Validators and other nodes check the validity of the proposed block
457
candidates; if a validator signs an invalid block candidate, it may be
458
automatically punished by losing part or all of its stake, or by being
459
suspended from the set of validators for some time. After that, the
460
validators should reach consensus on the choice of the next block,
461
essentially by an efficient variant of the BFT (Byzantine Fault
462
Tolerant; cf.~\ptref{sp:dpos.bft}) consensus protocol, similar to
463
PBFT~\cite{PBFT} or Honey Badger BFT~\cite{HoneyBadger}. If consensus
464
is reached, a new block is created, and validators divide between
465
themselves the transaction fees for the transactions included, plus
466
some newly-created (``minted'') coins.
467

468
Each validator can be elected to participate in several validator
469
subsets; in this case, it is expected to run all validation and
470
consensus algorithms in parallel.
471

472
After all new shardchain blocks are generated or a timeout is passed,
473
a new masterchain block is generated, including the hashes of the
474
latest blocks of all shardchains. This is done by BFT consensus of
475
{\em all\/} validators.\footnote{Actually, two-thirds by stake is
476
  enough to achieve consensus, but an effort is made to collect as
477
  many signatures as possible.}
478

479
More detail on the TON PoS approach and its economical model is
480
provided in section~\ptref{sect:validators}.
481

482
\nxsubpoint \embt(Forks of the masterchain.)  A complication that
483
arises from our tightly-coupled approach is that switching to a
484
different fork in the masterchain will almost necessarily require
485
switching to another fork in at least some of the shardchains. On the
486
other hand, as long as there are no forks in the masterchain, no forks
487
in the shardchain are even possible, because no blocks in the
488
alternative forks of the shardchains can become ``canonical'' by
489
having their hashes incorporated into a masterchain block.
490

491
The general rule is that {\em if masterchain block $B'$ is a
492
  predecessor of $B$, $B'$ includes hash $\Hash(B'_{w,s})$ of
493
  $(w,s)$-shardchain block $B'_{w,s}$, and $B$ includes hash
494
  $\Hash(B_{w,s})$, then $B'_{w,s}$ {\bf must} be a predecessor of
495
  $B_{w,s}$; otherwise, the masterchain block $B$ is invalid.}
496

497
We expect masterchain forks to be rare, next to non-existent, because
498
in the BFT paradigm adopted by the TON Blockchain they can happen only
499
in the case of incorrect behavior by a {\em majority\/} of validators
500
(cf.~\ptref{sp:validators} and~\ptref{sp:new.master.blk}), which would
501
imply significant stake losses by the offenders. Therefore, no true
502
forks in the shardchains should be expected. Instead, if an invalid
503
shardchain block is detected, it will be corrected by means of the
504
``vertical blockchain'' mechanism of the 2-blockchain
505
(cf.~\ptref{sp:inv.sh.blk.corr}), which can achieve this goal without
506
forking the ``horizontal blockchain'' (i.e., the shardchain). The same
507
mechanism can be used to fix non-fatal mistakes in the masterchain
508
blocks as well.
509

510
\nxsubpoint\label{sp:inv.sh.blk.corr} \embt(Correcting invalid
511
shardchain blocks.)  Normally, only valid shardchain blocks will be
512
committed, because validators assigned to the shardchain must reach a
513
two-thirds Byzantine consensus before a new block can be
514
committed. However, the system must allow for detection of previously
515
committed invalid blocks and their correction.
516

517
Of course, once an invalid shardchain block is found---either by a
518
validator (not necessarily assigned to this shardchain) or by a
519
``fisherman'' (any node of the system that made a certain deposit to
520
be able to raise questions about block validity;
521
cf.~\ptref{sp:fish})---the invalidity claim and its proof are
522
committed into the masterchain, and the validators that have signed
523
the invalid block are punished by losing part of their stake and/or
524
being temporarily suspended from the set of validators (the latter
525
measure is important for the case of an attacker stealing the private
526
signing keys of an otherwise benign validator).
527

528
However, this is not sufficient, because the overall state of the
529
system (TON Block\-chain) turns out to be invalid because of the
530
invalid shardchain block previously committed. This invalid block must
531
be replaced by a newer valid version.
532

533
Most systems would achieve this by ``rolling back'' to the last block
534
before the invalid one in this shardchain and the last blocks
535
unaffected by messages propagated from the invalid block in each of
536
the other shardchains, and creating a new fork from these blocks. This
537
approach has the disadvantage that a large number of otherwise correct
538
and committed transactions are suddenly rolled back, and it is unclear
539
whether they will be included later at all.
540

541
The TON Blockchain solves this problem by making each ``block'' of
542
each shardchain and of the masterchain (``horizontal blockchains'') a
543
small blockchain (``vertical blockchain'') by itself, containing
544
different versions of this ``block'', or their
545
``differences''. Normally, the vertical blockchain consists of exactly
546
one block, and the shardchain looks like a classical
547
blockchain. However, once the invalidity of a block is confirmed and
548
committed into a masterchain block, the ``vertical blockchain'' of the
549
invalid block is allowed to grow by a new block in the vertical
550
direction, replacing or editing the invalid block. The new block is
551
generated by the current validator subset for the shardchain in
552
question.
553

554
The rules for a new ``vertical'' block to be valid are quite
555
strict. In particular, if a virtual ``account-chain block''
556
(cf.~\ptref{sp:ISP}) contained in the invalid block is valid by
557
itself, it must be left unchanged by the new vertical block.
558

559
Once a new ``vertical'' block is committed on top of the invalid
560
block, its hash is published in a new masterchain block (or rather in
561
a new ``vertical'' block, lying above the original masterchain block
562
where the hash of the invalid shardchain block was originally
563
published), and the changes are propagated further to any shardchain
564
blocks referring to the previous version of this block (e.g., those
565
having received messages from the incorrect block). This is fixed by
566
committing new ``vertical'' blocks in vertical blockchains for all
567
blocks previously referring to the ``incorrect'' block; new vertical
568
blocks will refer to the most recent (corrected) versions
569
instead. Again, strict rules forbid changing account-chains that are
570
not really affected (i.e., that receive the same messages as in the
571
previous version). In this way, fixing an incorrect block generates
572
``ripples'' that are ultimately propagated towards the most recent
573
blocks of all affected shardchains; these changes are reflected in new
574
``vertical'' masterchain blocks as well.
575

576
Once the ``history rewriting'' ripples reach the most recent blocks,
577
the new shardchain blocks are generated in one version only, being
578
successors of the newest block versions only. This means that they
579
will contain references to the correct (most recent) vertical blocks
580
from the very beginning.
581

582
The masterchain state implicitly defines a map transforming the hash
583
of the first block of each ``vertical'' blockchain into the hash of
584
its latest version. This enables a client to identify and locate any
585
vertical blockchain by the hash of its very first (and usually the
586
only) block.
587

588
\nxsubpoint \embt(TON coins and multi-currency workchains.)  The TON
589
Block\-chain supports up to $2^{32}$ different ``cryptocurrencies'',
590
``coins'', or ``tokens'', distinguished by a 32-bit $\currencyid$. New
591
cryptocurrencies can be added by special transactions in the
592
masterchain. Each workchain has a basic cryptocurrency, and can have
593
several additional cryptocurrencies.
594

595
There is one special cryptocurrency with $\currencyid=0$, namely, the
596
{\em TON coin}, also known as the {\em Gram\/}
597
(cf.\ Appendix~\ref{app:coins}). It is the basic cryptocurrency of
598
Workchain Zero. It is also used for transaction fees and validator
599
stakes.
600

601
In principle, other workchains may collect transaction fees in other
602
tokens. In this case, some smart contract for automated conversion of
603
these transaction fees into Grams should be provided.
604

605
\nxsubpoint \embt(Messaging and value transfer.)  Shardchains
606
belonging to the same or different workchains may send {\em
607
  messages\/} to each other. While the exact form of the messages
608
allowed depends on the receiving workchain and receiving account
609
(smart contract), there are some common fields making inter-workchain
610
messaging possible. In particular, each message may have some {\em
611
  value} attached, in the form of a certain amount of Grams (TON
612
coins) and/or other registered cryptocurrencies, provided they are
613
declared as acceptable cryptocurrencies by the receiving workchain.
614

615
The simplest form of such messaging is a value transfer from one
616
(usually not a smart-contract) account to another.
617

618
\nxsubpoint\label{sp:tonvm} \embt(TON Virtual Machine.)  The {\em TON
619
  Virtual Machine}, also abbreviated as {\em TON VM\/} or {\em TVM\/},
620
is the virtual machine used to execute smart-contract code in the
621
masterchain and in the basic workchain. Other workchains may use other
622
virtual machines alongside or instead of the TVM.
623

624
Here we list some of its features. They are discussed further
625
in~\ptref{sp:pec.tvm}, \ptref{sp:tvm.cells} and elsewhere.
626

627
\begin{itemize}
628
\item TVM represents all data as a collection of {\em (TVM) cells\/}
629
  (cf.~\ptref{sp:tvm.cells}). Each cell contains up to 128 data bytes
630
  and up to 4 references to other cells. As a consequence of the
631
  ``everything is a bag of cells'' philosophy
632
  (cf.~\ptref{sp:everything.is.BoC}), this enables TVM to work with
633
  all data related to the TON Blockchain, including blocks and
634
  blockchain global state if necessary.
635
\item TVM can work with values of arbitrary algebraic data types
636
  (cf.~\ptref{sp:pec.tvm}), represented as trees or directed acyclic
637
  graphs of TVM cells. However, it is agnostic towards the existence
638
  of algebraic data types; it just works with cells.
639
\item TVM has built-in support for hashmaps (cf.~\ptref{sp:patricia}).
640
\item TVM is a stack machine. Its stack keeps either 64-bit integers
641
  or cell references.
642
\item 64-bit, 128-bit and 256-bit arithmetic is supported. All $n$-bit
643
  arithmetic operations come in three flavors: for unsigned integers,
644
  for signed integers and for integers modulo $2^n$ (no automatic
645
  overflow checks in the latter case).
646
\item TVM has unsigned and signed integer conversion from $n$-bit to
647
  $m$-bit, for all $0\leq m,n\leq 256$, with overflow checks.
648
\item All arithmetic operations perform overflow checks by default,
649
  greatly simplifying the development of smart contracts.
650
\item TVM has ``multiply-then-shift'' and ``shift-then-divide''
651
  arithmetic operations with intermediate values computed in a larger
652
  integer type; this simplifies implementing fixed-point arithmetic.
653
\item TVM offers support for bit strings and byte strings.
654
\item Support for 256-bit Elliptic Curve Cryptography (ECC) for some
655
  predefined curves, including Curve25519, is present.
656
\item Support for Weil pairings on some elliptic curves, useful for
657
  fast implementation of zk-SNARKs, is also present.
658
\item Support for popular hash functions, including $\Sha$, is
659
  present.
660
\item TVM can work with Merkle proofs
661
  (cf.~\ptref{sp:ton.smart.pc.supp}).
662
\item TVM offers support for ``large'' or ``global'' smart
663
  contracts. Such smart contracts must be aware of sharding
664
  (cf.~\ptref{sp:loc.glob.smct} and \ptref{sp:tvm.data.shard}). Usual
665
  (local) smart contracts can be sharding-agnostic.
666
\item TVM supports closures.
667
\item A ``spineless tagless $G$-machine'' \cite{STGM} can be easily
668
  implemented inside TVM.
669
\end{itemize}
670
Several high-level languages can be designed for TVM, in addition to
671
the ``TVM assembly''. All these languages will have static types and
672
will support algebraic data types.  We envision the following
673
possibilities:
674
\begin{itemize}
675
\item A Java-like imperative language, with each smart contract
676
  resembling a separate class.
677
\item A lazy functional language (think of Haskell).
678
\item An eager functional language (think of ML).
679
\end{itemize}
680
 
681
\nxsubpoint\label{sp:config.param} \embt(Configurable parameters.)  An
682
important feature of the TON Block\-chain is that many of its
683
parameters are {\em configurable}. This means that they are part of
684
the masterchain state, and can be changed by certain special
685
proposal/vote/result transactions in the masterchain, without any need
686
for hard forks. Changing such parameters will require collecting
687
two-thirds of validator votes and more than half of the votes of all
688
other participants who would care to take part in the voting process
689
in favor of the proposal.
690

691
\mysubsection{Generalities on Blockchains}
692

693
\nxsubpoint\label{sp:gen.blkch.def} \embt(General blockchain
694
definition.)  In general, any {\em (true) blockchain\/} is a sequence
695
of {\em blocks}, each block $B$ containing a reference $\blkprev(B)$
696
to the previous block (usually by including the hash of the previous
697
block into the header of the current block), and a list of {\em
698
  transactions}. Each transaction describes some transformation of the
699
{\em global blockchain state}; the transactions listed in a block are
700
applied sequentially to compute the new state starting from the old
701
state, which is the resulting state after the evaluation of the
702
previous block.
703

704
\nxsubpoint \embt(Relevance for the TON Blockchain.)  Recall that the
705
            {\em TON Block\-chain\/} is not a true blockchain, but a
706
            collection of 2-blockchains (i.e., of blockchains of
707
            blockchains; cf.~\ptref{sp:list.blkch.typ}), so the above
708
            is not directly applicable to it. However, we start with
709
            these generalities on true blockchains to use them as
710
            building blocks for our more sophisticated constructions.
711

712
\nxsubpoint \embt(Blockchain instance and blockchain type.)  One often
713
uses the word {\em blockchain\/} to denote both a general {\em
714
  blockchain type\/} and its specific {\em blockchain instances},
715
defined as sequences of blocks satisfying certain conditions. For
716
example, \ptref{sp:gen.blkch.def} refers to blockchain instances.
717

718
In this way, a blockchain type is usually a ``subtype'' of the type
719
$\Block^*$ of lists (i.e., finite sequences) of blocks, consisting of
720
those sequences of blocks that satisfy certain compatibility and
721
validity conditions:
722
\begin{equation}
723
  \Blockchain \subset \Block^*
724
\end{equation}
725

726
A better way to define $\Blockchain$ would be to say that
727
$\Blockchain$ is a {\em dependent couple type}, consisting of couples
728
$(\bbB,v)$, with first component $\bbB:\Block^*$ being of type
729
$\Block^*$ (i.e., a list of blocks), and the second component
730
$v:\isValidBc(\bbB)$ being a proof or a witness of the validity of
731
$\bbB$. In this way,
732
\begin{equation}
733
  \Blockchain\equiv\Sigma_{(\bbB:\Block^*)}\isValidBc(\bbB)
734
\end{equation}
735
We use here the notation for dependent sums of types borrowed from~\cite{HoTT}.
736

737
\nxsubpoint \embt(Dependent type theory, Coq and TL.)  Note that we
738
are using (Martin-L\"of) dependent type theory here, similar to that
739
used in the Coq\footnote{\url{https://coq.inria.fr}} proof
740
assistant. A simplified version of dependent type theory is also used
741
in {\em TL (Type
742
  Language)},\footnote{\url{https://core.telegram.org/mtproto/TL}}
743
which will be used in the formal specification of the TON Blockchain
744
to describe the serialization of all data structures and the layouts
745
of blocks, transactions, and the like.
746

747
In fact, dependent type theory gives a useful formalization of what a
748
proof is, and such formal proofs (or their serializations) might
749
become handy when one needs to provide proof of invalidity for some
750
block, for example.
751

752
\nxsubpoint\label{sp:TL} \embt(TL, or the Type Language.)  Since TL
753
(Type Language) will be used in the formal specifications of TON
754
blocks, transactions, and network datagrams, it warrants a brief
755
discussion.
756

757
TL is a language suitable for description of dependent algebraic {\em
758
  types}, which are allowed to have numeric (natural) and type
759
parameters. Each type is described by means of several {\em
760
  constructors}. Each constructor has a (human-readable) identifier
761
and a {\em name,} which is a bit string (32-bit integer by
762
default). Apart from that, the definition of a constructor contains a
763
list of fields along with their types.
764

765
A collection of constructor and type definitions is called a {\em
766
  TL-scheme}. It is usually kept in one or several files with the
767
suffix \texttt{.tl}.
768

769
An important feature of TL-schemes is that they determine an
770
unambiguous way of serializing and deserializing values (or objects)
771
of algebraic types defined. Namely, when a value needs to be
772
serialized into a stream of bytes, first the name of the constructor
773
used for this value is serialized. Recursively computed serializations
774
of each field follow.
775

776
The description of a previous version of TL, suitable for serializing
777
arbitrary objects into sequences of 32-bit integers, is available at
778
\url{https://core.telegram.org/mtproto/TL}. A new version of TL,
779
called {\em TL-B}, is being developed for the purpose of describing
780
the serialization of objects used by the TON Project. This new version
781
can serialize objects into streams of bytes and even bits (not just
782
32-bit integers), and offers support for serialization into a tree of
783
TVM cells (cf.~\ptref{sp:tvm.cells}). A description of TL-B will be a
784
part of the formal specification of the TON Blockchain.
785

786
\nxsubpoint\label{sp:blk.transf} \embt(Blocks and transactions as
787
state transformation operators.)  Normally, any blockchain (type)
788
$\Blockchain$ has an associated global state (type) $\State$, and a
789
transaction (type) $\Transaction$. The semantics of a blockchain are
790
to a large extent determined by the transaction application function:
791
\begin{equation}
792
  \evtrans':\Transaction\times\State\to\State^?
793
\end{equation}
794
Here $X^?$ denotes $\Maybe X$, the result of applying the $\Maybe$
795
monad to type $X$. This is similar to our use of $X^*$ for $\List
796
X$. Essentially, a value of type $X^?$ is either a value of type $X$
797
or a special value $\bot$ indicating the absence of an actual value
798
(think about a null pointer). In our case, we use $\State^?$ instead
799
of $\State$ as the result type because a transaction may be invalid if
800
invoked from certain original states (think about attempting to
801
withdraw from an account more money than it is actually there).
802

803
We might prefer a curried version of $\evtrans'$:
804
\begin{equation}
805
  \evtrans:\Transaction\to\State\to\State^?
806
\end{equation}
807

808
Because a block is essentially a list of transactions, the block
809
evaluation function
810
\begin{equation}
811
  \evblock:\Block\to\State\to\State^?
812
\end{equation}
813
can be derived from $\evtrans$. It takes a block $B:\Block$ and the
814
previous blockchain state $s:\State$ (which might include the hash of
815
the previous block) and computes the next blockchain state
816
$s'=\evblock(B)(s):\State$, which is either a true state or a special
817
value $\bot$ indicating that the next state cannot be computed (i.e.,
818
that the block is invalid if evaluated from the starting state
819
given---for example, the block includes a transaction trying to debit
820
an empty account.)
821

822
\nxsubpoint \embt(Block sequence numbers.)  Each block $B$ in the
823
blockchain can be referred to by its {\em sequence number}
824
$\blkseqno(B)$, starting from zero for the very first block, and
825
incremented by one whenever passing to the next block. More formally,
826
\begin{equation}
827
  \blkseqno(B)=\blkseqno\bigl(\blkprev(B)\bigr)+1
828
\end{equation}
829
Notice that the sequence number does not identify a block uniquely in
830
the presence of {\em forks}.
831

832
\nxsubpoint \embt(Block hashes.)  Another way of referring to a block
833
$B$ is by its hash $\blkhash(B)$, which is actually the hash of the
834
            {\em header\/} of block $B$ (however, the header of the
835
            block usually contains hashes that depend on all content
836
            of block $B$). Assuming that there are no collisions for
837
            the hash function used (or at least that they are very
838
            improbable), a block is uniquely identified by its hash.
839

840
\nxsubpoint \embt(Hash assumption.)  During formal analysis of
841
blockchain algorithms, we assume that there are no collisions for the
842
$k$-bit hash function $\Hash:\Bytes^*\to\st2^{k}$ used:
843
\begin{equation}\label{eq:hash.coll}
844
  \Hash(s)=\Hash(s')\Rightarrow s=s'\quad\text{for any $s$,
845
    $s'\in\Bytes^*$}
846
\end{equation}
847
Here $\Bytes=\{0\ldots255\}=\st2^8$ is the type of bytes, or the set
848
of all byte values, and $\Bytes^*$ is the type or set of arbitrary
849
(finite) lists of bytes; while $\st2=\{0,1\}$ is the bit type, and
850
$\st2^k$ is the set (or actually the type) of all $k$-bit sequences
851
(i.e., of $k$-bit numbers).
852

853
Of course, \eqref{eq:hash.coll} is impossible mathematically, because
854
a map from an infinite set to a finite set cannot be injective. A more
855
rigorous assumption would be
856
\begin{equation}\label{eq:hash.coll.prec}
857
  \forall s, s': s\neq s', P\bigl(\Hash(s)=\Hash(s')\bigr)=2^{-k}
858
\end{equation}
859
However, this is not so convenient for the proofs. If
860
\eqref{eq:hash.coll.prec} is used at most $N$ times in a proof with
861
$2^{-k}N<\epsilon$ for some small $\epsilon$ (say,
862
$\epsilon=10^{-18}$), we can reason as if \eqref{eq:hash.coll} were
863
true, provided we accept a failure probability $\epsilon$ (i.e., the
864
final conclusions will be true with probability at least
865
$1-\epsilon$).
866

867
Final remark: in order to make the probability statement
868
of~\eqref{eq:hash.coll.prec} really rigorous, one must introduce a
869
probability distribution on the set $\Bytes^*$ of all byte
870
sequences. A way of doing this is by assuming all byte sequences of
871
the same length $l$ equiprobable, and setting the probability of
872
observing a sequence of length $l$ equal to $p^l-p^{l+1}$ for some
873
$p\to1-$. Then \eqref{eq:hash.coll.prec} should be understood as a
874
limit of conditional probability $P\bigl(\Hash(s)=\Hash(s')|s\neq
875
s'\bigr)$ when $p$ tends to one from below.
876

877
\nxsubpoint\label{sp:hash.change} \embt(Hash used for the TON
878
Blockchain.)  We are using the 256-bit $\Sha$ hash for the TON
879
Blockchain for the time being. If it turns out to be weaker than
880
expected, it can be replaced by another hash function in the
881
future. The choice of the hash function is a configurable parameter of
882
the protocol, so it can be changed without hard forks as explained
883
in~\ptref{sp:config.param}.
884

885
\mysubsection{Blockchain State, Accounts and Hashmaps}
886

887
We have noted above that any blockchain defines a certain global
888
state, and each block and each transaction defines a transformation of
889
this global state. Here we describe the global state used by TON
890
blockchains.
891

892
\nxsubpoint \embt(Account IDs.)  The basic account IDs used by TON
893
blockchains---or at least by its masterchain and Workchain Zero---are
894
256-bit integers, assumed to be public keys for 256-bit Elliptic Curve
895
Cryptography (ECC) for a specific elliptic curve. In this way,
896
\begin{equation}
897
  \accountid:\Account=\uint_{256}=\st2^{256}
898
\end{equation}
899
Here $\Account$ is the account {\em type}, while $\accountid:\Account$
900
is a specific variable of type $\Account$.
901

902
Other workchains can use other account ID formats, 256-bit or
903
otherwise. For example, one can use Bitcoin-style account IDs, equal
904
to $\Sha$ of an ECC public key.
905

906
However, the bit length $l$ of an account ID must be fixed during the
907
creation of the workchain (in the masterchain), and it must be at
908
least 64, because the first 64 bits of $\accountid$ are used for
909
sharding and message routing.
910

911
\nxsubpoint \embt(Main component: {\em Hashmaps}.)  The principal
912
component of the TON blockchain state is a {\em hashmap}. In some
913
cases we consider (partially defined) ``maps''
914
$h:\st2^n\dashrightarrow\st2^m$. More generally, we might be
915
interested in hashmaps $h:\st2^n\dashrightarrow X$ for a composite
916
type $X$. However, the source (or index) type is almost always
917
$\st2^n$.
918

919
Sometimes, we have a ``default value'' $\vr{empty}:X$, and the hashmap
920
$h:\st2^n\to X$ is ``initialized'' by its ``default value''
921
$i\mapsto\vr{empty}$.
922

923
\nxsubpoint \embt(Example: TON account balances.)  An important
924
example is given by TON account balances. It is a hashmap
925
\begin{equation}
926
  \vr{balance}:\Account\to\uint_{128}
927
\end{equation}
928
mapping $\Account=\st2^{256}$ into a Gram (TON coin) balance of type
929
$\uint_{128}=\st2^{128}$. This hashmap has a default value of zero,
930
meaning that initially (before the first block is processed) the
931
balance of all accounts is zero.
932

933
\nxsubpoint \embt(Example: smart-contract persistent storage.)
934
Another example is given by smart-contract persistent storage, which
935
can be (very approximately) represented as a hashmap
936
\begin{equation}
937
  \vr{storage}:\st2^{256}\dashrightarrow\st2^{256}
938
\end{equation}
939
This hashmap also has a default value of zero, meaning that
940
uninitialized cells of persistent storage are assumed to be zero.
941

942
\nxsubpoint \embt(Example: persistent storage of all smart contracts.)
943
Because we have more than one smart contract, distinguished by
944
$\accountid$, each having its separate persistent storage, we must
945
actually have a hashmap
946
\begin{equation}
947
  \vr{Storage}:\Account\dashrightarrow(\st2^{256}\dashrightarrow\st2^{256})
948
\end{equation}
949
mapping $\accountid$ of a smart contract into its persistent storage.
950

951
\nxsubpoint \embt(Hashmap type.)  The hashmap is not just an abstract
952
(partially defined) function $\st2^n\dashrightarrow X$; it has a
953
specific representation. Therefore, we suppose that we have a special
954
hashmap type
955
\begin{equation}
956
  \Hashmap (n,X):\Type
957
\end{equation}
958
corresponding to a data structure encoding a (partial) map
959
$\st2^n\dashrightarrow X$.  We can also write
960
\begin{equation}
961
  \Hashmap (n:\nat) (X:\Type) : \Type
962
\end{equation}
963
or
964
\begin{equation}
965
  \Hashmap:\nat\to\Type\to\Type
966
\end{equation}
967
We can always transform $h:\Hashmap(n,X)$ into a map
968
$\hget(h):\st2^n\to X^?$. Henceforth, we usually write $h[i]$ instead
969
of $\hget(h)(i)$:
970
\begin{equation}
971
  h[i]:\equiv\hget(h)(i):X^?\quad\text{for any $i:\st2^n$,
972
    $h:\Hashmap(n,X)$}
973
\end{equation}
974

975
\nxsubpoint\label{sp:patricia} \embt(Definition of hashmap type as a
976
Patricia tree.)  Logically, one might define $\Hashmap(n,X)$ as an
977
(incomplete) binary tree of depth $n$ with edge labels $0$ and $1$ and
978
with values of type $X$ in the leaves. Another way to describe the
979
same structure would be as a {\em (bitwise) trie\/} for binary strings
980
of length equal to $n$.
981

982
In practice, we prefer to use a compact representation of this trie,
983
by compressing each vertex having only one child with its parent. The
984
resulting representation is known as a {\em Patricia tree\/} or a {\em
985
  binary radix tree\/}. Each intermediate vertex now has exactly two
986
children, labeled by two non-empty binary strings, beginning with zero
987
for the left child and with one for the right child.
988

989
In other words, there are two types of (non-root) nodes in a Patricia
990
tree:
991
\begin{itemize}
992
\item $\leaf(x)$, containing value $x$ of type $X$.
993
\item $\node(l,s_l,r,s_r)$, where $l$ is the (reference to the) left
994
  child or subtree, $s_l$ is the bitstring labeling the edge
995
  connecting this vertex to its left child (always beginning with 0),
996
  $r$ is the right subtree, and $s_r$ is the bitstring labeling the
997
  edge to the right child (always beginning with 1).
998
\end{itemize}
999
A third type of node, to be used only once at the root of the Patricia
1000
tree, is also necessary:
1001
\begin{itemize}
1002
\item $\root(n,s_0,t)$, where $n$ is the common length of index
1003
  bitstrings of $\Hashmap(n,X)$, $s_0$ is the common prefix of all
1004
  index bitstrings, and $t$ is a reference to a $\leaf$ or a $\node$.
1005
\end{itemize}
1006
If we want to allow the Patricia tree to be empty, a fourth type of
1007
(root) node would be used:
1008
\begin{itemize}
1009
\item $\emptyroot(n)$, where $n$ is the common length of all index
1010
  bitstrings.
1011
\end{itemize}
1012

1013
We define the height of a Patricia tree by
1014
\begin{align}
1015
  \height(\leaf(x))&=0\\ \height\bigl(\node(l,s_l,r,s_r)\bigr)&=\height(l)+\len(s_l)=\height(r)+\len(s_r)\\ \height\bigl(\root(n,s_0,t)\bigr)&=\len(s_0)+\height(t)=n
1016
\end{align}
1017
The last two expressions in each of the last two formulas must be
1018
equal. We use Patricia trees of height $n$ to represent values of type
1019
$\Hashmap(n,X)$.
1020

1021
If there are $N$ leaves in the tree (i.e., our hashmap contains $N$
1022
values), then there are exactly $N-1$ intermediate vertices. Inserting
1023
a new value always involves splitting an existing edge by inserting a
1024
new vertex in the middle and adding a new leaf as the other child of
1025
this new vertex. Deleting a value from a hashmap does the opposite: a
1026
leaf and its parent are deleted, and the parent's parent and its other
1027
child become directly linked.
1028

1029
\nxsubpoint\label{sp:merkle.patr.hash} \embt(Merkle-Patricia trees.)
1030
When working with blockchains, we want to be able to compare Patricia
1031
trees (i.e., hash maps) and their subtrees, by reducing them to a
1032
single hash value. The classical way of achieving this is given by the
1033
Merkle tree. Essentially, we want to describe a way of hashing objects
1034
$h$ of type $\Hashmap(n,X)$ with the aid of a hash function $\Hash$
1035
defined for binary strings, provided we know how to compute hashes
1036
$\Hash(x)$ of objects $x:X$ (e.g., by applying the hash function
1037
$\Hash$ to a binary serialization of object~$x$).
1038

1039
One might define $\Hash(h)$ recursively as follows:
1040
\begin{align}\label{eq:hash.leaf}
1041
  \Hash\bigl(\leaf(x)\bigr):=&\Hash(x)\\
1042
  \label{eq:hash.node}
1043
  \Hash\bigl(\node(l,s_l,r,s_r)\bigr):=&\Hash\bigl(\Hash(l).\Hash(r).\code(s_l).\code(s_r)\bigr)\\ \Hash\bigl(\root(n,s_0,t)\bigr):=&\Hash\bigl(\code(n).\code(s_0).\Hash(t)\bigr)
1044
\end{align}
1045
Here $s.t$ denotes the concatenation of (bit) strings $s$ and $t$, and
1046
$\code(s)$ is a prefix code for all bit strings $s$. For example, one
1047
might encode 0 by 10, 1 by 11, and the end of the string by 0.%
1048
\footnote{One can show that this encoding is optimal for approximately
1049
  half of all edge labels of a Patricia tree with random or
1050
  consecutive indices. Remaining edge labels are likely to be long
1051
  (i.e., almost 256 bits long). Therefore, a nearly optimal encoding
1052
  for edge labels is to use the above code with prefix 0 for ``short''
1053
  bit strings, and encode 1, then nine bits containing length $l=|s|$
1054
  of bitstring~$s$, and then the $l$ bits of $s$ for ``long''
1055
  bitstrings (with $l\geq10$).}
1056

1057
We will see later (cf.~\ptref{sp:pec.tvm} and \ptref{sp:tvm.cells})
1058
that this is a (slightly tweaked) version of recursively defined
1059
hashes for values of arbitrary (dependent) algebraic types.
1060

1061
\nxsubpoint \embt(Recomputing Merkle tree hashes.)  This way of
1062
recursively defining $\Hash(h)$, called a {\em Merkle tree hash}, has
1063
the advantage that, if one explicitly stores $\Hash(h')$ along with
1064
each node $h'$ (resulting in a structure called a {\em Merkle tree},
1065
or, in our case, a {\em Merkle--Patricia tree}), one needs to
1066
recompute only at most $n$ hashes when an element is added to, deleted
1067
from or changed in the hashmap.
1068

1069
In this way, if one represents the global blockchain state by a
1070
suitable Merkle tree hash, it is easy to recompute this state hash
1071
after each transaction.
1072

1073
\nxsubpoint\label{sp:merkle.proof} \embt(Merkle proofs.)  Under the
1074
assumption \eqref{eq:hash.coll} of ``injectivity'' of the chosen hash
1075
function $\Hash$, one can construct a proof that, for a given value
1076
$z$ of $\Hash(h)$, $h:\Hashmap(n,X)$, one has $\hget(h)(i)=x$ for some
1077
$i:\st2^n$ and $x:X$. Such a proof will consist of the path in the
1078
Merkle--Patricia tree from the leaf corresponding to $i$ to the root,
1079
augmented by the hashes of all siblings of all nodes occurring on this
1080
path.
1081

1082
In this way, a light node%
1083
\footnote{A {\em light node\/} is a node that does not keep track of
1084
  the full state of a shardchain; instead, it keeps minimal
1085
  information such as the hashes of the several most recent blocks,
1086
  and relies on information obtained from full nodes when it becomes
1087
  necessary to inspect some parts of the full state.} %
1088
knowing only the value of $\Hash(h)$ for some hashmap $h$ (e.g.,
1089
smart-contract persistent storage or global blockchain state) might
1090
request from a full node%
1091
\footnote{A {\em full node\/} is a node keeping track of the complete
1092
  up-to-date state of the shardchain in question.} %
1093
not only the value $x=h[i]=\hget(h)(i)$, but such a
1094
value along with a Merkle proof starting from the already known value
1095
$\Hash(h)$. Then, under assumption \eqref{eq:hash.coll}, the light
1096
node can check for itself that $x$ is indeed the correct value of
1097
$h[i]$.
1098

1099
In some cases, the client may want to obtain the value
1100
$y=\Hash(x)=\Hash(h[i])$ instead---for example, if $x$ itself is very
1101
large (e.g., a hashmap itself). Then a Merkle proof for $(i,y)$ can be
1102
provided instead. If $x$ is a hashmap as well, then a second Merkle
1103
proof starting from $y=\Hash(x)$ may be obtained from a full node, to
1104
provide a value $x[j]=h[i][j]$ or just its hash.
1105

1106
\nxsubpoint \embt(Importance of Merkle proofs for a multi-chain system
1107
such as TON.)  Notice that a node normally cannot be a full node for
1108
all shardchains existing in the TON environment. It usually is a full
1109
node only for some shardchains---for instance, those containing its
1110
own account, a smart contract it is interested in, or those that this
1111
node has been assigned to be a validator of. For other shardchains, it
1112
must be a light node---otherwise the storage, computing and network
1113
bandwidth requirements would be prohibitive. This means that such a
1114
node cannot directly check assertions about the state of other
1115
shardchains; it must rely on Merkle proofs obtained from full nodes
1116
for those shardchains, which is as safe as checking by itself unless
1117
\eqref{eq:hash.coll} fails (i.e., a hash collision is found).
1118

1119
\nxsubpoint\label{sp:pec.tvm} \embt(Peculiarities of TON VM.)  The TON
1120
VM or TVM (Telegram Virtual Machine), used to run smart contracts in
1121
the masterchain and Workchain Zero, is considerably different from
1122
customary designs inspired by the EVM (Ethereum Virtual Machine): it
1123
works not just with 256-bit integers, but actually with (almost)
1124
arbitrary ``records'', ``structures'', or ``sum-product types'',
1125
making it more suitable to execute code written in high-level
1126
(especially functional) languages. Essentially, TVM uses tagged data
1127
types, not unlike those used in implementations of Prolog or Erlang.
1128

1129
One might imagine first that the state of a TVM smart contract is not
1130
just a hashmap $\st2^{256}\to\st2^{256}$, or
1131
$\Hashmap(256,\st2^{256})$, but (as a first step) $\Hashmap(256,X)$,
1132
where $X$ is a type with several constructors, enabling it to store,
1133
apart from 256-bit integers, other data structures, including other
1134
hashmaps $\Hashmap(256,X)$ in particular. This would mean that a cell
1135
of TVM (persistent or temporary) storage---or a variable or an element
1136
of an array in a TVM smart-contract code---may contain not only an
1137
integer, but a whole new hashmap. Of course, this would mean that a
1138
cell contains not just 256 bits, but also, say, an 8-bit tag,
1139
describing how these 256 bits should be interpreted.
1140

1141
In fact, values do not need to be precisely 256-bit. The value format
1142
used by TVM consists of a sequence of raw bytes and references to
1143
other structures, mixed in arbitrary order, with some descriptor bytes
1144
inserted in suitable locations to be able to distinguish pointers from
1145
raw data (e.g., strings or integers); cf.~\ptref{sp:tvm.cells}.
1146

1147
This raw value format may be used to implement arbitrary sum-product
1148
algebraic types. In this case, the value would contain a raw byte
1149
first, describing the ``constructor'' being used (from the perspective
1150
of a high-level language), and then other ``fields'' or ``constructor
1151
arguments'', consisting of raw bytes and references to other
1152
structures depending on the constructor chosen
1153
(cf.~\ptref{sp:TL}). However, TVM does not know anything about the
1154
correspondence between constructors and their arguments; the mixture
1155
of bytes and references is explicitly described by certain descriptor
1156
bytes.\footnote{These two descriptor bytes, present in any TVM cell,
1157
  describe only the total number of references and the total number of
1158
  raw bytes; references are kept together either before or after all
1159
  raw bytes.}
1160

1161
The Merkle tree hashing is extended to arbitrary such structures: to
1162
compute the hash of such a structure, all references are recursively
1163
replaced by hashes of objects referred to, and then the hash of the
1164
resulting byte string (descriptor bytes included) is computed.
1165

1166
In this way, the Merkle tree hashing for hashmaps, described in
1167
\ptref{sp:merkle.patr.hash}, is just a special case of hashing for
1168
arbitrary (dependent) algebraic data types, applied to type
1169
$\Hashmap(n,X)$ with two constructors.\footnote{Actually, $\leaf$ and
1170
  $\node$ are constructors of an auxiliary type,
1171
  $\tp{HashmapAux}(n,X)$. Type $\Hashmap(n,X)$ has constructors
1172
  $\root$ and $\emptyroot$, with $\root$ containing a value of type
1173
  $\tp{HashmapAux}(n,X)$.}
1174

1175
\nxsubpoint \embt(Persistent storage of TON smart contracts.)
1176
Persistent storage of a TON smart contract essentially consists of its
1177
``global variables'', preserved between calls to the smart
1178
contract. As such, it is just a ``product'', ``tuple'', or ``record''
1179
type, consisting of fields of the correct types, corresponding to one
1180
global variable each. If there are too many global variables, they
1181
cannot fit into one TON cell because of the global restriction on TON
1182
cell size. In such a case, they are split into several records and
1183
organized into a tree, essentially becoming a ``product of products''
1184
or ``product of products of products'' type instead of just a product
1185
type.
1186

1187
\nxsubpoint\label{sp:tvm.cells} \embt(TVM Cells.)  Ultimately, the TON
1188
VM keeps all data in a collection of {\em (TVM) cells}. Each cell
1189
contains two descriptor bytes first, indicating how many bytes of raw
1190
data are present in this cell (up to 128) and how many references to
1191
other cells are present (up to four). Then these raw data bytes and
1192
references follow. Each cell is referenced exactly once, so we might
1193
have included in each cell a reference to its ``parent'' (the only
1194
cell referencing this one). However, this reference need not be
1195
explicit.
1196

1197
In this way, the persistent data storage cells of a TON smart contract
1198
are organized into a tree,\footnote{Logically; the ``bag of cells''
1199
  representation described in~\ptref{sp:bag.of.cells} identifies all
1200
  duplicate cells, transforming this tree into a directed acyclic
1201
  graph (dag) when serialized.} with a reference to the root of this
1202
tree kept in the smart-contract description. If necessary, a Merkle
1203
tree hash of this entire persistent storage is recursively computed,
1204
starting from the leaves and then simply replacing all references in a
1205
cell with the recursively computed hashes of the referenced cells, and
1206
subsequently computing the hash of the byte string thus obtained.
1207

1208
\nxsubpoint\label{sp:gen.merkle.proof} \embt(Generalized Merkle proofs
1209
for values of arbitrary algebraic types.)  Because the TON VM
1210
represents a value of arbitrary algebraic type by means of a tree
1211
consisting of (TVM) cells, and each cell has a well-defined
1212
(recursively computed) Merkle hash, depending in fact on the whole
1213
subtree rooted in this cell, we can provide ``generalized Merkle
1214
proofs'' for (parts of) values of arbitrary algebraic types, intended
1215
to prove that a certain subtree of a tree with a known Merkle hash
1216
takes a specific value or a value with a specific hash. This
1217
generalizes the approach of \ptref{sp:merkle.proof}, where only Merkle
1218
proofs for $x[i]=y$ have been considered.
1219

1220
\nxsubpoint\label{sp:tvm.data.shard} \embt(Support for sharding in TON
1221
VM data structures.)  We have just outlined how the TON VM, without
1222
being overly complicated, supports arbitrary (dependent) algebraic
1223
data types in high-level smart-contract languages. However, sharding
1224
of large (or global) smart contracts requires special support on the
1225
level of TON VM. To this end, a special version of the hashmap type
1226
has been added to the system, amounting to a ``map''
1227
$\Account\dashrightarrow X$. This ``map'' might seem equivalent to
1228
$\Hashmap(m,X)$, where $\Account=\st2^m$. However, when a shard is
1229
split in two, or two shards are merged, such hashmaps are
1230
automatically split in two, or merged back, so as to keep only those
1231
keys that belong to the corresponding shard.
1232

1233
\nxsubpoint \embt(Payment for persistent storage.)  A noteworthy
1234
feature of the TON Blockchain is the payment exacted from smart
1235
contracts for storing their persistent data (i.e., for enlarging the
1236
total state of the blockchain). It works as follows:
1237

1238
Each block declares two rates, nominated in the principal currency of
1239
the blockchain (usually the Gram): the price for keeping one cell in
1240
the persistent storage, and the price for keeping one raw byte in some
1241
cell of the persistent storage. Statistics on the total numbers of
1242
cells and bytes used by each account are stored as part of its state,
1243
so by multiplying these numbers by the two rates declared in the block
1244
header, we can compute the payment to be deducted from the account
1245
balance for keeping its data between the previous block and the
1246
current one.
1247

1248
However, payment for persistent storage usage is not exacted for every
1249
account and smart contract in each block; instead, the sequence number
1250
of the block where this payment was last exacted is stored in the
1251
account data, and when any action is done with the account (e.g., a
1252
value transfer or a message is received and processed by a smart
1253
contract), the storage usage payment for all blocks since the previous
1254
such payment is deducted from the account balance before performing
1255
any further actions. If the account's balance would become negative
1256
after this, the account is destroyed.
1257

1258
A workchain may declare some number of raw data bytes per account to
1259
be ``free'' (i.e., not participating in the persistent storage
1260
payments) in order to make ``simple'' accounts, which keep only their
1261
balance in one or two cryptocurrencies, exempt from these constant
1262
payments.
1263

1264
Notice that, if nobody sends any messages to an account, its
1265
persistent storage payments are not collected, and it can exist
1266
indefinitely. However, anybody can send, for instance, an empty
1267
message to destroy such an account. A small incentive, collected from
1268
part of the original balance of the account to be destroyed, can be
1269
given to the sender of such a message. We expect, however, that the
1270
validators would destroy such insolvent accounts for free, simply to
1271
decrease the global blockchain state size and to avoid keeping large
1272
amounts of data without compensation.
1273

1274
Payments collected for keeping persistent data are distributed among
1275
the validators of the shardchain or the masterchain (proportionally to
1276
their stakes in the latter case).
1277

1278
\nxsubpoint\label{sp:loc.glob.smct} \embt(Local and global smart
1279
contracts; smart-contract instances.)  A smart contract normally
1280
resides just in one shard, selected according to the smart contract's
1281
$\accountid$, similarly to an ``ordinary'' account. This is usually
1282
sufficient for most applications. However, some ``high-load'' smart
1283
contracts may want to have an ``instance'' in each shardchain of some
1284
workchain. To achieve this, they must propagate their creating
1285
transaction into all shardchains, for instance, by committing this
1286
transaction into the ``root'' shardchain $(w,\emptyset)$\footnote{A
1287
  more expensive alternative is to publish such a ``global'' smart
1288
  contract in the masterchain.} of the workchain $w$ and paying a
1289
large commission.\footnote{This is a sort of ``broadcast'' feature for
1290
  all shards, and as such, it must be quite expensive.}
1291

1292
This action effectively creates instances of the smart contract in
1293
each shard, with separate balances. Originally, the balance
1294
transferred in the creating transaction is distributed simply by
1295
giving the instance in shard $(w,s)$ the $2^{-|s|}$ part of the total
1296
balance. When a shard splits into two child shards, balances of all
1297
instances of global smart contracts are split in half; when two shards
1298
merge, balances are added together.
1299

1300
In some cases, splitting/merging instances of global smart contracts
1301
may involve (delayed) execution of special methods of these smart
1302
contracts. By default, the balances are split and merged as described
1303
above, and some special ``account-indexed'' hashmaps are also
1304
automatically split and merged (cf.~\ptref{sp:tvm.data.shard}).
1305

1306
\nxsubpoint \embt(Limiting splitting of smart contracts.)  A global
1307
smart contract may limit its splitting depth $d$ upon its creation, in
1308
order to make persistent storage expenses more predictable. This means
1309
that, if shardchain $(w,s)$ with $|s|\geq d$ splits in two, only one
1310
of two new shardchains inherits an instance of the smart
1311
contract. This shardchain is chosen deterministically: each global
1312
smart contract has some ``$\accountid$'', which is essentially the
1313
hash of its creating transaction, and its instances have the same
1314
$\accountid$ with the first $\leq d$ bits replaced by suitable values
1315
needed to fall into the correct shard. This $\accountid$ selects which
1316
shard will inherit the smart-contract instance after splitting.
1317

1318
\nxsubpoint\label{sp:account.state} \embt(Account/Smart-contract
1319
state.)  We can summarize all of the above to conclude that an account
1320
or smart-contract state consists of the following:
1321
\begin{itemize}
1322
\item A balance in the principal currency of the blockchain
1323
\item A balance in other currencies of the blockchain
1324
\item Smart-contract code (or its hash)
1325
\item Smart-contract persistent data (or its Merkle hash)
1326
\item Statistics on the number of persistent storage cells and raw
1327
  bytes used
1328
\item The last time (actually, the masterchain block number) when
1329
  payment for smart-contract persistent storage was collected
1330
\item The public key needed to transfer currency and send messages
1331
  from this account (optional; by default equal to $\accountid$
1332
  itself). In some cases, more sophisticated signature checking code
1333
  may be located here, similar to what is done for Bitcoin transaction
1334
  outputs; then the $\accountid$ will be equal to the hash of this
1335
  code.
1336
\end{itemize}
1337
We also need to keep somewhere, either in the account state or in some
1338
other account-indexed hashmap, the following data:
1339
\begin{itemize}
1340
\item The output message queue of the account
1341
  (cf.~\ptref{sp:out.queue})
1342
\item The collection of (hashes of) recently delivered messages
1343
  (cf.~\ptref{sp:deliver.q})
1344
\end{itemize}
1345

1346
Not all of these are really required for every account; for example,
1347
smart-contract code is needed only for smart contracts, but not for
1348
``simple'' accounts. Furthermore, while any account must have a
1349
non-zero balance in the principal currency (e.g., Grams for the
1350
masterchain and shardchains of the basic workchain), it may have
1351
balances of zero in other currencies. In order to avoid keeping unused
1352
data, a sum-product type (depending on the workchain) is defined
1353
(during the workchain's creation), which uses different tag bytes
1354
(e.g., TL constructors; cf.~\ptref{sp:TL}) to distinguish between
1355
different ``constructors'' used. Ultimately, the account state is
1356
itself kept as a collection of cells of the TVM persistent storage.
1357

1358
\mysubsection{Messages Between Shardchains}
1359

1360
An important component of the TON Blockchain is the {\em messaging
1361
  system\/} between blockchains. These blockchains may be shardchains
1362
of the same workchain, or of different workchains.
1363

1364
\nxsubpoint \embt(Messages, accounts and transactions: a bird's eye
1365
view of the system.)  {\em Messages\/} are sent from one account to
1366
another. Each {\em transaction\/} consists of an account receiving one
1367
message, changing its state according to certain rules, and generating
1368
several (maybe one or zero) new messages to other accounts. Each
1369
message is generated and received (delivered) exactly once.
1370

1371
This means that messages play a fundamental role in the system,
1372
comparable to that of accounts (smart contracts). From the perspective
1373
of the Infinite Sharding Paradigm (cf.~\ptref{sp:ISP}), each account
1374
resides in its separate ``account-chain'', and the only way it can
1375
affect the state of some other account is by sending a message.
1376

1377
\nxsubpoint\label{sp:actors} \embt(Accounts as processes or actors;
1378
Actor model.)  One might think about accounts (and smart contracts) as
1379
``processes'', or ``actors'', that are able to process incoming
1380
messages, change their internal state and generate some outbound
1381
messages as a result. This is closely related to the so-called {\em
1382
  Actor model}, used in languages such as Erlang (however, actors in
1383
Erlang are usually called ``processes''). Since new actors (i.e.,
1384
smart contracts) are also allowed to be created by existing actors as
1385
a result of processing an inbound message, the correspondence with the
1386
Actor model is essentially complete.
1387

1388
\nxsubpoint \embt(Message recipient.)  Any message has its {\em
1389
  recipient}, characterized by the {\em target workchain identifier
1390
  $w$} (assumed by default to be the same as that of the originating
1391
shardchain), and the {\em recipient account $\accountid$}. The exact
1392
format (i.e., number of bits) of $\accountid$ depends on $w$; however,
1393
the shard is always determined by its first (most significant) 64
1394
bits.
1395

1396
\nxsubpoint\label{sp:msg.sender} \embt(Message sender.)  In most
1397
cases, a message has a {\em sender}, characterized again by a
1398
$(w',\accountid')$ pair. If present, it is located after the message
1399
recipient and message value. Sometimes, the sender is unimportant or
1400
it is somebody outside the blockchain (i.e., not a smart contract), in which case this field is absent.
1401

1402
Notice that the Actor model does not require the messages to have an
1403
implicit sender. Instead, messages may contain a reference to the
1404
Actor to which an answer to the request should be sent; usually it
1405
coincides with the sender. However, it is useful to have an explicit
1406
unforgeable sender field in a message in a cryptocurrency (Byzantine)
1407
environment.
1408

1409
\nxsubpoint \embt(Message value.)  Another important characteristic of
1410
a message is its attached {\em value}, in one or several
1411
cryptocurrencies supported both by the source and by the target
1412
workchain. The value of the message is indicated at its very beginning
1413
immediately after the message recipient; it is essentially a list of
1414
$(\currencyid,\vr{value})$ pairs.
1415

1416
Notice that ``simple'' value transfers between ``simple'' accounts are
1417
just empty (no-op) messages with some value attached to them. On the
1418
other hand, a slightly more complicated message body might contain a
1419
simple text or binary comment (e.g., about the purpose of the
1420
payment).
1421

1422
\nxsubpoint\label{sp:ext.msg} \embt(External messages, or ``messages
1423
from nowhere''.)  Some messages arrive into the system ``from
1424
nowhere''---that is, they are not generated by an account (smart
1425
contract or not) residing in the blockchain. The most typical example
1426
arises when a user wants to transfer some funds from an account
1427
controlled by her to some other account. In this case, the user sends
1428
a ``message from nowhere'' to her own account, requesting it to
1429
generate a message to the receiving account, carrying the specified
1430
value. If this message is correctly signed, her account receives it
1431
and generates the required outbound messages.
1432

1433
In fact, one might consider a ``simple'' account as a special case of
1434
a smart contract with predefined code. This smart contract receives
1435
only one type of message. Such an inbound message must contain a list
1436
of outbound messages to be generated as a result of delivering
1437
(processing) the inbound message, along with a signature. The smart
1438
contract checks the signature, and, if it is correct, generates the
1439
required messages.
1440

1441
Of course, there is a difference between ``messages from nowhere'' and
1442
normal messages, because the ``messages from nowhere'' cannot bear
1443
value, so they cannot pay for their ``gas'' (i.e., their processing)
1444
themselves. Instead, they are tentatively executed with a small gas
1445
limit before even being suggested for inclusion in a new shardchain
1446
block; if the execution fails (the signature is incorrect), the
1447
``message from nowhere'' is deemed incorrect and is discarded. If the
1448
execution does not fail within the small gas limit, the message may be
1449
included in a new shardchain block and processed completely, with the
1450
payment for the gas (processing capacity) consumed exacted from the
1451
receiver's account. ``Messages from nowhere'' can also define some
1452
transaction fee which is deducted from the receiver's account on top
1453
of the gas payment for redistribution to the validators.
1454

1455
In this sense, ``messages from nowhere'' or ``external messages'' take
1456
the role of transaction candidates used in other blockchain systems
1457
(e.g., Bitcoin and Ethereum).
1458

1459
\nxsubpoint \embt(Log messages, or ``messages to nowhere''.)
1460
Similarly, sometimes a special message can be generated and routed to
1461
a specific shardchain not to be delivered to its recipient, but to be
1462
logged in order to be easily observable by anybody receiving updates
1463
about the shard in question. These logged messages may be output in a
1464
user's console, or trigger an execution of some script on an off-chain
1465
server. In this sense, they represent the external ``output'' of the
1466
``blockchain supercomputer'', just as the ``messages from nowhere''
1467
represent the external ``input'' of the ``blockchain supercomputer''.
1468

1469
\nxsubpoint \embt(Interaction with off-chain services and external
1470
blockchains.)  These external input and output messages can be used
1471
for interacting with off-chain services and other (external)
1472
blockchains, such as Bitcoin or Ethe\-reum. One might create tokens or
1473
cryptocurrencies inside the TON Block\-chain pegged to Bitcoins,
1474
Ethers or any ERC-20 tokens defined in the Ethe\-reum blockchain, and
1475
use ``messages from nowhere'' and ``messages to nowhere'', generated
1476
and processed by scripts residing on some third-party off-chain
1477
servers, to implement the necessary interaction between the TON
1478
Blockchain and these external blockchains.
1479

1480
\nxsubpoint \embt(Message body.)  The {\em message body\/} is simply a
1481
sequence of bytes, the meaning of which is determined only by the
1482
receiving workchain and/or smart contract. For blockchains using TON
1483
VM, this could be the serialization of any TVM cell, generated
1484
automatically via the \texttt{Send()} operation. Such a serialization
1485
is obtained simply by recursively replacing all references in a TON VM
1486
cell with the cells referred to. Ultimately, a string of raw bytes
1487
appears, which is usually prepended by a 4-byte ``message type'' or
1488
``message constructor'', used to select the correct method of the
1489
receiving smart contract.
1490

1491
Another option would be to use TL-serialized objects
1492
(cf.~\ptref{sp:TL}) as message bodies. This might be especially useful
1493
for communication between different workchains, one or both of which
1494
are not necessarily using the TON VM.
1495

1496
\nxsubpoint \embt(Gas limit and other workchain/VM-specific
1497
parameters.)  Sometimes a message needs to carry information about the
1498
gas limit, the gas price, transaction fees and similar values that
1499
depend on the receiving workchain and are relevant only for the
1500
receiving workchain, but not necessarily for the originating
1501
workchain. Such parameters are included in or before the message body,
1502
sometimes (depending on the workchain) with special 4-byte prefixes
1503
indicating their presence (which can be defined by a TL-scheme;
1504
cf.~\ptref{sp:TL}).
1505

1506
\nxsubpoint \embt(Creating messages: smart contracts and
1507
transactions.)  There are two sources of new messages. Most messages
1508
are created during smart-contract execution (via the \texttt{Send()}
1509
operation in TON VM), when some smart contract is invoked to process
1510
an incoming message. Alternatively, messages may come from the outside
1511
as ``external messages'' or ``messages from nowhere''
1512
(cf.~\ptref{sp:ext.msg}).%
1513
\footnote{The above needs to be literally true only for the basic
1514
  workchain and its shardchains; other workchains may provide other
1515
  ways of creating messages.}
1516

1517
\nxsubpoint \embt(Delivering messages.)  When a message reaches the
1518
shardchain containing its destination account,\footnote{As a degenerate
1519
  case, this shardchain may coincide with the originating shardchain---for example, if we are working inside a workchain which has not yet
1520
  been split.} it is ``delivered'' to its destination account. What
1521
happens next depends on the workchain; from an outside perspective, it
1522
is important that such a message can never be forwarded further from
1523
this shardchain.
1524

1525
For shardchains of the basic workchain, delivery consists in adding
1526
the message value (minus any gas payments) to the balance of the
1527
receiving account, and possibly in invoking a message-dependent method
1528
of the receiving smart contract afterwards, if the receiving account
1529
is a smart contract. In fact, a smart contract has only one entry
1530
point for processing all incoming messages, and it must distinguish
1531
between different types of messages by looking at their first few
1532
bytes (e.g., the first four bytes containing a TL constructor;
1533
cf.~\ptref{sp:TL}).
1534

1535
\nxsubpoint \embt(Delivery of a message is a transaction.)  Because
1536
the delivery of a message changes the state of an account or smart
1537
contract, it is a special {\em transaction\/} in the receiving
1538
shardchain, and is explicitly registered as such. Essentially, {\em
1539
  all\/} TON Blockchain transactions consist in the delivery of one
1540
inbound message to its receiving account (smart contract), neglecting
1541
some minor technical details.
1542

1543
\nxsubpoint \embt(Messages between instances of the same smart
1544
contract.)  Recall that a smart contract may be {\em local\/} (i.e.,
1545
residing in one shardchain as any ordinary account does) or {\em
1546
  global\/} (i.e., having instances in all shards, or at least in all
1547
shards up to some known depth $d$;
1548
cf.~\ptref{sp:loc.glob.smct}). Instances of a global smart contract
1549
may exchange special messages to transfer information and value
1550
between each other if required. In this case, the (unforgeable) sender
1551
$\accountid$ becomes important (cf.~\ptref{sp:msg.sender}).
1552

1553
\nxsubpoint \embt(Messages to any instance of a smart contract;
1554
wildcard addresses.)  Sometimes a message (e.g., a client request)
1555
needs be delivered to any instance of a global smart contract, usually
1556
the closest one (if there is one residing in the same shardchain as
1557
the sender, it is the obvious candidate). One way of doing this is by
1558
using a ``wildcard recipient address'', with the first $d$ bits of the
1559
destination $\accountid$ allowed to take arbitrary values. In
1560
practice, one will usually set these $d$ bits to the same values as in
1561
the sender's $\accountid$.
1562

1563
\nxsubpoint \embt(Input queue is absent.)  All messages received by a
1564
blockchain (usually a shardchain; sometimes the masterchain)---or,
1565
essentially, by an ``account-chain'' residing inside some
1566
shardchain---are immediately delivered (i.e., processed by the
1567
receiving account). Therefore, there is no ``input queue'' as
1568
such. Instead, if not all messages destined for a specific shardchain
1569
can be processed because of limitations on the total size of blocks
1570
and gas usage, some messages are simply left to accumulate in the
1571
output queues of the originating shardchains.
1572

1573
\nxsubpoint\label{sp:out.queue} \embt(Output queues.)  From the
1574
perspective of the Infinite Sharding Paradigm (cf.~\ptref{sp:ISP}),
1575
each account-chain (i.e., each account) has its own output queue,
1576
consisting of all messages it has generated, but not yet delivered to
1577
their recipients. Of course, account-chains have only a virtual
1578
existence; they are grouped into shardchains, and a shardchain has an
1579
output ``queue'', consisting of the union of the output queues of all
1580
accounts belonging to the shardchain.
1581

1582
This shardchain output ``queue'' imposes only partial order on its
1583
member messages. Namely, a message generated in a preceding block must
1584
be delivered before any message generated in a subsequent block, and
1585
any messages generated by the same account and having the same
1586
destination must be delivered in the order of their generation.
1587

1588
\nxsubpoint\label{sp:intershard.msgs} \embt(Reliable and fast
1589
inter-chain messaging.)  It is of paramount importance for a scalable
1590
multi-blockchain project such as TON to be able to forward and deliver
1591
messages between different shardchains (cf.~\ptref{sp:msg.IHR}), even
1592
if there are millions of them in the system. The messages should be
1593
delivered {\em reliably\/} (i.e., messages should not be lost or
1594
delivered more than once) and {\em quickly}. The TON Blockchain
1595
achieves this goal by using a combination of two ``message routing''
1596
mechanisms.
1597

1598
\nxsubpoint\label{sp:hypercube} \embt(Hypercube routing: ``slow path''
1599
for messages with assured delivery.)  The TON Blockchain uses
1600
``hypercube routing'' as a slow, but safe and reliable way of
1601
delivering messages from one shardchain to another, using several
1602
intermediate shardchains for transit if necessary.  Otherwise,
1603
the validators of any given shardchain would need to keep track of the
1604
state of (the output queues of) all other shardchains, which would
1605
require prohibitive amounts of computing power and network bandwidth
1606
as the total quantity of shardchains grows, thus limiting the
1607
scalability of the system.  Therefore, it is not possible to deliver
1608
messages directly from any shard to every other. Instead, each shard
1609
is ``connected'' only to shards differing in exactly one hexadecimal
1610
digit of their $(w,s)$ shard identifiers
1611
(cf.~\ptref{sp:shard.ident}). In this way, all shardchains constitute
1612
a ``hypercube'' graph, and messages travel along the edges of this
1613
hypercube.
1614

1615
If a message is sent to a shard different from the current one, one of
1616
the hexadecimal digits (chosen deterministically) of the current shard
1617
identifier is replaced by the corresponding digit of the target shard,
1618
and the resulting identifier is used as the proximate target to
1619
forward the message to.\footnote{This is not necessarily the final
1620
  version of the algorithm used to compute the next hop for hypercube
1621
  routing. In particular, hexadecimal digits may be replaced by
1622
  $r$-bit groups, with $r$ a configurable parameter, not necessarily
1623
  equal to four.}
1624

1625
The main advantage of hypercube routing is that the block validity
1626
conditions imply that validators creating blocks of a shardchain must
1627
collect and process messages from the output queues of ``neighboring''
1628
shardchains, on pain of losing their stakes. In this way, any message
1629
can be expected to reach its final destination sooner or later; a
1630
message cannot be lost in transit or delivered twice.
1631

1632
Notice that hypercube routing introduces some additional delays and
1633
expenses, because of the necessity to forward messages through several
1634
intermediate shardchains. However, the number of these intermediate
1635
shardchains grows very slowly, as the logarithm $\log N$ (more
1636
precisely, $\lceil\log_{16}N\rceil-1$) of the total number of
1637
shardchains $N$. For example, if $N\approx250$, there will be at most
1638
one intermediate hop; and for $N\approx4000$ shardchains, at most
1639
two. With four intermediate hops, we can support up to one million
1640
shardchains. We think this is a very small price to pay for the
1641
essentially unlimited scalability of the system. In fact, it is not
1642
necessary to pay even this price:
1643

1644
\nxsubpoint\label{sp:instant.hypercube} \embt(Instant Hypercube
1645
Routing: ``fast path'' for messages.)  A novel feature of the TON
1646
Blockchain is that it introduces a ``fast path'' for forwarding
1647
messages from one shardchain to any other, allowing in most cases to
1648
bypass the ``slow'' hypercube routing of \ptref{sp:hypercube}
1649
altogether and deliver the message into the very next block of the
1650
final destination shardchain.
1651

1652
The idea is as follows. During the ``slow'' hypercube routing, the
1653
message travels (in the network) along the edges of the hypercube, but
1654
it is delayed (for approximately five seconds) at each intermediate
1655
vertex to be committed into the corresponding shardchain before
1656
continuing its voyage. 
1657

1658
To avoid unnecessary delays, one might instead relay the message along
1659
with a suitable Merkle proof along the edges of the hypercube, without
1660
waiting to commit it into the intermediate shardchains. In fact, the
1661
network message should be forwarded from the validators of the ``task
1662
group'' (cf.~\ptref{sp:val.task.grp}) of the original shard to the
1663
designated block producer (cf.~\ptref{sp:rot.gen.prio}) of the ``task
1664
group'' of the destination shard; this might be done directly without
1665
going along the edges of the hypercube. When this message with the
1666
Merkle proof reaches the validators (more precisely, the collators;
1667
cf.~\ptref{sp:collators}) of the destination shardchain, they can
1668
commit it into a new block immediately, without waiting for the
1669
message to complete its travel along the ``slow path''. Then a
1670
confirmation of delivery along with a suitable Merkle proof is sent
1671
back along the hypercube edges, and it may be used to stop the travel
1672
of the message along the ``slow path'', by committing a special
1673
transaction.
1674

1675
Note that this ``instant delivery'' mechanism does not replace the
1676
``slow'' but failproof mechanism described
1677
in~\ptref{sp:hypercube}. The ``slow path'' is still needed because the
1678
validators cannot be punished for losing or simply deciding not to
1679
commit the ``fast path'' messages into new blocks of their
1680
blockchains.\footnote{However, the validators have some incentive to do
1681
  so as soon as possible, because they will be able to collect all
1682
  forwarding fees associated with the message that have not yet been
1683
  consumed along the slow path.}
1684

1685
Therefore, both message forwarding methods are run in parallel, and
1686
the ``slow'' mechanism is aborted only if a proof of success of the
1687
``fast'' mechanism is committed into an intermediate shardchain.%
1688
\footnote{In fact, one might temporarily or permanently disable the
1689
  ``instant delivery'' mechanism altogether, and the system would
1690
  continue working, albeit more slowly.}
1691

1692
\nxsubpoint\label{sp:collect.input.msg} \embt(Collecting input
1693
messages from output queues of neighboring shardchains.)  When a new
1694
block for a shardchain is proposed, some of the output messages of the
1695
neighboring (in the sense of the routing hypercube of
1696
\ptref{sp:hypercube}) shardchains are included in the new block as
1697
``input'' messages and immediately delivered (i.e., processed). There
1698
are certain rules as to the order in which these neighbors' output
1699
messages must be processed. Essentially, an ``older'' message (coming
1700
from a shardchain block referring to an older masterchain block) must
1701
be delivered before any ``newer'' message; and for messages coming
1702
from the same neighboring shardchain, the partial order of the output
1703
queue described in \ptref{sp:out.queue} must be observed.
1704

1705
\nxsubpoint\label{sp:out.q.del} \embt(Deleting messages from output
1706
queues.)  Once an output queue message is observed as having been
1707
delivered by a neighboring shardchain, it is explicitly deleted from
1708
the output queue by a special transaction.
1709

1710
\nxsubpoint\label{sp:deliver.q} \embt(Preventing double delivery of
1711
messages.)  To prevent double delivery of messages taken from the
1712
output queues of the neighboring shardchains, each shardchain (more
1713
precisely, each account-chain inside it) keeps the collection of
1714
recently delivered messages (or just their hashes) as part of its
1715
state. When a delivered message is observed to be deleted from the
1716
output queue by its originating neighboring shardchain
1717
(cf.~\ptref{sp:out.q.del}), it is deleted from the collection of
1718
recently delivered messages as well.
1719

1720
\nxsubpoint \embt(Forwarding messages intended for other shardchains.)
1721
Hypercube routing (cf.~\ptref{sp:hypercube}) means that sometimes
1722
outbound messages are delivered not to the shardchain containing the
1723
intended recipient, but to a neighboring shardchain lying on the
1724
hypercube path to the destination. In this case, ``delivery'' consists
1725
in moving the inbound message to the outbound queue. This is reflected
1726
explicitly in the block as a special {\em forwarding transaction},
1727
containing the message itself. Essentially, this looks as if the
1728
message had been received by somebody inside the shardchain, and one
1729
identical message had been generated as result.
1730

1731
\nxsubpoint \embt(Payment for forwarding and keeping a message.)  The
1732
forwarding transaction actually spends some gas (depending on the size
1733
of the message being forwarded), so a gas payment is deducted from the
1734
value of the message being forwarded on behalf of the validators of
1735
this shardchain. This forwarding payment is normally considerably
1736
smaller than the gas payment exacted when the message is finally
1737
delivered to its recipient, even if the message has been forwarded
1738
several times because of hypercube routing. Furthermore, as long as a
1739
message is kept in the output queue of some shardchain, it is part of
1740
the shardchain's global state, so a payment for keeping global data
1741
for a long time may be also collected by special transactions.
1742

1743
\nxsubpoint \embt(Messages to and from the masterchain.)  Messages can
1744
be sent directly from any shardchain to the masterchain, and vice
1745
versa. However, gas prices for sending messages to and for processing
1746
messages in the masterchain are quite high, so this ability will be
1747
used only when truly necessary---for example, by the validators to
1748
deposit their stakes. In some cases, a minimal deposit (attached
1749
value) for messages sent to the masterchain may be defined, which is
1750
returned only if the message is deemed ``valid'' by the receiving
1751
party.
1752

1753
Messages cannot be automatically routed through the masterchain. A
1754
message with $\workchainid\neq-1$ ($-1$ being the special
1755
$\workchainid$ indicating the masterchain) cannot be delivered to the
1756
masterchain.
1757

1758
In principle, one can create a message-forwarding smart contract
1759
inside the masterchain, but the price of using it would be
1760
prohibitive.
1761

1762
\nxsubpoint \embt(Messages between accounts in the same shardchain.)
1763
In some cases, a message is generated by an account belonging to some
1764
shardchain, destined to another account in the same shardchain. For
1765
example, this happens in a new workchain which has not yet split into
1766
several shardchains because the load is manageable.
1767

1768
Such messages might be accumulated in the output queue of the
1769
shardchain and then processed as incoming messages in subsequent
1770
blocks (any shard is considered a neighbor of itself for this
1771
purpose). However, in most cases it is possible to deliver these
1772
messages within the originating block itself.
1773

1774
In order to achieve this, a partial order is imposed on all
1775
transactions included in a shardchain block, and the transactions
1776
(each consisting in the delivery of a message to some account) are
1777
processed respecting this partial order. In particular, a transaction
1778
is allowed to process some output message of a preceding transaction
1779
with respect to this partial order.
1780

1781
In this case, the message body is not copied twice. Instead, the
1782
originating and the processing transactions refer to a shared copy of
1783
the message.
1784

1785
\mysubsection{Global Shardchain State. ``Bag of Cells'' Philosophy.}
1786

1787
Now we are ready to describe the global state of a TON blockchain, or
1788
at least of a shardchain of the basic workchain.
1789

1790
We start with a ``high-level'' or ``logical'' description, which
1791
consists in saying that the global state is a value of algebraic type
1792
$\tp{ShardchainState}$.
1793

1794
\nxsubpoint\label{sp:shard.state} \embt(Shardchain state as a
1795
collection of account-chain states.)  According to the Infinite
1796
Sharding Paradigm (cf.~\ptref{sp:ISP}), any shardchain is just a
1797
(temporary) collection of virtual ``account-chains'', containing
1798
exactly one account each. This means that, essentially, the global
1799
shardchain state must be a hashmap
1800
\begin{equation}\label{eq:simple.shard.st}
1801
  \tp{ShardchainState}:=(\Account\dashrightarrow\tp{AccountState})
1802
\end{equation}
1803
where all $\accountid$ appearing as indices of this hashmap must begin
1804
with prefix $s$, if we are discussing the state of shard $(w,s)$
1805
(cf.~\ptref{sp:shard.ident}).
1806

1807
In practice, we might want to split $\tp{AccountState}$ into several
1808
parts (e.g., keep the account output message queue separate to
1809
simplify its examination by the neighboring shardchains), and have
1810
several hashmaps $(\Account\dashrightarrow\tp{AccountStatePart}_i)$
1811
inside the $\tp{ShardchainState}$. We might also add a small number of
1812
``global'' or ``integral'' parameters to the $\tp{ShardchainState}$,
1813
(e.g., the total balance of all accounts belonging to this
1814
shard, or the total number of messages in all output queues).
1815

1816
However, \eqref{eq:simple.shard.st} is a good first approximation of
1817
what the shardchain global state looks like, at least from a
1818
``logical'' (``high-level'') perspective. The formal description of
1819
algebraic types $\tp{AccountState}$ and $\tp{ShardchainState}$ can be
1820
done with the aid of a TL-scheme (cf.~\ptref{sp:TL}), to be provided
1821
elsewhere.
1822

1823
\nxsubpoint\label{sp:split.merge.state} \embt(Splitting and merging
1824
shardchain states.)  Notice that the Infinite Sharding Paradigm
1825
description of the shardchain state \eqref{eq:simple.shard.st} shows
1826
how this state should be processed when shards are split or merged. In
1827
fact, these state transformations turn out to be very simple
1828
operations with hashmaps.
1829

1830
\nxsubpoint \embt(Account-chain state.)  The (virtual) account-chain
1831
state is just the state of one account, described by type
1832
$\tp{AccountState}$. Usually it has all or some of the fields listed
1833
in~\ptref{sp:account.state}, depending on the specific constructor
1834
used.
1835

1836
\nxsubpoint \embt(Global workchain state.)  Similarly to
1837
\eqref{eq:simple.shard.st}, we may define the global {\em workchain\/}
1838
state by the same formula, but with $\accountid$'s allowed to take any
1839
values, not just those belonging to one shard. Remarks similar to
1840
those made in~\ptref{sp:shard.state} apply in this case as well: we
1841
might want to split this hashmap into several hashmaps, and we might
1842
want to add some ``integral'' parameters such as the total balance.
1843

1844
Essentially, the global workchain state {\em must\/} be given by the
1845
same type $\tp{ShardchainState}$ as the shardchain state, because it
1846
is the shardchain state we would obtain if all existing shardchains of
1847
this workchain suddenly merged into one.
1848

1849
\nxsubpoint\label{sp:bag.of.cells} \embt(Low-level perspective: ``bag
1850
of cells''.)  There is a ``low-level'' description of the
1851
account-chain or shardchain state as well, complementary to the
1852
``high-level'' description given above. This description is quite
1853
important, because it turns out to be pretty universal, providing a
1854
common basis for representing, storing, serializing and transferring
1855
by network almost all data used by the TON Blockchain (blocks,
1856
shardchain states, smart-contract storage, Merkle proofs, etc.). At
1857
the same time, such a universal ``low-level'' description, once
1858
understood and implemented, allows us to concentrate our attention on
1859
the ``high-level'' considerations only.
1860

1861
Recall that the TVM represents values of arbitrary algebraic types
1862
(including, for instance, $\tp{ShardchainState}$
1863
of~\eqref{eq:simple.shard.st}) by means of a tree of {\em TVM cells},
1864
or {\em cells\/} for short (cf.~\ptref{sp:tvm.cells}
1865
and~\ptref{sp:TL}).
1866

1867
Any such cell consists of two {\em descriptor bytes}, defining certain
1868
flags and values $0\leq b\leq 128$, the quantity of raw bytes, and
1869
$0\leq c\leq 4$, the quantity of references to other cells. Then $b$
1870
raw bytes and $c$ cell references follow.\footnote{One can show that, if Merkle proofs for all data stored in a tree of cells are needed equally often, one should use cells with $b+ch\approx 2(h+r)$ to minimize average Merkle proof size, where $h=32$ is the hash size in bytes, and $r\approx4$ is the ``byte size'' of a cell reference. In other words, a cell should contain either two references and a few raw bytes, or one reference and about 36 raw bytes, or no references at all with 72 raw bytes.}
1871

1872
The exact format of cell references depends on the implementation and
1873
on whether the cell is located in RAM, on disk, in a network packet,
1874
in a block, and so on. A useful abstract model consists in imagining
1875
that all cells are kept in content-addressable memory, with the
1876
address of a cell equal to its ($\Sha$) hash. Recall that the (Merkle)
1877
hash of a cell is computed exactly by replacing the references to its
1878
child cells by their (recursively computed) hashes and hashing the
1879
resulting byte string.
1880

1881
In this way, if we use cell hashes to reference cells (e.g., inside
1882
descriptions of other cells), the system simplifies somewhat, and the
1883
hash of a cell starts to coincide with the hash of the byte string
1884
representing it.
1885

1886
Now we see that {\em any object representable by TVM, the global
1887
  shardchain state included, can be represented as a ``bag of
1888
  cells''}---i.e., {\em a collection of cells along with a ``root''
1889
  reference to one of them\/} (e.g., by hash). Notice that duplicate
1890
cells are removed from this description (the ``bag of cells'' is a set
1891
of cells, not a multiset of cells), so the abstract tree
1892
representation might actually become a directed acyclic graph (dag)
1893
representation.
1894

1895
One might even keep this state on disk in a $B$- or $B+$-tree,
1896
containing all cells in question (maybe with some additional data,
1897
such as subtree height or reference counter), indexed by cell
1898
hash. However, a naive implementation of this idea would result in the
1899
state of one smart contract being scattered among distant parts of the
1900
disk file, something we would rather avoid.%
1901
\footnote{A better implementation would be to keep the state of the
1902
  smart contract as a serialized string, if it is small, or in a
1903
  separate $B$-tree, if it is large; then the top-level structure
1904
  representing the state of a blockchain would be a $B$-tree, whose
1905
  leaves are allowed to contain references to other $B$-trees.}
1906

1907
Now we are going to explain in some detail how almost all objects used
1908
by the TON Blockchain can be represented as ``bags of cells'', thus
1909
demonstrating the universality of this approach.
1910

1911
\nxsubpoint \embt(Shardchain block as a ``bag of cells''.)  A
1912
shardchain block itself can be also described by an algebraic type,
1913
and stored as a ``bag of cells''. Then a naive binary representation
1914
of the block may be obtained simply by concatenating the byte strings
1915
representing each of the cells in the ``bag of cells'', in arbitrary
1916
order. This representation might be improved and optimized, for
1917
instance, by providing a list of offsets of all cells at the beginning
1918
of the block, and replacing hash references to other cells with 32-bit
1919
indices in this list whenever possible. However, one should imagine
1920
that a block is essentially a ``bag of cells'', and all other
1921
technical details are just minor optimization and implementation
1922
issues.
1923

1924
\nxsubpoint\label{sp:obj.update} \embt(Update to an object as a ``bag
1925
of cells''.)  Imagine that we have an old version of some object
1926
represented as a ``bag of cells'', and that we want to represent a new
1927
version of the same object, supposedly not too different from the
1928
previous one. One might simply represent the new state as another
1929
``bag of cells'' with its own root, {\em and remove from it all cells
1930
  occurring in the old version}. The remaining ``bag of cells'' is
1931
essentially an {\em update\/} to the object. Everybody who has the old
1932
version of this object and the update can compute the new version,
1933
simply by uniting the two bags of cells, and removing the old root
1934
(decreasing its reference counter and de-allocating the cell if the
1935
reference counter becomes zero).
1936

1937
\nxsubpoint \embt(Updates to the state of an account.)  In particular,
1938
updates to the state of an account, or to the global state of a
1939
shardchain, or to any hashmap can be represented using the idea
1940
described in~\ptref{sp:obj.update}. This means that when we receive a
1941
new shardchain block (which is a ``bag of cells''), we interpret this
1942
``bag of cells'' not just by itself, but by uniting it first with the
1943
``bag of cells'' representing the previous state of the shardchain. In
1944
this sense each block may ``contain'' the whole state of the
1945
blockchain.
1946

1947
\nxsubpoint \embt(Updates to a block.)  Recall that a block itself is
1948
a ``bag of cells'', so, if it becomes necessary to edit a block, one
1949
can similarly define a ``block update'' as a ``bag of cells'',
1950
interpreted in the presence of the ``bag of cells'' which is the
1951
previous version of this block. This is roughly the idea behind the
1952
``vertical blocks'' discussed in~\ptref{sp:inv.sh.blk.corr}.
1953

1954
\nxsubpoint\label{sp:merkle.as.BoC} \embt(Merkle proof as a ``bag of
1955
cells''.)  Notice that a (generalized) Merkle proof---for example, one
1956
asserting that $x[i]=y$ starting from a known value of $\Hash(x)=h$
1957
(cf.~\ptref{sp:merkle.proof} and~\ptref{sp:gen.merkle.proof})---may
1958
also be represented as a ``bag of cells''. Namely, one simply needs to
1959
provide a subset of cells corresponding to a path from the root of
1960
$x:\Hashmap(n,X)$ to its desired leaf with index $i:\st2^n$ and value
1961
$y:X$. References to children of these cells not lying on this path
1962
will be left ``unresolved'' in this proof, represented by cell
1963
hashes. One can also provide a simultaneous Merkle proof of, say,
1964
$x[i]=y$ and $x[i']=y'$, by including in the ``bag of cells'' the
1965
cells lying on the union of the two paths from the root of $x$ to
1966
leaves corresponding to indices $i$ and~$i'$.
1967

1968
\nxsubpoint\label{sp:merkle.query.resp} \embt(Merkle proofs as query
1969
responses from full nodes.)  In essence, a full node with a complete
1970
copy of a shardchain (or account-chain) state can provide a Merkle
1971
proof when requested by a light node (e.g., a network node running a
1972
light version of the TON Blockchain client), enabling the receiver to
1973
perform some simple queries without external help, using only the
1974
cells provided in this Merkle proof. The light node can send its
1975
queries in a serialized format to the full node, and receive the
1976
correct answers with Merkle proofs---or just the Merkle proofs,
1977
because the requester should be able to compute the answers using only
1978
the cells included in the Merkle proof. This Merkle proof would
1979
consist simply of a ``bag of cells'', containing only those cells
1980
belonging to the shardchain's state that have been accessed by the
1981
full node while executing the light node's query. This approach can be
1982
used in particular for executing ``get queries'' of smart contracts
1983
(cf.~\ptref{sp:tent.exec.get}).
1984

1985
\nxsubpoint\label{sp:aug.upd} \embt(Augmented update, or state update
1986
with Merkle proof of validity.)  Recall (cf.~\ptref{sp:obj.update})
1987
that we can describe the changes in an object state from an old value
1988
$x:X$ to a new value $x':X$ by means of an ``update'', which is simply
1989
a ``bag of cells'', containing those cells that lie in the subtree
1990
representing new value $x'$, but not in the subtree representing old
1991
value $x$, because the receiver is assumed to have a copy of the old
1992
value $x$ and all its cells.
1993

1994
However, if the receiver does not have a full copy of $x$, but knows
1995
only its (Merkle) hash $h=\Hash(x)$, it will not be able to check the
1996
validity of the update (i.e., that all ``dangling'' cell references in
1997
the update do refer to cells present in the tree of $x$). One would
1998
like to have ``verifiable'' updates, augmented by Merkle proofs of
1999
existence of all referred cells in the old state. Then anybody knowing
2000
only $h=\Hash(x)$ would be able to check the validity of the update
2001
and compute the new $h'=\Hash(x')$ by itself.
2002

2003
Because our Merkle proofs are ``bags of cells'' themselves
2004
(cf.~\ptref{sp:merkle.as.BoC}), one can construct such an {\em
2005
  augmented update\/} as a ``bag of cells'', containing the old root
2006
of $x$, some of its descendants along with paths from the root of $x$
2007
to them, and the new root of $x'$ and all its descendants that are not
2008
part of $x$.
2009

2010
\nxsubpoint \embt(Account state updates in a shardchain block.)  In
2011
particular, account state updates in a shardchain block should be
2012
augmented as discussed in~\ptref{sp:aug.upd}. Otherwise, somebody
2013
might commit a block containing an invalid state update, referring to
2014
a cell absent in the old state; proving the invalidity of such a block
2015
would be problematic (how is the challenger to prove that a cell is
2016
{\em not\/} part of the previous state?).
2017

2018
Now, if all state updates included in a block are augmented, their
2019
validity is easily checked, and their invalidity is also easily shown
2020
as a violation of the recursive defining property of (generalized)
2021
Merkle hashes.
2022

2023
\nxsubpoint\label{sp:everything.is.BoC} \embt(``Everything is a bag of
2024
cells'' philosophy.)  Previous considerations show that everything we
2025
need to store or transfer, either in the TON Block\-chain or in the
2026
network, is representable as a ``bag of cells''. This is an important
2027
part of the TON Blockchain design philosophy. Once the ``bag of
2028
cells'' approach is explained and some ``low-level'' serializations of
2029
``bags of cells'' are defined, one can simply define everything (block
2030
format, shardchain and account state, etc.) on the high level of
2031
abstract (dependent) algebraic data types.
2032

2033
The unifying effect of the ``everything is a bag of cells'' philosophy
2034
considerably simplifies the implementation of seemingly unrelated
2035
services; cf.~\ptref{sp:ton.smart.pc.supp} for an example involving
2036
payment channels.
2037

2038
\nxsubpoint \embt(Block ``headers'' for TON blockchains.)  Usually, a
2039
block in a block\-chain begins with a small header, containing the
2040
hash of the previous block, its creation time, the Merkle hash of the
2041
tree of all transactions contained in the block, and so on. Then the
2042
block hash is defined to be the hash of this small block
2043
header. Because the block header ultimately depends on all data
2044
included in the block, one cannot alter the block without changing its
2045
hash.
2046

2047
In the ``bag of cells'' approach used by the blocks of TON
2048
blockchains, there is no designated block header. Instead, the block
2049
hash is defined as the (Merkle) hash of the root cell of the
2050
block. Therefore, the top (root) cell of the block might be considered
2051
a small ``header'' of this block.
2052

2053
However, the root cell might not contain all the data usually expected
2054
from such a header. Essentially, one wants the header to contain some
2055
of the fields defined in the $\Block$ datatype. Normally, these fields
2056
will be contained in several cells, including the root. These are the
2057
cells that together constitute a ``Merkle proof'' for the values of
2058
the fields in question. One might insist that a block contain these
2059
``header cells'' in the very beginning, before any other cells. Then
2060
one would need to download only the first several bytes of a block
2061
serialization in order to obtain all of the ``header cells'', and to
2062
learn all of the expected fields.
2063

2064
\mysubsection{Creating and Validating New Blocks}\label{sect:validators}
2065

2066
The TON Blockchain ultimately consists of shardchain and masterchain
2067
blocks. These blocks must be created, validated and propagated through
2068
the network to all parties concerned, in order for the system to
2069
function smoothly and correctly.
2070

2071
\nxsubpoint\label{sp:validators} \embt(Validators.)  New blocks are
2072
created and validated by special designated nodes, called {\em
2073
  validators}. Essentially, any node wishing to become a validator may
2074
become one, provided it can deposit a sufficiently large stake (in TON
2075
coins, i.e., Grams; cf.\ Appendix~\ptref{app:coins}) into the
2076
masterchain. Validators obtain some ``rewards'' for good work, namely,
2077
the transaction, storage and gas fees from all transactions (messages)
2078
committed into newly generated blocks, and some newly minted coins,
2079
reflecting the ``gratitude'' of the whole community to the validators
2080
for keeping the TON Blockchain working. This income is distributed
2081
among all participating validators proportionally to their stakes.
2082

2083
However, being a validator is a high responsibility. If a validator
2084
signs an invalid block, it can be punished by losing part or all of
2085
its stake, and by being temporarily or permanently excluded from the
2086
set of validators. If a validator does not participate in creating a
2087
block, it does not receive its share of the reward associated with
2088
that block. If a validator abstains from creating new blocks for a
2089
long time, it may lose part of its stake and be suspended or
2090
permanently excluded from the set of validators.
2091

2092
All this means that the validator does not get its money ``for
2093
nothing''. Indeed, it must keep track of the states of all or some
2094
shardchains (each validator is responsible for validating and creating
2095
new blocks in a certain subset of shardchains), perform all
2096
computations requested by smart contracts in these shardchains,
2097
receive updates about other shardchains and so on. This activity
2098
requires considerable disk space, computing power and network
2099
bandwidth.
2100

2101
\nxsubpoint \embt(Validators instead of miners.)  Recall that the TON
2102
Blockchain uses the Proof-of-Stake approach, instead of the
2103
Proof-of-Work approach adopted by Bitcoin, the current version of
2104
Ethereum, and most other cryptocurrencies. This means that one cannot
2105
``mine'' a new block by presenting some proof-of-work (computing a lot
2106
of otherwise useless hashes) and obtain some new coins as a
2107
result. Instead, one must become a validator and spend one's computing
2108
resources to store and process TON Blockchain requests and data. In
2109
short, {\em one must be a validator to mine new coins.} In this
2110
respect, {\em validators are the new miners.}
2111

2112
However, there are some other ways to earn coins apart from being a
2113
validator.
2114

2115
\nxsubpoint\label{sp:nominators} \embt(Nominators and ``mining
2116
pools''.)  To become a validator, one would normally need to buy and
2117
install several high-performance servers and acquire a good Internet
2118
connection for them. This is not so expensive as the ASIC equipment
2119
currently required to mine Bitcoins. However, one definitely cannot
2120
mine new TON coins on a home computer, let alone a smartphone.
2121

2122
In the Bitcoin, Ethereum and other Proof-of-Work cryptocurrency mining
2123
communities there is a notion of {\em mining pools}, where a lot of
2124
nodes, having insufficient computing power to mine new blocks by
2125
themselves, combine their efforts and share the reward afterwards.
2126

2127
A corresponding notion in the Proof-of-Stake world is that of a {\em
2128
  nominator}. Essentially, this is a node lending its money to help a
2129
validator increase its stake; the validator then distributes the
2130
corresponding share of its reward (or some previously agreed fraction
2131
of it---say, 50\%) to the nominator.
2132

2133
In this way, a nominator can also take part in the ``mining'' and
2134
obtain some reward proportional to the amount of money it is willing
2135
to deposit for this purpose. It receives only a fraction of the
2136
corresponding share of the validator's reward, because it provides
2137
only the ``capital'', but does not need to buy computing power,
2138
storage and network bandwidth.
2139

2140
However, if the validator loses its stake because of invalid behavior,
2141
the nominator loses its share of the stake as well. In this sense the
2142
nominator {\em shares the risk}. It must choose its nominated
2143
validator wisely, otherwise it can lose money. In this sense,
2144
nominators make a weighted decision and ``vote'' for certain
2145
validators with their funds.
2146

2147
On the other hand, this nominating or lending system enables one to
2148
become a validator without investing a large amount of money into
2149
Grams (TON coins) first. In other words, it prevents those keeping
2150
large amounts of Grams from monopolizing the supply of validators.
2151

2152
\nxsubpoint\label{sp:fish} \embt(Fishermen: obtaining money by
2153
pointing out others' mistakes.)  Another way to obtain some rewards
2154
without being a validator is by becoming a {\em
2155
  fisherman}. Essentially, any node can become a fisherman by making a
2156
small deposit in the masterchain. Then it can use special masterchain
2157
transactions to publish (Merkle) invalidity proofs of some (usually
2158
shardchain) blocks previously signed and published by validators. If
2159
other validators agree with this invalidity proof, the offending
2160
validators are punished (by losing part of their stake), and the
2161
fisherman obtains some reward (a fraction of coins confiscated from
2162
the offending validators). Afterwards, the invalid (shardchain) block
2163
must be corrected as outlined
2164
in~\ptref{sp:inv.sh.blk.corr}. Correcting invalid masterchain blocks
2165
may involve creating ``vertical'' blocks on top of previously
2166
committed masterchain blocks (cf.~\ptref{sp:inv.sh.blk.corr}); there
2167
is no need to create a fork of the masterchain.
2168

2169
Normally, a fisherman would need to become a full node for at least
2170
some shardchains, and spend some computing resources by running the
2171
code of at least some smart contracts. While a fisherman does not need
2172
to have as much computing power as a validator, we think that a
2173
natural candidate to become a fisherman is a would-be validator that
2174
is ready to process new blocks, but has not yet been elected as a
2175
validator (e.g., because of a failure to deposit a sufficiently large
2176
stake).
2177

2178
\nxsubpoint\label{sp:collators} \embt(Collators: obtaining money by
2179
suggesting new blocks to validators.)  Yet another way to obtain some
2180
rewards without being a validator is by becoming a {\em
2181
  collator}. This is a node that prepares and suggests to a validator
2182
new shardchain block candidates, complemented (collated) with data
2183
taken from the state of this shardchain and from other (usually
2184
neighboring) shardchains, along with suitable Merkle proofs. (This is
2185
necessary, for example, when some messages need to be forwarded from
2186
neighboring shardchains.) Then a validator can easily check the
2187
proposed block candidate for validity, without having to download the
2188
complete state of this or other shardchains.
2189

2190
Because a validator needs to submit new (collated) block candidates to
2191
obtain some (``mining'') rewards, it makes sense to pay some part of
2192
the reward to a collator willing to provide suitable block
2193
candidates. In this way, a validator may free itself from the necessity
2194
of watching the state of the neighboring shardchains, by outsourcing
2195
it to a collator.
2196

2197
However, we expect that during the system's initial deployment phase
2198
there will be no separate designated collators, because all validators
2199
will be able to act as collators for themselves.
2200

2201
\nxsubpoint \embt(Collators or validators: obtaining money for
2202
including user transactions.)  Users can open micropayment channels to
2203
some collators or validators and pay small amounts of coins in
2204
exchange for the inclusion of their transactions in the shardchain.
2205

2206
\nxsubpoint\label{sp:global.valid} \embt(Global validator set
2207
election.)  The ``global'' set of validators is elected once each
2208
month (actually, every $2^{19}$ masterchain blocks). This set is
2209
determined and universally known one month in advance.
2210

2211
In order to become a validator, a node must transfer some TON coins
2212
(Grams) into the masterchain, and then send them to a special smart
2213
contract as its suggested stake $s$. Another parameter, sent along
2214
with the stake, is $l\geq 1$, the maximum validating load this node is
2215
willing to accept relative to the minimal possible. There is also a
2216
global upper bound (another configurable parameter) $L$ on $l$, equal
2217
to, say, 10.
2218

2219
Then the global set of validators is elected by this smart contract,
2220
simply by selecting up to $T$ candidates with maximal suggested stakes
2221
and publishing their identities. Originally, the total number of
2222
validators is $T=100$; we expect it to grow to 1000 as the load
2223
increases. It is a configurable parameter
2224
(cf.~\ptref{sp:config.param}).
2225

2226
The actual stake of each validator is computed as follows: If the top
2227
$T$ proposed stakes are $s_1\geq s_2\geq\cdots\geq s_T$, the actual
2228
stake of $i$-th validator is set to $s'_i:=\min(s_i,l_i\cdot s_T)$. In
2229
this way, $s'_i/s'_T\leq l_i$, so the $i$-th validator does not obtain
2230
more than $l_i\leq L$ times the load of the weakest validator (because
2231
the load is ultimately proportional to the stake).
2232

2233
Then elected validators may withdraw the unused part of their stake,
2234
$s_i-s'_i$. Unsuccessful validator candidates may withdraw all of
2235
their proposed stake.
2236

2237
Each validator publishes its {\em public signing key}, not necessarily
2238
equal to the public key of the account the stake came
2239
from.\footnote{It makes sense to generate and use a new key pair for
2240
  every validator election.}
2241

2242
The stakes of the validators are frozen until the end of the period for
2243
which they have been elected, and one month more, in case new disputes
2244
arise (i.e., an invalid block signed by one of these validators is
2245
found). After that, the stake is returned, along with the validator's
2246
share of coins minted and fees from transactions processed during this
2247
time.
2248

2249
\nxsubpoint\label{sp:val.task.grp} \embt(Election of validator ``task
2250
groups''.)  The whole global set of validators (where each validator
2251
is considered present with multiplicity equal to its stake---otherwise
2252
a validator might be tempted to assume several identities and split
2253
its stake among them) is used only to validate new masterchain
2254
blocks. The shardchain blocks are validated only by specially selected
2255
subsets of validators, taken from the global set of validators chosen
2256
as described in~\ptref{sp:global.valid}.
2257

2258
These validator ``subsets'' or ``task groups'', defined for every
2259
shard, are rotated each hour (actually, every $2^{10}$ masterchain
2260
blocks), and they are known one hour in advance, so that every
2261
validator knows which shards it will need to validate, and can prepare
2262
for that (e.g., by downloading missing shardchain data).
2263

2264
The algorithm used to select validator task groups for each shard
2265
$(w,s)$ is deterministic pseudorandom. It uses pseudorandom numbers
2266
embedded by validators into each masterchain block (generated by a
2267
consensus using threshold signatures) to create a random seed, and
2268
then computes for example
2269
$\Hash(\code(w).\code(s).\vr{validator\_id}.\vr{rand\_seed})$ for each
2270
validator. Then validators are sorted by the value of this hash, and
2271
the first several are selected, so as to have at least $20/T$ of the
2272
total validator stakes and consist of at least 5 validators. 
2273

2274
This selection could be done by a special smart contract. In that
2275
case, the selection algorithm would easily be upgradable without hard
2276
forks by the voting mechanism mentioned
2277
in~\ptref{sp:config.param}. All other ``constants'' mentioned so far
2278
(such as $2^{19}$, $2^{10}$, $T$, 20, and 5) are also configurable
2279
parameters.
2280

2281
\nxsubpoint\label{sp:rot.gen.prio} \embt(Rotating priority order on
2282
each task group.)  There is a certain ``priority'' order imposed
2283
on the members of a shard task group, depending on the hash of the
2284
previous masterchain block and (shardchain) block sequence
2285
number. This order is determined by generating and sorting some hashes
2286
as described above.
2287

2288
When a new shardchain block needs to be generated, the shard task
2289
group validator selected to create this block is normally the first
2290
one with respect to this rotating ``priority'' order. If it fails to
2291
create the block, the second or third validator may do
2292
it. Essentially, all of them may suggest their block candidates, but
2293
the candidate suggested by the validator having the highest priority
2294
should win as the result of Byzantine Fault Tolerant (BFT) consensus
2295
protocol.
2296

2297
\nxsubpoint\label{sp:sh.blk.cand.prop} \embt(Propagation of shardchain
2298
block candidates.)  Because shardchain task group membership is
2299
known one hour in advance, their members can use that time to build a
2300
dedicated ``shard validators multicast overlay network'', using the
2301
general mechanisms of the TON Network (cf.~\ptref{sect:overlay}). When
2302
a new shardchain block needs to be generated---normally one or two
2303
seconds after the most recent masterchain block has been
2304
propagated---everybody knows who has the highest priority to generate
2305
the next block (cf.~\ptref{sp:rot.gen.prio}). This validator will
2306
create a new collated block candidate, either by itself or with the
2307
aid of a collator (cf.~\ptref{sp:collators}). The validator must check
2308
(validate) this block candidate (especially if it has been prepared by
2309
some collator) and sign it with its (validator) private key. Then the
2310
block candidate is propagated to the remainder of the task group
2311
using the prearranged multicast overlay network (the task group
2312
creates its own private overlay network as explained
2313
in~\ptref{sect:overlay}, and then uses a version of the streaming
2314
multicast protocol described in~\ptref{sp:streaming.multicast} to
2315
propagate block candidates).
2316

2317
A truly BFT way of doing this would be to use a Byzantine multicast
2318
protocol, such as the one used in Honey Badger BFT~\cite{HoneyBadger}:
2319
encode the block candidate by an $(N,2N/3)$-erasure code, send $1/N$
2320
of the resulting data directly to each member of the group, and expect
2321
them to multicast directly their part of the data to all other members
2322
of the group.
2323

2324
However, a faster and more straightforward way of doing this
2325
(cf.\ also \ptref{sp:streaming.multicast}) is to split the block
2326
candidate into a sequence of signed one-kilobyte blocks (``chunks''),
2327
augment their sequence by a Reed--Solomon or a fountain code (such as
2328
the RaptorQ code~\cite{RaptorQ} \cite{Raptor}), and start transmitting
2329
chunks to the neighbors in the ``multicast mesh'' (i.e., the overlay
2330
network), expecting them to propagate these chunks further. Once a
2331
validator obtains enough chunks to reconstruct the block candidate
2332
from them, it signs a confirmation receipt and propagates it through
2333
its neighbors to the whole of the group. Then its neighbors stop
2334
sending new chunks to it, but may continue to send the (original)
2335
signatures of these chunks, believing that this node can generate the
2336
subsequent chunks by applying the Reed--Solomon or fountain code by
2337
itself (having all data necessary), combine them with signatures, and
2338
propagate to its neighbors that are not yet ready.
2339

2340
If the ``multicast mesh'' (overlay network) remains connected after
2341
removing all ``bad'' nodes (recall that up to one-third of nodes are
2342
allowed to be bad in a Byzantine way, i.e., behave in arbitrary
2343
malicious fashion), this algorithm will propagate the block candidate
2344
as quickly as possible.
2345

2346
Not only the designated high-priority block creator may multicast its
2347
block candidate to the whole of the group. The second and third
2348
validator by priority may start multicasting their block candidates,
2349
either immediately or after failing to receive a block candidate from
2350
the top priority validator. However, normally only the block candidate
2351
with maximal priority will be signed by all (actually, by at least
2352
two-thirds of the task group) validators and committed as a new
2353
shardchain block.
2354

2355
\nxsubpoint \embt(Validation of block candidates.)  Once a block
2356
candidate is received by a validator and the signature of its
2357
originating validator is checked, the receiving validator checks the
2358
validity of this block candidate, by performing all transactions in it
2359
and checking that their result coincides with the one claimed. All
2360
messages imported from other blockchains must be supported by suitable
2361
Merkle proofs in the collated data, otherwise the block candidate is
2362
deemed invalid (and, if a proof of this is committed to the
2363
masterchain, the validators having already signed this block candidate
2364
may be punished). On the other hand, if the block candidate is found
2365
to be valid, the receiving validator signs it and propagates its
2366
signature to other validators in the group, either through the ``mesh
2367
multicast network'', or by direct network messages.
2368

2369
We would like to emphasize that {\em a validator does not need access
2370
  to the states of this or neighboring shardchains in order to check
2371
  the validity of a (collated) block candidate}.%
2372
\footnote{A possible exception is the state of output queues of the
2373
  neighboring shardchains, needed to guarantee the message ordering
2374
  requirements described in~\ptref{sp:collect.input.msg}, because the
2375
  size of Merkle proofs might become prohibitive in this case.}  This
2376
allows the validation to proceed very quickly (without disk accesses),
2377
and lightens the computational and storage burden on the validators
2378
(especially if they are willing to accept the services of outside
2379
collators for creating block candidates).
2380

2381
\nxsubpoint\label{sp:new.shardc.blk} \embt(Election of the next block
2382
candidate.)  Once a block candidate collects at least two-thirds (by
2383
stake) of the validity signatures of validators in the task group, it
2384
is eligible to be committed as the next shardchain block. A BFT
2385
protocol is run to achieve consensus on the block candidate chosen
2386
(there may be more than one proposed), with all ``good'' validators
2387
preferring the block candidate with the highest priority for this
2388
round. As a result of running this protocol, the block is augmented by
2389
signatures of at least two-thirds of the validators (by stake). These
2390
signatures testify not only to the validity of the block in question,
2391
but also to its being elected by the BFT protocol. After that, the
2392
block (without collated data) is combined with these signatures,
2393
serialized in a deterministic way, and propagated through the network
2394
to all parties concerned.
2395

2396
\nxsubpoint \embt(Validators must keep the blocks they have signed.)
2397
During their membership in the task group and for at least one hour
2398
(or rather $2^{10}$ blocks) afterward, the validators are expected to
2399
keep the blocks they have signed and committed.  The failure to
2400
provide a signed block to other validators may be punished.
2401

2402
\nxsubpoint \embt(Propagating the headers and signatures of new
2403
shardchain blocks to all validators.)  Validators propagate the
2404
headers and signatures of newly-generated shardchain blocks to the
2405
{\em global\/} set of validators, using a multicast mesh network
2406
similar to the one created for each task group.
2407

2408
\nxsubpoint\label{sp:new.master.blk} \embt(Generation of new
2409
masterchain blocks.)  After all (or almost all) new shardchain blocks
2410
have been generated, a new masterchain block may be generated. The
2411
procedure is essentially the same as for shardchain blocks
2412
(cf.~\ptref{sp:new.shardc.blk}), with the difference that {\em all\/}
2413
validators (or at least two-thirds of them) must participate in this
2414
process. Because the headers and signatures of new shardchain blocks
2415
are propagated to all validators, hashes of the newest blocks in each
2416
shardchain can and must be included in the new masterchain block. Once
2417
these hashes are committed into the masterchain block, outside
2418
observers and other shardchains may consider the new shardchain blocks
2419
committed and immutable (cf.~\ptref{sp:sc.hash.mc}).
2420

2421
\nxsubpoint \embt(Validators must keep the state of masterchain.)  A
2422
noteworthy difference between the masterchain and the shardchains is
2423
that all validators are expected to keep track of the masterchain
2424
state, without relying on collated data. This is important because the
2425
knowledge of validator task groups is derived from the masterchain
2426
state.
2427

2428
\nxsubpoint \embt(Shardchain blocks are generated and propagated in
2429
parallel.)  Normally, each validator is a member of several shardchain
2430
task groups; their quantity (hence the load on the validator) is
2431
approximately proportional to the validator's stake. This means that
2432
the validator runs several instances of new shardchain block
2433
generation protocol in parallel.
2434

2435
\nxsubpoint \embt(Mitigation of block retention attacks.)  Because the
2436
total set of validators inserts a new shardchain block's hash into the
2437
masterchain after having seen only its header and signatures, there is
2438
a small probability that the validators that have generated this block
2439
will conspire and try to avoid publishing the new block in its
2440
entirety. This would result in the inability of validators of
2441
neighboring shardchains to create new blocks, because they must know
2442
at least the output message queue of the new block, once its hash has
2443
been committed into the masterchain.
2444

2445
In order to mitigate this, the new block must collect signatures from
2446
some other validators (e.g., two-thirds of the union of task groups of
2447
neighboring shardchains) testifying that these validators do have
2448
copies of this block and are willing to send them to any other
2449
validators if required. Only after these signatures are presented may
2450
the new block's hash be included in the masterchain.
2451

2452
\nxsubpoint \embt(Masterchain blocks are generated later than
2453
shardchain blocks.)  Masterchain blocks are generated approximately
2454
once every five seconds, as are shardchain blocks. However, while the
2455
generation of new blocks in all shardchains runs essentially at the
2456
same time (normally triggered by the release of a new masterchain
2457
block), the generation of new masterchain blocks is deliberately
2458
delayed, to allow the inclusion of hashes of newly-generated
2459
shardchain blocks in the masterchain.
2460

2461
\nxsubpoint\label{sp:slow.valid} \embt(Slow validators may receive
2462
lower rewards.)  If a validator is ``slow'', it may fail to validate
2463
new block candidates, and two-thirds of the signatures required to
2464
commit the new block may be gathered without its participation. In
2465
this case, it will receive a lower share of the reward associated with
2466
this block.
2467

2468
This provides an incentive for the validators to optimize their
2469
hardware, software, and network connection in order to process user
2470
transactions as fast as possible.
2471

2472
However, if a validator fails to sign a block before it is committed,
2473
its signature may be included in one of the next blocks, and then a
2474
part of the reward (exponentially decreasing depending on how many
2475
blocks have been generated since---e.g., $0.9^k$ if the validator is
2476
$k$ blocks late) will be still given to this validator.
2477

2478
\nxsubpoint\label{sp:val.sign.depth} \embt(``Depth'' of validator
2479
signatures.)  Normally, when a validator signs a block, the signature
2480
testifies only to the {\em relative validity\/} of a block: this block
2481
is valid provided all previous blocks in this and other shardchains
2482
are valid. The validator cannot be punished for taking for granted
2483
invalid data committed into previous blocks.
2484

2485
However, the validator signature of a block has an integer parameter
2486
called ``depth''. If it is non-zero, it means that the validator
2487
asserts the (relative) validity of the specified number of previous
2488
blocks as well. This is a way for ``slow'' or ``temporarily offline''
2489
validators to catch up and sign some of the blocks that have been
2490
committed without their signatures. Then some part of the block reward
2491
will still be given to them (cf.~\ptref{sp:slow.valid}).
2492

2493
\nxsubpoint\label{sp:abs.val.from.rel} \embt(Validators are
2494
responsible for {\em relative\/} validity of signed shardchain blocks;
2495
absolute validity follows.)  We would like to emphasize once again
2496
that a validator's signature on a shardchain block $B$ testifies to
2497
only the {\em relative\/} validity of that block (or maybe of $d$
2498
previous blocks as well, if the signature has ``depth'' $d$,
2499
cf.~\ptref{sp:val.sign.depth}; but this does not affect the following
2500
discussion much). In other words, the validator asserts that the next
2501
state $s'$ of the shardchain is obtained from the previous state $s$
2502
by applying the block evaluation function $\evblock$ described
2503
in~\ptref{sp:blk.transf}:
2504
\begin{equation}\label{eq:ev.block.2}
2505
  s'=\evblock(B)(s)
2506
\end{equation}
2507
In this way, the validator that signed block $B$ cannot be punished if
2508
the original state $s$ turns out to be ``incorrect'' (e.g., because of
2509
the invalidity of one of the previous blocks). A fisherman
2510
(cf.~\ptref{sp:fish}) should complain only if it finds a block that is
2511
{\em relatively\/} invalid. The PoS system as a whole endeavors to
2512
make every block {\em relatively\/} valid, not {\em recursively (or
2513
  absolutely)} valid. Notice, however, that {\em if all blocks in a
2514
  blockchain are relatively valid, then all of them and the blockchain
2515
  as a whole are absolutely valid}; this statement is easily shown
2516
using mathematical induction on the length of the blockchain. In this
2517
way, easily verifiable assertions of {\em relative\/} validity of
2518
blocks together demonstrate the much stronger {\em absolute validity\/}
2519
of the whole blockchain.
2520

2521
Note that by signing a block~$B$ the validator asserts that the block
2522
is valid given the original state $s$ (i.e., that the result
2523
of~\eqref{eq:ev.block.2} is not the value $\bot$ indicating that the
2524
next state cannot be computed). In this way, the validator must
2525
perform minimal formal checks of the cells of the original state that
2526
are accessed during the evaluation of~\eqref{eq:ev.block.2}.
2527

2528
For example, imagine that the cell expected to contain the original
2529
balance of an account accessed from a transaction committed into a
2530
block turns out to have zero raw bytes instead of the expected 8 or
2531
16. Then the original balance simply cannot be retrieved from the
2532
cell, and an ``unhandled exception'' happens while trying to process
2533
the block. In this case, the validator should not sign such a block on
2534
pain of being punished.
2535

2536
\nxsubpoint \embt(Signing masterchain blocks.)  The situation with the
2537
masterchain blocks is somewhat different: by signing a masterchain
2538
block, the validator asserts not only its relative validity, but also
2539
the relative validity of all preceding blocks up to the very first
2540
block when this validator assumed its responsibility (but not further
2541
back).
2542

2543
\nxsubpoint \embt(The total number of validators.)  The upper
2544
limit $T$ for the total number of validators to be elected
2545
(cf.~\ptref{sp:global.valid}) cannot become, in the system described
2546
so far, more than, say, several hundred or a thousand, because all
2547
validators are expected to participate in a BFT consensus protocol to
2548
create each new masterchain block, and it is not clear whether such
2549
protocols can scale to thousands of participants. Even more
2550
importantly, masterchain blocks must collect the signatures of at
2551
least two-thirds of all the validators (by stake), and these
2552
signatures must be included in the new block (otherwise all other
2553
nodes in the system would have no reason to trust the new block
2554
without validating it by themselves). If more than, say, one thousand
2555
validator signatures would have to be included in each masterchain
2556
block, this would imply more data in each masterchain block, to be
2557
stored by all full nodes and propagated through the network, and more
2558
processing power spent to check these signatures (in a PoS system,
2559
full nodes do not need to validate blocks by themselves, but they need
2560
to check the validators' signatures instead).
2561

2562
While limiting $T$ to a thousand validators seems more than sufficient
2563
for the first phase of the deployment of the TON Blockchain, a
2564
provision must be made for future growth, when the total number of
2565
shardchains becomes so large that several hundred validators will not
2566
suffice to process all of them. To this end, we introduce an
2567
additional configurable parameter $T'\leq T$ (originally equal
2568
to~$T$), and only the top $T'$ elected validators (by stake) are
2569
expected to create and sign new masterchain blocks.
2570

2571
\nxsubpoint \embt(Decentralization of the system.)  One might suspect
2572
that a Proof-of-Stake system such as the TON Blockchain, relying on
2573
$T\approx1000$ validators to create all shardchain and masterchain
2574
blocks, is bound to become ``too centralized'', as opposed to
2575
conventional Proof-of-Work blockchains like Bitcoin or Ethereum, where
2576
everybody (in principle) might mine a new block, without an explicit
2577
upper limit on the total number of miners.
2578

2579
However, popular Proof-of-Work blockchains, such as Bitcoin and
2580
Ether\-eum, currently require vast amounts of computing power (high
2581
``hash rates'') to mine new blocks with non-negligible probability of
2582
success. Thus, the mining of new blocks tends to be concentrated in the
2583
hands of several large players, who invest huge amounts money into
2584
datacenters filled with custom-designed hardware optimized for mining;
2585
and in the hands of several large mining pools, which concentrate and
2586
coordinate the efforts of larger groups of people who are not able to
2587
provide a sufficient ``hash rate'' by themselves.
2588

2589
Therefore, as of 2017, more than 75\% of new Ethereum or Bitcoin
2590
blocks are produced by less than ten miners. In fact, the two largest
2591
Ethereum mining pools produce together more than half of all new
2592
blocks! Clearly, such a system is much more centralized than one
2593
relying on $T\approx1000$ nodes to produce new blocks.
2594

2595
One might also note that the investment required to become a TON
2596
Blockchain validator---i.e., to buy the hardware (say, several
2597
high-performance servers) and the stake (which can be easily collected
2598
through a pool of nominators if necessary;
2599
cf.~\ptref{sp:nominators})---is much lower than that required to
2600
become a successful stand-alone Bitcoin or Ethereum miner. In fact,
2601
the parameter $L$ of~\ptref{sp:global.valid} will force nominators not
2602
to join the largest ``mining pool'' (i.e., the validator that has
2603
amassed the largest stake), but rather to look for smaller validators
2604
currently accepting funds from nominators, or even to create new
2605
validators, because this would allow a higher proportion $s'_i/s_i$ of
2606
the validator's---and by extension also the nominator's---stake to be
2607
used, hence yielding larger rewards from mining. In this way, the TON
2608
Proof-of-Stake system actually {\em encourages\/} decentralization
2609
(creating and using more validators) and {\em punishes\/}
2610
centralization.
2611

2612
\nxsubpoint\label{sp:rel.rel} \embt(Relative reliability of a block.)
2613
The {\em (relative) reliability\/} of a block is simply the total
2614
stake of all validators that have signed this block. In other words,
2615
this is the amount of money certain actors would lose if this block
2616
turns out to be invalid. If one is concerned with transactions
2617
transferring value lower than the reliability of the block, one can
2618
consider them to be safe enough. In this sense, the relative
2619
reliability is a measure of trust an outside observer can have in a
2620
particular block.
2621

2622
Note that we speak of the {\em relative\/} reliability of a block,
2623
because it is a guarantee that the block is valid {\em provided the
2624
  previous block and all other shardchains' blocks referred to are
2625
  valid\/} (cf.~\ptref{sp:abs.val.from.rel}).
2626

2627
The relative reliability of a block can grow after it is
2628
committed---for example, when belated validators' signatures are added
2629
(cf.~\ptref{sp:val.sign.depth}). On the other hand, if one of these
2630
validators loses part or all of its stake because of its misbehavior
2631
related to some other blocks, the relative reliability of a block may
2632
{\em decrease}.
2633

2634
\nxsubpoint \embt(``Strengthening'' the blockchain.)  It is important
2635
to provide incentives for validators to increase the relative
2636
reliability of blocks as much as possible. One way of doing this is by
2637
allocating a small reward to validators for adding signatures to
2638
blocks of other shardchains. Even ``would-be'' validators, who have
2639
deposited a stake insufficient to get into the top $T$ validators by
2640
stake and to be included in the global set of validators
2641
(cf.~\ptref{sp:global.valid}), might participate in this activity (if
2642
they agree to keep their stake frozen instead of withdrawing it after
2643
having lost the election). Such would-be validators might double as
2644
fishermen (cf.~\ptref{sp:fish}): if they have to check the validity of
2645
certain blocks anyway, they might as well opt to report invalid blocks and collect the associated rewards.
2646

2647
\nxsubpoint\label{sp:rec.rel} \embt(Recursive reliability of a block.)
2648
One can also define the {\em recursive reliability\/} of a block to be
2649
the minimum of its relative reliability and the recursive
2650
reliabilities of all blocks it refers to (i.e., the masterchain block,
2651
the previous shardchain block, and some blocks of neighboring
2652
shardchains). In other words, if the block turns out to be invalid,
2653
either because it is invalid by itself or because one of the blocks it
2654
depends on is invalid, then at least this amount of money would be
2655
lost by someone. If one is truly unsure whether to trust a specific
2656
transaction in a block, one should compute the {\em recursive\/}
2657
reliability of this block, not just the {\em relative\/} one.
2658

2659
It does not make sense to go too far back when computing recursive
2660
reliability, because, if we look too far back, we will see blocks
2661
signed by validators whose stakes have already been unfrozen and
2662
withdrawn. In any case, we do not allow the validators to
2663
automatically reconsider blocks that are that old (i.e., created more
2664
than two months ago, if current values of configurable parameters are
2665
used), and create forks starting from them or correct them with the
2666
aid of ``vertical blockchains'' (cf.~\ptref{sp:inv.sh.blk.corr}), even
2667
if they turn out to be invalid. We assume that a period of two months
2668
provides ample opportunities for detecting and reporting any invalid
2669
blocks, so that if a block is not challenged during this period, it is
2670
unlikely to be challenged at all.
2671

2672
\nxsubpoint \embt(Consequence of Proof-of-Stake for light nodes.)  An
2673
important consequence of the Proof-of-Stake approach used by the TON
2674
Blockchain is that a light node (running light client software) for
2675
the TON Blockchain does not need to download the ``headers'' of all
2676
shardchain or even masterchain blocks in order to be able to check by
2677
itself the validity of Merkle proofs provided to it by full nodes as
2678
answers to its queries.
2679

2680
Indeed, because the most recent shardchain block hashes are included
2681
in the masterchain blocks, a full node can easily provide a Merkle
2682
proof that a given shardchain block is valid starting from a known
2683
hash of a masterchain block. Next, the light node needs to know only
2684
the very first block of the masterchain (where the very first set of
2685
validators is announced), which (or at least the hash of which) might
2686
be built-in into the client software, and only one masterchain block
2687
approximately every month afterwards, where newly-elected validator
2688
sets are announced, because this block will have been signed by the
2689
previous set of validators. Starting from that, it can obtain the
2690
several most recent masterchain blocks, or at least their headers and
2691
validator signatures, and use them as a base for checking Merkle
2692
proofs provided by full nodes.
2693

2694
\mysubsection{Splitting and Merging
2695
  Shardchains}\label{sect:split.merge}
2696

2697
One of the most characteristic and unique features of the TON
2698
Blockchain is its ability to automatically split a shardchain in two
2699
when the load becomes too high, and merge them back if the load
2700
subsides (cf.~\ptref{sp:dyn.split.merge}). We must discuss it in some
2701
detail because of its uniqueness and its importance to the scalability
2702
of the whole project.
2703

2704
\nxsubpoint \embt(Shard configuration.)  Recall that, at any given
2705
moment of time, each workchain $w$ is split into one or several
2706
shardchains $(w,s)$ (cf.~\ptref{sp:shard.ident}). These shardchains
2707
may be represented by leaves of a binary tree, with root
2708
$(w,\emptyset)$, and each non-leaf node $(w,s)$ having children
2709
$(w,s.0)$ and $(w,s.1)$. In this way, every account belonging to
2710
workchain $w$ is assigned to exactly one shard, and everybody who
2711
knows the current shardchain configuration can determine the shard
2712
$(w,s)$ containing account $\accountid$: it is the only shard with
2713
binary string $s$ being a prefix of $\accountid$.
2714

2715
The shard configuration---i.e., this {\em shard binary tree}, or the
2716
collection of all active $(w,s)$ for a given $w$ (corresponding to the
2717
leaves of the shard binary tree)---is part of the masterchain state
2718
and is available to everybody who keeps track of the
2719
masterchain.\footnote{Actually, the shard configuration is completely
2720
  determined by the last masterchain block; this simplifies getting
2721
  access to the shard configuration.}
2722

2723
\nxsubpoint \embt(Most recent shard configuration and state.)  Recall
2724
that hashes of the most recent shardchain blocks are included in each
2725
masterchain block. These hashes are organized in a shard binary tree
2726
(actually, a collection of trees, one for each workchain). In this
2727
way, each masterchain block contains the most recent shard
2728
configuration.
2729

2730
\nxsubpoint \embt(Announcing and performing changes in the shard
2731
configuration.)  The shard configuration may be changed in two ways:
2732
either a shard $(w,s)$ can be {\em split\/} into two shards $(w,s.0)$
2733
and $(w,s.1)$, or two ``sibling'' shards $(w,s.0)$ and $(w,s.1)$ can
2734
be {\em merged\/} into one shard $(w,s)$.
2735

2736
These split/merge operations are announced several (e.g., $2^6$; this
2737
is a configurable parameter) blocks in advance, first in the
2738
``headers'' of the corresponding shardchain blocks, and then in the
2739
masterchain block that refers to these shardchain blocks. This advance
2740
announcement is needed for all parties concerned to prepare for the
2741
planned change (e.g., build an overlay multicast network to distribute
2742
new blocks of the newly-created shardchains, as discussed
2743
in~\ptref{sect:overlay}). Then the change is committed, first into the
2744
(header of the) shardchain block (in case of a split; for a merge,
2745
blocks of both shardchains should commit the change), and then
2746
propagated to the masterchain block. In this way, the masterchain
2747
block defines not only the most recent shard configuration {\em
2748
  before\/} its creation, but also the next immediate shard
2749
configuration.
2750

2751
\nxsubpoint \embt(Validator task groups for new shardchains.)  Recall
2752
that each shard, i.e., each shardchain, normally is assigned a subset
2753
of validators (a validator task group) dedicated to creating and
2754
validating new blocks in the corresponding shardchain
2755
(cf.~\ptref{sp:val.task.grp}). These task groups are elected for some
2756
period of time (approximately one hour) and are known some time in
2757
advance (also approximately one hour), and are immutable during this
2758
period.\footnote{Unless some validators are temporarily or permanently
2759
  banned because of signing invalid blocks---then they are
2760
  automatically excluded from all task groups.}
2761

2762
However, the actual shard configuration may change during this period
2763
because of split/merge operations. One must assign task groups to
2764
newly created shards. This is done as follows:
2765

2766
Notice that any active shard $(w,s)$ will either be a descendant of
2767
some uniquely determined original shard $(w,s')$, meaning that $s'$ is
2768
a prefix of $s$, or it will be the root of a subtree of original
2769
shards $(w,s')$, where $s$ will be a prefix of every $s'$. In the
2770
first case, we simply take the task group of the original shard
2771
$(w,s')$ to double as the task group of the new shard $(w,s)$. In the
2772
latter case, the task group of the new shard $(w,s)$ will be the union
2773
of task groups of all original shards $(w,s')$ that are descendants of
2774
$(w,s)$ in the shard tree.
2775

2776
In this way, every active shard $(w,s)$ gets assigned a well-defined
2777
subset of validators (task group). When a shard is split, both
2778
children inherit the whole of the task group from the original
2779
shard. When two shards are merged, their task groups are also merged.
2780

2781
Anyone who keeps track of the masterchain state can compute validator
2782
task groups for each of the active shards.
2783

2784
\nxsubpoint \embt(Limit on split/merge operations during the period of
2785
responsibility of original task groups.)  Ultimately, the new shard
2786
configuration will be taken into account, and new dedicated validator
2787
subsets (task groups) will automatically be assigned to each
2788
shard. Before that happens, one must impose a certain limit on
2789
split/merge operations; otherwise, an original task group may end up
2790
validating $2^k$ shardchains for a large $k$ at the same time, if the
2791
original shard quickly splits into $2^k$ new shards.
2792

2793
This is achieved by imposing limits on how far the active shard
2794
configuration may be removed from the original shard configuration
2795
(the one used to select validator task groups currently in
2796
charge). For example, one might require that the distance in the shard
2797
tree from an active shard $(w,s)$ to an original shard $(w,s')$ must
2798
not exceed 3, if $s'$ is a predecessor of $s$ (i.e., $s'$ is a prefix
2799
of binary string $s$), and must not exceed 2, if $s'$ is a successor
2800
of $s$ (i.e., $s$ is a prefix of $s'$). Otherwise, the split or merge
2801
operation is not permitted.
2802

2803
Roughly speaking, one is imposing a limit on the number of times a
2804
shard can be split (e.g., three) or merged (e.g., two) during the
2805
period of responsibility of a given collection of validator task
2806
groups. Apart from that, after a shard has been created by merging or
2807
splitting, it cannot be reconfigured for some period of time (some
2808
number of blocks).
2809

2810
\nxsubpoint\label{sp:split.necess} \embt(Determining the necessity of
2811
split operations.)  The split operation for a shardchain is triggered
2812
by certain formal conditions (e.g., if for 64 consecutive blocks the
2813
shardchain blocks are at least $90\%$ full). These conditions are
2814
monitored by the shardchain task group. If they are met, first a
2815
``split preparation'' flag is included in the header of a new
2816
shardchain block (and propagated to the masterchain block referring to
2817
this shardchain block). Then, several blocks afterwards, the ``split
2818
commit'' flag is included in the header of the shardchain block (and
2819
propagated to the next masterchain block).
2820

2821
\nxsubpoint \embt(Performing split operations.)  After the ``split
2822
commit'' flag is included in a block $B$ of shardchain $(w,s)$, there
2823
cannot be a subsequent block $B'$ in that shardchain. Instead, two
2824
blocks $B'_0$ and $B'_1$ of shardchains $(w,s.0)$ and $(w,s.1)$,
2825
respectively, will be created, both referring to block $B$ as their
2826
previous block (and both of them will indicate by a flag in the header
2827
that the shard has been just split). The next masterchain block will
2828
contain hashes of blocks $B'_0$ and $B'_1$ of the new shardchains; it
2829
is not allowed to contain the hash of a new block $B'$ of shardchain
2830
$(w,s)$, because a ``split commit'' event has already been committed
2831
into the previous masterchain block.
2832

2833
Notice that both new shardchains will be validated by the same
2834
validator task group as the old one, so they will automatically have a
2835
copy of their state. The state splitting operation itself is quite
2836
simple from the perspective of the Infinite Sharding Paradigm
2837
(cf.~\ptref{sp:split.merge.state}).
2838

2839
\nxsubpoint\label{sp:merge.necess} \embt(Determining the necessity of
2840
merge operations.)  The necessity of shard merge operations is also
2841
detected by certain formal conditions (e.g., if for 64 consecutive
2842
blocks the sum of the sizes of the two blocks of sibling shardchains
2843
does not exceed $60\%$ of maximal block size). These formal conditions
2844
should also take into account the total gas spent by these blocks and
2845
compare it to the current block gas limit, otherwise the blocks may
2846
happen to be small because there are some computation-intensive
2847
transactions that prevent the inclusion of more transactions.
2848

2849
These conditions are monitored by validator task groups of both
2850
sibling shards $(w,s.0)$ and $(w,s.1)$. Notice that siblings are
2851
necessarily neighbors with respect to hypercube routing
2852
(cf.~\ptref{sp:hypercube}), so validators from the task group of any
2853
shard will be monitoring the sibling shard to some extent anyways.
2854

2855
When these conditions are met, either one of the validator subgroups
2856
can suggest to the other that they merge by sending a special
2857
message. Then they combine into a provisional ``merged task group'',
2858
with combined membership, capable of running BFT consensus algorithms
2859
and of propagating block updates and block candidates if necessary.
2860

2861
If they reach consensus on the necessity and readiness of merging,
2862
``merge prepare'' flags are committed into the headers of some blocks
2863
of each shardchain, along with the signatures of at least two-thirds
2864
of the validators of the sibling's task group (and are propagated to
2865
the next masterchain blocks, so that everybody can get ready for the
2866
imminent reconfiguration). However, they continue to create separate
2867
shardchain blocks for some predefined number of blocks.
2868

2869
\nxsubpoint \embt(Performing merge operations.)  After that, when the
2870
validators from the union of the two original task groups are ready to
2871
become validators for the merged shardchain (this might involve a
2872
state transfer from the sibling shardchain and a state merge
2873
operation), they commit a ``merge commit'' flag in the headers of
2874
blocks of their shardchain (this event is propagated to the next
2875
masterchain blocks), and stop creating new blocks in separate
2876
shardchains (once the merge commit flag appears, creating blocks in
2877
separate shardchains is forbidden). Instead, a merged shardchain block
2878
is created (by the union of the two original task groups), referring
2879
to both of its ``preceding blocks'' in its ``header''. This is
2880
reflected in the next masterchain block, which will contain the hash
2881
of the newly created block of the merged shardchain. After that, the
2882
merged task group continues creating blocks in the merged shardchain.
2883

2884
\mysubsection{Classification of Blockchain
2885
  Projects}\label{sect:class.blkch}
2886

2887
We will conclude our brief discussion of the TON Blockchain by
2888
comparing it with existing and proposed blockchain projects. Before
2889
doing this, however, we must introduce a sufficiently general
2890
classification of blockchain projects. The comparison of particular
2891
blockchain projects, based on this classification, is postponed
2892
until~\ptref{sect:compare.blkch}.
2893

2894
\nxsubpoint \embt(Classification of blockchain projects.)  As a first
2895
step, we suggest some classification criteria for blockchains (i.e.,
2896
for blockchain projects). Any such classification is somewhat
2897
incomplete and superficial, because it must ignore some of the most
2898
specific and unique features of the projects under
2899
consideration. However, we feel that this is a necessary first step in
2900
providing at least a very rough and approximate map of the blockchain
2901
projects territory.
2902

2903
The list of criteria we consider is the following:
2904
\begin{itemize}
2905
\item Single-blockchain vs.\ multi-blockchain architecture
2906
  (cf.~\ptref{sp:single.multi})
2907
\item Consensus algorithm: Proof-of-Stake vs.\ Proof-of-Work
2908
  (cf.~\ptref{sp:pow.pos})
2909
\item For Proof-of-Stake systems, the exact block generation,
2910
  validation and consensus algorithm used (the two principal options
2911
  are DPOS vs.\ BFT; cf.~\ptref{sp:dpos.bft})
2912
\item Support for ``arbitrary'' (Turing-complete) smart contracts
2913
  (cf.~\ptref{sp:smartc.supp})
2914
\end{itemize}
2915
Multi-blockchain systems have additional classification criteria
2916
(cf.~\ptref{sp:class.multichain}):
2917
\begin{itemize}
2918
\item Type and rules of member blockchains: homogeneous, heterogeneous
2919
  (cf.~\ptref{sp:blkch.hom.het}), mixed
2920
  (cf.~\ptref{sp:mixed.het.hom}). Confederations
2921
  (cf.~\ptref{sp:het.confed}).
2922
\item Absence or presence of a {\em masterchain}, internal or external
2923
  (cf.~\ptref{sp:pres.masterch})
2924
\item Native support for sharding (cf.~\ptref{sp:shard.supp}). Static
2925
  or dynamic sharding (cf.~\ptref{sp:dyn.stat.shard}).
2926
\item Interaction between member blockchains: loosely-coupled and
2927
  tightly-coupled systems (cf.~\ptref{sp:blkch.interact})
2928
\end{itemize}
2929

2930
\nxsubpoint\label{sp:single.multi} \embt(Single-blockchain
2931
vs.\ multi-blockchain projects.)  The first classification criterion
2932
is the quantity of blockchains in the system. The oldest and simplest
2933
projects consist of a {\em single blockchain\/} (``singlechain
2934
projects'' for short); more sophisticated projects use (or, rather,
2935
plan to use) {\em multiple blockchains\/} (``multichain projects'').
2936

2937
Singlechain projects are generally simpler and better tested; they
2938
have withstood the test of time. Their main drawback is low
2939
performance, or at least transaction throughput, which is on the level
2940
of ten (Bitcoin) to less than one hundred\footnote{More like 15, for
2941
  the time being. However, some upgrades are being planned to make
2942
  Ethereum transaction throughput several times larger.}  (Ethereum)
2943
transactions per second for general-purpose systems. Some specialized
2944
systems (such as Bitshares) are capable of processing tens of
2945
thousands of specialized transactions per second, at the expense of
2946
requiring the blockchain state to fit into memory, and limiting the
2947
processing to a predefined special set of transactions, which are then
2948
executed by highly-optimized code written in languages like C++ (no
2949
VMs here).
2950

2951
Multichain projects promise the scalability everybody craves. They may
2952
support larger total states and more transactions per second, at the
2953
expense of making the project much more complex, and its
2954
implementation more challenging. As a result, there are few multichain
2955
projects already running, but most proposed projects are
2956
multichain. We believe that the future belongs to multichain projects.
2957

2958
\nxsubpoint\label{sp:pow.pos} \embt(Creating and validating blocks:
2959
Proof-of-Work vs.\ Proof-of-Stake.)  Another important distinction is
2960
the algorithm and protocol used to create and propagate new blocks,
2961
check their validity, and select one of several forks if they appear.
2962

2963
The two most common paradigms are {\em Proof-of-Work (PoW)} and {\em
2964
  Proof-of-Stake (PoS)}. The Proof-of-Work approach usually allows any
2965
node to create (``mine'') a new block (and obtain some reward
2966
associated with mining a block) if it is lucky enough to solve an
2967
otherwise useless computational problem (usually involving the
2968
computation of a large amount of hashes) before other competitors
2969
manage to do this. In the case of forks (for example, if two nodes
2970
publish two otherwise valid but different blocks to follow the
2971
previous one) the longest fork wins. In this way, the guarantee of
2972
immutability of the blockchain is based on the amount of {\em work\/}
2973
(computational resources) spent to generate the blockchain: anybody
2974
who would like to create a fork of this blockchain would need to re-do
2975
this work to create alternative versions of the already committed
2976
blocks. For this, one would need to control more than $50\%$ of the
2977
total computing power spent creating new blocks, otherwise the
2978
alternative fork will have exponentially low chances of becoming the
2979
longest.
2980

2981
The Proof-of-Stake approach is based on large {\em stakes\/}
2982
(nominated in cryptocurrency) made by some special nodes ({\em
2983
  validators}) to assert that they have checked ({\em validated\/})
2984
some blocks and have found them correct. Validators sign blocks, and
2985
receive some small rewards for this; however, if a validator is ever
2986
caught signing an incorrect block, and a proof of this is presented,
2987
part or all of its stake is forfeit. In this way, the guarantee of
2988
validity and immutability of the blockchain is given by the total
2989
volume of stakes put by validators on the validity of the blockchain.
2990

2991
The Proof-of-Stake approach is more natural in the respect that it
2992
incentivizes the validators (which replace PoW miners) to perform
2993
useful computation (needed to check or create new blocks, in
2994
particular, by performing all transactions listed in a block) instead
2995
of computing otherwise useless hashes. In this way, validators would
2996
purchase hardware that is better adapted to processing user
2997
transactions, in order to receive rewards associated with these
2998
transactions, which seems quite a useful investment from the
2999
perspective of the system as a whole.
3000

3001
However, Proof-of-Stake systems are somewhat more challenging to
3002
implement, because one must provide for many rare but possible
3003
conditions. For example, some malicious validators might conspire to
3004
disrupt the system to extract some profit (e.g., by altering
3005
their own cryptocurrency balances). This leads to some non-trivial
3006
game-theoretic problems.
3007

3008
In short, Proof-of-Stake is more natural and more promising,
3009
especially for multichain projects (because Proof-of-Work would
3010
require prohibitive amounts of computational resources if there are
3011
many blockchains), but must be more carefully thought out and
3012
implemented. Most currently running blockchain projects, especially
3013
the oldest ones (such as Bitcoin and at least the original Ethereum),
3014
use Proof-of-Work.
3015

3016
\nxsubpoint\label{sp:dpos.bft} \embt(Variants of Proof-of-Stake. DPOS
3017
vs.\ BFT.)  While Proof-of-Work algorithms are very similar to each
3018
other and differ mostly in the hash functions that must be computed
3019
for mining new blocks, there are more possibilities for Proof-of-Stake
3020
algorithms. They merit a sub-classification of their own.
3021

3022
Essentially, one must answer the following questions about a
3023
Proof-of-Stake algorithm:
3024
\begin{itemize}
3025
\item Who can produce (``mine'') a new block---any full node, or only
3026
  a member of a (relatively) small subset of validators?  (Most PoS
3027
  systems require new blocks to be generated and signed by one of
3028
  several designated validators.)
3029
\item Do validators guarantee the validity of the blocks by their
3030
  signatures, or are all full nodes expected to validate all blocks by
3031
  themselves? (Scalable PoS systems must rely on validator signatures
3032
  instead of requiring all nodes to validate all blocks of all
3033
  blockchains.)
3034
\item Is there a designated producer for the next blockchain block,
3035
  known in advance, such that nobody else can produce that block
3036
  instead?
3037
\item Is a newly-created block originally signed by only one validator
3038
  (its producer), or must it collect a majority of validator
3039
  signatures from the very beginning?
3040
\end{itemize}
3041

3042
While there seem to be $2^4$ possible classes of PoS algorithms
3043
depending on the answers to these questions, the distinction in
3044
practice boils down to two major approaches to PoS. In fact, most
3045
modern PoS algorithms, designed to be used in scalable multi-chain
3046
systems, answer the first two questions in the same fashion: only
3047
validators can produce new blocks, and they guarantee block validity
3048
without requiring all full nodes to check the validity of all blocks
3049
by themselves.
3050

3051
As to the two last questions, their answers turn out to be highly
3052
correlated, leaving essentially only two basic options:
3053
\begin{itemize}
3054
\item {\em Delegated Proof-of-Stake (DPOS)}: There is a universally
3055
  known designated producer for every block; no one else can produce
3056
  that block; the new block is originally signed only by its producing
3057
  validator.
3058
\item {\em Byzantine Fault Tolerant (BFT)} PoS algorithms: There is a
3059
  known subset of validators, any of which can suggest a new block;
3060
  the choice of the actual next block among several suggested
3061
  candidates, which must be validated and signed by a majority of
3062
  validators before being released to the other nodes, is achieved by
3063
  a version of Byzantine Fault Tolerant consensus protocol.
3064
\end{itemize}
3065

3066
\nxsubpoint\label{sp:dpos.bft.compare} \embt(Comparison of DPOS and
3067
BFT PoS.)  The BFT approach has the advantage that a newly-produced
3068
block has {\em from the very beginning\/} the signatures of a majority
3069
of validators testifying to its validity. Another advantage is that,
3070
if a majority of validators executes the BFT consensus protocol
3071
correctly, no forks can appear at all. On the other hand, BFT
3072
algorithms tend to be quite convoluted and require more time for the
3073
subset of validators to reach consensus. Therefore, blocks cannot be
3074
generated too often. This is why we expect the TON Blockchain (which
3075
is a BFT project from the perspective of this classification) to
3076
produce a block only once every five seconds. In practice, this
3077
interval might be decreased to 2--3 seconds (though we do not promise
3078
this), but not further, if validators are spread across the globe.
3079

3080
The DPOS algorithm has the advantage of being quite simple and
3081
straightforward. It can generate new blocks quite often---say, once
3082
every two seconds, or maybe even once every second,\footnote{Some
3083
  people even claim DPOS block generation times of half a second,
3084
  which does not seem realistic if validators are scattered across
3085
  several continents.} because of its reliance on designated block
3086
producers known in advance.
3087

3088
However, DPOS requires all nodes---or at least all validators---to
3089
validate all blocks received, because a validator producing and
3090
signing a new block confirms not only the {\em relative\/} validity of
3091
this block, but also the validity of the previous block it refers to,
3092
and all the blocks further back in the chain (maybe up to the
3093
beginning of the period of responsibility of the current subset of
3094
validators). There is a predetermined order on the current subset of
3095
validators, so that for each block there is a designated producer
3096
(i.e., validator expected to generate that block); these designated
3097
producers are rotated in a round-robin fashion. In this way, a block
3098
is at first signed only by its producing validator; then, when the
3099
next block is mined, and its producer chooses to refer to this block
3100
and not to one of its predecessors (otherwise its block would lie in a
3101
shorter chain, which might lose the ``longest fork'' competition in
3102
the future), the signature of the next block is essentially an
3103
additional signature on the previous block as well. In this way, a new
3104
block gradually collects the signatures of more validators---say,
3105
twenty signatures in the time needed to generate the next twenty
3106
blocks. A full node will either need to wait for these twenty
3107
signatures, or validate the block by itself, starting from a
3108
sufficiently confirmed block (say, twenty blocks back), which might be
3109
not so easy.
3110

3111
The obvious disadvantage of the DPOS algorithm is that a new block
3112
(and transactions committed into it) achieves the same level of trust
3113
(``recursive reliability'' as discussed in~\ptref{sp:rec.rel}) only
3114
after twenty more blocks are mined, compared to the BFT algorithms,
3115
which deliver this level of trust (say, twenty signatures)
3116
immediately. Another disadvantage is that DPOS uses the ``longest fork
3117
wins'' approach for switching to other forks; this makes forks quite
3118
probable if at least some producers fail to produce subsequent blocks
3119
after the one we are interested in (or we fail to observe these blocks
3120
because of a network partition or a sophisticated attack).
3121

3122
We believe that the BFT approach, while more sophisticated to
3123
implement and requiring longer time intervals between blocks than
3124
DPOS, is better adapted to ``tightly-coupled''
3125
(cf.~\ptref{sp:blkch.interact}) multichain systems, because other
3126
blockchains can start acting almost immediately after seeing a
3127
committed transaction (e.g., generating a message intended for them)
3128
in a new block, without waiting for twenty confirmations of validity
3129
(i.e., the next twenty blocks), or waiting for the next six blocks to
3130
be sure that no forks appear and verifying the new block by themselves
3131
(verifying blocks of other blockchains may become prohibitive in a
3132
scalable multi-chain system). Thus they can achieve scalability while
3133
preserving high reliability and availability
3134
(cf.~\ptref{sp:shard.supp}).
3135

3136
On the other hand, DPOS might be a good choice for a
3137
``loosely-coupled'' multi-chain system, where fast interaction between
3138
blockchains is not required -- e.g., if each blockchain
3139
(``workchain'') represents a separate distributed exchange, and
3140
inter-blockchain interaction is limited to rare transfers of tokens
3141
from one workchain into another (or, rather, trading one altcoin
3142
residing in one workchain for another at a rate approaching
3143
$1:1$). This is what is actually done in the BitShares project, which
3144
uses DPOS quite successfully.
3145

3146
To summarize, while DPOS can {\em generate\/} new blocks and {\em
3147
  include transactions\/} into them {\em faster\/} (with smaller
3148
intervals between blocks), these transactions reach the level of trust
3149
required to use them in other blockchains and off-chain applications
3150
as ``committed'' and ``immutable'' {\em much more slowly\/} than in
3151
the BFT systems---say, in thirty seconds%
3152
\footnote{For instance, EOS, one of the best DPOS projects proposed up
3153
  to this date, promises a 45-second confirmation and inter-blockchain
3154
  interaction delay (cf.~\cite{EOSWP}, ``Transaction Confirmation''
3155
  and ``Latency of Interchain Communication'' sections).}
3156
instead of five. Faster transaction {\em inclusion\/} does not mean
3157
faster transaction {\em commitment}. This could become a huge problem
3158
if fast inter-blockchain interaction is required. In that case, one
3159
must abandon DPOS and opt for BFT PoS instead.
3160

3161
\nxsubpoint\label{sp:smartc.supp} \embt(Support for Turing-complete
3162
code in transactions, i.e., essentially arbitrary smart contracts.)
3163
Blockchain projects normally collect some {\em transactions\/} in
3164
their blocks, which alter the blockchain state in a way deemed useful
3165
(e.g., transfer some amount of cryptocurrency from one account to
3166
another). Some blockchain projects might allow only some specific
3167
predefined types of transactions (such as value transfers from one
3168
account to another, provided correct signatures are presented). Others
3169
might support some limited form of scripting in the
3170
transactions. Finally, some blockchains support the execution of
3171
arbitrarily complex code in transactions, enabling the system (at
3172
least in principle) to support arbitrary applications, provided the
3173
performance of the system permits. This is usually associated with
3174
``Turing-complete virtual machines and scripting languages'' (meaning
3175
that any program that can be written in any other computing language
3176
may be re-written to be performed inside the blockchain), and ``smart
3177
contracts'' (which are programs residing in the blockchain).
3178

3179
Of course, support for arbitrary smart contracts makes the system
3180
truly flexible. On the other hand, this flexibility comes at a cost:
3181
the code of these smart contracts must be executed on some virtual
3182
machine, and this must be done every time for each transaction in the
3183
block when somebody wants to create or validate a block. This slows
3184
down the performance of the system compared to the case of a
3185
predefined and immutable set of types of simple transactions, which
3186
can be optimized by implementing their processing in a language such
3187
as C++ (instead of some virtual machine).
3188

3189
Ultimately, support for Turing-complete smart contracts seems to be
3190
desirable in any general-purpose blockchain project; otherwise, the
3191
designers of the blockchain project must decide in advance which
3192
applications their blockchain will be used for. In fact, the lack of
3193
support for smart contracts in the Bitcoin blockchain was the
3194
principal reason why a new blockchain project, Ethereum, had to be
3195
created.
3196

3197
In a (heterogeneous; cf.~\ptref{sp:blkch.hom.het}) multi-chain system,
3198
one might have ``the best of both worlds'' by supporting
3199
Turing-complete smart contracts in some blockchains (i.e.,
3200
workchains), and a small predefined set of highly-optimized
3201
transactions in others.
3202

3203
\nxsubpoint\label{sp:class.multichain} \embt(Classification of
3204
multichain systems.)  So far, the classification was valid both for
3205
single-chain and multi-chain systems. However, multi-chain systems
3206
admit several more classification criteria, reflecting the
3207
relationship between the different blockchains in the system. We now
3208
discuss these criteria.
3209

3210
\nxsubpoint\label{sp:blkch.hom.het} \embt(Blockchain types:
3211
homogeneous and heterogeneous systems.)  In a multi-chain system, all
3212
blockchains may be essentially of the same type and have the same
3213
rules (i.e., use the same format of transactions, the same virtual
3214
machine for executing smart-contract code, share the same
3215
cryptocurrency, and so on), and this similarity is explicitly
3216
exploited, but with different data in each blockchain. In this case,
3217
we say that the system is {\em homogeneous}. Otherwise, different
3218
blockchains (which will usually be called {\em workchains\/} in this
3219
case) can have different ``rules''. Then we say that the system is
3220
{\em heterogeneous}.
3221

3222
\nxsubpoint\label{sp:mixed.het.hom} \embt(Mixed
3223
heterogeneous-homogeneous systems.)  Sometimes we have a mixed system,
3224
where there are several sets of types or rules for blockchains, but
3225
many blockchains with the same rules are present, and this fact is
3226
explicitly exploited. Then it is a mixed {\em
3227
  heterogeneous-homogeneous system}. To our knowledge, the TON
3228
Blockchain is the only example of such a system.
3229

3230
\nxsubpoint\label{sp:het.confed} \embt(Heterogeneous systems with
3231
several workchains having the same rules, or {\em confederations}.)
3232
In some cases, several blockchains (work\-chains) with the same rules
3233
can be present in a heterogeneous system, but the interaction between
3234
them is the same as between blockchains with different rules (i.e.,
3235
their similarity is not exploited explicitly). Even if they appear to
3236
use ``the same'' cryptocurrency, they in fact use different
3237
``altcoins'' (independent incarnations of the
3238
cryptocurrency). Sometimes one can even have certain mechanisms to
3239
convert these altcoins at a rate near to $1:1$. However, this does not
3240
make the system homogeneous in our view; it remains heterogeneous. We
3241
say that such a heterogeneous collection of workchains with the same
3242
rules is a {\em confederation}.
3243

3244
While making a heterogeneous system that allows one to create several
3245
work\-chains with the same rules (i.e., a confederation) may seem a
3246
cheap way of building a scalable system, this approach has a lot of
3247
drawbacks, too. Essentially, if someone hosts a large project in many
3248
workchains with the same rules, she does not obtain a large project,
3249
but rather a lot of small instances of this project. This is like
3250
having a chat application (or a game) that allows having at most 50
3251
members in any chat (or game) room, but ``scales'' by creating new
3252
rooms to accommodate more users when necessary. As a result, a lot of
3253
users can participate in the chats or in the game, but can we say that
3254
such a system is truly scalable?
3255

3256
\nxsubpoint\label{sp:pres.masterch} \embt(Presence of a masterchain,
3257
external or internal.)  Sometimes, a multi-chain project has a
3258
distinguished ``masterchain'' (sometimes called ``control
3259
blockchain''), which is used, for example, to store the overall
3260
configuration of the system (the set of all active blockchains, or
3261
rather workchains), the current set of validators (for a
3262
Proof-of-Stake system), and so on. Sometimes other blockchains are
3263
``bound'' to the masterchain, for example by committing the hashes of
3264
their latest blocks into it (this is something the TON Blockchain
3265
does, too).
3266

3267
In some cases, the masterchain is {\em external}, meaning that it is
3268
not a part of the project, but some other pre-existing blockchain,
3269
originally completely unrelated to its use by the new project and
3270
agnostic of it. For example, one can try to use the Ethereum
3271
blockchain as a masterchain for an external project, and publish
3272
special smart contracts into the Ethereum blockchain for this purpose
3273
(e.g., for electing and punishing validators).
3274

3275
\nxsubpoint\label{sp:shard.supp} \embt(Sharding support.)  Some
3276
blockchain projects (or systems) have native support for {\em
3277
  sharding}, meaning that several (necessarily homogeneous;
3278
cf.~\ptref{sp:blkch.hom.het}) blockchains are thought of as {\em
3279
  shards\/} of a single (from a high-level perspective) virtual
3280
blockchain. For example, one can create 256 shard blockchains
3281
(``shardchains'') with the same rules, and keep the state of an
3282
account in exactly one shard selected depending on the first byte of
3283
its $\accountid$.
3284

3285
Sharding is a natural approach to scaling blockchain systems, because,
3286
if it is properly implemented, users and smart contracts in the system
3287
need not be aware of the existence of sharding at all. In fact, one
3288
often wants to add sharding to an existing single-chain project (such
3289
as Ethereum) when the load becomes too high.
3290

3291
An alternative approach to scaling would be to use a ``confederation''
3292
of heterogeneous workchains as described in~\ptref{sp:het.confed},
3293
allowing each user to keep her account in one or several workchains of
3294
her choice, and transfer funds from her account in one workchain to
3295
another workchain when necessary, essentially performing a $1:1$
3296
altcoin exchange operation. The drawbacks of this approach have
3297
already been discussed in~\ptref{sp:het.confed}.
3298

3299
However, sharding is not so easy to implement in a fast and reliable
3300
fashion, because it implies a lot of messages between different
3301
shardchains. For example, if accounts are evenly distributed between
3302
$N$ shards, and the only transactions are simple fund transfers from
3303
one account to another, then only a small fraction ($1/N$) of all
3304
transactions will be performed within a single blockchain; almost all
3305
($1-1/N$) transactions will involve two blockchains, requiring
3306
inter-blockchain communication. If we want these transactions to be
3307
fast, we need a fast system for transferring messages between
3308
shardchains. In other words, the blockchain project needs to be
3309
``tightly-coupled'' in the sense described
3310
in~\ptref{sp:blkch.interact}.
3311

3312
\nxsubpoint\label{sp:dyn.stat.shard} \embt(Dynamic and static
3313
sharding.)  Sharding might be {\em dynamic\/} (if additional shards
3314
are automatically created when necessary) or {\em static\/} (when
3315
there is a predefined number of shards, which is changeable only
3316
through a hard fork at best). Most sharding proposals are static; the
3317
TON Blockchain uses dynamic sharding (cf.~\ptref{sect:split.merge}).
3318

3319
\nxsubpoint\label{sp:blkch.interact} \embt(Interaction between
3320
blockchains: loosely-coupled and tightly-coupled systems.)
3321
Multi-blockchain projects can be classified according to the supported
3322
level of interaction between the constituent blockchains.
3323

3324
The least level of support is the absence of any interaction between
3325
different blockchains whatsoever. We do not consider this case here,
3326
because we would rather say that these blockchains are not parts of
3327
one blockchain system, but just separate instances of the same
3328
blockchain protocol.
3329

3330
The next level of support is the absence of any specific support for
3331
messaging between blockchains, making interaction possible in
3332
principle, but awkward. We call such systems ``loosely-coupled''; in
3333
them one must send messages and transfer value between blockchains as
3334
if they had been blockchains belonging to completely separate
3335
blockchain projects (e.g., Bitcoin and Ethereum; imagine two parties
3336
want to exchange some Bitcoins, kept in the Bitcoin blockchain, into
3337
Ethers, kept in the Ethereum blockchain). In other words, one must
3338
include the outbound message (or its generating transaction) in a
3339
block of the source blockchain. Then she (or some other party) must
3340
wait for enough confirmations (e.g., a given number of subsequent
3341
blocks) to consider the originating transaction to be ``committed''
3342
and ``immutable'', so as to be able to perform external actions based
3343
on its existence. Only then may a transaction relaying the message
3344
into the target blockchain (perhaps along with a reference and a
3345
Merkle proof of existence for the originating transaction) be
3346
committed.
3347

3348
If one does not wait long enough before transferring the message, or
3349
if a fork happens anyway for some other reason, the joined state of
3350
the two blockchains turns out to be inconsistent: a message is
3351
delivered into the second blockchain that has never been generated in
3352
(the ultimately chosen fork of) the first blockchain.
3353

3354
Sometimes partial support for messaging is added, by standardizing the
3355
format of messages and the location of input and output message queues
3356
in the blocks of all workchains (this is especially useful in
3357
heterogeneous systems). While this facilitates messaging to a certain
3358
extent, it is conceptually not too different from the previous case,
3359
so such systems are still ``loosely-coupled''.
3360

3361
By contrast, ``tightly-coupled'' systems include special mechanisms to
3362
provide fast messaging between all blockchains. The desired behavior
3363
is to be able to deliver a message into another workchain immediately
3364
after it has been generated in a block of the originating
3365
blockchain. On the other hand, ``tightly-coupled'' systems are also
3366
expected to maintain overall consistency in the case of forks. While
3367
these two requirements appear to be contradictory at first glance, we
3368
believe that the mechanisms used by the TON Blockchain (the inclusion
3369
of shardchain block hashes into masterchain blocks; the use of
3370
``vertical'' blockchains for fixing invalid blocks,
3371
cf.~\ptref{sp:inv.sh.blk.corr}; hypercube routing,
3372
cf.~\ptref{sp:hypercube}; Instant Hypercube Routing,
3373
cf.~\ptref{sp:instant.hypercube}) enable it to be a
3374
``tightly-coupled'' system, perhaps the only one so far.
3375

3376
Of course, building a ``loosely-coupled'' system is much simpler;
3377
however, fast and efficient sharding (cf.~\ptref{sp:shard.supp})
3378
requires the system to be ``tightly-coupled''.
3379

3380
\nxsubpoint\label{sp:blkch.gen} \embt(Simplified
3381
classification. Generations of blockchain projects.)  The
3382
classification we have suggested so far splits all blockchain projects
3383
into a large number of classes. However, the classification criteria
3384
we use happen to be quite correlated in practice. This enables us to
3385
suggest a simplified ``generational'' approach to the classification
3386
of blockchain projects, as a very rough approximation of reality,
3387
with some examples. Projects that have not been implemented and
3388
deployed yet are shown in {\em italics}; the most important
3389
characteristics of a generation are shown in {\bf bold}.
3390
\begin{itemize}
3391
\item First generation: Single-chain, {\bf PoW}, no support for smart
3392
  contracts. Examples: Bitcoin (2009) and a lot of otherwise
3393
  uninteresting imitators (Litecoin, Monero, \dots).
3394
\item Second generation: Single-chain, PoW, {\bf smart-contract
3395
  support}. Example: Ethereum (2013; deployed in 2015), at least in
3396
  its original form.
3397
\item Third generation: Single-chain, {\bf PoS}, smart-contract
3398
  support. Example: {\em future Ethereum} (2018 or later).
3399
\item Alternative third ($3'$) generation: {\bf Multi-chain}, PoS, no
3400
  support for smart contracts, loosely-coupled. Example: Bitshares
3401
  (2013--2014; uses DPOS).
3402
\item Fourth generation: {\bf Multi-chain, PoS, smart-contract
3403
  support}, loosely-coupled. Examples: {\em EOS\/} (2017; uses DPOS),
3404
  {\em PolkaDot\/} (2016; uses BFT).
3405
\item Fifth generation: Multi-chain, PoS with BFT, smart-contract
3406
  support, {\bf tightly-coupled, with sharding}. Examples: {\em TON\/}
3407
  (2017).
3408
\end{itemize}
3409
While not all blockchain projects fall precisely into one of these
3410
categories, most of them do.
3411

3412
\nxsubpoint\label{sp:genome.change.never} \embt(Complications of
3413
changing the ``genome'' of a blockchain project.)  The above
3414
classification defines the ``genome'' of a blockchain project. This
3415
genome is quite ``rigid'': it is almost impossible to change it once
3416
the project is deployed and is used by a lot of people. One would need
3417
a series of hard forks (which would require the approval of the
3418
majority of the community), and even then the changes would need to be
3419
very conservative in order to preserve backward compatibility (e.g.,
3420
changing the semantics of the virtual machine might break existing
3421
smart contracts). An alternative would be to create new ``sidechains''
3422
with their different rules, and bind them somehow to the blockchain
3423
(or the blockchains) of the original project. One might use the
3424
blockchain of the existing single-blockchain project as an external
3425
masterchain for an essentially new and separate project.\footnote{For
3426
  example, the Plasma project plans to use the Ethereum blockchain as
3427
  its (external) masterchain; it does not interact much with Ethereum
3428
  otherwise, and it could have been suggested and implemented by a
3429
  team unrelated to the Ethereum project.}
3430

3431
Our conclusion is that the genome of a project is very hard to change
3432
once it has been deployed. Even starting with PoW and planning to
3433
replace it with PoS in the future is quite complicated.\footnote{As of
3434
  2017, Ethereum is still struggling to transition from PoW to a
3435
  combined PoW+PoS system; we hope it will become a truly PoS system
3436
  someday.} Adding shards to a project originally designed without
3437
support for them seems almost impossible.\footnote{There are sharding
3438
  proposals for Ethereum dating back to 2015; it is unclear how they
3439
  might be implemented and deployed without disrupting Ethereum or
3440
  creating an essentially independent parallel project.} In fact,
3441
adding support for smart contracts into a project (namely, Bitcoin)
3442
originally designed without support for such features has been deemed
3443
impossible (or at least undesirable by the majority of the Bitcoin
3444
community) and eventually led to the creation of a new blockchain
3445
project, Ethereum.
3446

3447
\nxsubpoint \embt(Genome of the TON Blockchain.)  Therefore, if one
3448
wants to build a scalable blockchain system, one must choose its
3449
genome carefully from the very beginning. If the system is meant to
3450
support some additional specific functionality in the future not known
3451
at the time of its deployment, it should support ``heterogeneous''
3452
workchains (having potentially different rules) from the start. For
3453
the system to be truly scalable, it must support sharding from the
3454
very beginning; sharding makes sense only if the system is
3455
``tightly-coupled'' (cf.~\ptref{sp:blkch.interact}), so this in turn
3456
implies the existence of a masterchain, a fast system of
3457
inter-blockchain messaging, usage of BFT PoS, and so on.
3458

3459
When one takes into account all these implications, most of the design
3460
choices made for the TON Blockchain project appear natural, and almost
3461
the only ones possible.
3462

3463
\mysubsection{Comparison to Other Blockchain
3464
  Projects}\label{sect:compare.blkch}
3465

3466
We conclude our brief discussion of the TON Blockchain and its most
3467
important and unique features by trying to find a place for it on a
3468
map containing existing and proposed blockchain projects. We use the
3469
classification criteria described in~\ptref{sect:class.blkch} to
3470
discuss different blockchain projects in a uniform way and construct
3471
such a ``map of blockchain projects''. We represent this map as
3472
Table~\ref{tab:blkch.proj}, and then briefly discuss a few projects
3473
separately to point out their peculiarities that may not fit into the
3474
general scheme.
3475

3476
\begin{table}
3477
  \captionsetup{font=scriptsize}
3478
  \begin{tabular}{|c|cc|ccc|ccc|}
3479
    \hline Project & Year & G. & Cons. & Sm. & Ch. & R. & Sh. &
3480
    Int. \\ \hline Bitcoin & 2009 & 1 & PoW & no & 1 \\ Ethereum &
3481
    2013, 2015 & 2 & PoW & yes & 1 \\ NXT & 2014 & 2+ & PoS & no & 1
3482
    \\ Tezos & 2017, ? & 2+ & PoS & yes & 1 \\ Casper & 2015, (2017) &
3483
    3 & PoW/PoS & yes & 1 \\ \hline BitShares & 2013, 2014 & $3'$ &
3484
    DPoS & no & m & ht. & no & L \\ EOS & 2016, (2018) & 4 & DPoS &
3485
    yes & m & ht. & no & L \\ PolkaDot & 2016, (2019) & 4 & PoS BFT & yes &
3486
    m & ht. & no & L \\ Cosmos & 2017, ?  & 4 & PoS BFT & yes & m &
3487
    ht. & no & L \\ TON & 2017, (2018) & 5 & PoS BFT & yes & m & mix &
3488
    dyn. & T \\ \hline
3489
  \end{tabular}
3490
  \caption{A summary of some notable blockchain projects. The columns
3491
    are: {\em Project} -- project name; {\em Year} -- year announced
3492
    and year deployed; {\em G.} -- generation
3493
    (cf.~\ptref{sp:blkch.gen}); {\em Cons.} -- consensus algorithm
3494
    (cf.~\ptref{sp:pow.pos} and~\ptref{sp:dpos.bft}); {\em Sm.} --
3495
    support for arbitrary code (smart contracts;
3496
    cf.~\ptref{sp:smartc.supp}); {\em Ch.} -- single/multiple
3497
    blockchain system (cf.~\ptref{sp:single.multi}); {\em R.} --
3498
    heterogeneous/homogeneous multichain systems
3499
    (cf.~\ptref{sp:blkch.hom.het}); {\em Sh.} -- sharding support
3500
    (cf.~\ptref{sp:shard.supp}); {\em Int.} -- interaction between
3501
    blockchains, (L)oose or (T)ight (cf.~\ptref{sp:blkch.interact}).
3502
  }\label{tab:blkch.proj}
3503
\end{table}
3504

3505
\nxsubpoint \embt(Bitcoin \cite{BitcWP}; \url{https://bitcoin.org/}.)
3506
            {\em Bitcoin\/} (2009) is the first and the most famous
3507
            block\-chain project. It is a typical {\em
3508
              first-generation} blockchain project: it is
3509
            single-chain, it uses Proof-of-Work with a
3510
            ``longest-fork-wins'' fork selection algorithm, and it
3511
            does not have a Turing-complete scripting language
3512
            (however, simple scripts without loops are supported). The
3513
            Bitcoin blockchain has no notion of an account; it uses
3514
            the UTXO (Unspent Transaction Output) model instead.
3515

3516
\nxsubpoint \embt(Ethereum \cite{EthWP}; \url{https://ethereum.org/}.)
3517
            {\em Ethereum\/} (2015) is the first blockchain with
3518
            support for Turing-complete smart contracts. As such, it
3519
            is a typical {\em second-generation\/} project, and the
3520
            most popular among them. It uses Proof-of-Work on a single
3521
            blockchain, but has smart contracts and accounts.
3522

3523
\nxsubpoint \embt(NXT; \url{https://nxtplatform.org/}.)  {\em NXT\/}
3524
(2014) is the first PoS-based blockchain and currency. It is still
3525
single-chain, and has no smart contract support.
3526

3527
\nxsubpoint \embt(Tezos; \url{https://www.tezos.com/}.)  {\em Tezos\/}
3528
(2018 or later) is a proposed PoS-based single-blockchain project. We
3529
mention it here because of its unique feature: its block
3530
interpretation function $\evblock$ (cf.~\ptref{sp:blk.transf}) is not
3531
fixed, but is determined by an OCaml module, which can be upgraded by
3532
committing a new version into the blockchain (and collecting some
3533
votes for the proposed change). In this way, one will be able to
3534
create custom single-chain projects by first deploying a ``vanilla''
3535
Tezos blockchain, and then gradually changing the block interpretation
3536
function in the desired direction, without any need for hard forks.
3537

3538
This idea, while intriguing, has the obvious drawback that it forbids
3539
any optimized implementations in other languages like C++, so a
3540
Tezos-based blockchain is destined to have lower performance. We think
3541
that a similar result might have been obtained by publishing a formal
3542
{\em specification\/} of the proposed block interpretation function
3543
$\evtrans$, without fixing a particular {\em implementation}.
3544

3545
\nxsubpoint
3546
\embt(Casper.)%
3547
\footnote{\url{https://blog.ethereum.org/2015/08/01/introducing-casper-friendly-ghost/}}
3548
{\em Casper\/} is an upcoming PoS algorithm for Ethereum; its gradual
3549
deployment in 2017 (or 2018), if successful, will change Ethereum into
3550
a single-chain PoS or mixed PoW+PoS system with smart contract
3551
support, transforming Ethereum into a {\em third-generation\/}
3552
project.
3553

3554
\nxsubpoint \embt(BitShares \cite{BitShWP};
3555
\url{https://bitshares.org}.)  {\em BitShares\/} (2014) is a platform
3556
for distributed blockchain-based exchanges. It is a heterogeneous
3557
multi-blockchain DPoS system without smart contracts; it achieves its
3558
high performance by allowing only a small set of predefined
3559
specialized transaction types, which can be efficiently implemented in
3560
C++, assuming the blockchain state fits into memory. It is also the
3561
first blockchain project to use Delegated Proof-of-Stake (DPoS),
3562
demonstrating its viability at least for some specialized purposes.
3563

3564
\nxsubpoint\label{sp:discuss.EOS} \embt(EOS \cite{EOSWP};
3565
\url{https://eos.io}.)  {\em EOS\/} (2018 or later) is a proposed
3566
heterogeneous multi-blockchain DPoS system {\em with\/} smart contract
3567
support and with some minimal support for messaging (still
3568
loosely-coupled in the sense described
3569
in~\ptref{sp:blkch.interact}). It is an attempt by the same team that
3570
has previously successfully created the BitShares and SteemIt
3571
projects, demonstrating the strong points of the DPoS consensus
3572
algorithm. Scalability will be achieved by creating specialized
3573
workchains for projects that need it (e.g., a distributed exchange
3574
might use a workchain supporting a special set of optimized
3575
transactions, similarly to what BitShares did) and by creating
3576
multiple workchains with the same rules ({\em confederations\/} in the
3577
sense described in~\ptref{sp:het.confed}). The drawbacks and
3578
limitations of this approach to scalability have been discussed in
3579
{\em loc.~cit.} Cf.\ also \ptref{sp:dpos.bft.compare},
3580
\ptref{sp:shard.supp}, and \ptref{sp:blkch.interact} for a more
3581
detailed discussion of DPoS, sharding, interaction between workchains
3582
and their implications for the scalability of a blockchain system.
3583

3584
    At the same time, even if one will not be able to ``create a
3585
    Facebook inside a blockchain''
3586
    (cf.~\ptref{sp:blockchain.facebook}), EOS or otherwise, we think
3587
    that EOS might become a convenient platform for some
3588
    highly-specialized weakly interacting distributed applications,
3589
    similar to BitShares (decentralized exchange) and SteemIt
3590
    (decentralized blog platform).
3591

3592
\nxsubpoint\label{sp:discuss.PolkaDot} \embt(PolkaDot \cite{PolkaWP};
3593
\url{https://polkadot.io/}.)  {\em PolkaDot\/} (2019 or later) is one
3594
of the best thought-out and most detailed proposed multichain
3595
Proof-of-Stake projects; its development is led by one of the
3596
Ethereum co-founders. This project is one of the closest projects to
3597
the TON Blockchain on our map. (In fact, we are indebted for our
3598
terminology for ``fishermen'' and ``nominators'' to the PolkaDot
3599
project.)
3600

3601
PolkaDot is a heterogeneous loosely-coupled multichain Proof-of-Stake
3602
project, with Byzantine Fault Tolerant (BFT) consensus for generation
3603
of new blocks and a masterchain (which might be external---e.g., the
3604
Ethereum blockchain). It also uses hypercube routing, somewhat like
3605
(the slow version of) TON's as described in~\ptref{sp:hypercube}.
3606

3607
Its unique feature is its ability to create not only {\em public}, but
3608
also {\em private\/} blockchains. These private blockchains would also
3609
be able to interact with other public blockchains, PolkaDot or
3610
otherwise.
3611

3612
As such, PolkaDot might become a platform for large-scale {\em
3613
  private\/} block\-chains, which might be used, for example, by bank
3614
consortiums to quickly transfer funds to each other, or for any other
3615
uses a large corporation might have for private blockchain technology.
3616

3617
However, PolkaDot has no sharding support and is not
3618
tightly-coupled. This somewhat hampers its scalability, which is
3619
similar to that of EOS. (Perhaps a bit better, because PolkaDot uses
3620
BFT PoS instead of DPoS.)
3621

3622
\nxsubpoint \embt(Universa; \url{https://universa.io}.)  The only
3623
reason we mention this unusual blockchain project here is because it
3624
is the only project so far to make in passing an explicit reference to
3625
something similar to our Infinite Sharding Paradigm
3626
(cf.~\ptref{sp:ISP}). Its other peculiarity is that it bypasses all
3627
complications related to Byzantine Fault Tolerance by promising that
3628
only trusted and licensed partners of the project will be admitted as
3629
validators, hence they will never commit invalid blocks. This is an
3630
interesting decision; however, it essentially makes a blockchain
3631
project deliberately {\em centralized}, something blockchain projects
3632
usually want to avoid (why does one need a blockchain at all to work
3633
in a trusted centralized environment?).
3634

3635
\nxsubpoint \embt(Plasma; \url{https://plasma.io}).)  {\em Plasma\/}
3636
(2019?) is an unconventional blockchain project from another
3637
co-founder of Ethereum. It is supposed to mitigate some limitations of
3638
Ethereum without introducing sharding. In essence, it is a separate
3639
project from Ethereum, introducing a hierarchy of (heterogeneous)
3640
workchains, bound to the Ethereum blockchain (to be used as an
3641
external masterchain) at the top level. Funds can be transferred from
3642
any blockchain up in the hierarchy (starting from the Ethereum
3643
blockchain as the root), along with a description of a job to be
3644
done. Then the necessary computations are done in the child workchain
3645
(possibly requiring forwarding of parts of the original job further
3646
down the tree), their results are passed up, and a reward is
3647
collected. The problem of achieving consistency and validating these
3648
workchains is circumvented by a (payment channel-inspired) mechanism
3649
allowing users to unilaterally withdraw their funds from a misbehaving
3650
workchain to its parent workchain (albeit slowly), and re-allocate
3651
their funds and their jobs to another workchain.
3652

3653
In this way, Plasma might become a platform for distributed
3654
computations bound to the Ethereum blockchain, something like a
3655
``mathematical co-processor''. However, this does not seem like a way
3656
to achieve true general-purpose scalability.
3657

3658
\nxsubpoint \embt(Specialized blockchain projects.)  There are also
3659
some specialized blockchain projects, such as FileCoin (a system that
3660
incentivizes users to offer their disk space for storing the files of
3661
other users who are willing to pay for it), Golem (a blockchain-based
3662
platform for renting and lending computing power for specialized
3663
applications such as 3D-rendering) or SONM (another similar computing
3664
power-lending project). Such projects do not introduce anything
3665
conceptually new on the level of blockchain organization; rather, they
3666
are particular blockchain applications, which could be implemented by
3667
smart contracts running in a general-purpose blockchain, provided it
3668
can deliver the required performance. As such, projects of this kind
3669
are likely to use one of the existing or planned blockchain projects
3670
as their base, such as EOS, PolkaDot or TON. If a project needs
3671
``true'' scalability (based on sharding), it would better use TON; if
3672
it is content to work in a ``confederated'' context by defining a
3673
family of workchains of its own, explicitly optimized for its purpose,
3674
it might opt for EOS or PolkaDot.
3675

3676
\nxsubpoint \embt(The TON Blockchain.)  The TON (Telegram Open
3677
Network) Block\-chain (planned 2018) is the project we are describing
3678
in this document. It is designed to be the first fifth-generation
3679
blockchain project---that is, a BFT PoS-multichain project, mixed
3680
homogeneous/heterogeneous, with support for (shardable) custom
3681
workchains, with native sharding support, and tightly-coupled (in
3682
particular, capable of forwarding messages between shards almost
3683
instantly while preserving a consistent state of all shardchains). As
3684
such, it will be a truly scalable general-purpose blockchain project,
3685
capable of accommodating essentially any applications that can be
3686
implemented in a blockchain at all. When augmented by the other
3687
components of the TON Project (cf.~\ptref{sect:ton.components}), its
3688
possibilities expand even further.
3689

3690
\nxsubpoint\label{sp:blockchain.facebook} \embtx(Is it possible to
3691
``upload Facebook into a blockchain''?)  Sometimes people claim that
3692
it will be possible to implement a social network on the scale of
3693
Facebook as a distributed application residing in a
3694
blockchain. Usually a favorite blockchain project is cited as a
3695
possible ``host'' for such an application.
3696

3697
We cannot say that this is a technical impossibility. Of course, one
3698
needs a tightly-coupled blockchain project with true sharding (i.e.,
3699
TON) in order for such a large application not to work too slowly
3700
(e.g., deliver messages and updates from users residing in one
3701
shardchain to their friends residing in another shardchain with
3702
reasonable delays). However, we think that this is not needed and will
3703
never be done, because the price would be prohibitive.
3704

3705
Let us consider ``uploading Facebook into a blockchain'' as a thought
3706
experiment; any other project of similar scale might serve as an
3707
example as well. Once Facebook is uploaded into a blockchain, all
3708
operations currently done by Facebook's servers will be serialized as
3709
transactions in certain blockchains (e.g., TON's shardchains), and
3710
will be performed by all validators of these blockchains. Each
3711
operation will have to be performed, say, at least twenty times, if we
3712
expect every block to collect at least twenty validator signatures
3713
(immediately or eventually, as in DPOS systems). Similarly, all data
3714
kept by Facebook's servers on their disks will be kept on the disks of
3715
all validators for the corresponding shardchain (i.e., in at least
3716
twenty copies).
3717

3718
Because the validators are essentially the same servers (or perhaps
3719
clusters of servers, but this does not affect the validity of this
3720
argument) as those currently used by Facebook, we see that the total
3721
hardware expenses associated with running Facebook in a blockchain are
3722
at least twenty times higher than if it were implemented in the
3723
conventional way.
3724

3725
In fact, the expenses would be much higher still, because the
3726
blockchain's virtual machine is slower than the ``bare CPU'' running
3727
optimized compiled code, and its storage is not optimized for
3728
Facebook-specific problems. One might partially mitigate this problem
3729
by crafting a specific workchain with some special transactions
3730
adapted for Facebook; this is the approach of BitShares and EOS to
3731
achieving high performance, available in the TON Blockchain as
3732
well. However, the general blockchain design would still impose some
3733
additional restrictions by itself, such as the necessity to register
3734
all operations as transactions in a block, to organize these
3735
transactions in a Merkle tree, to compute and check their Merkle
3736
hashes, to propagate this block further, and so on.
3737

3738
Therefore, a conservative estimate is that one would need 100 times
3739
more servers of the same performance as those used by Facebook now in
3740
order to validate a blockchain project hosting a social network of
3741
that scale. Somebody will have to pay for these servers, either the
3742
company owning the distributed application (imagine seeing 700 ads on
3743
each Facebook page instead of 7) or its users. Either way, this does
3744
not seem economically viable.
3745

3746
We believe that {\em it is not true that everything should be uploaded
3747
  into the blockchain}. For example, it is not necessary to keep user
3748
photographs in the blockchain; registering the hashes of these
3749
photographs in the blockchain and keeping the photographs in a
3750
distributed off-chain storage (such as FileCoin or TON Storage) would
3751
be a better idea. This is the reason why TON is not just a blockchain
3752
project, but a collection of several components (TON P2P Network, TON
3753
Storage, TON Services) centered around the TON Blockchain as outlined
3754
in Chapters~\ptref{sect:ton.components} and~\ptref{sect:services}.
3755

3756
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
3757
%
3758
%
3759
%                  NETWORK
3760
%
3761
%
3762
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
3763

3764
\clearpage
3765
\mysection{TON Networking}\label{sect:network}
3766

3767
Any blockchain project requires not only a specification of block
3768
format and blockchain validation rules, but also a network protocol
3769
used to propagate new blocks, send and collect transaction candidates
3770
and so on. In other words, a specialized peer-to-peer network must be
3771
set up by every blockchain project. This network must be peer-to-peer,
3772
because blockchain projects are normally expected to be decentralized,
3773
so one cannot rely on a centralized group of servers and use
3774
conventional client-server architecture, as, for instance, classical
3775
online banking applications do. Even light clients (e.g., light
3776
cryptocurrency wallet smartphone applications), which must connect to
3777
full nodes in a client-server--like fashion, are actually free to
3778
connect to another full node if their previous peer goes down,
3779
provided the protocol used to connect to full nodes is standardized
3780
enough.
3781

3782
While the networking demands of single-blockchain projects, such as
3783
Bitcoin or Ethereum, can be met quite easily (one essentially needs to
3784
construct a ``random'' peer-to-peer overlay network, and propagate all
3785
new blocks and transaction candidates by a gossip protocol),
3786
multi-blockchain projects, such as the TON Blockchain, are much more
3787
demanding (e.g., one must be able to subscribe to updates of only some
3788
shardchains, not necessarily all of them). Therefore, the networking
3789
part of the TON Blockchain and the TON Project as a whole merits at
3790
least a brief discussion.
3791

3792
On the other hand, once the more sophisticated network protocols
3793
needed to support the TON Blockchain are in place, it turns out that
3794
they can easily be used for purposes not necessarily related to the
3795
immediate demands of the TON Blockchain, thus providing more
3796
possibilities and flexibility for creating new services in the TON
3797
ecosystem.
3798

3799
\mysubsection{Abstract Datagram Network Layer}\label{sect:ANL}
3800

3801
The cornerstone in building the TON networking protocols is the {\em
3802
  (TON) Abstract (Datagram) Network Layer}. It enables all nodes to
3803
assume certain ``network identities'', represented by 256-bit
3804
``abstract network addresses'', and communicate (send datagrams to
3805
each other, as a first step) using only these 256-bit network
3806
addresses to identify the sender and the receiver. In particular, one
3807
does not need to worry about IPv4 or IPv6 addresses, UDP port numbers,
3808
and the like; they are hidden by the Abstract Network Layer.
3809

3810
\nxsubpoint\label{sp:abs.addr} \embt(Abstract network addresses.)
3811
An {\em abstract network address}, or an {\em abstract address}, or
3812
just {\em address\/} for short, is a 256-bit integer, essentially
3813
equal to a 256-bit ECC public key. This public key can be generated
3814
arbitrarily, thus creating as many different network identities as the
3815
node likes. However, one must know the corresponding {\em private\/}
3816
key in order to receive (and decrypt) messages intended for such an
3817
address.
3818

3819
In fact, the address is {\em not\/} the public key itself; instead, it
3820
is a 256-bit hash ($\Hash=\Sha$) of a serialized TL-object
3821
(cf.~\ptref{sp:TL}) that can describe several types of public keys and
3822
addresses depending on its constructor (first four bytes). In the
3823
simplest case, this serialized TL-object consists just of a 4-byte
3824
magic number and a 256-bit elliptic curve cryptography (ECC) public
3825
key; in this case, the address will equal the hash of this 36-byte
3826
structure. One might use, however, 2048-bit RSA keys, or any other
3827
scheme of public-key cryptography instead.
3828

3829
When a node learns another node's abstract address, it must also
3830
receive its ``preimage'' (i.e., the serialized TL-object, the hash of
3831
which equals that abstract address) or else it will not be able to
3832
encrypt and send datagrams to that address.
3833

3834
\nxsubpoint \embt(Lower-level networks. UDP implementation.)  From the
3835
perspective of almost all TON Networking components, the only thing
3836
that exists is a network (the Abstract Datagram Networking Layer) able
3837
to (unreliably) send datagrams from one abstract address to
3838
another. In principle, the Abstract Datagram Networking Layer (ADNL)
3839
can be implemented over different existing network
3840
technologies. However, we are going to implement it over UDP in
3841
IPv4/IPv6 networks (such as the Internet or intranets), with an
3842
optional TCP fallback if UDP is not available.
3843

3844
\nxsubpoint\label{sp:net.simple.dg} \embt(Simplest case of ADNL over
3845
UDP.)  The simplest case of sending a datagram from a sender's
3846
abstract address to any other abstract address (with known preimage)
3847
can be implemented as follows.
3848

3849
Suppose that the sender somehow knows the IP-address and the UDP port
3850
of the receiver who owns the destination abstract address, and that
3851
both the receiver and the sender use abstract addresses derived from
3852
256-bit ECC public keys.
3853

3854
In this case, the sender simply augments the datagram to be sent by
3855
its ECC signature (done with its private key) and its source address
3856
(or the preimage of the source address, if the receiver is not known
3857
to know that preimage yet). The result is encrypted with the
3858
recipient's public key, embedded into a UDP datagram and sent to the
3859
known IP and port of the recipient. Because the first 256 bits of the
3860
UDP datagram contain the recipient's abstract address, the recipient
3861
can identify which private key should be used to decrypt the remainder
3862
of the datagram. Only after that is the sender's identity revealed.
3863

3864
\nxsubpoint\label{sp:net.simplest.dg} \embt(Less secure way, with the
3865
sender's address in plaintext.)  Sometimes a less secure scheme is
3866
sufficient, when the recipient's and the sender's addresses are kept
3867
in plaintext in the UDP datagram; the sender's private key and the
3868
recipient's public key are combined together using ECDH (Elliptic
3869
Curve Diffie--Hellman) to generate a 256-bit shared secret, which is
3870
used afterwards, along with a random 256-bit nonce also included in
3871
the unencrypted part, to derive AES keys used for encryption. The
3872
integrity may be provided, for instance, by concatenating the hash of
3873
the original plaintext data to the plaintext before encryption.
3874

3875
This approach has the advantage that, if more than one datagram is
3876
expected to be exchanged between the two addresses, the shared secret
3877
can be computed only once and then cached; then slower elliptic curve
3878
operations will no longer be required for encrypting or decrypting the
3879
next datagrams.
3880

3881
\nxsubpoint\label{sp:net.channels} \embt(Channels and channel
3882
identifiers.)  In the simplest case, the first 256 bits of a UDP
3883
datagram carrying an embedded TON ADNL datagram will be equal to the
3884
recipient's address. However, in general they constitute a {\em
3885
  channel identifier}. There are different types of channels. Some of
3886
them are point-to-point; they are created by two parties who wish to
3887
exchange a lot of data in the future and generate a shared secret by
3888
exchanging several packets encrypted as described
3889
in~\ptref{sp:net.simple.dg} or~\ptref{sp:net.simplest.dg}, by running
3890
classical or elliptic curve Diffie--Hellman (if extra security is
3891
required), or simply by one party generating a random shared secret
3892
and sending it to the other party.
3893

3894
After that, a channel identifier is derived from the shared secret
3895
combined with some additional data (such as the sender's and
3896
recipient's addresses), for instance by hashing, and that identifier
3897
is used as the first 256 bits of UDP datagrams carrying data encrypted
3898
with the aid of that shared secret.
3899

3900
\nxsubpoint\label{sp:tunnels} \embt(Channel as a tunnel identifier.)
3901
In general, a ``channel'', or ``channel identifier'' simply selects a
3902
way of processing an inbound UDP datagram, known to the receiver. If
3903
the channel is the receiver's abstract address, the processing is done
3904
as outlined in~\ptref{sp:net.simple.dg} or \ptref{sp:net.simplest.dg};
3905
if the channel is an established point-to-point channel discussed
3906
in~\ptref{sp:net.channels}, the processing consists in decrypting the
3907
datagram with the aid of the shared secret as explained in {\em
3908
  loc.~cit.}, and so on.
3909

3910
In particular, a channel identifier can actually select a ``tunnel'',
3911
when the immediate recipient simply forwards the received message to
3912
somebody else---the actual recipient or another proxy. Some encryption
3913
or decryption steps (reminiscent of ``onion routing'' \cite{Onion} or
3914
even ``garlic
3915
routing''\footnote{\url{https://geti2p.net/en/docs/how/garlic-routing}})
3916
might be done along the way, and another channel identifier might be
3917
used for re-encrypted forwarded packets (for example, a peer-to-peer
3918
channel could be employed to forward the packet to the next recipient
3919
on the path).
3920

3921
In this way, some support for ``tunneling'' and
3922
``proxying''---somewhat similar to that provided by the TOR or $I^2P$
3923
projects---can be added on the level of the TON Abstract Datagram
3924
Network Layer, without affecting the functionality of all higher-level
3925
TON network protocols, which would be agnostic of such an
3926
addition. This opportunity is exploited by the {\em TON Proxy\/}
3927
service (cf.~\ptref{sp:ex.ton.proxy}).
3928

3929
\nxsubpoint\label{sp:net.startup} \embt(Zero channel and the bootstrap
3930
problem.)  Normally, a TON ADNL node will have some ``neighbor
3931
table'', containing information about other known nodes, such as their
3932
abstract addresses and their preimages (i.e., public keys) and their
3933
IP addresses and UDP ports. Then it will gradually extend this table
3934
by using information learned from these known nodes as answers to
3935
special queries, and sometimes prune obsolete records.
3936

3937
However, when a TON ADNL node just starts up, it may happen that it
3938
does not know any other node, and can learn only the IP address and
3939
UDP port of a node, but not its abstract address. This happens, for
3940
example, if a light client is not able to access any of the previously
3941
cached nodes and any nodes hardcoded into the software, and must ask
3942
the user to enter an IP address or a DNS domain of a node, to be
3943
resolved through DNS.
3944

3945
In this case, the node will send packets to a special ``zero channel''
3946
of the node in question. This does not require knowledge of the
3947
recipient's public key (but the message should still contain the
3948
sender's identity and signature), so the message is transferred
3949
without encryption.  It should be normally used only to obtain an
3950
identity (maybe a one-time identity created especially for this
3951
purpose) of the receiver, and then to start communicating in a safer
3952
way.
3953

3954
Once at least one node is known, it is easy to populate the ``neighbor
3955
table'' and ``routing table'' by more entries, learning them from
3956
answers to special queries sent to the already known nodes.
3957

3958
Not all nodes are required to process datagrams sent to the zero
3959
channel, but those used to bootstrap light clients should support this
3960
feature.
3961

3962
\nxsubpoint \embt(TCP-like stream protocol over ADNL.)  The ADNL,
3963
being an unreliable (small-size) datagram protocol based on 256-bit
3964
abstract addresses, can be used as a base for more sophisticated
3965
network protocols. One can build, for example, a TCP-like stream
3966
protocol, using ADNL as an abstract replacement for IP. However, most
3967
components of the TON Project do not need such a stream protocol.
3968

3969
\nxsubpoint\label{sp:RLDP} \embt(RLDP, or Reliable Large Datagram
3970
Protocol over ADNL.)  A reliable arbitrary-size datagram protocol
3971
built upon the ADNL, called RLDP, is used instead of a TCP-like
3972
protocol. This reliable datagram protocol can be employed, for
3973
instance, to send RPC queries to remote hosts and receive answers from
3974
them (cf.~\ptref{sp:pure.net.serv}).
3975

3976
\mysubsection{TON DHT: Kademlia-like Distributed Hash
3977
  Table}\label{sect:kademlia}
3978

3979
The {\em TON Distributed Hash Table (DHT)\/} plays a crucial role in
3980
the networking part of the TON Project, being used to locate other
3981
nodes in the network. For example, a client wanting to commit a
3982
transaction into a shardchain might want to find a validator or a
3983
collator of that shardchain, or at least some node that might relay
3984
the client's transaction to a collator. This can be done by looking up
3985
a special key in the TON DHT. Another important application of the TON
3986
DHT is that it can be used to quickly populate a new node's neighbor
3987
table (cf.~\ptref{sp:net.startup}), simply by looking up a random key,
3988
or the new node's address. If a node uses proxying and tunneling for
3989
its inbound datagrams, it publishes the tunnel identifier and its
3990
entry point (e.g., IP address and UDP port) in the TON DHT; then all
3991
nodes wishing to send datagrams to that node will obtain this contact
3992
information from the DHT first.
3993

3994
The TON DHT is a member of the family of {\em Kademlia-like distributed
3995
  hash tables\/}~\cite{Kademlia}.
3996

3997
\nxsubpoint \embt(Keys of the TON DHT.)  The {\em keys\/} of the TON
3998
DHT are simply 256-bit integers. In most cases, they are computed as
3999
$\Sha$ of a TL-serialized object (cf.~\ptref{sp:TL}), called {\em
4000
  preimage\/} of the key, or {\em key description}. In some cases, the
4001
abstract addresses of the TON Network nodes (cf.~\ptref{sp:abs.addr})
4002
can also be used as keys of the TON DHT, because they are also
4003
256-bit, and they are also hashes of TL-serialized objects. For
4004
example, if a node is not afraid of publishing its IP address, it can
4005
be found by anybody who knows its abstract address by simply looking
4006
up that address as a key in the DHT.
4007

4008
\nxsubpoint \embt(Values of the DHT.)  The {\em values\/} assigned to
4009
these 256-bit keys are essentially arbitrary byte strings of limited
4010
length. The interpretation of such byte strings is determined by the
4011
preimage of the corresponding key; it is usually known both by the
4012
node that looks up the key, and by the node that stores the key.
4013

4014
\nxsubpoint \embt(Nodes of the DHT. Semi-permanent network
4015
identities.)  The key-value mapping of the TON DHT is kept on the {\em
4016
  nodes\/} of the DHT---essentially, all members of the TON
4017
Network. To this end, any node of the TON Network (perhaps with the
4018
exception of some very light nodes), apart from any number of
4019
ephemeral and permanent abstract addresses described
4020
in~\ptref{sp:abs.addr}, has at least one ``semi-permanent address'',
4021
which identifies it as a member of the TON DHT. This {\em
4022
  semi-permanent\/} or {\em DHT address\/} should not to be changed
4023
too often, otherwise other nodes would be unable to locate the keys
4024
they are looking for. If a node does not want to reveal its ``true''
4025
identity, it generates a separate abstract address to be used only for
4026
the purpose of participating in the DHT. However, this abstract
4027
address must be public, because it will be associated with the node's
4028
IP address and port.
4029

4030
\nxsubpoint \embt(Kademlia distance.)  Now we have both 256-bit keys
4031
and 256-bit (semi-permanent) node addresses. We introduce the
4032
so-called {\em XOR distance\/} or {\em Kademlia distance~$d_K$} on the
4033
set of 256-bit sequences, given by
4034
\begin{equation}
4035
  d_K(x,y):=(x\oplus y)\quad\text{interpreted as an unsigned 256-bit
4036
    integer}
4037
\end{equation}
4038
Here $x\oplus y$ denotes the bitwise eXclusive OR (XOR) of two bit
4039
sequences of the same length.
4040

4041
The Kademlia distance introduces a metric on the set $\st2^{256}$ of
4042
all 256-bit sequences. In particular, we have $d_K(x,y)=0$ if and only
4043
if $x=y$, $d_K(x,y)=d_K(y,x)$, and $d_K(x,z)\leq
4044
d_K(x,y)+d_K(y,z)$. Another important property is that {\em there is
4045
  only one point at any given distance from~$x$}: $d_K(x,y)=d_K(x,y')$
4046
implies $y=y'$.
4047

4048
\nxsubpoint \embt(Kademlia-like DHTs and the TON DHT.)  We say that a
4049
distributed hash table (DHT) with 256-bit keys and 256-bit node
4050
addresses is a {\em Kademlia-like DHT\/} if it is expected to keep the
4051
value of key $K$ on $s$ Kademlia-nearest nodes to $K$ (i.e., the $s$
4052
nodes with smallest Kademlia distance from their addresses to $K$.)
4053

4054
Here $s$ is a small parameter, say, $s=7$, needed to improve
4055
reliability of the DHT (if we would keep the key only on one node, the
4056
nearest one to~$K$, the value of that key would be lost if that only
4057
node goes offline).
4058

4059
The TON DHT is a Kademlia-like DHT, according to this definition. It
4060
is implemented over the ADNL protocol described in~\ptref{sect:ANL}.
4061

4062
\nxsubpoint \embt(Kademlia routing table.)  Any node participating in
4063
a Kademlia-like DHT usually maintains a {\em Kademlia routing
4064
  table}. In the case of TON DHT, it consists of $n=256$ buckets,
4065
numbered from $0$ to $n-1$. The $i$-th bucket will contain information
4066
about some known nodes (a fixed number $t$ of ``best'' nodes, and
4067
maybe some extra candidates) that lie at a Kademlia distance from
4068
$2^i$ to $2^{i+1}-1$ from the node's address $a$.\footnote{If there
4069
  are sufficiently many nodes in a bucket, it can be subdivided
4070
  further into, say, eight sub-buckets depending on the top four bits
4071
  of the Kademlia distance. This would speed up DHT lookups.} This
4072
information includes their (semi-permanent) addresses, IP addresses
4073
and UDP ports, and some availability information such as the time and
4074
the delay of the last ping.
4075

4076
When a Kademlia node learns about any other Kademlia node as a result
4077
of some query, it includes it into a suitable bucket of its routing
4078
table, first as a candidate. Then, if some of the ``best'' nodes in
4079
that bucket fail (e.g., do not respond to ping queries for a long
4080
time), they can be replaced by some of the candidates. In this way the
4081
Kademlia routing table stays populated.
4082

4083
New nodes from the Kademlia routing table are also included in the
4084
ADNL neighbor table described in~\ptref{sp:net.startup}. If a ``best''
4085
node from a bucket of the Kademlia routing table is used often, a
4086
channel in the sense described in~\ptref{sp:net.channels} can be
4087
established to facilitate the encryption of datagrams.
4088

4089
A special feature of the TON DHT is that it tries to select nodes with
4090
the smallest round-trip delays as the ``best'' nodes for the buckets
4091
of the Kademlia routing table.
4092

4093
\nxsubpoint (Kademlia network queries.)  A Kademlia node usually
4094
supports the following network queries:
4095
\begin{itemize}
4096
\item $\Ping$ -- Checks node availability.
4097
\item $\Store(key,value)$ -- Asks the node to keep $value$ as a value
4098
  for key $key$. For TON DHT, the $\Store$ queries are slightly more
4099
  complicated (cf.~\ptref{sp:DHT.store}).
4100
\item $\FindNode(key,l)$ -- Asks the node to return $l$
4101
  Kademlia-nearest known nodes (from its Kademlia routing table) to
4102
  $key$.
4103
\item $\FindValue(key,l)$ -- The same as above, but if the node knows
4104
  the value corresponding to key $key$, it just returns that value.
4105
\end{itemize}
4106

4107
When any node wants to look up the value of a key $K$, it first
4108
creates a set $S$ of $s'$ nodes (for some small value of $s'$, say,
4109
$s'=5$), nearest to $K$ with respect to the Kademlia distance among
4110
all known nodes (i.e., they are taken from the Kademlia routing
4111
table). Then a $\FindValue$ query is sent to each of them, and nodes
4112
mentioned in their answers are included in $S$. Then the $s'$ nodes
4113
from $S$, nearest to $K$, are also sent a $\FindValue$ query if this
4114
hasn't been done before, and the process continues until the value is
4115
found or the set $S$ stops growing. This is a sort of ``beam search''
4116
of the node nearest to $K$ with respect to Kademlia distance.
4117

4118
If the value of some key $K$ is to be set, the same procedure is run
4119
for $s'\geq s$, with $\FindNode$ queries instead of $\FindValue$, to
4120
find $s$ nearest nodes to $K$. Afterwards, $\Store$ queries are sent
4121
to all of them.
4122

4123
There are some less important details in the implementation of a
4124
Kademlia-like DHT (for example, any node should look up $s$ nearest
4125
nodes to itself, say, once every hour, and re-publish all stored keys
4126
to them by means of $\Store$ queries). We will ignore them for the
4127
time being.
4128

4129
\nxsubpoint \embt(Booting a Kademlia node.)  When a Kademlia node goes
4130
online, it first populates its Kademlia routing table by looking up
4131
its own address. During this process, it identifies the $s$ nearest
4132
nodes to itself. It can download from them all $(key,value)$ pairs
4133
known to them to populate its part of the DHT.
4134

4135
\nxsubpoint\label{sp:DHT.store} \embt(Storing values in TON DHT.)
4136
Storing values in TON DHT is slightly different from a general
4137
Kademlia-like DHT. When someone wishes to store a value, she must
4138
provide not only the key $K$ itself to the $\Store$ query, but also
4139
its {\em preimage\/}---i.e., a TL-serialized string (with one of
4140
several predefined TL-constructors at the beginning) containing a
4141
``description'' of the key. This key description is later kept by the
4142
node, along with the key and the value.
4143

4144
The key description describes the ``type'' of the object being stored,
4145
its ``owner'', and its ``update rules'' in case of future updates. The
4146
owner is usually identified by a public key included in the key
4147
description. If it is included, normally only updates signed by the
4148
corresponding private key will be accepted. The ``type'' of the stored
4149
object is normally just a byte string. However, in some cases it can
4150
be more sophisticated---for example, an input tunnel description
4151
(cf.~\ptref{sp:tunnels}), or a collection of node addresses.
4152

4153
The ``update rules'' can also be different. In some cases, they simply
4154
permit replacing the old value with the new value, provided the new
4155
value is signed by the owner (the signature must be kept as part of
4156
the value, to be checked later by any other nodes after they obtain
4157
the value of this key). In other cases, the old value somehow affects
4158
the new value. For example, it can contain a sequence number, and the
4159
old value is overwritten only if the new sequence number is larger (to
4160
prevent replay attacks).
4161

4162
\nxsubpoint\label{sp:distr.torr.tr} \embt(Distributed ``torrent
4163
trackers'' and ``network interest groups'' in TON DHT.)  Yet another
4164
interesting case is when the value contains a list of nodes---perhaps
4165
with their IP addresses and ports, or just with their abstract
4166
addresses---and the ``update rule'' consists in including the
4167
requester in this list, provided she can confirm her identity.
4168

4169
This mechanism might be used to create a distributed ``torrent
4170
tracker'', where all nodes interested in a certain ``torrent'' (i.e.,
4171
a certain file) can find other nodes that are interested in the same
4172
torrent, or already have a copy.
4173

4174
{\em TON Storage\/} (cf.~\ptref{sp:ex.ton.storage}) uses this
4175
technology to find the nodes that have a copy of a required file
4176
(e.g., a snapshot of the state of a shardchain, or an old
4177
block). However, its more important use is to create ``overlay
4178
multicast subnetworks'' and ``network interest groups''
4179
(cf.~\ptref{sect:overlay}). The idea is that only some nodes are
4180
interested in the updates of a specific shardchain. If the number of
4181
shardchains becomes very large, finding even one node interested in
4182
the same shard may become complicated. This ``distributed torrent
4183
tracker'' provides a convenient way to find some of these
4184
nodes. Another option would be to request them from a validator, but
4185
this would not be a scalable approach, and validators might choose not
4186
to respond to such queries coming from arbitrary unknown nodes.
4187

4188
\nxsubpoint \embt(Fall-back keys.)  Most of the ``key types''
4189
described so far have an extra 32-bit integer field in their TL
4190
description, normally equal to zero. However, if the key obtained by
4191
hashing that description cannot be retrieved from or updated in the
4192
TON DHT, the value in this field is increased, and a new attempt is
4193
made. In this way, one cannot ``capture'' and ``censor'' a key (i.e.,
4194
perform a key retention attack) by creating a lot of abstract
4195
addresses lying near the key under attack and controlling the
4196
corresponding DHT nodes.
4197

4198
\nxsubpoint\label{sp:loc.serv} \embt(Locating services.)  Some
4199
services, located in the TON Network and available through the
4200
(higher-level protocols built upon the) TON ADNL described
4201
in~\ptref{sect:ANL}, may want to publish their abstract addresses
4202
somewhere, so that their clients would know where to find them.
4203

4204
However, publishing the service's abstract address in the TON
4205
Blockchain may not be the best approach, because the abstract address
4206
might need to be changed quite often, and because it could make sense
4207
to provide several addresses, for reliability or load balancing
4208
purposes.
4209

4210
An alternative is to publish a public key into the TON Blockchain, and
4211
use a special DHT key indicating that public key as its ``owner'' in
4212
the TL description string (cf.~\ptref{sp:TL}) to publish an up-to-date
4213
list of the service's abstract addresses. This is one of the
4214
approaches exploited by TON Services.
4215

4216
\nxsubpoint \embt(Locating owners of TON blockchain accounts.)  In
4217
most cases, owners of TON blockchain accounts would not like to be
4218
associated with abstract network addresses, and especially IP
4219
addresses, because this can violate their privacy. In some cases,
4220
however, the owner of a TON blockchain account may want to publish
4221
one or several abstract addresses where she could be contacted.
4222

4223
A typical case is that of a node in the TON Payments ``lightning
4224
network'' (cf.~\ptref{sect:lightning}), the platform for instant
4225
cryptocurrency transfers. A public TON Payments node may want not only
4226
to establish payment channels with other peers, but also to publish an
4227
abstract network address that could be used to contact it at a later
4228
time for transferring payments along the already-established channels.
4229

4230
One option would be to include an abstract network address in the
4231
smart contract creating the payment channel. A more flexible option is
4232
to include a public key in the smart contract, and then use DHT as
4233
explained in~\ptref{sp:loc.serv}.
4234

4235
The most natural way would be to use the same private key that
4236
controls the account in the TON Blockchain to sign and publish updates
4237
in the TON DHT about the abstract addresses associated with that
4238
account. This is done almost in the same way as described
4239
in~\ptref{sp:loc.serv}; however, the DHT key employed would require a
4240
special key description, containing only the $\accountid$ itself,
4241
equal to $\Sha$ of the ``account description'', which contains the
4242
public key of the account. The signature, included in the value of
4243
this DHT key, would contain the account description as well.
4244

4245
In this way, a mechanism for locating abstract network addresses of
4246
some owners of the TON Blockchain accounts becomes available.
4247

4248
\nxsubpoint\label{sp:loc.abs.addr} \embt(Locating abstract addresses.)
4249
Notice that the TON DHT, while being implemented over TON ADNL, is
4250
itself used by the TON ADNL for several purposes.
4251

4252
The most important of them is to locate a node or its contact data
4253
starting from its 256-bit abstract address. This is necessary because
4254
the TON ADNL should be able to send datagrams to arbitrary 256-bit
4255
abstract addresses, even if no additional information is provided.
4256

4257
To this end, the 256-bit abstract address is simply looked up as a key
4258
in the DHT. Either a node with this address (i.e., using this address
4259
as a public semi-persistent DHT address) is found, in which case its
4260
IP address and port can be learned; or, an input tunnel description
4261
may be retrieved as the value of the key in question, signed by the
4262
correct private key, in which case this tunnel description would be
4263
used to send ADNL datagrams to the intended recipient.
4264

4265
Notice that in order to make an abstract address ``public'' (reachable
4266
from any nodes in the network), its owner must either use it as a
4267
semi-permanent DHT address, or publish (in the DHT key equal to the
4268
abstract address under consideration) an input tunnel description with
4269
another of its public abstract addresses (e.g., the semi-permanent
4270
address) as the tunnel's entry point. Another option would be to
4271
simply publish its IP address and UDP port.
4272

4273
\mysubsection{Overlay Networks and Multicasting
4274
  Messages}\label{sect:overlay}
4275

4276
In a multi-blockchain system like the TON Blockchain, even full nodes
4277
would normally be interested in obtaining updates (i.e., new blocks)
4278
only about some shardchains. To this end, a special overlay
4279
(sub)network must be built inside the TON Network, on top of the ADNL
4280
protocol discussed in~\ptref{sect:ANL}, one for each shardchain.
4281

4282
Therefore, the need to build arbitrary overlay subnetworks, open to
4283
any nodes willing to participate, arises. Special gossip protocols,
4284
built upon ADNL, will be run in these overlay networks. In particular,
4285
these gossip protocols may be used to propagate (broadcast) arbitrary
4286
data inside such a subnetwork.
4287

4288
\nxsubpoint \embt(Overlay networks.)  An {\em overlay (sub)network\/}
4289
is simply a (virtual) network implemented inside some larger
4290
network. Usually only some nodes of the larger network participate in
4291
the overlay subnetwork, and only some ``links'' between these nodes,
4292
physical or virtual, are part of the overlay subnetwork.
4293

4294
In this way, if the encompassing network is represented as a graph
4295
(perhaps a full graph in the case of a datagram network such as ADNL,
4296
where any node can easily communicate to any other), the overlay
4297
subnetwork is a {\em subgraph\/} of this graph.
4298

4299
In most cases, the overlay network is implemented using some protocol
4300
built upon the network protocol of the larger network. It may use the
4301
same addresses as the larger network, or use custom addresses.
4302

4303
\nxsubpoint\label{sp:ton.overlays} \embt(Overlay networks in TON.)
4304
Overlay networks in TON are built upon the ADNL protocol discussed
4305
in~\ptref{sect:ANL}; they use 256-bit ADNL abstract addresses as
4306
addresses in the overlay networks as well. Each node usually selects
4307
one of its abstract addresses to double as its address in the overlay
4308
network.
4309

4310
In contrast to ADNL, the TON overlay networks usually do not support
4311
sending datagrams to arbitrary other nodes. Instead, some
4312
``semipermanent links'' are established between some nodes (called
4313
``neighbors'' with respect to the overlay network under
4314
consideration), and messages are usually forwarded along these links
4315
(i.e., from a node to one of its neighbors). In this way, a TON
4316
overlay network is a (usually not full) subgraph inside the (full)
4317
graph of the ADNL network.
4318

4319
Links to neighbors in TON overlay networks can be implemented using
4320
dedicated peer-to-peer ADNL channels (cf.~\ptref{sp:net.channels}).
4321

4322
Each node of an overlay network maintains a list of neighbors (with
4323
respect to the overlay network), containing their abstract addresses
4324
(which they use to identify them in the overlay network) and some link
4325
data (e.g., the ADNL channel used to communicate with them).
4326

4327
\nxsubpoint \embt(Private and public overlay networks.)  Some overlay
4328
networks are {\em public}, meaning that any node can join them at
4329
will. Other are {\em private}, meaning that only certain nodes can be
4330
admitted (e.g., those that can prove their identities as validators.)
4331
Some private overlay networks can even be unknown to the ``general
4332
public''. The information about such overlay networks is made
4333
available only to certain trusted nodes; for example, it can be
4334
encrypted with a public key, and only nodes having a copy of the
4335
corresponding private key will be able to decrypt this information.
4336

4337
\nxsubpoint \embt(Centrally controlled overlay networks.)  Some
4338
overlay networks are {\em centrally controlled}, by one or several
4339
nodes, or by the owner of some widely-known public key. Others are
4340
{\em decentralized}, meaning that there are no specific nodes
4341
responsible for them.
4342

4343
\nxsubpoint \embt(Joining an overlay network.)  When a node wants to
4344
join an overlay network, it first must learn its 256-bit {\em network
4345
  identifier}, usually equal to $\Sha$ of the {\em description\/} of
4346
the overlay network---a TL-serialized object (cf.~\ptref{sp:TL}) which
4347
may contain, for instance, the central authority of the overlay
4348
network (i.e., its public key and perhaps its abstract
4349
address,\footnote{Alternatively, the abstract address might be stored
4350
  in the DHT as explained in~\ptref{sp:loc.serv}.}) a string with the
4351
name of the overlay network, a TON Blockchain shard identifier if this
4352
is an overlay network related to that shard, and so on.
4353

4354
Sometimes it is possible to recover the overlay network description
4355
starting from the network identifier, simply by looking it up in the
4356
TON DHT. In other cases (e.g., for private overlay networks), one must
4357
obtain the network description along with the network identifier.
4358

4359
\nxsubpoint\label{sp:loc.1.mem} \embt(Locating one member of the
4360
overlay network.)  After a node learns the network identifier and the
4361
network description of the overlay network it wants to join, it must
4362
locate at least one node belonging to that network.
4363

4364
This is also needed for nodes that do not want to join the overlay
4365
network, but want just to communicate with it; for example, there
4366
might be an overlay network dedicated to collecting and propagating
4367
transaction candidates for a specific shardchain, and a client might
4368
want to connect to any node of this network to suggest a transaction.
4369

4370
The method used for locating members of an overlay network is defined
4371
in the description of that network. Sometimes (especially for private
4372
networks) one must already know a member node to be able to join. In
4373
other cases, the abstract addresses of some nodes are contained in the
4374
network description. A more flexible approach is to indicate in the
4375
network description only the central authority responsible for the
4376
network, and then the abstract addresses will be available through
4377
values of certain DHT keys, signed by that central authority.
4378

4379
Finally, truly decentralized public overlay networks can use the
4380
``distributed torrent-tracker'' mechanism described
4381
in~\ptref{sp:distr.torr.tr}, also implemented with the aid of the TON
4382
DHT.
4383

4384
\nxsubpoint\label{sp:loc.many.mem} \embt(Locating more members of the
4385
overlay network. Creating links.)  Once one node of the overlay
4386
network is found, a special query may be sent to that node requesting
4387
a list of other members, for instance, neighbors of the node being
4388
queried, or a random selection thereof.
4389

4390
This enables the joining member to populate her ``adjacency'' or
4391
``neighbor list'' with respect to the overlay network, by selecting
4392
some newly-learned network nodes and establishing links to them (i.e.,
4393
dedicated ADNL point-to-point channels, as outlined
4394
in~\ptref{sp:ton.overlays}). After that, special messages are sent to
4395
all neighbors indicating that the new member is ready to work in the
4396
overlay network. The neighbors include their links to the new member
4397
in their neighbor lists.
4398

4399
\nxsubpoint\label{sp:rand.mem} \embt(Maintaining the neighbor list.)
4400
An overlay network node must update its neighbor list from time to
4401
time. Some neighbors, or at least links (channels) to them, may stop
4402
responding; in this case, these links must be marked as ``suspended'',
4403
some attempts to reconnect to such neighbors must be made, and, if
4404
these attempts fail, the links must be destroyed.
4405

4406
On the other hand, every node sometimes requests from a randomly
4407
chosen neighbor its list of neighbors (or some random selection
4408
thereof), and uses it to partially update its own neighbor list, by
4409
adding some newly-discovered nodes to it, and removing some of the old
4410
ones, either randomly or depending on their response times and
4411
datagram loss statistics.
4412

4413
\nxsubpoint \embt(The overlay network is a random subgraph.)  In this
4414
way, the overlay network becomes a random subgraph inside the ADNL
4415
network. If the degree of each vertex is at least three (i.e., if each
4416
node is connected to at least three neighbors), this random graph is
4417
known to be {\em connected\/} with a probability almost equal to
4418
one. More precisely, the probability of a random graph with $n$
4419
vertices being {\em dis\/}connected is exponentially small, and this
4420
probability can be completely neglected if, say, $n\geq20$. (Of
4421
course, this does not apply in the case of a global network partition,
4422
when nodes on different sides of the partition have no chance to learn
4423
about each other.) On the other hand, if $n$ is smaller than 20, it
4424
would suffice to require each vertex to have, say, at least ten
4425
neighbors.
4426

4427
\nxsubpoint\label{sp:ov.opt.low.lat} \embt(TON overlay networks are
4428
optimized for lower latency.)  TON overlay networks optimize the
4429
``random'' network graph generated by the previous method as
4430
follows. Every node tries to retain at least three neighbors with the
4431
minimal round-trip time, changing this list of ``fast neighbors'' very
4432
rarely. At the same time, it also has at least three other ``slow
4433
neighbors'' that are chosen completely randomly, so that the overlay
4434
network graph would always contain a random subgraph. This is required
4435
to maintain connectivity and prevent splitting of the overlay network
4436
into several unconnected regional subnetworks. At least three
4437
``intermediate neighbors'', which have intermediate round-trip times,
4438
bounded by a certain constant (actually, a function of the round-trip
4439
times of the fast and the slow neighbors), are also chosen and
4440
retained.
4441

4442
In this way, the graph of an overlay network still maintains enough
4443
randomness to be connected, but is optimized for lower latency and
4444
higher throughput.
4445

4446
\nxsubpoint \embt(Gossip protocols in an overlay network.)  An overlay
4447
network is often used to run one of the so-called {\em gossip
4448
  protocols}, which achieve some global goal while letting every node
4449
interact only with its neighbors. For example, there are gossip
4450
protocols to construct an approximate list of all members of a (not
4451
too large) overlay network, or to compute an estimate of the number of
4452
members of an (arbitrarily large) overlay network, using only a
4453
bounded amount of memory at each node (cf.~\cite[4.4.3]{DistrSys} or
4454
\cite{Birman} for details).
4455

4456
\nxsubpoint \embt(Overlay network as a broadcast domain.)  The most
4457
important gossip protocol running in an overlay network is the {\em
4458
  broadcast protocol}, intended to propagate broadcast messages
4459
generated by any node of the network, or perhaps by one of the
4460
designated sender nodes, to all other nodes.
4461

4462
There are in fact several broadcast protocols, optimized for different
4463
use cases. The simplest of them receives new broadcast messages and
4464
relays them to all neighbors that have not yet sent a copy of that
4465
message themselves.
4466

4467
\nxsubpoint \embt(More sophisticated broadcast protocols.)  Some
4468
applications may warrant more sophisticated broadcast protocols. For
4469
instance, for broadcasting messages of substantial size, it makes
4470
sense to send to the neighbors not the newly-received message itself,
4471
but its hash (or a collection of hashes of new messages). The neighbor
4472
may request the message itself after learning a previously unseen
4473
message hash, to be transferred, say, using the reliable large
4474
datagram protocol (RLDP) discussed in~\ptref{sp:RLDP}. In this way,
4475
the new message will be downloaded from one neighbor only.
4476

4477
\nxsubpoint \embt(Checking the connectivity of an overlay network.)
4478
The connectivity of an overlay network can be checked if there is a
4479
known node (e.g., the ``owner'' or the ``creator'' of the overlay
4480
network) that must be in this overlay network. Then the node in
4481
question simply broadcasts from time to time short messages containing
4482
the current time, a sequence number and its signature. Any other node
4483
can be sure that it is still connected to the overlay network if it
4484
has received such a broadcast not too long ago. This protocol can be
4485
extended to the case of several well-known nodes; for example, they
4486
all will send such broadcasts, and all other nodes will expect to
4487
receive broadcasts from more than half of the well-known nodes.
4488

4489
In the case of an overlay network used for propagating new blocks (or
4490
just new block headers) of a specific shardchain, a good way for a
4491
node to check connectivity is to keep track of the most recent block
4492
received so far. Because a block is normally generated every five
4493
seconds, if no new block is received for more than, say, thirty
4494
seconds, the node probably has been disconnected from the overlay
4495
network.
4496

4497
\nxsubpoint\label{sp:streaming.multicast} \embt(Streaming broadcast
4498
protocol.)  Finally, there is a {\em streaming broadcast protocol\/}
4499
for TON overlay networks, used, for example, to propagate block
4500
candidates among validators of some shardchain (``shardchain task
4501
group''), who, of course, create a private overlay network for that
4502
purpose. The same protocol can be used to propagate new shardchain
4503
blocks to all full nodes for that shardchain.
4504

4505
This protocol has already been outlined
4506
in~\ptref{sp:sh.blk.cand.prop}: the new (large) broadcast message is
4507
split into, say, $N$ one-kilobyte chunks; the sequence of these chunks
4508
is augmented to $M\geq N$ chunks by means of an erasure code such as
4509
the Reed--Solomon or a fountain code (e.g., the RaptorQ code
4510
\cite{RaptorQ} \cite{Raptor}), and these $M$ chunks are streamed to
4511
all neighbors in ascending chunk number order. The participating nodes
4512
collect these chunks until they can recover the original large message
4513
(one would have to successfully receive at least $N$ of the chunks for
4514
this), and then instruct their neighbors to stop sending new chunks of
4515
the stream, because now these nodes can generate the subsequent chunks
4516
on their own, having a copy of the original message. Such nodes
4517
continue to generate the subsequent chunks of the stream and send them
4518
to their neighbors, unless the neighbors in turn indicate that this is
4519
no longer necessary.
4520

4521
In this way, a node does not need to download a large message in its
4522
entirety before propagating it further. This minimizes broadcast
4523
latency, especially when combined with the optimizations described
4524
in~\ptref{sp:ov.opt.low.lat}.
4525

4526
\nxsubpoint \embt(Constructing new overlay networks based on existing
4527
ones.)  Sometimes one does not want to construct an overlay network
4528
from scratch. Instead, one or several previously existing overlay
4529
networks are known, and the membership of the new overlay network is
4530
expected to overlap significantly with the combined membership of
4531
these overlay networks.
4532

4533
An important example arises when a TON shardchain is split in two, or
4534
two sibling shardchains are merged into one
4535
(cf.~\ptref{sect:split.merge}). In the first case, the overlay
4536
networks used for propagating new blocks to full nodes must be
4537
constructed for each of the new shardchains; however, each of these
4538
new overlay networks can be expected to be contained in the block
4539
propagation network of the original shardchain (and comprise
4540
approximately half its members). In the second case, the overlay
4541
network for propagating new blocks of the merged shardchain will
4542
consist approximately of the union of members of the two overlay
4543
networks related to the two sibling shardchains being merged.
4544

4545
In such cases, the description of the new overlay network may contain
4546
an explicit or implicit reference to a list of related existing
4547
overlay networks. A node wishing to join the new overlay network may
4548
check whether it is already a member of one of these existing
4549
networks, and query its neighbors in these networks whether they are
4550
interested in the new network as well. In case of a positive answer,
4551
new point-to-point channels can be established to such neighbors, and
4552
they can be included in the neighbor list for the new overlay network.
4553

4554
This mechanism does not totally supplant the general mechanism
4555
described in~\ptref{sp:loc.1.mem} and \ptref{sp:loc.many.mem}; rather,
4556
both are run in parallel and are used to populate the neighbor
4557
list. This is needed to prevent inadvertent splitting of the new
4558
overlay network into several unconnected subnetworks.
4559

4560
\nxsubpoint\label{sp:net.within.net} \embt(Overlay networks within
4561
overlay networks.)  Another interesting case arises in the
4562
implementation of {\em TON Payments} (a ``lightning network'' for
4563
instant off-chain value transfers; cf.~\ptref{sect:lightning}). In
4564
this case, first an overlay network containing all transit nodes of
4565
the ``lightning network'' is constructed. However, some of these nodes
4566
have established payment channels in the blockchain; they must always
4567
be neighbors in this overlay network, in addition to any ``random''
4568
neighbors selected by the general overlay network algorithms described
4569
in~\ptref{sp:loc.1.mem}, \ptref{sp:loc.many.mem}
4570
and~\ptref{sp:rand.mem}. These ``permanent links'' to the neighbors
4571
with established payment channels are used to run specific lightning
4572
network protocols, thus effectively creating an overlay subnetwork
4573
(not necessarily connected, if things go awry) inside the encompassing
4574
(almost always connected) overlay network.
4575

4576
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
4577
%
4578
%
4579
%                  SERVICES
4580
%
4581
%
4582
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
4583
\clearpage
4584
\mysection{TON Services and Applications}\label{sect:services}
4585

4586
We have discussed the TON Blockchain and TON Networking technologies
4587
at some length. Now we explain some ways in which they can be combined
4588
to create a wide range of services and applications, and discuss some
4589
of the services that will be provided by the TON Project itself,
4590
either from the very beginning or at a later time.
4591

4592
\mysubsection{TON Service Implementation Strategies}%
4593
\label{sect:ton.service.impl}
4594

4595
We start with a discussion of how different blockchain and
4596
network-related applications and services may be implemented inside
4597
the TON ecosystem. First of all, a simple classification is in order:
4598

4599
\nxsubpoint \embt(Applications and services.)  We will use the words
4600
``application'' and ``service'' interchangeably. However, there is a
4601
subtle and somewhat vague distinction: an {\em application\/} usually
4602
provides some services directly to human users, while a {\em
4603
  service\/} is usually exploited by other applications and
4604
services. For example, TON Storage is a service, because it is
4605
designed to keep files on behalf of other applications and services,
4606
even though a human user might use it directly as well. A hypothetical
4607
``Facebook in a blockchain'' (cf.~\ptref{sp:blockchain.facebook}) or
4608
Telegram messenger, if made available through the TON Network (i.e.,
4609
implemented as a ``ton-service''; cf.~\ptref{sp:telegram.ton.serv}),
4610
would rather be an {\em application}, even though some ``bots'' might
4611
access it automatically without human intervention.
4612

4613
\nxsubpoint\label{sp:on.off.chain} \embt(Location of the application:
4614
on-chain, off-chain or mixed.)  A service or an application designed
4615
for the TON ecosystem needs to keep its data and process that data
4616
somewhere. This leads to the following classification of applications
4617
(and services):
4618
\begin{itemize}
4619
\item {\em On-chain\/} applications (cf.~\ptref{sp:pure.blockchain}):
4620
  All data and processing are in the TON Blockchain.
4621
\item {\em Off-chain\/} applications (cf.~\ptref{sp:pure.net.serv}):
4622
  All data and processing are outside the TON Blockchain, on servers
4623
  available through the TON Network.
4624
\item {\em Mixed\/} applications (cf.~\ptref{sp:mixed.serv}): Some,
4625
  but not all, data and processing are in the TON Blockchain; the rest
4626
  are on off-chain servers available through the TON Network.
4627
\end{itemize}
4628

4629
\nxsubpoint \embt(Centralization: centralized and decentralized, or
4630
distributed, applications.)  Another classification criterion is
4631
whether the application (or service) relies on a centralized server
4632
cluster, or is really ``distributed'' (cf.~\ptref{sp:fog}). All
4633
on-chain applications are automatically decentralized and
4634
distributed. Off-chain and mixed applications may exhibit different
4635
degrees of centralization.
4636

4637
\medbreak
4638
Now let us consider the above possibilities in more detail.
4639

4640
\nxsubpoint\label{sp:pure.blockchain} \embt(Pure ``on-chain''
4641
applications: distributed applications, or ``dapps'', residing in the
4642
blockchain.)  One of the possible approaches, mentioned
4643
in~\ptref{sp:on.off.chain}, is to deploy a ``distributed application''
4644
(commonly abbreviated as ``dapp'') completely in the TON Blockchain,
4645
as one smart contract or a collection of smart contracts. All data
4646
will be kept as part of the permanent state of these smart contracts,
4647
and all interaction with the project will be done by means of (TON
4648
Blockchain) messages sent to or received from these smart contracts.
4649

4650
We have already discussed in~\ptref{sp:blockchain.facebook} that this
4651
approach has its drawbacks and limitations. It has its advantages,
4652
too: such a distributed application needs no servers to run on or to
4653
store its data (it runs ``in the blockchain''---i.e., on the
4654
validators' hardware), and enjoys the blockchain's extremely high
4655
(Byzantine) reliability and accessibility. The developer of such a
4656
distributed application does not need to buy or rent any hardware; all
4657
she needs to do is develop some software (i.e., the code for the smart
4658
contracts). After that, she will effectively rent the computing power
4659
from the validators, and will pay for it in Grams, either herself or
4660
by putting this burden on the shoulders of her users.
4661

4662
\nxsubpoint\label{sp:pure.net.serv} \embt(Pure network services:
4663
``ton-sites'' and ``ton-services''.)  Another extreme option is to
4664
deploy the service on some servers and make it available to the users
4665
through the ADNL protocol described in~\ptref{sect:ANL}, and maybe
4666
some higher level protocol such as the RLDP discussed
4667
in~\ptref{sp:RLDP}, which can be used to send RPC queries to the
4668
service in any custom format and obtain answers to these queries. In
4669
this way, the service will be totally off-chain, and will reside in
4670
the TON Network, almost without using the TON Blockchain.
4671

4672
The TON Blockchain might be used only to locate the abstract address
4673
or addresses of the service, as outlined in~\ptref{sp:loc.serv},
4674
perhaps with the aid of a service such as the TON DNS
4675
(cf.~\ptref{sp:ton.dns}) to facilitate translation of domain-like
4676
human-readable strings into abstract addresses.
4677

4678
To the extent the ADNL network (i.e., the TON Network) is similar to
4679
the Invisible Internet Project ($I^2P$), such (almost) purely network
4680
services are analogous to the so-called ``eep-services'' (i.e.,
4681
services that have an $I^2P$-address as their entry point, and are
4682
available to clients through the $I^2P$ network). We will say that
4683
such purely network services residing in the TON Network are
4684
``ton-services''.
4685

4686
An ``eep-service'' may implement HTTP as its client-server protocol;
4687
in the TON Network context, a ``ton-service'' might simply use RLDP
4688
(cf.~\ptref{sp:RLDP}) datagrams to transfer HTTP queries and responses
4689
to them. If it uses the TON DNS to allow its abstract address to be
4690
looked up by a human-readable domain name, the analogy to a web site
4691
becomes almost perfect. One might even write a specialized browser, or
4692
a special proxy (``ton-proxy'') that is run locally on a user's
4693
machine, accepts arbitrary HTTP queries from an ordinary web browser
4694
the user employs (once the local IP address and the TCP port of the
4695
proxy are entered into the browser's configuration), and forwards
4696
these queries through the TON Network to the abstract address of the
4697
service. Then the user would have a browsing experience similar to
4698
that of the World Wide Web (WWW).
4699

4700
In the $I^2P$ ecosystem, such ``eep-services'' are called
4701
``eep-sites''. One can easily create ``ton-sites'' in the TON
4702
ecosystem as well. This is facilitated somewhat by the existence of
4703
services such as the TON DNS, which exploit the TON Blockchain and the
4704
TON DHT to translate (TON) domain names into abstract addresses.
4705

4706
\nxsubpoint\label{sp:telegram.ton.serv} \embt(Telegram Messenger as a
4707
ton-service; MTProto over RLDP.)  We would like to mention in passing
4708
that the MTProto
4709
protocol,\footnote{\url{https://core.telegram.org/mtproto}} used by
4710
Telegram Messenger\footnote{\url{https://telegram.org/}} for
4711
client-server interaction, can be easily embedded into the RLDP
4712
protocol discussed in~\ptref{sp:RLDP}, thus effectively transforming
4713
Telegram into a ton-service. Because the TON Proxy technology can be
4714
switched on transparently for the end user of a ton-site or a
4715
ton-service, being implemented on a lower level than the RLDP and ADNL
4716
protocols (cf.~\ptref{sp:tunnels}), this would make Telegram
4717
effectively unblockable. Of course, other messaging and social
4718
networking services might benefit from this technology as well.
4719

4720
\nxsubpoint\label{sp:mixed.serv} \embt(Mixed services: partly
4721
off-chain, partly on-chain.)  Some services might use a mixed
4722
approach: do most of the processing off-chain, but also have some
4723
on-chain part (for example, to register their obligations towards
4724
their users, and vice versa). In this way, part of the state would
4725
still be kept in the TON Blockchain (i.e., an immutable public
4726
ledger), and any misbehavior of the service or of its users could be
4727
punished by smart contracts.
4728

4729
\nxsubpoint\label{sp:ex.ton.storage} \embt(Example: keeping files
4730
off-chain; TON Storage.)  An example of such a service is given by
4731
{\em TON Storage}. In its simplest form, it allows users to store
4732
files off-chain, by keeping on-chain only a hash of the file to be
4733
stored, and possibly a smart contract where some other parties agree
4734
to keep the file in question for a given period of time for a
4735
pre-negotiated fee. In fact, the file may be subdivided into chunks of
4736
some small size (e.g., 1 kilobyte), augmented by an erasure code such
4737
as a Reed--Solomon or a fountain code, a Merkle tree hash may be
4738
constructed for the augmented sequence of chunks, and this Merkle tree
4739
hash might be published in the smart contract instead of or along with
4740
the usual hash of the file. This is somewhat reminiscent of the way
4741
files are stored in a torrent.
4742

4743
An even simpler form of storing files is completely off-chain: one
4744
might essentially create a ``torrent'' for a new file, and use TON DHT
4745
as a ``distributed torrent tracker'' for this torrent
4746
(cf.~\ptref{sp:distr.torr.tr}). This might actually work pretty well
4747
for popular files. However, one does not get any availability
4748
guarantees. For example, a hypothetical ``blockchain Facebook''
4749
(cf.~\ptref{sp:blockchain.facebook}), which would opt to keep the
4750
profile photographs of its users completely off-chain in such
4751
``torrents'', might risk losing photographs of ordinary (not
4752
especially popular) users, or at least risk being unable to present
4753
these photographs for prolonged periods. The TON Storage technology,
4754
which is mostly off-chain, but uses an on-chain smart contract to
4755
enforce availability of the stored files, might be a better match for
4756
this task.
4757

4758
\nxsubpoint\label{sp:fog} \embt(Decentralized mixed services, or ``fog
4759
services''.)  So far, we have discussed {\em centralized\/} mixed
4760
services and applications. While their on-chain component is processed
4761
in a decentralized and distributed fashion, being located in the
4762
blockchain, their off-chain component relies on some servers
4763
controlled by the service provider in the usual centralized
4764
fashion. Instead of using some dedicated servers, computing power
4765
might be rented from a cloud computing service offered by one of the
4766
large companies. However, this would not lead to decentralization of
4767
the off-chain component of the service.
4768

4769
A decentralized approach to implementing the off-chain component of a
4770
service consists in creating a {\em market}, where anybody possessing
4771
the required hardware and willing to rent their computing power or
4772
disk space would offer their services to those needing them.
4773

4774
For example, there might exist a registry (which might also be called
4775
a ``market'' or an ``exchange'') where all nodes interested in keeping
4776
files of other users publish their contact information, along with
4777
their available storage capacity, availability policy, and
4778
prices. Those needing these services might look them up there, and, if
4779
the other party agrees, create smart contracts in the blockchain and
4780
upload files for off-chain storage. In this way a service like {\em
4781
  TON Storage\/} becomes truly decentralized, because it does not need
4782
to rely on any centralized cluster of servers for storing files.
4783

4784
\nxsubpoint \embt(Example: ``fog computing'' platforms as
4785
decentralized mixed services.)  Another example of such a
4786
decentralized mixed application arises when one wants to perform some
4787
specific computations (e.g., 3D rendering or training neural
4788
networks), often requiring specific and expensive hardware. Then those
4789
having such equipment might offer their services through a similar
4790
``exchange'', and those needing such services would rent them, with
4791
the obligations of the sides registered by means of smart
4792
contracts. This is similar to what ``fog computing'' platforms, such
4793
as Golem (\url{https://golem.network/}) or SONM
4794
(\url{https://sonm.io/}), promise to deliver.
4795

4796
\nxsubpoint\label{sp:ex.ton.proxy} \embt(Example: TON Proxy is a fog
4797
service.)  {\em TON Proxy\/} provides yet another example of a fog
4798
service, where nodes wishing to offer their services (with or without
4799
compensation) as tunnels for ADNL network traffic might register,
4800
and those needing them might choose one of these nodes depending on
4801
the price, latency and bandwidth offered. Afterwards, one might use
4802
payment channels provided by {\em TON Payments\/} for processing
4803
micropayments for the services of those proxies, with payments
4804
collected, for instance, for every 128~KiB transferred.
4805

4806
\nxsubpoint \embt(Example: TON Payments is a fog service.)  The TON
4807
Payments platform (cf.~\ptref{sect:payments}) is also an example of
4808
such a decentralized mixed application.
4809

4810
\mysubsection{Connecting Users and Service
4811
  Providers}\label{sect:reg.markt}
4812

4813
We have seen in~\ptref{sp:fog} that ``fog services'' (i.e., mixed
4814
decentralized services) will usually need some {\em markets}, {\em
4815
  exchanges\/} or {\em registries}, where those needing specific
4816
services might meet those providing them.
4817

4818
Such markets are likely to be implemented as on-chain, off-chain or
4819
mixed services themselves, centralized or distributed.
4820

4821
\nxsubpoint \embt(Example: connecting to TON Payments.)  For example,
4822
if one wants to use TON Payments (cf.~\ptref{sect:payments}), the
4823
first step would be to find at least some existing transit nodes of
4824
the ``lightning network'' (cf.~\ptref{sect:lightning}), and establish
4825
payment channels with them, if they are willing. Some nodes can be
4826
found with the aid of the ``encompassing'' overlay network, which is
4827
supposed to contain all transit lightning network nodes
4828
(cf.~\ptref{sp:net.within.net}). However, it is not clear whether
4829
these nodes will be willing to create new payment channels. Therefore,
4830
a registry is needed where nodes ready to create new links can publish
4831
their contact information (e.g., their abstract addresses).
4832

4833
\nxsubpoint \embt(Example: uploading a file into TON Storage.)
4834
Similarly, if one wants to upload a file into the TON Storage, she
4835
must locate some nodes willing to sign a smart contract binding them
4836
to keep a copy of that file (or of any file below a certain size
4837
limit, for that matter). Therefore, a registry of nodes offering their
4838
services for storing files is needed.
4839

4840
\nxsubpoint \embt(On-chain, mixed and off-chain registries.)  Such a
4841
registry of service providers might be implemented completely
4842
on-chain, with the aid of a smart contract which would keep the
4843
registry in its permanent storage. However, this would be quite slow
4844
and expensive. A mixed approach is more efficient, where the
4845
relatively small and rarely changed on-chain registry is used only to
4846
point out some nodes (by their abstract addresses, or by their public
4847
keys, which can be used to locate actual abstract addresses as
4848
described in~\ptref{sp:loc.serv}), which provide off-chain
4849
(centralized) registry services.
4850

4851
Finally, a decentralized, purely off-chain approach might consist of a
4852
public overlay network (cf.~\ptref{sect:overlay}), where those willing
4853
to offer their services, or those looking to buy somebody's services,
4854
simply broadcast their offers, signed by their private keys. If the
4855
service to be provided is very simple, even broadcasting the offers
4856
might be not necessary: the approximate membership of the overlay
4857
network itself might be used as a ``registry'' of those willing to
4858
provide a particular service. Then a client requiring this service
4859
might locate (cf.~\ptref{sp:loc.many.mem}) and query some nodes of
4860
this overlay network, and then query their neighbors, if the nodes
4861
already known are not ready to satisfy its needs.
4862

4863
\nxsubpoint\label{sp:side.chain.reg} \embt(Registry or exchange in a
4864
side-chain.)  Another approach to implementing decentralized mixed
4865
registries consists in creating an independent specialized blockchain
4866
(``side-chain''), maintained by its own set of self-proclaimed
4867
validators, who publish their identities in an on-chain smart contract
4868
and provide network access to all interested parties to this
4869
specialized blockchain, collecting transaction candidates and
4870
broadcasting block updates through dedicated overlay networks
4871
(cf.~\ptref{sect:overlay}). Then any full node for this sidechain can
4872
maintain its own copy of the shared registry (essentially equal to the
4873
global state of this side-chain), and process arbitrary queries
4874
related to this registry.
4875

4876
\nxsubpoint \embt(Registry or exchange in a workchain.)  Another
4877
option is to create a dedicated workchain inside the TON Blockchain,
4878
specialized for creating registries, markets and exchanges. This might
4879
be more efficient and less expensive than using smart contracts
4880
residing in the basic workchain
4881
(cf.~\ptref{sp:basic.workchain}). However, this would still be more
4882
expensive than maintaining registries in side-chains
4883
(cf.~\ptref{sp:side.chain.reg}).
4884

4885
\mysubsection{Accessing TON Services}
4886

4887
We have discussed in~\ptref{sect:ton.service.impl} the different
4888
approaches one might employ for creating new services and applications
4889
residing in the TON ecosystem. Now we discuss how these services might
4890
be accessed, and some of the ``helper services'' that will be provided
4891
by TON, including {\em TON DNS\/} and {\em TON Storage}.
4892

4893
\nxsubpoint\label{sp:ton.dns} \embt(TON DNS: a mostly on-chain
4894
hierarchical domain name service.)  The {\em TON DNS\/} is a
4895
predefined service, which uses a collection of smart contracts to keep
4896
a map from human-readable domain names to (256-bit) addresses of ADNL
4897
network nodes and TON Blockchain accounts and smart contracts.
4898

4899
While anybody might in principle implement such a service using the
4900
TON Blockchain, it is useful to have such a predefined service with a
4901
well-known interface, to be used by default whenever an application or
4902
a service wants to translate human-readable identifiers into
4903
addresses.
4904

4905
\nxsubpoint \embt(TON DNS use cases.)  For example, a user looking to
4906
transfer some cryptocurrency to another user or to a merchant may
4907
prefer to remember a TON DNS domain name of the account of that user
4908
or merchant, instead of keeping their 256-bit account identifiers at
4909
hand and copy-pasting them into the recipient field in their light
4910
wallet client.
4911

4912
Similarly, TON DNS may be used to locate account identifiers of smart
4913
contracts or entry points of ton-services and ton-sites
4914
(cf.~\ptref{sp:pure.net.serv}), enabling a specialized client
4915
(``ton-browser''), or a usual internet browser combined with a
4916
specialized ton-proxy extension or stand-alone application, to deliver
4917
a WWW-like browsing experience to the user.
4918

4919
\nxsubpoint \embt(TON DNS smart contracts.)  The TON DNS is
4920
implemented by means of a tree of special (DNS) smart contracts. Each
4921
DNS smart contract is responsible for registering subdomains of some
4922
fixed domain. The ``root'' DNS smart contract, where level one domains
4923
of the TON DNS system will be kept, is located in the masterchain. Its
4924
account identifier must be hardcoded into all software that wishes to
4925
access the TON DNS database directly.
4926

4927
Any DNS smart contract contains a hashmap, mapping variable-length
4928
null-terminated UTF-8 strings into their ``values''. This hashmap is
4929
implemented as a binary Patricia tree, similar to that described
4930
in~\ptref{sp:patricia} but supporting variable-length bitstrings as
4931
keys.
4932

4933
\nxsubpoint \embt(Values of the DNS hashmap, or TON DNS records.)  As
4934
to the values, they are ``TON DNS records'' described by a TL-scheme
4935
(cf.~\ptref{sp:TL}). They consist of a ``magic number'', selecting one
4936
of the options supported, and then either an account identifier, or a
4937
smart-contract identifier, or an abstract network address
4938
(cf.~\ptref{sect:ANL}), or a public key used to locate abstract
4939
addresses of a service (cf.~\ptref{sp:loc.serv}), or a description of
4940
an overlay network, and so on. An important case is that of another
4941
DNS smart contract: in such a case, that smart contract is used to
4942
resolve subdomains of its domain. In this way, one can create separate
4943
registries for different domains, controlled by the owners of those
4944
domains.
4945

4946
These records may also contain an expiration time, a caching time
4947
(usually very large, because updating values in a blockchain too often
4948
is expensive), and in most cases a reference to the owner of the
4949
subdomain in question. The owner has the right to change this record
4950
(in particular, the owner field, thus transferring the domain to
4951
somebody else's control), and to prolong it.
4952

4953
\nxsubpoint \embt(Registering new subdomains of existing domains.)  In
4954
order to register a new subdomain of an existing domain, one simply
4955
sends a message to the smart contract, which is the registrar of that
4956
domain, containing the subdomain (i.e., the key) to be registered, the
4957
value in one of several predefined formats, an identity of the owner,
4958
an expiration date, and some amount of cryptocurrency as determined by
4959
the domain's owner.
4960

4961
Subdomains are registered on a ``first-come, first-served'' basis.
4962

4963
\nxsubpoint\label{sp:dns.get} \embt(Retrieving data from a DNS smart
4964
contract.)  In principle, any full node for the masterchain or
4965
shardchain containing a DNS smart contract might be able to look up
4966
any subdomain in the database of that smart contract, if the structure
4967
and the location of the hashmap inside the persistent storage of the
4968
smart contract are known.
4969

4970
However, this approach would work only for certain DNS smart
4971
contracts. It would fail miserably if a non-standard DNS smart
4972
contract were used.
4973

4974
Instead, an approach based on {\em general smart contract
4975
  interfaces\/} and {\em get methods\/} (cf.~\ptref{sp:get.methods})
4976
is used. Any DNS smart contract must define a ``get method'' with a
4977
``known signature'', which is invoked to look up a key. Since this
4978
approach makes sense for other smart contracts as well, especially
4979
those providing on-chain and mixed services, we explain it in some
4980
detail in~\ptref{sp:get.methods}.
4981

4982
\nxsubpoint \embt(Translating a TON DNS domain.)  Once any full node,
4983
acting by itself or on behalf of some light client, can look up
4984
entries in the database of any DNS smart contract, arbitrary TON DNS
4985
domain names can be recursively translated, starting from the
4986
well-known and fixed root DNS smart contract (account) identifier.
4987

4988
For example, if one wants to translate \texttt{A.B.C}, one looks up
4989
keys \texttt{.C}, \texttt{.B.C}, and \texttt{A.B.C} in the root domain
4990
database. If the first of them is not found, but the second is, and
4991
its value is a reference to another DNS smart contract, then
4992
\texttt{A} is looked up in the database of that smart contract and the
4993
final value is retrieved.
4994

4995
\nxsubpoint \embt(Translating TON DNS domains for light nodes.)  In
4996
this way, a full node for the masterchain---and also for all
4997
shardchains involved in the domain look-up process---might translate
4998
any domain name into its current value without external help. A light
4999
node might request a full node to do this on its behalf and return the
5000
value, along with a Merkle proof
5001
(cf.~\ptref{sp:merkle.query.resp}). This Merkle proof would enable the
5002
light node to verify that the answer is correct, so such TON DNS
5003
responses cannot be ``spoofed'' by a malicious interceptor, in
5004
contrast to the usual DNS protocol.
5005

5006
Because no node can be expected to be a full node with respect to all
5007
shardchains, actual TON DNS domain translation would involve a
5008
combination of these two strategies.
5009

5010
\nxsubpoint \embt(Dedicated ``TON DNS servers''.)  One might provide a
5011
simple ``TON DNS server'', which would receive RPC ``DNS'' queries
5012
(e.g., via the ADNL or RLDP protocols described in~\ptref{sect:ANL}),
5013
requesting that the server translate a given domain, process these
5014
queries by forwarding some subqueries to other (full) nodes if
5015
necessary, and return answers to the original queries, augmented by
5016
Merkle proofs if required.
5017

5018
Such ``DNS servers'' might offer their services (for free or not) to
5019
any other nodes and especially light clients, using one of the methods
5020
described in~\ptref{sect:reg.markt}. Notice that these servers, if
5021
considered part of the TON DNS service, would effectively transform it
5022
from a distributed on-chain service into a distributed mixed service
5023
(i.e., a ``fog service'').
5024

5025
This concludes our brief overview of the TON DNS service, a scalable
5026
on-chain registry for human-readable domain names of TON Blockchain
5027
and TON Network entities.
5028

5029
\nxsubpoint \embt(Accessing data kept in smart contracts.)  We have
5030
already seen that it is sometimes necessary to access data stored in a
5031
smart contract without changing its state.
5032

5033
If one knows the details of the smart-contract implementation, one can
5034
extract all the needed information from the smart contract's
5035
persistent storage, available to all full nodes of the shardchain the
5036
smart contract resides in. However, this is quite an inelegant way of
5037
doing things, depending very much on the smart-contract
5038
implementation.
5039

5040
\nxsubpoint\label{sp:get.methods} \embt(``Get methods'' of smart
5041
contracts.)  A better way would be to define some {\em get methods\/}
5042
in the smart contract, that is, some types of inbound messages that do
5043
not affect the state of the smart contract when delivered, but
5044
generate one or more output messages containing the ``result'' of the
5045
get method. In this way, one can obtain data from a smart contract,
5046
knowing only that it implements a get method with a known signature
5047
(i.e., a known format of the inbound message to be sent and outbound
5048
messages to be received as a result).
5049

5050
This way is much more elegant and in line with object oriented
5051
programming (OOP). However, it has an obvious defect so far: one must
5052
actually commit a transaction into the blockchain (sending the get
5053
message to the smart contract), wait until it is committed and
5054
processed by the validators, extract the answer from a new block, and
5055
pay for gas (i.e., for executing the get method on the validators'
5056
hardware). This is a waste of resources: get methods do not change the
5057
state of the smart contract anyways, so they need not be executed in
5058
the blockchain.
5059

5060
\nxsubpoint\label{sp:tent.exec.get} \embt(Tentative execution of get
5061
methods of smart contracts.)  We have already remarked
5062
(cf.~\ptref{sp:ext.msg}) that any full node can tentatively execute
5063
any method of any smart contract (i.e., deliver any message to a smart
5064
contract), starting from a given state of the smart contract, without
5065
actually committing the corresponding transaction. The full node can
5066
simply load the code of the smart contract under consideration into
5067
the TON VM, initialize its persistent storage from the global state of
5068
the shardchain (known to all full nodes of the shardchain), and
5069
execute the smart-contract code with the inbound message as its input
5070
parameter. The output messages created will yield the result of this
5071
computation.
5072

5073
In this way, any full node can evaluate arbitrary get methods of
5074
arbitrary smart contracts, provided their signature (i.e., the format
5075
of inbound and outbound messages) is known. The node may keep track of
5076
the cells of the shardchain state accessed during this evaluation, and
5077
create a Merkle proof of the validity of the computation performed,
5078
for the benefit of a light node that might have asked the full node to
5079
do so (cf.~\ptref{sp:merkle.query.resp}).
5080

5081
\nxsubpoint \embt(Smart-contract interfaces in TL-schemes.)  Recall
5082
that the methods implemented by a smart contract (i.e., the input
5083
messages accepted by it) are essentially some TL-serialized objects,
5084
which can be described by a TL-scheme (cf.~\ptref{sp:TL}). The
5085
resulting output messages can be described by the same TL-scheme as
5086
well. In this way, the interface provided by a smart contract to other
5087
accounts and smart contracts may be formalized by means of a
5088
TL-scheme.
5089

5090
In particular, (a subset of) get methods supported by a smart
5091
contract can be described by such a formalized smart-contract
5092
interface.
5093

5094
\nxsubpoint\label{sp:pub.int.smartc} \embt(Public interfaces of a
5095
smart contract.)  Notice that a formalized smart-contract interface,
5096
either in form of a TL-scheme (represented as a TL source file;
5097
cf.~\ptref{sp:TL}) or in serialized form,\footnote{TL-schemes can be
5098
  TL-serialized themselves;
5099
  cf.\ \url{https://core.telegram.org/mtproto/TL-tl}.} can be
5100
published---for example, in a special field in the smart-contract
5101
account description, stored in the blockchain, or separately, if this
5102
interface will be referred to many times. In the latter case a hash of
5103
the supported public interface may be incorporated into the
5104
smart-contract description instead of the interface description
5105
itself.
5106

5107
An example of such a public interface is that of a DNS smart contract,
5108
which is supposed to implement at least one standard get method for
5109
looking up subdomains (cf.~\ptref{sp:dns.get}). A standard method for
5110
registering new subdomains can be also included in the standard public
5111
interface of DNS smart contracts.
5112

5113
\nxsubpoint\label{sp:ui.smartc} \embt(User interface of a smart
5114
contract.)  The existence of a public interface for a smart contract
5115
has other benefits, too. For example, a wallet client application may
5116
download such an interface while examining a smart contract on the
5117
request of a user, and display a list of public methods (i.e., of
5118
available actions) supported by the smart contract, perhaps with some
5119
human-readable comments if any are provided in the formal
5120
interface. After the user selects one of these methods, a form may be
5121
automatically generated according to the TL-scheme, where the user
5122
will be prompted for all fields required by the chosen method and for
5123
the desired amount of cryptocurrency (e.g., Grams) to be attached to
5124
this request. Submitting this form will create a new blockchain
5125
transaction containing the message just composed, sent from the user's
5126
blockchain account.
5127

5128
In this way, the user will be able to interact with arbitrary smart
5129
contracts from the wallet client application in a user-friendly way by
5130
filling and submitting certain forms, provided these smart contracts
5131
have published their interfaces.
5132

5133
\nxsubpoint\label{sp:ui.ton.serv} \embt(User interface of a
5134
``ton-service''.)  It turns out that ``ton-services'' (i.e., services
5135
residing in the TON Network and accepting queries through the ADNL and
5136
RLDP protocols of~\ptref{sect:network}; cf.~\ptref{sp:pure.net.serv})
5137
may also profit from having public interfaces, described by TL-schemes
5138
(cf.~\ptref{sp:TL}). A client application, such as a light wallet or a
5139
``ton-browser'', might prompt the user to select one of the methods
5140
and to fill in a form with parameters defined by the interface,
5141
similarly to what has just been discussed in~\ptref{sp:ui.smartc}. The
5142
only difference is that the resulting TL-serialized message is not
5143
submitted as a transaction in the blockchain; instead, it is sent as
5144
an RPC query to the abstract address of the ``ton-service'' in
5145
question, and the response to this query is parsed and displayed
5146
according to the formal interface (i.e., a TL-scheme).
5147

5148
\nxsubpoint\label{sp:ui.ton.dns} \embt(Locating user interfaces via
5149
TON DNS.)  The TON DNS record containing an abstract address of a
5150
ton-service or a smart-contract account identifier might also contain
5151
an optional field describing the public (user) interface of that
5152
entity, or several supported interfaces. Then the client application
5153
(be it a wallet, a ton-browser or a ton-proxy) will be able to
5154
download the interface and interact with the entity in question (be it
5155
a smart contract or a ton-service) in a uniform way.
5156

5157
\nxsubpoint \embt(Blurring the distinction between on-chain and off-chain
5158
services.)  In this way, the distinction between on-chain, off-chain
5159
and mixed services (cf.~\ptref{sp:on.off.chain}) is blurred for the
5160
end user: she simply enters the domain name of the desired service
5161
into the address line of her ton-browser or wallet, and the rest is
5162
handled seamlessly by the client application.
5163

5164
\nxsubpoint\label{sp:telegram.integr} \embt(A light wallet and TON
5165
entity explorer can be built into Telegram Messenger clients.)  An
5166
interesting opportunity arises at this point. A light wallet and TON
5167
entity explorer, implementing the above functionality, can be embedded
5168
into the Telegram Messenger smartphone client application, thus
5169
bringing the technology to more than 200 million people. Users would
5170
be able to send hyperlinks to TON entities and resources by including
5171
TON URIs (cf.~\ptref{sp:ton.uri}) in messages; such hyperlinks, if
5172
selected, will be opened internally by the Telegram client application
5173
of the receiving party, and interaction with the chosen entity will
5174
begin.
5175

5176
\nxsubpoint \embt(``ton-sites'' as ton-services supporting an HTTP
5177
interface.)  A {\em ton-site\/} is simply a ton-service that supports
5178
an HTTP interface, perhaps along with some other interfaces. This
5179
support may be announced in the corresponding TON DNS record.
5180

5181
\nxsubpoint \embt(Hyperlinks.)  Notice that the HTML pages returned by
5182
ton-sites may contain {\em ton-hyperlinks}---that is, references to
5183
other ton-sites, smart contracts and accounts by means of specially
5184
crafted URI schemes (cf.~\ptref{sp:ton.uri})---containing either
5185
abstract network addresses, account identifiers, or human-readable TON
5186
DNS domains. Then a ``ton-browser'' might follow such a hyperlink when
5187
the user selects it, detect the interface to be used, and display a
5188
user interface form as outlined in \ptref{sp:ui.smartc}
5189
and~\ptref{sp:ui.ton.serv}.
5190

5191
\nxsubpoint\label{sp:ton.uri} \embt(Hyperlink URLs may specify some
5192
parameters.)  The hyperlink URLs may contain not only a (TON) DNS
5193
domain or an abstract address of the service in question, but also the
5194
name of the method to be invoked and some or all of its parameters. A
5195
possible URI scheme for this might look as follows:
5196
\begin{quote}
5197
\texttt{ton://}\textit{<domain>}\texttt{/}\textit{<method>}\texttt{?}%
5198
\textit{<field1>}\texttt{=}\textit{<value1>}\texttt{\&}%
5199
\textit{<field2>}\texttt{=}\dots
5200
\end{quote}
5201
When the user selects such a link in a ton-browser, either the action
5202
is performed immediately (especially if it is a get method of a smart
5203
contract, invoked anonymously), or a partially filled form is
5204
displayed, to be explicitly confirmed and submitted by the user (this
5205
may be required for payment forms).
5206

5207
\nxsubpoint \embt(POST actions.)  A ton-site may embed into the HTML
5208
pages it returns some usual-looking POST forms, with POST actions
5209
referring either to ton-sites, ton-services or smart contracts by
5210
means of suitable (TON) URLs. In that case, once the user fills and
5211
submits that custom form, the corresponding action is taken, either
5212
immediately or after an explicit confirmation.
5213

5214
\nxsubpoint\label{sp:ton.www} \embt(TON WWW.)  All of the above will
5215
lead to the creation of a whole web of cross-referencing entities,
5216
residing in the TON Network, which would be accessible to the end user
5217
through a ton-browser, providing the user with a WWW-like browsing
5218
experience. For end users, this will finally make blockchain
5219
applications fundamentally similar to the web sites they are already
5220
accustomed to.
5221

5222
\nxsubpoint \embt(Advantages of TON WWW.)  This ``TON WWW'' of
5223
on-chain and off-chain services has some advantages over its
5224
conventional counterpart. For example, payments are inherently
5225
integrated in the system. User identity can be always presented to the
5226
services (by means of automatically generated signatures on the
5227
transactions and RPC requests generated), or hidden at will. Services
5228
would not need to check and re-check user credentials; these
5229
credentials can be published in the blockchain once and for all. User
5230
network anonymity can be easily preserved by means of TON Proxy, and
5231
all services will be effectively unblockable. Micropayments are also
5232
possible and easy, because ton-browsers can be integrated with the TON
5233
Payments system.
5234

5235
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
5236
%
5237
%
5238
%                  PAYMENTS
5239
%
5240
%
5241
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
5242
\clearpage
5243
\mysection{TON Payments}\label{sect:payments}
5244

5245
The last component of the TON Project we will briefly discuss in this
5246
text is {\em TON Payments}, the platform for (micro)payment channels
5247
and ``lightning network'' value transfers. It would enable ``instant''
5248
payments, without the need to commit all transactions into the
5249
blockchain, pay the associated transaction fees (e.g., for the gas
5250
consumed), and wait five seconds until the block containing the
5251
transactions in question is confirmed.
5252

5253
The overall overhead of such instant payments is so small that one can
5254
use them for micropayments. For example, a TON file-storing service
5255
might charge the user for every 128 KiB of downloaded data, or a paid
5256
TON Proxy might require some tiny micropayment for every 128 KiB of
5257
traffic relayed.
5258

5259
While {\em TON Payments\/} is likely to be released later than the
5260
core components of the TON Project, some considerations need to be
5261
made at the very beginning. For example, the TON Virtual Machine (TON
5262
VM; cf.~\ptref{sp:tonvm}), used to execute the code of TON Blockchain
5263
smart contracts, must support some special operations with Merkle
5264
proofs. If such support is not present in the original design, adding
5265
it at a later stage might become problematic
5266
(cf.~\ptref{sp:genome.change.never}). We will see, however, that the
5267
TON VM comes with natural support for ``smart'' payment channels
5268
(cf.~\ptref{sp:ton.smart.pc.supp}) out of the box.
5269

5270
\mysubsection{Payment Channels}
5271

5272
We start with a discussion of point-to-point payment channels, and how they can be implemented in the TON Blockchain.
5273

5274
\nxsubpoint \embt(The idea of a payment channel.)  Suppose two
5275
parties, $A$ and $B$, know that they will need to make a lot of
5276
payments to each other in the future. Instead of committing each
5277
payment as a transaction in the blockchain, they create a shared
5278
``money pool'' (or perhaps a small private bank with exactly two
5279
accounts), and contribute some funds to it: $A$ contributes $a$
5280
coins, and $B$ contributes $b$ coins. This is achieved by creating a
5281
special smart contract in the blockchain, and sending the money to it.
5282

5283
Before creating the ``money pool'', the two sides agree to a certain
5284
protocol. They will keep track of the {\em state\/} of the pool---that
5285
is, of their balances in the shared pool. Originally, the state is
5286
$(a,b)$, meaning that $a$ coins actually belong to~$A$, and $b$ coins
5287
belong to~$B$. Then, if $A$ wants to pay $d$ coins to $B$, they can
5288
simply agree that the new state is $(a',b')=(a-d,b+d)$. Afterwards,
5289
if, say, $B$ wants to pay $d'$ coins to $A$, the state will become
5290
$(a'',b'')=(a'+d',b'-d')$, and so on.
5291

5292
All this updating of balances inside the pool is done completely
5293
off-chain. When the two parties decide to withdraw their due funds
5294
from the pool, they do so according to the final state of the
5295
pool. This is achieved by sending a special message to the smart
5296
contract, containing the agreed-upon final state $(a^*,b^*)$ along
5297
with the signatures of both~$A$ and $B$. Then the smart contract sends
5298
$a^*$ coins to $A$, $b^*$ coins to $B$ and self-destructs.
5299

5300
This smart contract, along with the network protocol used by $A$ and
5301
$B$ to update the state of the pool, is a simple {\em payment channel
5302
  between $A$ and~$B$.} According to the classification described
5303
in~\ptref{sp:on.off.chain}, it is a {\em mixed\/} service: part of its
5304
state resides in the blockchain (the smart contract), but most of its
5305
state updates are performed off-chain (by the network protocol). If
5306
everything goes well, the two parties will be able to perform as many
5307
payments to each other as they want (with the only restriction being
5308
that the ``capacity'' of the channel is not overrun---i.e., their
5309
balances in the payment channel both remain non-negative), committing
5310
only two transactions into the blockchain: one to open (create) the
5311
payment channel (smart contract), and another to close (destroy) it.
5312

5313
\nxsubpoint \embt(Trustless payment channels.)  The previous example
5314
was somewhat unrealistic, because it assumes that both parties are
5315
willing to cooperate and will never cheat to gain some
5316
advantage. Imagine, for example, that $A$ will choose not to sign the
5317
final balance $(a',b')$ with $a'<a$. This would put $B$ in a difficult
5318
situation.
5319

5320
To protect against such scenarios, one usually tries to develop {\em
5321
  trustless\/} payment channel protocols, which do not require the
5322
parties to trust each other, and make provisions for punishing any
5323
party who would attempt to cheat.
5324

5325
This is usually achieved with the aid of signatures. The payment
5326
channel smart contract knows the public keys of $A$ and $B$, and it
5327
can check their signatures if needed. The payment channel protocol
5328
requires the parties to sign the intermediate states and send the
5329
signatures to each other. Then, if one of the parties cheats---for
5330
instance, pretends that some state of the payment channel never
5331
existed---its misbehavior can be proved by showing its signature on
5332
that state. The payment channel smart contract acts as an ``on-chain
5333
arbiter'', able to process complaints of the two parties about each
5334
other, and punish the guilty party by confiscating all of its money
5335
and awarding it to the other party.
5336

5337
\nxsubpoint\label{sp:simple.sync.pc} \embt(Simple bidirectional
5338
synchronous trustless payment channel.)  Consider the following, more
5339
realistic example: Let the state of the payment channel be described
5340
by triple $(\delta_i,i,o_i)$, where $i$ is the sequence number of the
5341
state (it is originally zero, and then it is increased by one when a
5342
subsequent state appears), $\delta_i$ is the {\em channel imbalance\/}
5343
(meaning that $A$ and $B$ own $a+\delta_i$ and $b-\delta_i$ coins,
5344
respectively), and $o_i$ is the party allowed to generate the next
5345
state (either $A$ or $B$). Each state must be signed both by $A$ and
5346
$B$ before any further progress can be made.
5347

5348
Now, if $A$ wants to transfer $d$ coins to $B$ inside the payment
5349
channel, and the current state is $S_i=(\delta_i,i,o_i)$ with $o_i=A$,
5350
then it simply creates a new state $S_{i+1}=(\delta_i-d,i+1,o_{i+1})$,
5351
signs it, and sends it to $B$ along with its signature. Then $B$
5352
confirms it by signing and sending a copy of its signature to
5353
$A$. After that, both parties have a copy of the new state with both
5354
of their signatures, and a new transfer may occur.
5355

5356
If $A$ wants to transfer coins to $B$ in a state $S_i$ with $o_i=B$,
5357
then it first asks $B$ to commit a subsequent state $S_{i+1}$ with the
5358
same imbalance $\delta_{i+1}=\delta_i$, but with $o_{i+1}=A$. After
5359
that, $A$ will be able to make its transfer.
5360

5361
When the two parties agree to close the payment channel, they both put
5362
their special {\em final\/} signatures on the state $S_k$ they believe
5363
to be final, and invoke the {\em clean\/} or {\em two-sided
5364
finalization method\/} of the payment channel smart contract by sending
5365
it the final state along with both final signatures.
5366

5367
If the other party does not agree to provide its final signature, or
5368
simply if it stops responding, it is possible to close the channel
5369
unilaterally. For this, the party wishing to do so will invoke the
5370
{\em unilateral finalization\/} method, sending to the smart contract
5371
its version of the final state, its final signature, and the most
5372
recent state having a signature of the other party. After that, the
5373
smart contract does not immediately act on the final state
5374
received. Instead, it waits for a certain period of time (e.g., one
5375
day) for the other party to present its version of the final
5376
state. When the other party submits its version and it turns out to be
5377
compatible with the already submitted version, the ``true'' final
5378
state is computed by the smart contract and used to distribute the
5379
money accordingly. If the other party fails to present its version of
5380
the final state to the smart contract, then the money is redistributed
5381
according to the only copy of the final state presented.
5382

5383
If one of the two parties cheats---for example, by signing two
5384
different states as final, or by signing two different next
5385
states $S_{i+1}$ and $S'_{i+1}$, or by signing an invalid new state
5386
$S_{i+1}$ (e.g., with imbalance $\delta_{i+1}<-a$ or $>b$)---then the
5387
other party may submit proof of this misbehavior to a third method of
5388
the smart contract. The guilty party is punished immediately by losing
5389
its share in the payment channel completely.
5390

5391
This simple payment channel protocol is {\em fair\/} in the sense that
5392
any party can always get its due, with or without the cooperation of
5393
the other party, and is likely to lose all of its funds committed to
5394
the payment channel if it tries to cheat.
5395

5396
\nxsubpoint\label{sp:sync.pc.as.blockch} \embt(Synchronous payment
5397
channel as a simple virtual blockchain with two validators.)  The
5398
above example of a simple synchronous payment channel can be recast as
5399
follows. Imagine that the sequence of states $S_0$, $S_1$, \dots,
5400
$S_n$ is actually the sequence of blocks of a very simple
5401
blockchain. Each block of this blockchain contains essentially only
5402
the current state of the blockchain, and maybe a reference to the
5403
previous block (i.e., its hash). Both parties $A$ and $B$ act as
5404
validators for this blockchain, so every block must collect both of
5405
their signatures. The state $S_i$ of the blockchain defines the
5406
designated producer $o_i$ for the next block, so there is no race
5407
between $A$ and $B$ for producing the next block. Producer $A$ is
5408
allowed to create blocks that transfer funds from $A$ to $B$ (i.e.,
5409
decrease the imbalance: $\delta_{i+1}\leq\delta_i$), and $B$ can only
5410
transfer funds from $B$ to $A$ (i.e., increase $\delta$).
5411

5412
If the two validators agree on the final block (and the final state)
5413
of the blockchain, it is finalized by collecting special ``final''
5414
signatures of the two parties, and submitting them along with the
5415
final block to the channel smart contract for processing and
5416
re-distributing the money accordingly.
5417

5418
If a validator signs an invalid block, or creates a fork, or signs two
5419
different final blocks, it can be punished by presenting a proof of
5420
its misbehavior to the smart contract, which acts as an ``on-chain
5421
arbiter'' for the two validators; then the offending party will lose
5422
all its money kept in the payment channel, which is analogous to a
5423
validator losing its stake.
5424

5425
\nxsubpoint\label{sp:async.pc} \embt(Asynchronous payment channel as a
5426
virtual blockchain with two workchains.)  The synchronous payment
5427
channel discussed in \ptref{sp:simple.sync.pc} has a certain
5428
disadvantage: one cannot begin the next transaction (money transfer
5429
inside the payment channel) before the previous one is confirmed by
5430
the other party. This can be fixed by replacing the single virtual
5431
blockchain discussed in~\ptref{sp:sync.pc.as.blockch} by a system of
5432
two interacting virtual workchains (or rather shardchains).
5433

5434
The first of these workchains contains only transactions by $A$, and
5435
its blocks can be generated only by~$A$; its states are
5436
$S_i=(i,\phi_i,j,\psi_j)$, where $i$ is the block sequence number
5437
(i.e., the count of transactions, or money transfers, performed by $A$
5438
so far), $\phi_i$ is the total amount transferred from $A$ to $B$ so
5439
far, $j$ is the sequence number of the most recent valid block in
5440
$B$'s blockchain that $A$ is aware of, and $\psi_j$ is the amount of
5441
money transferred from $B$ to $A$ in its $j$ transactions. A signature
5442
of $B$ put onto its $j$-th block should also be a part of this
5443
state. Hashes of the previous block of this workchain and of the
5444
$j$-th block of the other workchain may be also included. Validity
5445
conditions for $S_i$ include $\phi_i\geq 0$, $\phi_i\geq\phi_{i-1}$ if
5446
$i>0$, $\psi_j\geq0$, and $-a\leq\psi_j-\phi_i\leq b$.
5447

5448
Similarly, the second workchain contains only transactions by $B$, and
5449
its blocks are generated only by~$B$; its states are
5450
$T_j=(j,\psi_j,i,\phi_i)$, with similar validity conditions.
5451

5452
Now, if $A$ wants to transfer some money to $B$, it simply creates a
5453
new block in its workchain, signs it, and sends to $B$, without
5454
waiting for confirmation.
5455

5456
The payment channel is finalized by $A$ signing (its version of) the
5457
final state of its blockchain (with its special ``final signature''),
5458
$B$ signing the final state of its blockchain, and presenting these
5459
two final states to the clean finalization method of the payment
5460
channel smart contract. Unilateral finalization is also possible, but
5461
in that case the smart contract will have to wait for the other party
5462
to present its version of the final state, at least for some grace
5463
period.
5464

5465
\nxsubpoint \embt(Unidirectional payment channels.)  If only $A$ needs
5466
to make payments to $B$ (e.g., $B$ is a service provider, and $A$ its
5467
client), then a unilateral payment channel can be
5468
created. Essentially, it is just the first workchain described
5469
in~\ptref{sp:async.pc} without the second one. Conversely, one can say
5470
that the asynchronous payment channel described in \ptref{sp:async.pc}
5471
consists of two unidirectional payment channels, or ``half-channels'',
5472
managed by the same smart contract.
5473

5474
\nxsubpoint\label{sp:pc.promises} \embt(More sophisticated payment
5475
channels. Promises.)  We will see later in~\ptref{sp:ch.money.tr} that
5476
the ``lightning network'' (cf.~\ptref{sect:lightning}), which enables
5477
instant money transfers through chains of several payment channels,
5478
requires higher degrees of sophistication from the payment channels
5479
involved.
5480

5481
In particular, we want to be able to commit ``promises'', or
5482
``conditional money transfers'': $A$ agrees to send $c$ coins to $B$,
5483
but $B$ will get the money only if a certain condition is fulfilled,
5484
for instance, if $B$ can present some string $u$ with $\Hash(u)=v$ for
5485
a known value of $v$. Otherwise, $A$ can get the money back after a
5486
certain period of time.
5487

5488
Such a promise could easily be implemented on-chain by a simple smart
5489
contract. However, we want promises and other kinds of conditional
5490
money transfers to be possible off-chain, in the payment channel,
5491
because they considerably simplify money transfers along a chain of
5492
payment channels existing in the ``lightning network''
5493
(cf.~\ptref{sp:ch.money.tr}).
5494

5495
The ``payment channel as a simple blockchain'' picture outlined
5496
in~\ptref{sp:sync.pc.as.blockch} and~\ptref{sp:async.pc} becomes
5497
convenient here. Now we consider a more complicated virtual
5498
blockchain, the state of which contains a set of such unfulfilled
5499
``promises'', and the amount of funds locked in such promises. This
5500
blockchain---or the two workchains in the asynchronous case---will
5501
have to refer explicitly to the previous blocks by their
5502
hashes. Nevertheless, the general mechanism remains the same.
5503

5504
\nxsubpoint\label{sp:sm.pc.chal} \embt(Challenges for the
5505
sophisticated payment channel smart contracts.)  Notice that, while
5506
the final state of a sophisticated payment channel is still small, and
5507
the ``clean'' finalization is simple (if the two sides have agreed on
5508
their amounts due, and both have signed their agreement, nothing else
5509
remains to be done), the unilateral finalization method and the method
5510
for punishing fraudulent behavior need to be more complex. Indeed, they
5511
must be able to accept Merkle proofs of misbehavior, and to check
5512
whether the more sophisticated transactions of the payment channel
5513
blockchain have been processed correctly.
5514

5515
In other words, the payment channel smart contract must be able to
5516
work with Merkle proofs, to check their ``hash validity'', and must
5517
contain an implementation of $\evtrans$ and $\evblock$ functions
5518
(cf.~\ptref{sp:blk.transf}) for the payment channel (virtual)
5519
blockchain.
5520

5521
\nxsubpoint\label{sp:ton.smart.pc.supp} \embt(TON VM support for
5522
``smart'' payment channels.)  The TON VM, used to run the code of TON
5523
Blockchain smart contracts, is up to the challenge of executing the
5524
smart contracts required for ``smart'', or sophisticated, payment
5525
channels (cf.~\ptref{sp:sm.pc.chal}).
5526

5527
At this point the ``everything is a bag of cells'' paradigm
5528
(cf.~\ptref{sp:everything.is.BoC}) becomes extremely convenient. Since
5529
all blocks (including the blocks of the ephemeral payment channel
5530
blockchain) are represented as bags of cells (and described by some
5531
algebraic data types), and the same holds for messages and Merkle
5532
proofs as well, a Merkle proof can easily be embedded into an inbound
5533
message sent to the payment channel smart contract. The ``hash
5534
condition'' of the Merkle proof will be checked automatically, and
5535
when the smart contract accesses the ``Merkle proof'' presented, it
5536
will work with it as if it were a value of the corresponding algebraic
5537
data type---albeit incomplete, with some subtrees of the tree replaced
5538
by special nodes containing the Merkle hash of the omitted
5539
subtree. Then the smart contract will work with that value, which
5540
might represent, for instance, a block of the payment channel
5541
(virtual) blockchain along with its state, and will evaluate the
5542
$\evblock$ function (cf.~\ptref{sp:blk.transf}) of that blockchain on
5543
this block and the previous state. Then either the computation
5544
finishes, and the final state can be compared with that asserted in
5545
the block, or an ``absent node'' exception is thrown while attempting
5546
to access an absent subtree, indicating that the Merkle proof is
5547
invalid.
5548

5549
In this way, the implementation of the verification code for smart
5550
payment channel blockchains turns out to be quite straightforward
5551
using TON Blockchain smart contracts. One might say that {\em the TON
5552
  Virtual Machine comes with built-in support for checking the
5553
  validity of other simple blockchains.} The only limiting factor is
5554
the size of the Merkle proof to be incorporated into the inbound
5555
message to the smart contract (i.e., into the transaction).
5556

5557
\nxsubpoint\label{sp:pc.within.pc} \embt(Simple payment channel within
5558
a smart payment channel.)  We would like to discuss the possibility of
5559
creating a simple (synchronous or asynchronous) payment channel inside
5560
an existing payment channel.
5561

5562
While this may seem somewhat convoluted, it is not much harder to
5563
understand and implement than the ``promises'' discussed
5564
in~\ptref{sp:pc.promises}. Essentially, instead of promising to pay
5565
$c$ coins to the other party if a solution to some hash problem is
5566
presented, $A$ promises to pay up to $c$ coins to $B$ according to the
5567
final settlement of some other (virtual) payment channel
5568
blockchain. Generally speaking, this other payment channel blockchain
5569
need not even be between $A$ and $B$; it might involve some other
5570
parties, say, $C$ and $D$, willing to commit $c$ and $d$ coins into
5571
their simple payment channel, respectively. (This possibility is
5572
exploited later in~\ptref{sp:virt.pc}.)
5573

5574
If the encompassing payment channel is asymmetric, two promises need
5575
to be committed into the two workchains: $A$ will promise to pay
5576
$-\delta$ coins to $B$ if the final settlement of the ``internal''
5577
simple payment channel yields a negative final imbalance $\delta$ with
5578
$0\leq-\delta\leq c$; and $B$ will have to promise to pay $\delta$ to
5579
$A$ if $\delta$ is positive. On the other hand, if the encompassing
5580
payment channel is symmetric, this can be done by committing a single
5581
``simple payment channel creation'' transaction with parameters
5582
$(c,d)$ into the single payment channel blockchain by~$A$ (which would
5583
freeze $c$ coins belonging to~$A$), and then committing a special
5584
``confirmation transaction'' by~$B$ (which would freeze $d$ coins
5585
of~$B$).
5586

5587
We expect the internal payment channel to be extremely simple (e.g.,
5588
the simple synchronous payment channel discussed
5589
in~\ptref{sp:simple.sync.pc}), to minimize the size of Merkle proofs
5590
to be submitted. The external payment channel will have to be
5591
``smart'' in the sense described in~\ptref{sp:pc.promises}.
5592

5593
\mysubsection{Payment Channel Network, or ``Lightning
5594
  Network''}\label{sect:lightning}
5595

5596
Now we are ready to discuss the ``lightning network'' of TON Payments
5597
that enables instant money transfers between any two participating
5598
nodes.
5599

5600
\nxsubpoint \embt(Limitations of payment channels.)  A payment channel
5601
is useful for parties who expect a lot of money transfers between
5602
them. However, if one needs to transfer money only once or twice to a
5603
particular recipient, creating a payment channel with her would be
5604
impractical. Among other things, this would imply freezing a
5605
significant amount of money in the payment channel, and would require
5606
at least two blockchain transactions anyway.
5607

5608
\nxsubpoint \embt(Payment channel networks, or ``lightning
5609
networks''.)  Payment channel networks overcome the limitations of
5610
payment channels by enabling money transfers along {\em chains} of
5611
payment channels. If $A$ wants to transfer money to $E$, she does not
5612
need to establish a payment channel with $E$. It would be sufficient
5613
to have a chain of payment channels linking $A$ with $E$ through
5614
several intermediate nodes---say, four payment channels: from $A$ to
5615
$B$, from $B$ to $C$, from $C$ to $D$ and from $D$ to $E$.
5616

5617
\nxsubpoint \embt(Overview of payment channel networks.)  Recall that
5618
a {\em payment channel network}, known also as a ``lightning
5619
network'', consists of a collection of participating nodes, some of
5620
which have established long-lived payment channels between them. We
5621
will see in a moment that these payment channels will have to be
5622
``smart'' in the sense of~\ptref{sp:pc.promises}. When a
5623
participating node $A$ wants to transfer money to any other
5624
participating node $E$, she tries to find a path linking $A$ to $E$
5625
inside the payment channel network. When such a path is found, she
5626
performs a ``chain money transfer'' along this path.
5627

5628
\nxsubpoint\label{sp:ch.money.tr} \embt(Chain money transfers.)
5629
Suppose that there is a chain of payment channels from $A$ to $B$,
5630
from $B$ to $C$, from $C$ to $D$, and from $D$ to $E$. Suppose,
5631
further, that $A$ wants to transfer $x$ coins to $E$.
5632

5633
A simplistic approach would be to transfer $x$ coins to $B$ along
5634
the existing payment channel, and ask him to forward the money further
5635
to $C$. However, it is not evident why $B$ would not simply take the
5636
money for himself. Therefore, one must employ a more sophisticated
5637
approach, not requiring all parties involved to trust each other.
5638

5639
This can be achieved as follows. $A$ generates a large random number
5640
$u$ and computes its hash $v=\Hash(u)$. Then she creates a promise to
5641
pay $x$ coins to $B$ if a number $u$ with hash $v$ is presented
5642
(cf.~\ptref{sp:pc.promises}), inside her payment channel
5643
with~$B$. This promise contains $v$, but not $u$, which is still kept
5644
secret.
5645

5646
After that, $B$ creates a similar promise to $C$ in their payment
5647
channel. He is not afraid to give such a promise, because he is aware
5648
of the existence of a similar promise given to him by $A$. If $C$ ever
5649
presents a solution of the hash problem to collect $x$ coins promised
5650
by $B$, then $B$ will immediately submit this solution to $A$ to
5651
collect $x$ coins from $A$.
5652

5653
Then similar promises of $C$ to $D$ and of $D$ to $E$ are
5654
created. When the promises are all in place, $A$ triggers the transfer
5655
by communicating the solution $u$ to all parties involved---or just to
5656
$E$.
5657

5658
Some minor details are omitted in this description. For example, these
5659
promises must have different expiration times, and the amount promised
5660
might slightly differ along the chain ($B$ might promise only
5661
$x-\epsilon$ coins to $C$, where $\epsilon$ is a small pre-agreed
5662
transit fee). We ignore such details for the time being, because they
5663
are not too relevant for understanding how payment channels work and
5664
how they can be implemented in TON.
5665

5666
\nxsubpoint\label{sp:virt.pc} \embt(Virtual payment channels inside a
5667
chain of payment channels.)  Now suppose that $A$ and $E$ expect to
5668
make a lot of payments to each other. They might create a new payment
5669
channel between them in the blockchain, but this would still be quite
5670
expensive, because some funds would be locked in this payment
5671
channel. Another option would be to use chain money transfers
5672
described in~\ptref{sp:ch.money.tr} for each payment. However, this
5673
would involve a lot of network activity and a lot of transactions in
5674
the virtual blockchains of all payment channels involved.
5675

5676
An alternative is to create a virtual payment channel inside the chain
5677
linking $A$ to $E$ in the payment channel network. For this, $A$ and
5678
$E$ create a (virtual) blockchain for their payments, as if they were
5679
going to create a payment channel in the blockchain. However, instead
5680
of creating a payment channel smart contract in the blockchain, they
5681
ask all intermediate payment channels---those linking $A$ to $B$, $B$
5682
to $C$, etc.---to create simple payment channels inside them, bound
5683
to the virtual blockchain created by $A$ and $E$
5684
(cf.~\ptref{sp:pc.within.pc}). In other words, now a promise to
5685
transfer money according to the final settlement between $A$ and~$E$
5686
exists inside every intermediate payment channel.
5687

5688
If the virtual payment channel is unidirectional, such promises can be
5689
implemented quite easily, because the final imbalance $\delta$ is
5690
going to be non-positive, so simple payment channels can be created
5691
inside intermediate payment channels in the same order as described
5692
in~\ptref{sp:ch.money.tr}. Their expiration times can also be set in
5693
the same way.
5694

5695
If the virtual payment channel is bidirectional, the situation is
5696
slightly more complicated. In that case, one should split the promise
5697
to transfer $\delta$ coins according to the final settlement into two
5698
half-promises, as explained in \ptref{sp:pc.within.pc}: to transfer
5699
$\delta^-=\max(0,-\delta)$ coins in the forward direction, and to
5700
transfer $\delta^+=\max(0,\delta)$ in the backward direction. These
5701
half-promises can be created in the intermediate payment channels
5702
independently, one chain of half-promises in the direction from $A$
5703
to~$E$, and the other chain in the opposite direction.
5704

5705
\nxsubpoint\label{sp:lnet.find.path} \embt(Finding paths in the
5706
lightning network.)  One point remains undiscussed so far: how will
5707
$A$ and $E$ find a path connecting them in the payment network?  If
5708
the payment network is not too large, an OSPF-like protocol can be
5709
used: all nodes of the payment network create an overlay network
5710
(cf.~\ptref{sp:net.within.net}), and then every node propagates all
5711
available link (i.e., participating payment channel) information to
5712
its neighbors by a gossip protocol. Ultimately, all nodes will have a
5713
complete list of all payment channels participating in the payment
5714
network, and will be able to find the shortest paths by
5715
themselves---for example, by applying a version of Dijkstra's
5716
algorithm modified to take into account the ``capacities'' of the
5717
payment channels involved (i.e., the maximal amounts that can be
5718
transferred along them). Once a candidate path is found, it can be
5719
probed by a special ADNL datagram containing the full path, and asking
5720
each intermediate node to confirm the existence of the payment channel
5721
in question, and to forward this datagram further according to the
5722
path. After that, a chain can be constructed, and a protocol for chain
5723
transfers (cf.~\ptref{sp:ch.money.tr}), or for creating a virtual
5724
payment channel inside a chain of payment channels
5725
(cf.~\ptref{sp:virt.pc}), can be run.
5726

5727
\nxsubpoint \embt(Optimizations.)  Some optimizations might be done
5728
here. For example, only transit nodes of the lightning network need to
5729
participate in the OSPF-like protocol discussed
5730
in~\ptref{sp:lnet.find.path}. Two ``leaf'' nodes wishing to connect
5731
through the lightning network would communicate to each other the
5732
lists of transit nodes they are connected to (i.e., with which they
5733
have established payment channels participating in the payment
5734
network). Then paths connecting transit nodes from one list to transit
5735
nodes from the other list can be inspected as outlined above
5736
in~\ptref{sp:lnet.find.path}.
5737

5738
\nxsubpoint \embt(Conclusion.)  We have outlined how the blockchain
5739
and network technologies of the TON project are adequate to the task
5740
of creating {\em TON Payments}, a platform for off-chain instant money
5741
transfers and micropayments. This platform can be extremely useful
5742
for services residing in the TON ecosystem, allowing them to easily
5743
collect micropayments when and where required.
5744

5745
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
5746
%
5747
%
5748
%                  CONCLUSION
5749
%
5750
%
5751
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
5752

5753
\clearpage
5754
\section*{Conclusion}
5755
\markbothsame{\textsc{Conclusion}}
5756
\addcontentsline{toc}{section}{Conclusion}
5757

5758
We have proposed a scalable multi-blockchain architecture capable of
5759
supporting a massively popular cryptocurrency and decentralized
5760
applications with user-friendly interfaces.
5761

5762
To achieve the necessary scalability, we proposed the {\em TON
5763
  Blockchain}, a ``tightly-coupled'' multi-blockchain system
5764
(cf.~\ptref{sp:blkch.interact}) with bottom-up approach to sharding
5765
(cf.~\ptref{sp:shard.supp} and~\ptref{sp:ISP}). To further increase
5766
potential performance, we introduced the 2-blockchain mechanism for
5767
replacing invalid blocks (cf.~\ptref{sp:inv.sh.blk.corr}) and Instant
5768
Hypercube Routing for faster communication between shards
5769
(cf.~\ptref{sp:instant.hypercube}). A brief comparison of the TON
5770
Blockchain to existing and proposed blockchain projects
5771
(cf.~\ptref{sect:class.blkch} and~\ptref{sect:compare.blkch})
5772
highlights the benefits of this approach for systems that seek to
5773
handle millions of transactions per second.
5774

5775
The {\em TON Network}, described in Chapter~\ptref{sect:network},
5776
covers the networking demands of the proposed multi-blockchain
5777
infrastructure. This network component may also be used in combination
5778
with the blockchain to create a wide spectrum of applications and
5779
services, impossible using the blockchain alone
5780
(cf.~\ptref{sp:blockchain.facebook}). These services, discussed in
5781
Chapter~\ptref{sect:services}, include {\em TON DNS}, a service for
5782
translating human-readable object identifiers into their addresses;
5783
{\em TON Storage}, a distributed platform for storing arbitrary files;
5784
{\em TON Proxy}, a service for anonymizing network access and
5785
accessing TON-powered services; and {\em TON Payments\/}
5786
(cf. Chapter~\ptref{sect:payments}), a platform for instant off-chain
5787
money transfers across the TON ecosystem that applications may use for
5788
micropayments.
5789

5790
The TON infrastructure allows for specialized light client wallet and
5791
``ton-browser'' desktop and smartphone applications that enable a
5792
browser-like experience for the end user (cf.~\ptref{sp:ton.www}),
5793
making cryptocurrency payments and interaction with smart contracts
5794
and other services on the TON Platform accessible to the mass
5795
user. Such a light client can be integrated into the Telegram
5796
Messenger client (cf.~\ptref{sp:telegram.integr}), thus eventually
5797
bringing a wealth of blockchain-based applications to hundreds of
5798
millions of users.
5799

5800
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
5801
%
5802
%
5803
%                  BIBLIOGRAPHY
5804
%
5805
%
5806
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
5807

5808
\clearpage
5809
\markbothsame{\textsc{References}}
5810

5811
\begin{thebibliography}{2}
5812

5813
\bibitem{Birman}
5814
  {\sc K.~Birman}, {\sl Reliable Distributed Systems: Technologies, Web Services and Applications}, Springer, 2005.
5815
  
5816
\bibitem{EthWP}
5817
  {\sc V.~Buterin}, {\sl Ethereum: A next-generation smart contract and decentralized application platform}, \url{https://github.com/ethereum/wiki/wiki/White-Paper}, 2013.
5818

5819
\bibitem{BenOr}
5820
  {\sc M.~Ben-Or, B.~Kelmer, T.~Rabin}, {\sl Asynchronous secure computations with optimal resilience}, in {\em Proceedings of the thirteenth annual ACM symposium on Principles of distributed computing}, p.~183--192. ACM, 1994.
5821

5822
\bibitem{PBFT}
5823
  {\sc M.~Castro, B.~Liskov, et al.}, {\sl Practical byzantine fault tolerance}, {\it Proceedings of the Third Symposium on Operating Systems Design and Implementation\/} (1999), p.~173--186, available at \url{http://pmg.csail.mit.edu/papers/osdi99.pdf}.
5824

5825
\bibitem{EOSWP}
5826
  {\sc EOS.IO}, {\sl EOS.IO technical white paper}, \url{https://github.com/EOSIO/Documentation/blob/master/TechnicalWhitePaper.md}, 2017.
5827

5828
\bibitem{Onion}
5829
  {\sc D.~Goldschlag, M.~Reed, P.~Syverson}, {\sl Onion Routing for Anonymous and Private Internet Connections}, {\it Communications of the ACM}, {\bf 42}, num.~2 (1999), \url{http://www.onion-router.net/Publications/CACM-1999.pdf}.
5830

5831
\bibitem{Byzantine}
5832
  {\sc L.~Lamport, R.~Shostak, M.~Pease}, {\sl The byzantine generals problem}, {\it ACM Transactions on Programming Languages and Systems}, {\bf 4/3} (1982), p.~382--401.
5833

5834
\bibitem{BitShWP}
5835
  {\sc S.~Larimer}, {\sl The history of BitShares}, \url{https://docs.bitshares.org/bitshares/history.html}, 2013.
5836

5837
\bibitem{RaptorQ}
5838
  {\sc M.~Luby, A.~Shokrollahi, et al.}, {\sl RaptorQ forward error correction scheme for object delivery}, IETF RFC 6330, \url{https://tools.ietf.org/html/rfc6330}, 2011.
5839

5840
\bibitem{Kademlia}
5841
  {\sc P.~Maymounkov, D.~Mazi\`eres}, {\sl Kademlia: A peer-to-peer information system based on the XOR metric}, in {\em IPTPS '01 revised papers from the First International Workshop on Peer-to-Peer Systems}, p.~53--65, available at \url{http://pdos.csail.mit.edu/~petar/papers/maymounkov-kademlia-lncs.pdf}, 2002.
5842

5843
\bibitem{HoneyBadger}
5844
  {\sc A.~Miller, Yu Xia, et al.}, {\sl The honey badger of BFT protocols}, Cryptology e-print archive 2016/99, \url{https://eprint.iacr.org/2016/199.pdf}, 2016.
5845

5846
\bibitem{BitcWP}
5847
  {\sc S.~Nakamoto}, {\sl Bitcoin: A peer-to-peer electronic cash system}, \url{https://bitcoin.org/bitcoin.pdf}, 2008.
5848

5849
\bibitem{STGM}
5850
  {\sc S.~Peyton Jones}, {\sl Implementing lazy functional languages on stock hardware: the Spineless Tagless G-machine}, {\it Journal of Functional Programming\/} {\bf 2} (2), p.~127--202, 1992.
5851

5852
\bibitem{Raptor}
5853
  {\sc A.~Shokrollahi, M.~Luby}, {\sl Raptor Codes}, {\it IEEE Transactions on Information Theory\/} {\bf 6}, no.\ 3--4 (2006), p.~212--322.
5854

5855
\bibitem{DistrSys}
5856
  {\sc M.~van Steen, A.~Tanenbaum}, {\sl Distributed Systems, 3rd ed.}, 2017.
5857

5858
\bibitem{HoTT}
5859
  {\sc The Univalent Foundations Program}, {\sl Homotopy Type Theory: Univalent Foundations of Mathematics}, Institute for Advanced Study, 2013, available at \url{https://homotopytypetheory.org/book}.
5860

5861
\bibitem{PolkaWP}
5862
  {\sc G.~Wood}, {\sl PolkaDot: vision for a heterogeneous multi-chain framework}, draft~1, \url{https://github.com/w3f/polkadot-white-paper/raw/master/PolkaDotPaper.pdf}, 2016.
5863

5864
\end{thebibliography}
5865

5866
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
5867
%
5868
%
5869
%                  APPENDICES
5870
%
5871
%
5872
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
5873
\clearpage
5874
\appendix
5875
\myappendix{The TON Coin, or the Gram}\label{app:coins}
5876

5877
The principal cryptocurrency of the TON Blockchain, and in particular
5878
of its masterchain and basic workchain, is the {\em TON Coin}, also
5879
known as the {\em Gram\/} (GRM). It is used to make deposits required
5880
to become a validator; transaction fees, gas payments (i.e.,
5881
smart-contract message processing fees) and persistent storage
5882
payments are also usually collected in Grams.
5883

5884
\nxpoint \embt(Subdivision and terminology.)  A {\em Gram\/} is
5885
subdivided into one billion ($10^9$) smaller units, called {\em
5886
  nanograms}, {\em ngrams} or simply {\em nanos}. All transfers and
5887
account balances are expressed as non-negative integer multiples of
5888
nanos. Other units include:
5889
\begin{itemize}
5890
\item A {\em nano}, {\em ngram} or {\em nanogram} is the smallest
5891
  unit, equal to $10^{-9}$ Grams.
5892
\item A {\em micro\/} or {\em microgram\/} equals one thousand
5893
  ($10^3$) nanos.
5894
\item A {\em milli\/} is one million ($10^6$) nanos, or one thousandth
5895
  part ($10^{-3}$) of a Gram.
5896
\item A {\em Gram\/} equals one billion ($10^9$) nanos.
5897
\item A {\em kilogram}, or {\em kGram}, equals one thousand ($10^3$)
5898
  Grams.
5899
\item A {\em megagram}, or {\em MGram}, equals one million ($10^6$)
5900
  Grams, or $10^{15}$ nanos.
5901
\item Finally, a {\em gigagram}, or {\em GGram}, equals one billion
5902
  ($10^9$) Grams, or $10^{18}$ nanos.
5903
\end{itemize}
5904

5905
There will be no need for larger units, because the initial supply of
5906
Grams will be limited to five billion ($5\cdot10^9$) Grams (i.e., 5
5907
Gigagrams).
5908

5909
\nxpoint \embt(Smaller units for expressing gas prices.)  If the
5910
necessity for smaller units arises, ``specks'' equal to $2^{-16}$
5911
nanograms will be used. For example, gas prices may be indicated in
5912
specks. However, the actual fee to be paid, computed as the product of
5913
the gas price and the amount of gas consumed, will be always rounded
5914
down to the nearest multiple of $2^{16}$ specks and expressed as an
5915
integer number of nanos.
5916

5917
\nxpoint \embt(Original supply, mining rewards and inflation.)  The
5918
total supply of Grams is originally limited to $5$ Gigagrams (i.e.,
5919
five billion Grams or $5\cdot10^{18}$ nanos).
5920

5921
This supply will increase very slowly, as rewards to validators for
5922
mining new masterchain and shardchain blocks accumulate. These rewards
5923
would amount to approximately $20\%$ (the exact number may be adjusted
5924
in future) of the validator's stake per year, provided the validator
5925
diligently performs its duties, signs all blocks, never goes offline
5926
and never signs invalid blocks. In this way, the validators will have
5927
enough profit to invest into better and faster hardware needed to
5928
process the ever growing quantity of users' transactions.
5929

5930
We expect that at most $10\%$\footnote{The maximum total amount of
5931
  validator stakes is a configurable parameter of the blockchain, so
5932
  this restriction can be enforced by the protocol if necessary.} of
5933
the total supply of Grams, on average, will be bound in validator
5934
stakes at any given moment. This will produce an inflation rate of
5935
$2\%$ per year, and as a result, will double the total supply of Grams
5936
(to ten Gigagrams) in 35 years. Essentially, this inflation represents
5937
a payment made by all members of the community to the validators for
5938
keeping the system up and running.
5939

5940
On the other hand, if a validator is caught misbehaving, a part or all
5941
of its stake will be taken away as a punishment, and a larger portion
5942
of it will subsequently be ``burned'', decreasing the total supply of
5943
Grams. This would lead to deflation. A smaller portion of the fine may
5944
be redistributed to the validator or the ``fisherman'' who committed a
5945
proof of the guilty validator's misbehavior.
5946

5947
\nxpoint\label{sp:gram.price} \embt(Original price of Grams.)  The
5948
price of the first Gram to be sold will equal approximately
5949
$\$0.1$ (USD). Every subsequent Gram to be sold (by the TON Reserve,
5950
controlled by the TON Foundation) will be priced one billionth higher
5951
than the previous one. In this way, the $n$-th Gram to be put into
5952
circulation will be sold at approximately
5953
\begin{equation}\label{eq:gram.price}
5954
  p(n)\approx 0.1\cdot (1+10^{-9})^n\quad\text{USD},
5955
\end{equation}
5956
or an approximately equivalent (because of quickly changing market
5957
exchange rates) amount of other (crypto)currencies, such as BTC or
5958
ETH.
5959

5960
\nxsubpoint\label{sp:exp.priced} \embt(Exponentially priced
5961
cryptocurrencies.)  We say that the Gram is an {\em exponentially
5962
  priced cryptocurrency}, meaning that the price of the $n$-th Gram to
5963
be put into circulation is approximately $p(n)$ given by the formula
5964
\begin{equation}
5965
  p(n)=p_0\cdot e^{\alpha n}
5966
\end{equation}
5967
with specific values $p_0=0.1$ USD and $\alpha=10^{-9}$.
5968

5969
More precisely, a small fraction $dn$ of a new coin is worth
5970
$p(n)\,dn$ dollars, once $n$ coins are put into circulation. (Here $n$
5971
is not necessarily an integer.)
5972

5973
Other important parameters of such a cryptocurrency include $n$, the
5974
total number of coins in circulation, and $N\geq n$, the total number
5975
of coins that can exist. For the Gram, $N=5\cdot 10^9$.
5976

5977
\nxsubpoint \embt(Total price of first $n$ coins.)  The total price
5978
$T(n)=\int_0^n p(n)\,dn\approx p(0)+p(1)+\cdots+p(n-1)$ of the first
5979
$n$ coins of an exponentially priced cryptocurrency (e.g., the Gram)
5980
to be put into circulation can be computed by
5981
\begin{equation}
5982
  T(n)=p_0\cdot\alpha^{-1}(e^{\alpha n}-1)\quad.
5983
\end{equation}
5984

5985
\nxsubpoint \embt(Total price of next $\Delta n$ coins.)  The total
5986
price $T(n+\Delta n)-T(n)$ of $\Delta n$ coins put into circulation
5987
after $n$ previously existing coins can be computed by
5988
\begin{equation}\label{eq:T.m.n}
5989
  T(n+\Delta n)-T(n)=p_0\cdot\alpha^{-1}(e^{\alpha(n+\Delta n)}-e^{\alpha n})
5990
  =p(n)\cdot\alpha^{-1}(e^{\alpha\,\Delta n}-1)\quad.
5991
\end{equation}
5992

5993
\nxsubpoint \embt(Buying next coins with total value $T$.)  Suppose
5994
that $n$ coins have already been put into circulation, and that one
5995
wants to spend $T$ (dollars) on buying new coins. The quantity of
5996
newly-obtained coins $\Delta n$ can be computed by putting $T(n+\Delta
5997
n)-T(n)=T$ into \eqref{eq:T.m.n}, yielding
5998
\begin{equation}\label{eq:new.coins}
5999
  \Delta n=\alpha^{-1}\log\left(1+\frac{T\cdot\alpha}{p(n)}\right)\quad.
6000
\end{equation}
6001
Of course, if $T\lll p(n)\alpha^{-1}$, then $\Delta n\approx T/p(n)$.
6002

6003
\nxsubpoint \embt(Market price of Grams.)  Of course, if the free
6004
market price falls below $p(n):=0.1\cdot (1+10^{-9})^n$, once $n$
6005
Grams are put into circulation, nobody would buy new Grams from the
6006
TON Reserve; they would choose to buy their Grams on the free market
6007
instead, without increasing the total quantity of Grams in
6008
circulation. On the other hand, the market price of a Gram cannot
6009
become much higher than $p(n)$, otherwise it would make sense to
6010
obtain new Grams from the TON Reserve. This means that the market
6011
price of Grams would not be subject to sudden spikes (and drops); this
6012
is important because stakes (validator deposits) are frozen for at
6013
least one month, and gas prices cannot change too fast either. So, the
6014
overall economic stability of the system requires some mechanism that
6015
would prevent the exchange rate of the Gram from changing too
6016
drastically, such as the one described above.
6017

6018
\nxsubpoint \embt(Buying back the Grams.)  If the market price of the
6019
Gram falls below $0.5\cdot p(n)$, when there are a total of $n$ Grams
6020
in circulation (i.e., not kept on a special account controlled by the
6021
TON Reserve), the TON Reserve reserves the right to buy some Grams
6022
back and decrease $n$, the total quantity of Grams in
6023
circulation. This may be required to prevent sudden falls of the
6024
Gram exchange rate.
6025

6026
\nxsubpoint \embt(Selling new Grams at a higher price.)  The TON
6027
Reserve will sell only up to one half (i.e., $2.5\cdot10^9$ Grams) of
6028
the total supply of Grams according to the price
6029
formula~\eqref{eq:gram.price}.  It reserves the right not to sell any
6030
of the remaining Grams at all, or to sell them at a higher price than
6031
$p(n)$, but never at a lower price (taking into account the uncertainty
6032
of quickly changing exchange rates). The rationale here is that once
6033
at least half of all Grams have been sold, the total value of the Gram
6034
market will be sufficiently high, and it will be more difficult for outside forces to manipulate the exchange rate than it may be at the very
6035
beginning of the Gram's deployment.
6036

6037
\nxpoint\label{sp:unalloc.gr} \embt(Using unallocated Grams.)  The TON
6038
Reserve will use the bulk of ``unallocated'' Grams (approximately
6039
$5\cdot10^9-n$ Grams)---i.e., those residing in the special account of
6040
the TON Reserve and some other accounts explicitly linked to it---only
6041
as validator stakes (because the TON Foundation itself will likely
6042
have to provide most of the validators during the first deployment
6043
phase of the TON Blockchain), and for voting in the masterchain for or
6044
against proposals concerning changes in the ``configurable
6045
parameters'' and other protocol changes, in the way determined by the
6046
TON Foundation (i.e., its creators---the development team). This also
6047
means that the TON Foundation will have a majority of votes during the
6048
first deployment phase of the TON Blockchain, which may be useful if a
6049
lot of parameters end up needing to be adjusted, or if the need arises
6050
for hard or soft forks. Later, when less than half of all Grams remain
6051
under control of the TON Foundation, the system will become more
6052
democratic. Hopefully it will have become more mature by then, without
6053
the need to adjust parameters too frequently.
6054

6055
\nxsubpoint\label{sp:dev.grams} \embt(Some unallocated Grams will be
6056
given to developers.)  A predefined (relatively small) quantity of
6057
``unallocated'' Grams (e.g., 200 Megagrams, equal to 4\% of the total
6058
supply) will be transferred during the deployment of the TON
6059
Blockchain to a special account controlled by the TON Foundation, and
6060
then some ``rewards'' may be paid from this account to the developers
6061
of the open source TON software, with a minimum two-year vesting
6062
period.
6063

6064
\nxsubpoint\label{sp:TON.own.grams} \embt(The TON Foundation needs
6065
Grams for operational purposes.)  Recall that the TON Foundation will
6066
receive the fiat and cryptocurrency obtained by selling Grams from the
6067
TON Reserve, and will use them for the development and deployment of
6068
the TON Project. For instance, the original set of validators, as well
6069
as an initial set of TON Storage and TON Proxy nodes may be installed
6070
by the TON Foundation.
6071

6072
While this is necessary for the quick start of the project, the
6073
ultimate goal is to make the project as decentralized as possible. To
6074
this end, the TON Foundation may need to encourage installation of
6075
third-party validators and TON Storage and TON Proxy nodes---for
6076
example, by paying them for storing old blocks of the TON Blockchain
6077
or proxying network traffic of a selected subset of services. Such
6078
payments will be made in Grams; therefore, the TON Foundation will
6079
need a significant amount of Grams for operational purposes.
6080

6081
\nxsubpoint \embt(Taking a pre-arranged amount from the Reserve.) The
6082
TON Foundation will transfer to its account a small part of the TON
6083
Reserve---say, 10\% of all coins (i.e.\ 500 Megagrams) after the end
6084
of the initial sale of Grams---to be used for its own purposes as
6085
outlined in~\ptref{sp:TON.own.grams}. This is best done simultaneously
6086
with the transfer of the funds intended for TON developers, as
6087
mentioned in~\ptref{sp:dev.grams}.
6088

6089
After the transfers to the TON Foundation and the TON developers, the
6090
TON Reserve price $p(n)$ of the Gram will immediately rise by a
6091
certain amount, known in advance. For example, if 10\% of all coins
6092
are transferred for the purposes of the TON Foundation, and 4\% are
6093
transferred for the encouragement of the developers, then the total
6094
quantity $n$ of coins in circulation will immediately increase by
6095
$\Delta n=7\cdot10^8$, with the price of the Gram multiplying by
6096
$e^{\alpha\,\Delta n}=e^{0.7}\approx 2$ (i.e, doubling).
6097

6098
The remainding ``unallocated'' Grams will be used by the TON Reserve
6099
as explained above in~\ptref{sp:unalloc.gr}. If the TON Foundation
6100
needs any more Grams thereafter, it will simply convert into Grams
6101
some of the funds it had previously obtained during the sale of the
6102
coins, either on the free market or by buying Grams from the TON
6103
Reserve.  To prevent excessive centralization, the TON Foundation will
6104
never endeavour to have more than 10\% of the total amount of Grams
6105
(i.e., 500 Megagrams) on its account.
6106

6107
\nxpoint\label{sp:bulk.sales} \embt(Bulk sales of Grams.)  When a lot
6108
of people simultaneously want to buy large amounts of Grams from the
6109
TON Reserve, it makes sense not to process their orders immediately,
6110
because this would lead to results very dependent on the timing of
6111
specific orders and their processing sequence.
6112

6113
Instead, orders for buying Grams may be collected during some
6114
pre-defined period of time (e.g., a day or a month) and then processed
6115
all together at once. If $k$ orders with $i$-th order worth $T_i$
6116
dollars arrive, then the total amount $T=T_1+T_2+\cdots+T_k$ is used
6117
to buy $\Delta n$ new coins according to \eqref{eq:new.coins}, and the
6118
sender of the $i$-th order is allotted $\Delta n\cdot T_i/T$ of these
6119
coins. In this way, all buyers obtain their Grams at the same average
6120
price of $T/\Delta n$ USD per Gram.
6121

6122
After that, a new round of collecting orders for buying new Grams
6123
begins.
6124

6125
When the total value of Gram buying orders becomes low enough, this
6126
system of ``bulk sales'' may be replaced with a system of immediate
6127
sales of Grams from the TON Reserve according to
6128
formula~\eqref{eq:new.coins}.
6129

6130
The ``bulk sales'' mechanism will probably be used extensively during
6131
the initial phase of collecting investments in the TON Project.
6132

6133
\end{document}
6134
Ton

Использование cookies