add tex sources for the papers

This commit is contained in:
arthur.breitman@gmail.com 2016-10-16 16:35:49 -07:00
parent 6c9cfde9be
commit c5fbd00dda
6 changed files with 2960 additions and 0 deletions

View File

@ -0,0 +1,6 @@
These are the sources of the original Tezos paper by L.M. Goodman with very minor edits and typo fixes.
They are being made available in order to facilitate translation of the papers.
A word of caution, though the project is substantially similar to the protocol described in the paper,
some implementation details have changed. At this point, the protocol is defined by its reference
implementation in OCaml.

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,98 @@
\begin{thebibliography}{10}
\bibitem{Nomic}
Peter Suber.
\newblock Nomic: A game of self-amendment.
\newblock {\url{http://legacy.earlham.edu/~peters/writing/nomic.htm}}, 1982.
\bibitem{Bitcoin}
Satoshi Nakamoto.
\newblock Bitcoin: A peer-to-peer electronic cash system.
\newblock {\url{https://bitcoin.org/bitcoin.pdf}}, 2008.
\bibitem{Ethereum}
Vitalik~Buterin et~al.
\newblock A next-generation smart contract and decentralized application
platform.
\newblock
{\url{https://github.com/ethereum/wiki/wiki/%5BEnglish%5D-White-Paper}},
2014.
\bibitem{CryptoNote}
Nicolas van Saberhagen.
\newblock Cryptonote v 2.0.
\newblock {\url{https://cryptonote.org/whitepaper.pdf}}, 2013.
\bibitem{Zerocash}
Matthew~Green et~al.
\newblock Zerocash: Decentralized anonymous payments from bitcoin.
\newblock
{\url{http://zerocash-project.org/media/pdf/zerocash-extended-20140518.pdf}},
2014.
\bibitem{schelling}
Thomas Schelling.
\newblock {\em The Strategy of conflict}.
\newblock Cambridge: Harvard University Press, 1960.
\bibitem{51pct}
Bitcoin Wiki.
\newblock Weaknesses.
\newblock
{\url{https://en.bitcoin.it/wiki/Attacks#Attacker_has_a_lot_of_computing_power}},
2014.
\bibitem{centralized}
Gaving Andresen.
\newblock Centralized mining.
\newblock {\url{https://bitcoinfoundation.org/2014/06/13/centralized-mining/}},
2014.
\bibitem{btccommons}
Bitcoin Wiki.
\newblock Tragedy of the commons.
\newblock {\url{https://en.bitcoin.it/wiki/Tragedy_of_the_Commons}}, 2014.
\bibitem{dominantassurance}
Bitcoin Wiki.
\newblock Dominant assurance contracts.
\newblock {\url{https://en.bitcoin.it/wiki/Dominant_Assurance_Contracts}},
2014.
\bibitem{doge}
Simon de~la Rouviere.
\newblock Not actually capped at 100 billion?
\newblock {\url{https://github.com/dogecoin/dogecoin/issues/23}}, 2013.
\bibitem{shootout}
Debian project.
\newblock Computer language benchmarks game.
\newblock
{\url{http://benchmarksgame.alioth.debian.org/u32/benchmark.php?test=all&lang=all&data=u32}},
2014.
\bibitem{semantic}
Scott Owens.
\newblock A sound semantics for ocaml light.
\newblock {\url{http://www.cl.cam.ac.uk/~so294/ocaml/paper.pdf}}, 2008.
\bibitem{distrib_impossible}
Ben Laurie.
\newblock Decentralised currencies are probably impossible, but let's at least
make them efficient.
\newblock {\url{http://www.links.org/files/decentralised-currencies.pdf}},
2011.
\bibitem{Slasher}
Vitalik Buterin.
\newblock Slasher: A punitive proof-of-stake algorithm.
\newblock
{\url{https://blog.ethereum.org/2014/01/15/slasher-a-punitive-proof-of-stake-algorithm/}},
2014.
\bibitem{Futarchy}
Robin Hanson.
\newblock Shall we vote on values, but bet on beliefs?
\newblock {\url{http://hanson.gmu.edu/futarchy3.pdf}}, 2013.
\end{thebibliography}

View File

@ -0,0 +1,855 @@
\documentclass[letterpaper]{article}
\author{L.M Goodman}
\date{August 3, 2014}
\title{Tezos: A Self-Amending Crypto-Ledger \\ Position Paper}
\usepackage[utf8]{inputenc}
%%\setlength{\parskip}{\baselineskip}
\usepackage{amsfonts}
\usepackage{url}
\usepackage[hidelinks]{hyperref}
%\usepackage{hyperref}
\usepackage{listings}
\usepackage{color}
\usepackage{epigraph}
%\epigraphfontsize{\small\itshape}
\setlength\epigraphwidth{4.6cm}
\setlength\epigraphrule{0pt}
\begin{document}
\maketitle
%\epigraphfontsize{\small\itshape}
%\renewcommand{\abstractname}{Introduction}
\epigraph{\emph{``Laissez faire les propri\'{e}taires.''}}
{--- \textup{Pierre-Joseph Proudhon}}
\begin{abstract}
The popularization of Bitcoin, a decentralized crypto-currency has inspired the
production of several alternative, or ``alt'', currencies. Ethereum, CryptoNote,
and Zerocash all represent unique contributions to the crypto-currency space.
Although most alt currencies harbor their own source of innovation, they have
no means of adopting the innovations of other currencies which may succeed them.
We aim to remedy the potential for atrophied evolution in the crypto-currency
space by presenting Tezos, a generic and self-amending crypto-ledger.
Tezos can instanciate any blockchain based protocol. Its seed protocol specifies
a procedure for stakeholders to approve amendments to the protocol,
\emph{including} amendments to the amendment procedure itself.
Upgrades to Tezos are staged through a testing environment to allow
stakeholders to recall potentially problematic amendments.
The philosophy of Tezos is inspired by Peter Suber's Nomic\cite{Nomic},
a game built around a fully introspective set of rules.
In this paper, we hope to elucidate the potential benefits of Tezos,
our choice to implement as a proof-of-stake system, and our choice to write it
in OCaml.
\end{abstract}
\newpage
\tableofcontents
\section{Motivation}
In our development of Tezos, we aspire to address four problems we perceive with
Bitcoin\cite{Bitcoin}:
\begin{itemize}
\item[-] The ``hard fork'' problem, or the inability for Bitcoin to dynamically
innovate due to coordination issues.
\item[-] Cost and centralization issues raised by Bitcoin's proof-of-work
system.
\item[-] The limited expressiveness of Bitcoin's transaction language, which has
pushed smart contracts onto other chains.
\item[-] Security concerns regarding the implementation of a crypto-currency.
\end{itemize}
\subsection{The Protocol Fork Problem}
\subsubsection{Keeping Up With Innovation}
In the wake of Bitcoin's success, many developers and entrepreneurs have
released alternative crypto-currencies (``altcoins''). While some of these
altcoins did not diverge dramatically from Bitcoin's original
code\footnote{wow, such unoriginal}, some presented interesting improvements.
For example, Litecoin introduced a memory hard proof of work
function\footnote{scrypt mining ASICs are now available} and a shorter block
confirmation time. Similarly, Ethereum has designed
stateful contracts and a Turing-complete transaction language\cite{Ethereum}.
More important contributions include privacy-preserving ring signatures
(CryptoNote)\cite{CryptoNote} and untraceable transactions using SNARK
(Zerocash)\cite{Zerocash}.
The rise of altcoins has inspired a vast competition in software innovation.
Cheerleaders for this Hayekian growth, however, miss a fundamental point: for a
cryptocurrency to be an effective form of money, it needs to be a stable store
of value. Innovation within a ledger preserves value through protecting
the network effect giving the currency its value.
To illustrate the problem of many competing altcoins, let us compare a
crypto-currency and a smart phone. When purchasing a smart phone, the consumer
is paying for certain features, such as the ability to play music, check email,
message his friends, and conduct phone calls.
Every few weeks, a newer smartphone model is released on the market which often
contains enhanced features. Though consumers who have the older model may be
jealous of those with the latest model, the introduction of newer smartphones
does not render older smartphones dysfunctional.
This dynamic would change, however, if the newest phones could not communicate
with older models. If the many models and styles of smartphone could not be used
together seamlessly, the value of each smartphone would be reduced to the number
of people with the same model.
Crypto-currencies suffer from the same fate as smartphones which are
incompatible with one another; they derive their value from a network effect,
or the number of users who have given it value. To this end, any innovation that
occurs outside of a crypto-currency will either fail to build enough network
effect to be noticed, or it will succeed but undermine the value of the savings
in the old currency. If smartphones were incompatible with older models, there
would be either very little innovation or extremely disruptive innovation
forcing older phones into obsolescence.
Side-chains are an attempt to allow innovations which will retain
compatibility with Bitcoin by pegging the value of a new currency to Bitcoin and
creating a two-way convertibility. Unfortunately, it's unclear whether they
will be flexible enough to accommodate protocols substantially different fro
Bitcoin. The only alternative so far is to fork the protocol.
\subsubsection{Economics of Forks}
To understand the economics of forks, one must first understand that monetary
value is primarily a social consensus. It is tempting to equate a
cryptocurrency with its rules and its ledger, but currencies are actually focal
points: they draw their value from the common knowledge that they are accepted
as money. While this may seem circular, there is nothing paradoxical about it.
From a game theoretic perspective, the perception of a token as a store of value
is stable so long as it is widespread. Note that, as a ledger, Bitcoin is
a series of 1s and 0s. The choice to treat the amounts encoded within unspent
outputs as balances is a purely \emph{social} consensus, not a property of the
protocol itself.
Changes in the protocol are referred to as ``forks''\footnote{not to be confused
with blockchain forks which happen \emph{within} a protocol}. They are so called
because, in principle, users have the option to keep using the old protocol.
Thus, during a fork, the currency splits in two: an old version and a new
version.
A successful fork does not merely require software engineering, but
the coordination of a critical mass of users. This coordination is hard
to achieve in practice. Indeed, after a fork, two ledgers exist and users
are confronted with a dilemma. How should they value each branch of the fork?
This is a coordination game where the answer is to primarily value the branch
other users are expected to primarily value. Of course, said users are likely
to follow the same strategy and value the branch for the same reason. These
games were analyzed by economist Thomas Schelling and focal points are
sometimes referred to as ``Schelling points''\cite{schelling}.
Unfortunately, there is no guarantee that this Schelling point will be the most
desirable choice for the stakeholders, it will merely the ``default'' choice.
A ``default'' could be to follow the lead of a core development team or the
decrees of a government regardless of their merit.
An attacker capable of changing social consensus
controls the currency for all intents and purposes.
The option to stick with the original protocol is widely irrelevant
if the value of its tokens is annihilated by a consensus shift.%
\footnote{The argument that there can never be more than 21 million bitcoin
because ``if a fork raised the cap, then it wouldn't be Bitcoin anymore''
isn't very substantive, for Bitcoin is what the consensus says it is.}
Core development teams are a potentially a dangerous source of centralization.
Though users can fork any open source project,
that ability offers no protection against an attacker
with enough clout to alter the social consensus.
Even assuming the likely benevolence of a core development team,
it represents a weak point on which an attacker could exercise leverage.
Tezos guards against the vulnerabilities wrought by the source of centralization
through radically decentralized protocol forks.
It uses its own cryptoledger to let stakeholders coordinate on forks.
This allows coordination and enshrines the principle that
forks are not valid unless they are endogenous,
making it much harder to attack the protocol by moving the consensus.
Suppose for instance that a popular developer announces his intention to fork
Tezos without making use of the protocol's internal procedure. ``Why would he
attempt to bypass this process?'' might ask stakeholders. Most certainly,
because he knew that he wouldn't be able to build consensus around his proposed
fork \emph{within} Tezos.
This signals to the stakeholders that their preferred consensus would be to
reject this fork, and the Schelling point is thus to refuse it, no matter the
clout of that developer.
\subsection{Shortcomings of Proof-of-Work}
The proof-of-work mechanism used by Bitcoin is a careful balance
of incentives meant to prevent the double spending problem.
While it has nice theoretical properties in the absence of miner
collusion, it suffers in practice from severe shortcomings.
\subsubsection{Mining Power Concentration}
There are several problems with proof-of-work as a foundation for
crypto-currencies. The most salient problem, which is all too relevant
as of 2014, is the existence of centralized mining pools, which concentrate
power in the hands of a few individuals.
The proof-of-work mechanism is decentralized, which means that users do not
need to \emph{explicitely} trust anyone to secure the currency. However,
\emph{implicitely}, Bitcoin has yielded a system where all users have to trust
the benevolence of one or two pool operators to secure the currency.
A conspiracy of miners holding more than 50\% of the hashing power
is known as 51\% attack\cite{51pct}. It allows the attackers
to prevent transactions from being made, to undo transactions,
to steal recently minted coins and to to double spend\cite{centralized}.
A centralized mint signing blocks would be just as secure,
and far less wasteful, as a miner controlling 51\% of the hashing power.
If a centralized mint is unacceptable to Bitcoin users,
they should not tolerate \textit{de facto} centralization of mining power.
The concentration of mining power is no coincidence:
large mining pools face less variance in their returns than their competitors
and can thus afford to grow their operation more.
In turn, this growth increases their market share and lowers their variance.
To make things worse, the large mining pool ghash.io
has hinted at a business model where they would prioritize ``premium''
transactions submitted directly to them. This means that large miners would earn
proportionally more than smaller miners. Sadly, p2pool has had trouble
attracting hashing power as most miners selfishly prefer the convenience of
centralized mining-pools.
Many have argued that fears of market concentration are
overblown. They are generalizing hastily from the real world economy.
Real businesses compete in a rapidly changing landscape
where Schumpeterian creative destruction exercises
constant evolutionary pressure on incumbents.
Real businesses need local knowledge, face organizational issues
and principal agent problems. Bitcoin mining is a purely synthetic economic
sector centered around hashing power, a purely fungible commodity.
It would be mistaken to hastily generalize and think that such a sterile
environment is endowed with the same organic robustness that characterizes a
complex, fertile, economy.\footnote{It is possible that a new technology
will supplant ASICs who themselves replaced FPGA boards. However, the pace of
this type of innovation is nowhere fast enough to prevent miners from forming
dominating positions for long period of times; and such innovation would benefit
but a new (or the same) small clique of people who initially possess the new
technology or eventually amass the capital to repeat the same pattern.}
Furthermore, the economic argument generally holds that natural monopolies have
few incentives to abuse their position. The same could be said about a Bitcoin
miner --- after all, why would a dominant miner destroy the value of their
investments by compromising the currency?
Unfortunately, this still creates a huge systemic risk as such miners can be
compromised by a dishonest attacker. The cost of executing a double spending
attack against the network is \emph{no more} than the cost of subverting a few
large mining pool.
There have been proposals intended to address this issue by tweaking the
protocol so it would be impossible for pool organizers to trust their members
not to cheat. However, these proposals only prevent pools from gathering mining
force from anonymous participants with whom there is no possibility of
retaliation. Pooling is still possible between non-anonymous people:
organizers may operate all the mining hardware while participants hold shares,
or organizers may track cheaters by requiring inclusion of an identifying nonce
in the blocks they are supposed to hash. The result of such proposals would thus
be to increase variance for anonymous mining operations and to push towards
further concentration in the hands of mining cartels.
Proof-of-stake, as used by Tezos, does not suffer from this problem:
inasmuch as it is possible to hold 51\% of the mining power,
this implies holding 51\% of the currency,
which is not only much more onerous than controlling 51\% of hashing power but
implies fundamentally better \emph{incentives}.
\subsubsection{Bad incentives}
There is an even deeper problem with proof-of-work, one that is much harder to
mitigate than the concentration of mining power: a misalignment of incentives
between miners and stakeholders.
Indeed, in the long run, the total mining revenues will be the sum of the all
transaction fees paid to the miners. Since miners compete to produce hashes,
the amount of money spent on mining will be slightly smaller than the revenues.
In turn, the amount spent on transactions depends on the supply and demand for
transactions. The supply of transactions on the blockchain is determined by the
block size and is fixed.
Unfortunately, there is reason to expect that the demand for transactions will
fall to very low levels. People are likely to make use of off-chain transaction
mechanisms via trusted third parties, particularly for small amounts, in order
to alleviate the need to wait for confirmations. Payment processors may only
need to clear with each other infrequently.
This scenario is not only economically likely, it seems necessary given the
relatively low transaction rate supported by Bitcoin. Since blockchain
transaction will have to compete with off-chain transaction, the amount spent on
transactions will approach its cost, which, given modern infrastructure, should
be close to zero.
Attempting to impose minimum transaction fees may only exacerbate the problem
and cause users to rely on off-chain transaction more. As the amount paid in
transaction fees collapses, so will the miner's revenues, and so will the cost
of executing a 51\% attack. To put it in a nutshell, the security of a
proof-of-work blockchain suffers from a commons problem\cite{btccommons}.
Core developer Mike Hearn has suggested the use of special transactions to
subsidize mining using a pledge type of fund raising\cite{dominantassurance}.
A robust currency should not need to rely on charity to operate securely.
Proof-of-stake fixes these bad incentives by aligning the incentives of the
miners and stakeholders: by very definition, the miners \emph{are} the
stakeholders, and are thus interested in keeping the transaction costs low.
At the same time, because proof-of-stake mining is not based on destruction of
resources, the transaction cost (whether direct fees or indirect inflation)
are entirely captured by miners, who can cover their operating costs
without having to compete through wealth destruction.
\subsubsection{Cost}
An alternative is to keep permanent mining rewards, as Dogecoin\cite{doge} has
considered. Unfortunately, proof-of-work arbitrarily increases the costs to the
users without increasing the profits of the miners, incurring a deadweight loss.
Indeed, since miners compete to produce hashes, the amount of money they spend
on mining will be slightly smaller than the revenues, and in the long run,
the profits they make will be commensurate with the value of their transaction
services, while the cost of mining is lost to everyone.
This is not simply a nominal effect: real economic goods (time in fabs,
electricity, engineering efforts) are being removed from the economy for the
sake of proof-of-work mining. As of June 2014, Bitcoin's annual inflation stands
at a little over 10\% and about \$2.16M dollars are being burned daily for the
sake of maintaining a system that provides little to no security over a
centralized system in the hands of ghash.io.
The very security of a proof-of-work scheme rests on this actual cost being
higher than what an attacker is willing to pay, which is bound to increase
with the success of the currency.
Proof-of-stake eliminates this source of waste without lowering the cost of
attacks --- indeed, it automatically scales up the cost of an attack as the
currency appreciates. Because the thing you must prove to mine is not
destruction of existing resources but provision of existing resources,
a proof-of-stake currency does not rely on destroying massive resources
as it gains in popularity.
\subsubsection{Control}
Last but not least, the proof-of-work system puts the miners,
not the stakeholders, in charge. Forks for instance require the consent of a
majority of the miners. This poses a potential conflict of interest: a majority
of miners could decide to hold the blockchain hostage until stakeholders consent
to a protocol fork increasing the mining rewards; more generally, they will hold
onto the hugely wasteful system that empowers them longer than is economically
beneficial for users.
\subsection{Smart Contracts}
Though Bitcoin does allow for smart contracts, most of its opcodes
have been historically disabled and the possibilities are limited.
Ethereum introduced a smart contract system with some critical differences:
their scripting language is Turing complete and they substitute
stateful accounts to Bitcoin's unspent outputs.
While emphasis has been put on the Turing complete aspect of the language,
the second property is by far the most interesting and powerful of the two.
In Bitcoin, an output can be thought of as having only two states: spent and
unspent. In Ethereum, accounts (protected by a key) hold a balance, a contract
code and a data store. The state of an account's storage can be mutated
by making a transaction towards this account. The transaction specifies an
amount and the parameters passed to the contract code.
A downside of a Turing complete scripting language for the contracts
is that the number of steps needed to execute a script is potentially unbounded,
a property which is generally uncomputable.
To address this problem, Ethereum has devised a system by which the miner
validating the transaction requires a fee proportional to the complexity
and number of steps needed to execute the contract.
Yet, for the blockchain to be secure, \emph{all} the active nodes need to
validate the transaction. A malicious miner could include in his block a
transaction that he crafted specially to run into an infinite loop and pay
himself an exorbitant fee for validating this transaction. Other miners could
waste a very long time validating this transaction. Worse, they could just
slack and fail to validate it. In practice though, most of the interesting
smart contracts can be implemented with very simple business logic and do not
need to perform complex calculations.
Our solution is to cap the maximum number of steps that a program is allowed to
run for in a single transaction. Since blocks have a size limit that caps the
number of transactions per block, there is also a cap on the number of
computation steps per block. This rate limitation foils CPU-usage
denial-of-service attacks. Meanwhile, legitimate users can issue multiple
transactions to compute more steps than allowed in a single transaction,
though at a limited rate. Miners may decide to exclude too long of an execution
if they feel the included fee is too small. Since the Tezos protocol is
amendable, the cap can be increased in future revisions and new cryptographic
primitives included in the scripting language as the need develops.
\subsection{Correctness}
Bitcoin underpins a \$8B valuation with a modest code base. As security
researcher Dan Kaminsky explains, Bitcoin looks like a security nightmare on
paper. A \verb!C++! code base with a custom binary protocol powers nodes
connected to the Internet while holding e-cash, sounds like a recipe for
disaster. \verb!C++! programs are often riddled with memory corruption bugs.
When they are connecting to the Internet, this creates vulnerabilities
exploitable by remote attackers. E-cash gives an immediate payoff to any
attacker clever enough to discover and exploit such a vulnerability.
Fortunately, Bitcoin's implementation has proven very resilient to attacks
thus far, with some exceptions. In August 2010, a bug where the sum of two
outputs overflowed to a negative number allowed attackers to create two
outputs of $92233720368.54$ coins from an input of $0.50$ coins.
More recently, massive vulnerabilities such as the heartbleed bug
have been discovered in the OpenSSL libraries. These vulnerabilities have
one thing in common, they happened because languages like \verb!C! and
\verb!C++! do not perform any checks on the operations they perform. For the
sake of efficiency, they may access random parts of the memory, add integers
larger than natively supported, etc. While these vulnerabilities have spared
Bitcoin, they do no not bode well for the security of the system.
Other languages do not exhibit those problems. OCaml is a functional programming
language developed by the INRIA since 1996 (and itself based on earlier
efforts). Its speed is comparable to that of \verb!C++! and it generally
features among the fastest programming languages in benchmarks\cite{shootout}.
More importantly, OCaml is strongly typed and offers a powerful type inference
system. Its expressive syntax and semantics, including powerful pattern matching
and higher-order modules, make it easy to concisely and correctly describe the
type of logic underpinning blockchain based protocols.
OCaml's semantic is fairly rigorous and a very large subset has been
formalized\cite{semantic}, which removes any ambiguity as to what is the
intended behavior of amendments.
In addition, Coq, one of the most advanced proof checking software
is able to extract OCaml code from proofs. As Tezos matures, it will be
possible to automatically extract key parts of the protocol's code from
mathematical proofs of correctness.
Examples of spectacular software failure abound. The heartbleed bug caused
millions of dollars in damages. In 2013, a single bug at high-frequency trading
firm Knight capital caused half a billion dollars worth of losses. In 1996, an
arithmetic overflow bug caused the crash of Ariane 5, a rocket that had cost
\$7B to develop; the cost of the rocket and the cargo was estimated at \$500M.
All of these bugs could have been prevented with the use of formal verification.
Formal verification has progressed by leaps and bounds in recent years,
it is time to use it in real systems.
\section{Abstract Blockchains}
Tezos attempts to represent a blockchain protocol in the most general way
possible while attempting to remain as efficient as a native protocol.
The goal of a blockchain is to represent a single state being concurrently
edited. In order to avoid conflicts between concurrent edits, it represents the
state as a ledger, that is as a series of transformations applied to an initial
state. These transformations are the ``blocks'' of the blockchain, and --- in
the case of Bitcoin --- the state is mostly the set of unspent outputs. Since
the blocks are created asynchronously by many concurrent nodes, a block tree is
formed. Each leaf of the tree represents a possible state and the end of a
different blockchain. Bitcoin specifies that only one branch should be
considered the valid branch: the one with the greatest total difficulty.
Blocks, as their name suggests, actually bundle together
multiple operations (known as transactions in the case of Bitcoin).
These operations are sequentially applied to the state.
\subsection{Three Protocols}
It is important to distinguish three protocols in cryptoledgers:
the network protocol, the transaction protocol, and the consensus protocol.
The role of the meta shell is to handle the network protocol
in as agnostic a way as possible while delegating the transaction and consensus
protocol to an abstracted implementation.
\subsubsection{Network Protocol}
The network protocol in Bitcoin is essentially the gossip network that allows
the broadcasting of transactions, the downloading and publishing of blocks,
the discovery of peers, etc. It is where most development occurs. For instance,
bloom filters were introduced in 2012 through BIP0037 to speed up the simple
payment verification for clients which do not download the whole blockchain.
Changes to the network protocol are relatively uncontroversial. There
may be initial disagreements on the desirability of these changes, but all
parties interests are fundamentally aligned overall.
These changes do not need to happen in concert either. One could devise a way to
integrate Bitcoin transactions steganographically into pictures of cats posted
on the Internet. If enough people started publishing transactions this way,
miners would start parsing cat pictures to find transactions to include in the
blockchain.
While a healthy network requires compatibility, competing innovation in the
network protocol generally strengthens a cryptocurrency.
\subsubsection{Transaction Protocol}
The transaction protocol describes what makes transactions valid. It is defined
in Bitcoin, for instance, through a scripting language. First, coins are created
by miners when they find a block. The miner then attaches a script to the coins
that he mined.
Such a script is known as an ``unspent output''. Transactions combine outputs
by providing arguments for which their scripts evaluate to true. These arguments
can be thought of keys and the scripts as padlocks.
In simple transactions, such scripts are merely signature-checking scripts but
more complex scripts can be formed. These outputs are added up and allocated
among a set of new outputs. If the amount of output spent is greater than the
amount allocated, the difference can be claimed by the miner.
Changes to the transaction protocol are more controversial than changes to
the network protocol. While a small group of people could unilaterally start
using the cat-picture broadcast algorithm, changing the transaction protocol
is trickier. Such changes typically do not affect the block validity
and thus only require the cooperation of a majority of the miners.
These are generally referred to as ``soft-fork''.
Some relatively uncontroversial changes still stand a chance to be implemented
there. For instance a fix to the transaction malleability issue would be a
transaction protocol level change. The introduction of Zerocash, also a
transaction protocol level change, risks being too controversial to be
undertaken.
\subsubsection{Consensus Protocol}
The consensus protocol of Bitcoin describes the way consensus is built
around the most difficult chain and the miner reward schedules.
It allows miners to draw transactions from the coin base,
it dictates how difficulty changes over time,
it indicates which blocks are valid
and which are part of the ``official'' chain.
This is by far the most central and most difficult to change protocol,
often requiring a ``hard-fork'', that is a fork invalidating old blocks.
For instance, the proof of work system, as is the reliance on SHA256 as a
proof-of-work system, etc.
\subsection{Network Shell}
Tezos separates those three protocols.
The transaction protocol and the consensus protocol
are implemented in an isolated module plugged
into a generic network shell responsible for maintaining the blockchain.
In order for the protocol to remain generic, we define the following interface.
We want our blockchain to represent the current ``state'' of the economy,
which we call in Tezos the \textbf{Context}.
This could include the balances of the various accounts
and other informations such as the current block number.
Blocks are seen as operators that transform an old state into a new state.
In this respect, a protocol can be described by only two functions:
\begin{itemize}
\item[-] \textbf{apply} which takes a Context and a block and returns
either a valid Context or an invalid result (should the block be invalid)
\item[-] \textbf{score} which takes a Context and returns a score
allowing us to compare various leafs of the blockchain
to determine the canonical one.
In Bitcoin, we would simply record the total difficulty
or the chain inside the Context and return this value.
\end{itemize}
Strikingly, these two functions alone can implement \emph{any} blockchain based
crypto-ledger. In addition, we attach those functions to the context itself
and expose the following two functions to the protocol:
\begin{itemize}
\item[-] \textbf{set\_test\_protocol} which replaces the protocol used in the
test-net with a new protocol (typically one that has been adopted through a
stakeholder voter).
\item[-] \textbf{promote\_test\_protocol} which replaces the current protocol
with the protocol currently being tested
\end{itemize}
These two procedures allow the protocol to validate its own replacement.
While the seed protocol relies on a simple super-majority rule with a quorum,
more complex rules can be adopted in the future.
For instance, the stakeholders could vote
to require certain properties to be respected by any future protocol.
This could be achieved by integrating a proof checker within the protocol
and requiring that every amendment include a proof of constitutionality.
\section{Proof-of-Stake}
Tezos can implement any type of blockchain algorithm:
proof-of-work, proof-of-stake, or even centralized.
Due to the shortcomings of the proof-of-work mechanism,
the Tezos seed protocol implements a proof-of-stake system.
There are considerable theoretical hurdles to designing a working
proof-of-stake systems, we will explain our way of dealing with
them.\footnote{A full, technical, description of our proof-of-stake system is
given in the Tezos white paper.}
\subsection{Is Proof-of-Stake Impossible?}
There are very serious theoretical hurdles to any proof-of-stake system.
The main argument against the very possibility of a proof-of-stake system
is the following:
a new user downloads a client and connects for the first time to the network.
He receives a tree of blocks with two larges branches
starting from the genesis hash.
Both branches display a thriving economic activity,
but they represent two fundamentally different histories.
One has clearly been crafted by an attacker, but which one is the real chain?
In the case of Bitcoin, the canonical blockchain is the one representing the
largest amount of work. This does not mean that rewriting history is impossible,
but it is costly to do so, especially as one's hashing power could be used
towards mining blocks on the real blockchain.
In a proof-of-stake system where blocks are signed by stakeholders,
a former stakeholder (who has since cashed out) could use his old signatures
to costlessly fork the blockchain
--- this is known as the nothing-at-stake problem.
\subsection{Mitigations}
While this theoretical objection seems ironclad, there are effective mitigations.
An important insight is to consider that there are roughly two kind of forks:
very deep ones that rewrite a substantial fraction of the history
and short ones that attempt to double spend.
On the surface there is only a quantitative difference between the two
but in practice the incentives, motivations,
and mitigation strategies are different.
No system is unconditionally safe, not Bitcoin, not even public key
cryptography. Systems are designed to be safe for a given \emph{threat model}. How well
that model captures reality is, \emph{in fine}, an empirical question.
\subsubsection{Checkpoints}
Occasional checkpoints can be an effective way to prevent very long blockchain reorganizations.
Checkpoints are a hack. As Ben Laurie points out, Bitcoin's use of checkpoints
taints its status as a fully decentralized currency\cite{distrib_impossible}.
Yet, in practice, annual or even semi-annual checkpoints hardly seem problematic.
Forming a consensus over a single hash value over a period of months is
something that human institutions are perfectly capable of safely accomplishing.
This hash can be published in major newspapers around the world,
carved on the tables of freshmen students, spray painted under bridges,
included in songs, impressed on fresh concrete, tattooed on pet ferrets...
there are countless ways to record occasional checkpoints
in a way that makes forgery impossible.
In contrast, the problem of forming a consensus over a period of minutes
is more safely solved by a decentralized protocol.
\subsubsection{Statistical Detection}
Transactions can reference blocks belonging to the canonical blockchain,
thus implicitely signing the chain. An attacker attempting to forge a
long reorganization can only produce transactions involving coins he
controlled as off the last checkpoint. A long, legitimate, chain would
typically show activity in a larger fraction of the coins and can thus
be distinguished, statistically, from the forgery.
This family of techniques (often called TAPOS, for
``transactions as proof of stake'') does not work well for short forks where the sample
is too small to perform a reliable statistical test. However, they can be combined
with a technique dealing with short term forks to form a composite selection
algorithm robust to both type of forks.
%% \paragraph{Cementing}
%% Cementing --- a technique which consists in refusing to
%% consider and relay blocks causing medium to large reorganizations ---
%% can be quite effective.
%% The main theoretical weakness of cementing is that
%% it prevents a node from ever converging to the right blockchain
%% if it first accepts the wrong fork.
%% However, this requires the ability to isolate a node on the network.
%% Given this ability, it is possible to trick the node into accepting
%% a transaction that will be double spent on the main chain ---
%% this is true of Bitcoin and almost all blockchain based systems.
%% Such attacks can generally be detected statistically.
%% If the attack is detected, it suffices to stop accepting payments and to deactivate cementing
%% until convergence with the main chain has been achieved.
%% In the case of a new node bootstrapping on the network,
%% the cementing can be activated once the user is convinced
%% that his client has found the main chain (either by waiting long enough
%% or by requesting hashes from a few trusted sources).
%% Note that this bootstrapping procedure does not involve any more trust
%% or centralization than is already involved
%% in the process of merely downloading the client.
\subsection{The Nothing-At-Stake Problem}
An interesting approach to solving the nothing-at-stake problem was
outlined by Vitalik Buterin in the algorithm Slasher\cite{Slasher}.
However, Slasher still relies on a proof of work mechanism to mine blocks
and assumes a bound on the length of feasible forks.
We retain the main idea which consists in punishing double signers.
If signing rewards are delayed, they can be withheld
if any attempt at double spending is detected.
This is enough to prevent a selfish stakeholder
from opportunistically attempting to sign a fork
for the sake of collecting a reward should the fork succeed.
However, once rewards have been paid,
this incentive to behave honestly disappears;
thus, we use a delay long enough for TAPOS to become
statistically significant or for checkpointing to take place.
In order to incentivize stakeholders to behave honestly,
we introduce a ticker system. A prospective miner must
burn a certain amount of coins in order to exercise his
mining right. This amount is automatically returned to
him if he fails to mine, or after a long delay.
In order to allow stakeholders not to be permanently connected
to the Internet and not to expose private keys, a different,
signature key is used.
\subsection{Threat Models}
No system is unconditionally safe, not Bitcoin, not even public key
cryptography. Systems are designed to be safe for a given \emph{threat model}. How well
that model captures reality is, \emph{in fine}, an empirical question.
Bitcoin does offer an interesting guarantee: it attempts to tolerate amoral
but selfish participants. As long as miners do not collude, it is not necessary
to assume that any participant is honest, merely than they prefer making money
to destroying the network. However, non collusion, a key condition, is too
often forgotten, and the claim of Bitcoin's
``trustlessness'' is zealously repeated without much thought.
With checkpointing (be it yearly), the same properties can be achieved by
a proof-of-stake system.
Without checkpointing proof-of-stake systems cannot make this claim. Indeed,
it would be theoretically possible for an attacker to purchase old keys from
a large number of former stakeholders, with no consequence to them. In this case,
a stronger assumption is needed about participants, namely that a majority of current or
former stakeholders cannot be cheaply corrupted into participating in an
attack on the network. In this case, the role ``stake'' in the proof-of-stake is
only to avoid adverse selection by malicious actors in the consensus group.
\section{Potential Developments}
In this section, we explore some ideas
that we're specifically interested in integrating to the Tezos protocol.
\subsection{Privacy Preserving Transactions}
One of the most pressing protocol updates will be
the introduction of privacy preserving transactions.
We know of two ways to achieve this:
ring signatures and non-interactive zero-knowledge proofs of knowledge
(NIZKPK).
\subsubsection{Ring Signatures}
CryptoNote has built a protocol using ring signatures to preserve privacy.
Users are able to spend coins
without revealing which of $N$ addresses spent the coins.
Double spenders are revealed and the transaction deemed invalid.
This works similarly to the coin-join protocol
\emph{without} requiring the cooperation of the addresses involved in
obfuscating the transaction.
One of the main advantage of ring signatures is that they are comparatively
simpler to implement than NIZKPK and rely on more mature cryptographic
primitives which have stood the test of time.
\subsubsection{Non Interactive Zero-knowledge Proofs of Knowledge}
Matthew Green et al. proposed the use of NIZKPK
to achieve transaction untraceability in a blockchain based cryptocurrency.
The latest proposition, Zerocash, maintains
a set of coins with attached secrets in a Merkle tree.
Committed coins are redeemed by providing a NIZKPK
of the secret attached to a coin in the tree.
It uses a relatively new primitive, SNARKs,
to build very small proofs which can be efficiently checked.
This technique is attractive but suffers from drawbacks.
The cryptographic primitives involved are fairly new
and have not been scrutinized as heavily
as the relatively simple elliptic curve cryptography involved in Bitcoin.
Secondly, the construction of these proofs relies on the CRS model.
This effectively means that a trusted setup is required,
though the use of secure multi-party computation
can reduce the risk that such a setup be compromised.
\subsection{Amendment Rules}
\subsubsection{Constitutionalism}
While this is more advanced, it is possible to integrate a proof checker
within the protocol so that only amendments carrying a formal proof that
they respect particular properties can be adopted. In effect this enforces
a form of constitutionality.
\subsubsection{Futarchy}
Robin Hanson has proposed that we vote on values and bet on beliefs.
He calls such a system ``Futarchy''\cite{Futarchy}. The main idea
is that values are best captured by a majoritarian consensus while the choice
of policies conducive to realizing those values is best left to a prediction
market.
This system can quite litteraly be implemented in Tezos. Stakeholders would
first vote on a trusted datafeed representing the satisfaction of a value.
This might be for example the exchange rate of coins against a basket
of international currencies. An internal prediction market would be formed
to estimate the change in this indicator conditional on various code
amendments being adopted. The market-making in those contracts can be
subsidized by issuing coins to market makers in order to improve price discovery
and liquidity. In the end, the amendment deemed most likely to improve the
indicator would be automatically adopted.
\subsection{Solving Collective Action Problems}
The collective action problem arises when multiple parties would benefit from
taking an action but none benefit from individually undertaking the action.
This is also known as the free-rider problem.
There are several actions that the holders of a cryptocurrency could undertake
to raise its profile or defend it against legal challenges.
\subsubsection{Raising Awareness}
As of July 2014, the market capitalization of Bitcoin was around \$8B.
By spending about 0.05\% of the Bitcoin monetary mass every month,
Bitcoin could make highly visible
charitable donations of \$1M \emph{every single week}.
Would, as of 2014, an entire year of weekly charitable donations
raise the value of Bitcoin by more than 0.6\%?
We think the answer is clearly, and resoundingly ``yes''.
Bitcoin stakeholders would be doing well while doing good.
However, Bitcoin stakeholders are unable to undertake such an operation
because of the difficulty of forming large binding contracts. This type
of collective action problem is solved in Tezos.
A protocol amendment can set up a procedure by which
stakeholders may vote every month on a few addresses
where 0.05\% of the monetary mass would be spent.
The stakeholder's consensus might be to avoid dilution
by voting on an invalid address,
but it could also be that the money would be better spent as a charitable gift.
\subsubsection{Funding Innovation}
Financing of innovation would also be facilitated
by incorporating bounties directly within the protocol.
A protocol could define unit tests and automatically award a reward
to any code proposal that passes these tests.
Conversely, an innovator designing a new protocol
could include a reward to himself within the protocol.
While his protocol could be copied and the reward stripped,
the stakeholder's consensus would likely be to reward the original creator.
Stakeholders are playing a repeated game
and it would be foolish to defect by refusing a reasonable reward.
\section*{Conclusion}
We've presented issues with the existing cryptocurrencies
and offered Tezos as a solution.
While the irony of preventing the fragmentation of cryptocurrencies
by releasing a new one does not escape us,%\cite{xkcd_standards}
Tezos truly aims to be the \emph{last} cryptocurrency.
No matter what innovations other protocols produce,
it will be possible for Tezos stakeholders to adopt these innovations.
Furthermore, the ability to solve collective action problems
and easily implement protocols in OCaml will make Tezos one of the most reactive cryptocurrency.
\bibliographystyle{unsrt}
\bibliography{biblio}
\end{document}

View File

@ -0,0 +1,31 @@
\begin{thebibliography}{1}
\bibitem{Slasher}
Vitalik Buterin.
\newblock Slasher: A punitive proof-of-stake algorithm.
\newblock
{\url{https://blog.ethereum.org/2014/01/15/slasher-a-punitive-proof-of-stake-algorithm/}},
2014.
\bibitem{CoA}
Ariel~Gabizon Iddo~Bentov and Alex Mizrahi.
\newblock Cryptocurrencies without proof of work.
\newblock {\url{http://www.cs.technion.ac.il/~idddo/CoA.pdf}}, 2014.
\bibitem{Nomic}
Peter Suber.
\newblock Nomic: A game of self-amendment.
\newblock {\url{http://legacy.earlham.edu/~peters/writing/nomic.htm}}, 1982.
\bibitem{LWT}
J\'er\^ome Vouillon.
\newblock Lwt: a cooperative thread library.
\newblock 2008.
\bibitem{language}
Tezos project.
\newblock Formal specification of the tezos smart contract language.
\newblock {\url{http://www.tezos.com/language.txt}}, 2014.
\end{thebibliography}

View File

@ -0,0 +1,817 @@
%%% COMPILE WITH XELATEX, NOT PDFLATEX
\documentclass[letterpaper]{article}
\author{L.M Goodman}
\date{September 2, 2014}
\title{Tezos --- a self-amending crypto-ledger \\ White paper}
%\usepackage[utf8]{inputenc}
%%\setlength{\parskip}{\baselineskip}
\usepackage{amsfonts}
\usepackage{listings}
\usepackage{color}
\usepackage{courier}
\usepackage{epigraph}
\usepackage{fontspec}
\usepackage{newunicodechar}
\usepackage{graphicx}
\usepackage{siunitx}
\usepackage{url}
\usepackage[hidelinks]{hyperref}
%\epigraphfontsize{\small\itshape}
\setlength\epigraphwidth{4.6cm}
\setlength\epigraphrule{0pt}
%\DeclareUnicodeCharacter{42793}{\tz{}}
%\newunicodechar{}{\anchor}
%
\usepackage{url}
\lstset{basicstyle=\footnotesize\ttfamily,breaklines=true}
\newcommand{\tz}{{\fontspec{DejaVu Sans} \small{}}}
\begin{document}
\maketitle
\epigraph{\emph{``Our argument is not flatly circular,
but something like it.''}}
{--- \textup{Willard van Orman Quine}}
\begin{abstract}
We present Tezos, a generic and self-amending crypto-ledger. Tezos can
instantiate any blockchain based ledger. The operations of a regular blockchain
are implemented as a purely functional module abstracted into a shell
responsible for network operations. Bitcoin, Ethereum, Cryptonote, etc. can all
be represented within Tezos by implementing the proper interface to the network
layer.
Most importantly, Tezos supports meta upgrades: the protocols can evolve by
amending their own code. To achieve this, Tezos begins with a seed protocol
defining a procedure for stakeholders to approve amendments to the protocol,
\emph{including} amendments to the voting procedure itself. This is not unlike
philosopher Peter Suber's Nomic\cite{Nomic}, a game built around a fully
introspective set of rules.
In addition, Tezos's seed protocol is based on a pure proof-of-stake system
and supports Turing complete smart contracts. Tezos is implemented in OCaml,
a powerful functional programming language offering speed, an unambiguous
syntax and semantic, and an ecosystem making Tezos a good candidate for formal
proofs of correctness.
Familiarity with the Bitcoin protocol and basic cryptographic primitives are
assumed in the rest of this paper.
\end{abstract}
\newpage
\tableofcontents
\newpage
\section{Introduction}
In the first part of this paper, we will discuss the concept of abstract
blockchains and the implementation of a self-amending crypto-ledger.
In the second part, we will describe our proposed seed protocol.
\section{Self-amending cryptoledger}
A blockchain protocol can be decomposed into three distinct protocols:
\begin{itemize}
\item[-] The network protocol discovers blocks and broadcasts transactions.
\item[-] The transaction protocol specifies what makes a transaction valid.
\item[-] The consensus protocol forms consensus around a unique chain.
\end{itemize}
Tezos implements a generic network shell. This shell is agnostic to the
transaction protocol and to the consensus protocol. We refer to the transaction
protocol and the consensus protocol together as a ``blockchain protocol''. We
will first give a mathematical representation of a blockchain protocol and then
describe some of the implementation choices
in Tezos.
\subsection{Mathematical representation}
A blockchain protocol is fundamentally a monadic implementation of concurrent
mutations of a global state. This is achieved by defining ``blocks'' as
operators acting on this global state. The free monoid of blocks acting on the
genesis state forms a tree structure. A global, canonical, state is defined as
the minimal leaf for a specified ordering.
This suggests the following abstract representation:
\begin{itemize}
\item[-]Let $(\mathbf{S},\leq)$ be a totally ordered, countable, set of possible
states.
\item[-]Let $\oslash \notin \mathbf{S}$ represent a special, invalid, state.
\item[-]Let $\mathbf{B} \subset \mathbf{S}^{\mathbf{S} \cup \{\oslash\}}$ be the
set of blocks. The set of \emph{valid} blocks is
$\mathbf{B} \cap \mathbf{S}^{\mathbf{S}}$.
\end{itemize}
The total order on $\mathbf{S}$ is extended so that
$\forall s \in \mathbf{S}, \oslash < s$.
This order determines which leaf in the block tree is considered to be the
canonical one. Blocks in $\mathbf{B}$ are seen as operators acting on the state.
All in all, any blockchain protocol\footnote{GHOST is an approach which orders
the leafs based on properties of the tree. Such an approach is problematic for
both theoretical and practical reasons. It is almost always better to emulate it
by inserting proofs of mining in the main chain.} (be it Bitcoin, Litecoin,
Peercoin, Ethereum, Cryptonote, etc) can be fully determined by the tuple:
$$\left(\mathbf{S},\leq,\oslash,
\mathbf{B} \subset \mathbf{S}^{\mathbf{S} \cup \{\oslash\}}\right)$$
The networking protocol is fundamentally identical for these blockchains.
``Mining'' algorithms are but an emergent property of the network,
given the incentives for block creation.
In Tezos, we make a blockchain protocol introspective
by letting blocks act on the protocol itself.
We can then express the set of protocols recursively as
$$\mathcal{P} = \left\{\left(\mathbf{S},\leq,\oslash,\mathbf{B} \subset
\mathbf{S}^{(\mathbf{S} \times \mathcal{P})\cup \{\oslash\}} \right)\right\}$$
\subsection{The network shell}
This formal mathematical description doesn't tell us \emph{how} to build the
block tree. This is the role of the network shell, which acts as an interface
between a gossip network and the protocol.
The network shell works by maintaining the best chain known to the client. It is
aware of three type of objects. The first two are transactions and blocks, which
are only propagated through the network if deemed valid. The third are
protocols, OCaml modules used to amend the existing protocol. They will be
described in more details later on. For now we will focus on transaction and
blocks.
The most arduous part of the network shell is to protect nodes against
denial-of-service attacks.
\subsubsection{Clock}
Every block carries a timestamp visible to the network shell. Blocks that appear
to come from the future are buffered if their timestamps are within a few
minutes of the system time and rejected otherwise. The protocol design must
tolerate reasonable clock drifts in the clients and must assume that timestamps
can be falsified.
\subsubsection{Chain selection algorithm}
The shell maintains a single chain rather than a full tree of blocks. This chain
is only overwritten if the client becomes aware of a strictly better chain.
Maintaining a tree would be more parsimonious in terms of network communications
but would be susceptible to denial-of-service attacks where an attacker produces
a large number of low-scoring but valid forks.
Yet, it remains possible for a node to lie about the score of a given
chain, a lie that the client may only uncover after having processed a
potentially large number of blocks. However, such a node can be subsequently
ignored.
Fortunately, a protocol can have the property that low scoring chains exhibit a
low rate of block creation. Thus, the client would only consider a few blocks of
a ``weak'' fork before concluding that the announced score was a lie.
\subsubsection{Network level defense}
In addition, the shell is ``defensive''.
It attempts to connect to many peers across various IP ranges. It detects
disconnected peers and bans malicious nodes.
To protect against certain denial of service attacks, the protocol provides the
shell with context dependent bounds on the size of blocks and transactions.
\subsection{Functional representation}
\subsubsection{Validating the chain}
We can efficiently capture almost all the genericity
of our abstract blockchain structure with the following OCaml types.
To begin with, a block header is defined as:
\lstset{
language=[Objective]Caml
}
\begin{lstlisting}
type raw_block_header = {
pred: Block_hash.t;
header: Bytes.t;
operations: Operation_hash.t list;
timestamp: float;
}
\end{lstlisting}
We are purposefully not typing the header field more strongly so it can
represent arbitrary content. However, we do type the fields necessary for the
operation of the shell. These include the hash of the preceding block, a list of
operation hashes and a timestamp. In practice, the operations included in a
block are transmitted along with the blocks at the network level. Operations
themselves are represented as arbitrary blobs.
\begin{lstlisting}
type raw_operation = Bytes.t
\end{lstlisting}
The state is represented with the help of a \textbf{Context} module which
encapsulates a disk-based immutable key-value store. The structure of a
key-value store is versatile and allows us to efficiently represent a wide
variety of states.
\begin{lstlisting}
module Context = sig
type t
type key = string list
val get: t -> key -> Bytes.t option Lwt.t
val set: t -> key -> Bytes.t -> t Lwt.t
val del: t -> key -> t Lwt.t
(*...*)
end
\end{lstlisting}
To avoid blocking on disk operations, the functions use the asynchronous monad
Lwt\cite{LWT}. Note that the operations on the context are purely functional:
\textbf{get} uses the \textbf{option} monad rather than throwing an exception
while \textbf{set} and \textbf{del} both return a new \textbf{Context}.
The \textbf{Context} module uses a combination of memory caching and disk
storage to efficiently provide the appearance of an immutable store.
We can now define the module type of an arbitrary blockchain protocol:
\begin{lstlisting}
type score = Bytes.t list
module type PROTOCOL = sig
type operation
val parse_block_header : raw_block_header -> block_header option
val parse_operation : Bytes.t -> operation option
val apply :
Context.t ->
block_header option ->
(Operation_hash.t * operation) list ->
Context.t option Lwt.t
val score : Context.t -> score Lwt.t
(*...*)
end
\end{lstlisting}
We no longer compare states directly as in the mathematical model, instead we
project the \textbf{Context} onto a list of bytes using the \textbf{score}
function. List of bytes are ordered first by length, then by
lexicographic order. This is a fairly generic structure, similar to the one used
in software versioning, which is quite versatile in representing various
orderings.
Why not define a comparison function within the protocol modules? First off it
would be hard to enforce the requirement that such a function represent a
\emph{total} order. The score projection always verifies this (ties can be
broken based on the hash of the last block). Second, in principle we need
the ability to compare states across distinct protocols. Specific protocol
amendment rules are likely to make this extremely unlikely to ever happen,
but the network shell does not know that.
The operations \textbf{parse\_block\_header} and \textbf{parse\_operation} are
exposed to the shell and allow it to pass fully typed operations and blocks to
the protocol but also to check whether these operations and blocks are
well-formed, before deciding to relay operations or to add blocks to the local
block tree database.
The apply function is the heart of the protocol:
\begin{itemize}
\item[-]When it is passed a block header and the associated list of operations,
it computes the changes made to the context and returns a modified copy.
Internally, only the difference is stored, as in a versioning system,
using the block's hash as a version handle.
\item[-]When it is only passed a list of operations, it greedily attempts
to apply as many operations as possible. This function is not necessary for the
protocol itself but is of great use to miners attempting to form valid blocks.
\end{itemize}
\subsubsection{Amending the protocol}
Tezos's most powerful feature is its ability to implement protocol capable
of self-amendment. This is achieved by exposing two procedures functions to the
protocol:
\begin{itemize}
\item[-] \textbf{set\_test\_protocol} which replaces the protocol
used in the testnet with a new protocol (typically one that has been adopted
through a stakeholder voter).
\item[-] \textbf{promote\_test\_protocol} which replaces the current
protocol with the protocol currently being tested
\end{itemize}
These functions transform a Context by changing the associated protocol.
The new protocol takes effect when the following block is applied to the chain.
\begin{lstlisting}
module Context = sig
type t
(*...*)
val set_test_protocol: t -> Protocol_hash.t Lwt.t
val promote_test_protocol: t -> Protocol_hash.t -> t Lwt.t
end
\end{lstlisting}
The \textbf{protocol\_hash} is the \textbf{sha256} hash of a tarball of
\textbf{.ml} and \textbf{.mli} files. These files are compiled on the
fly. They have access to a small standard library but are sandboxed
and may not make any system call.
These functions are called through the \textbf{apply} function of the protocol
which returns the new \textbf{Context}.
Many conditions can trigger a change of protocol. In its simplest version,
a stakeholder vote triggers a change of protocol. More complicated rules
can be progressively voted in. For instance, if the stakeholder desire they
may pass an amendment that will require further amendments to provide a
computer checkable proof that the new amendment respects certain properties.
This is effectively and algorithmic check of ``constitutionality''.
\subsubsection{RPC}
In order to make the GUI building job's easier, the protocol exposes a JSON-RPC
API. The API itself is described by a json schema indicating the types of the
various procedures. Typically, functions such as \textbf{get\_balance} can
be implemented in the RPC.
\begin{lstlisting}
type service = {
name : string list ;
input : json_schema option ;
output : json_schema option ;
implementation : Context.t -> json -> json option Lwt.t
}
\end{lstlisting}
The name is a list of string to allow namespaces in the procedures. Input and
output are optionally described by a json schema.
Note that the call is made on a given context which is typically a recent ancestor
of the highest scoring leaf. For instance, querying the context six blocks above
the highest scoring leaf displays the state of the ledger with six confirmations.
The UI itself can be tailored to a specific version of the protocol, or generically
derived from the JSON specification.
\section{Seed protocol}
Much like blockchains start from a genesis hash, Tezos starts with a seed
protocol. This protocol can be amended to reflect virtually any blockchain based
algorithm.
\subsection{Economy}
\subsubsection{Coins}
There are initially $\num{10000000000}$ (ten billion) coins, divisible up to two
decimal places. We suggest that a single coin be referred to as a ``tez''
and that the smallest unit simply as a cent. We also suggest to use the
symbol \tz~(\verb!\ua729!, ``Latin small letter tz'') to represent a tez.
Therefore 1 cent = \tz\num{0.01} = one hundreth of a tez.
\subsubsection{Mining and signing rewards}
\paragraph{Principle}
We conjecture that the security of any decentralised currency requires
to incentivize the participants with a pecuniary reward. As explained in the
position paper, relying on transaction costs alone suffers from a tragedy of the
commons. In Tezos, we rely on the combination of a bond and a reward.
Bonds are one year security deposits purchased by miners.
In the event of a double signing, these bonds are forfeited.
After a year, the miners receive a reward along with their bond to compensate
for their opportunity cost. The security is primarily being provided by the
value of the bond and the reward need only be a small percentage of that value.
The purpose of the bonds is to diminish the amount of reward needed, and perhaps
to use the loss aversion effect to the network's advantage.
\paragraph{Specifics}
In the seed protocol, mining a block offers a reward of \tz\num{512} and
requires a bond of \tz\num{1536}. Signing a block offers a reward of
$32\Delta T^{-1}$ tez where $\Delta T$ is the time interval in minutes between
the block being signed and its predecessor. There are up to 16 signatures per block
and signing requires no bond.
Thus, assuming a mining rate of one block per minute, about 8\% of the initial
money mass should be held in the form of safety bonds after the first year.
The reward schedule implies at most a 5.4\% \emph{nominal} inflation
rate. \emph{Nominal} inflation is neutral, it neither enrishes nor
impoverishes anyone\footnote{In contrast, Bitcoin's mining inflation impoverishes
Bitcoin holders as a whole, and central banking enrishes the financial
sector at the expense of savers}.
Note that the period of a year is determined from the block's timestamps, not
from the number of blocks. This is to remove uncertainty as to the length of
the commitment made by miners.
\paragraph{Looking forward}
The proposed reward gives miners a 33\% return on their bond.
This return needs to be high in the early days as miners and signers commit
to hold a potentially volatile asset for an entire year.
However, as Tezos mature, this return could be gradually lowered to the
prevailing interest rate. A nominal rate of inflation below 1\% could safely be
achieved, though it's not clear there would be any point in doing so.
\subsubsection{Lost coins}
In order to reduce uncertainty regarding the monetary mass, addresses
showing no activity for over a year (as determined by timestamps)
are destroyed along with the coins they contain.
\subsubsection{Amendment rules}
Amendments are adopted over election cycles lasting $N = 2^{17} = \num{131072}$
blocks each. Given the a one minute block interval, this is about three
calendar months. The election cycle is itself divided in four quarters of
$2^{15} = \num{32768}$ blocks. This cycle is relatively short to encourage early
improvements, but it is expected that further amendments will increase the
length of the cycle. Adoption requires a certain quorum to be met. This quorum
starts at $Q = 80\%$ but dynamically adapts to reflect the average
participation. This is necessary if only to deal with lost coins.
\paragraph{First quarter}
Protocol amendments are suggested by submitting the hash of a tarball of
\verb!.ml! and \verb!.mli! files representing a new protocol. Stakeholders may
approve of any number of these protocols. This is known as ``approval voting'',
a particularly robust voting procedure.
\paragraph{Second quarter}
The amendment receiving the most approval in the first quarter is now subject to
a vote. Stakeholders may cast a vote for, against or can choose to explicitely
abstain. Abstentions count towards the quorum.
\paragraph{Third quarter} If the quorum is met (including explicit abstentions),
and the amendment received $80\%$ of yays, the amendment is approved and
replaces the test protocol. Otherwise, it is rejected.
Assuming the quorum reached was $q$, the minimum quorum $Q$ is updated as such:
$$Q \leftarrow 0.8 Q + 0.2 q.$$
The goal of this update is to avoid lost coins causing the voting procedure to
become stuck over time. The minimum quorum is an exponential moving average of
the quorum reached over each previous election.
\paragraph{Fourth quarter} Assuming the amendment was approved, it will have
been running in the testnet since the beginning of the third quarter.
The stakeholders vote a second time to confirm they wish to promote the test
protocol to the main protocol. This also requires the quorum to be met and an
$80\%$ supermajority.
We deliberately chose a conservative approach to amendments. However,
stakeholders are free to adopt amendments loosening or tightening this policy
should they deem it beneficial
\subsection{Proof-of-stake mechanism}
\subsubsection{Overview}
Our proof-of-stake mechanism is a mix of several ideas, including
Slasher\cite{Slasher}, chain-of-activity\cite{CoA}, and proof-of-burn.
The following is a brief overview of the algorithm, the components of which
are explained in more details below.
Each block is mined by a random stakeholder (the miner) and includes
multiple signatures of the previous block provided by random
stakeholders (the signers). Mining and signing both offer a small reward but
also require making a one year safety deposit to be
forfeited in the event of a double mining or double signing.
The protocol unfolds in cycles of \num{2048} blocks. At the beginning of each
cycle, a random seed is derived from numbers that block miners chose and committed
to in the penultimate cycle, and revealed in the last. Using this random seed,
a follow the coin strategy is used to allocate migning rights and signing rights
to a specific addresses for the next cycle. See figure \ref{fig:pos_figure}.
\begin{figure}[b!]
\centering
\includegraphics[width=0.8\textwidth]{pos_figure.eps}
\caption{Four cycles of the proof-of-stake mechanism}
\label{fig:pos_figure}
\end{figure}
\subsubsection{Clock}
The protocol imposes minimum delays between blocks. In principle, each block
can be mined by any stakeholder. However, for a given block, each stakeholder
is subject to a random minimum delay. The stakeholder receiving the highest
priority may mine the block one minute after the previous block. The
stakeholder receiving the second highest priority may mine the block two
minutes after the previous block, the third, three minutes, and so on.
This guarantees that a fork where only a small fraction of stakeholder
contribute will exhibit a low rate of block creation. If this weren't
the case, a CPU denial of service attacks would be possible by
tricking nodes into verifying a very long chain claimed to have a very high
score.
\subsubsection{Generating the random seed}
Every block mined carries a hash commitment to a random number chosen by the
miner. These numbers must be revealed in the next cycle under penalty of
forfeiting the safety bond. This harsh penalty is meant to prevent selective
whitholding of the numbers which could be sued to attack the entropy of the seed.
Malicious miners in the next cycle could attempt to censor such reveals, however
since multiple numbers may be revealed in a single block, they are very unlikely
to succeed.
All the revealed numbers in a cycle are combined in a hash list and the seed is
derived from the root using the \verb!scrypt! key derivation function. The key
derivation should be tuned so that deriving the seed takes on the order of a
fraction of a percent of the average validation time for a block on a typical
desktop PC.
\subsubsection{Follow-the-coin procedure}
In order to randomly select a stakeholder, we use a follow the coin procedure.
\paragraph{Principle}
The idea is known in bitcoin as follow-the-satoshi. The procedures works
``as-if'' every satoshi ever minted had a unique serial number. Satoshis are
implicitly ordered by creation time, a random satoshi is drawn and tracked
through the blockchain. Of course, individual cents are not tracked directly.
Instead, rules are applied to describe what happens when inputs are combined and
spent over multiple output.
In the end, the algorithm keeps track of a set of intervals associated with each
key. Each intervals represents a ``range'' of satoshis.
Unfortunately, over time, the database becomes more and more fragmented,
increasing bloat on the client side.
\paragraph{Coin Rolls}
We optimize the previous algorithm by constructing large ``coin rolls'' made up
of \num{10000} tez. There are thus about one million rolls in existence. A
database maps every roll to its current owner.
Each address holds a certain set of specific rolls as well as some loose change.
When we desire to spend a fraction of a full roll, the roll is broken and
its serial number is sent in a LIFO queue of rolls, a sort of ``limbo''. Every
transaction is processed in a way that minimizes the number of broken rolls.
Whenever an address holds enough coins to form a roll, a serial number is pulled
from the queue and the roll is formed again.
The LIFO priority ensures that an attacker working on a secret fork cannot
change the coins he holds by shuffling change between accounts.
A slight drawback of this approach is that stake is rounded down to the
nearest integer number of rolls. However, this provides a massive improvement
in efficiency over the follow-the-satoshi approach.
While the rolls are numbered, this approach does not preclude the use of
fungibility preserving protocols like Zerocash. Such protocols can use
the same ``limbo'' queue technique.
\paragraph{Motivation}
This procedure is functionally different from merely drawing a random address
weighted by balance.
Indeed, in a secretive fork, a miner could attempt to control the generation of
the random seed and to assign itself signing and minting rights by creating the
appropriate addresses ahead of time. This is much harder to achieve if rolls
are randomly selected, as the secretive fork cannot fake ownership of certain
rolls and must thus try to preimage the hash function applied to the seed to
assign itself signing and minting rights.
Indeed, in a cycle of length $N=\num{2048}$, someone holding a fraction $f$ of
the rolls will receive on average $f N$ mining rights, and the effective
fraction received, $f_0$ will have a standard deviation of
$$\sqrt{\frac{1}{N}}\sqrt{\frac{1-f}{f}}.$$
If an attacker can perform a brute-force search through $W$ different seeds,
then his expected advantage is at most\footnote{this is a standard bound
on the expectation of the maximum of W normally distributed variable}
$$\left(\sqrt{\frac{2\log(W)}{N}}\sqrt{\frac{1-f}{f}}\right)fN$$
blocks. For instance, an attacker controlling $f = 10\%$ of the rolls should
expect to mine about $205$ blocks per cycle. In a secret fork where he attempts
to control the seed, assuming he computed over a trillion hashes, he could
assign itself about $302$ blocks, or about $14.7\%$ of the blocks. Note that:
\begin{itemize}
\item[-] The hash from which the seed is derived is an expensive key derivation
function, rendering brute-force search impractical.
\item[-] To make linear gains in blocks mined, the attacked needs to expend a
quadratically exponential effort.
\end{itemize}
\subsubsection{Mining blocks}
The random seed is used to repeatedly select a roll. The first roll selected
allows its stakeholder to mine a block after one minute, the second one after
two minutes --- and so on.
When a stakeholder observes the seed and realizes he can mint a high priority
block in the next cycle, he can make a security deposit.
To avoid a potentially problematic situation were no stakeholder made a
safety deposit to mine a particular block, after a 16 minutes delay, the
block may be mined without a deposit.
Bonds are implicitely returned to their buyers immediately in any chain
where they do not mine the block.
\subsubsection{Signing blocks}
As it is, we almost have a working proof of stake system.
We could define a chain's weight to be the number of blocks.
However, this would open the door to a form of selfish mining.
We thus introduce a signing scheme. While a block is being minted, the random
seed is used to randomly assign 16 signing rights to 16 rolls.
The stakeholders who received signing rights observe the blocks being minted and
then submit signatures of that blocks. Those signatures are then included in
the next block, by miners attempting to secure their parent's inclusion in the
blockchain.
The signing reward received by signers is inversely proportional to the time
interval between the block and its predecessor.
Signer thus have a strong incentive to sign what they genuinely believe to be
the best block produced at one point. They also have a strong incentive to agree
on which block they will sign as signing rewards are only paid if the block ends
up included in the blockchain.
If the highest priority block isn't mined (perhaps because the miner isn't
on line), there could be an incentive for signers to wait for a while, just
in case the miner is late. However, other signers may then decide to sign the
best priority block, and a new block could include those signatures, leaving out
the holdouts. Thus, miners are unlikely to follow this strategy.
Conversely, we could imagine an equilibrium where signers panic and start
signing the first block they see, for fear that other signers will do so and
that a new block will be built immediately. This is however a very contrived
situation which benefits no one. There is no incentive for signers to think this
equilibrium is likely, let alone to modify the code of their program to act
this way. A malicious stakeholder attempting to disrupt the operations would only
hurt itself by attempting to follow this strategy, as others would be unlikely
to follow suit.
\subsubsection{Weight of the chain}
The weight is the number of signatures.
\subsubsection{Denunciations}
In order to avoid the double minting of a block or the double signing of a
block, a miner may include in his block a denunciation.
This denunciation takes the form of two signatures. Each minting signature
or block signature signs the height of the block, making the proof of malfeasance quite concise.
While we could allow anyone to denounce malfeasance, there is really no point to
allow anyone else beyond the block miner. Indeed, a miner can
simply copy any proof of malfeasance and pass it off as its own
discovery.\footnote{A zero-knowledge proof would allow anyone to benefit from
denouncing malfeasances, but it's not particularly clear this carries much
benefit.}
Once a party has been found guilty of double minting or double signing,
the safety bond is forfeited.
\subsection{Smart contracts}
\subsubsection{Contract type}
In lieu of unspent outputs, Tezos uses stateful accounts. When those
accounts specify executable code, they are known more generally as
contracts. Since an account is a type of contract (one with no
executable code), we refer to both as "contracts" in full generality.
Each contract has a ``manager", which in the case of an account is
simply the owner. If the contract is flagged as spendable, the manager
may spend the funds associated with the contract. In addition, each
contract may specify the hash of a public key used to sign or
mine blocks in the proof-of-stake protocol. The private key may or
may not be controlled by the manager.
Formally, a contract is represented as:
\begin{lstlisting}
type contract = {
counter: int; (* counter to prevent repeat attacks *)
manager: id; (* hash of the contract's manager public key *)
balance: Int64.t; (* balance held *)
signer: id option; (* id of the signer *)
code: opcode list; (* contract code as a list of opcodes *)
storage: data list; (* storage of the contract *)
spendable: bool; (* may the money be spent by the manager? *)
delegatable: bool; (* may the manager change the signing key? *)
}
\end{lstlisting}
The handle of a contract is the hash of its initial content. Attempting
to create a contract whose hash would collide with an existing contract
is an invalid operation and cannot be included in a valid block.
Note that data is represented as the union type.
\begin{lstlisting}
type data =
| STRING of string
| INT of int
\end{lstlisting}
where \verb!INT! is a signed 64-bit integer and string is an array of
up to \num{1024} bytes. The storage capacity is limited to \num{16384} bytes,
counting the integers as eight bytes and the strings as their length.
\subsubsection{Origination}
The origination operation may be used to create a new contract, it specifies
the code of the contract and the initial content of the contract's storage. If
the handle is already the handle of an existing contract, the origination is
rejected (there is no reason for this to ever happen, unless by mistake or
malice).
A contract needs a minimum balance of $\tz~\num{1}$ to remain active. If the
balance falls below this number, the contract is destroyed.
\subsubsection{Transactions}
A transaction is a message sent from one contract to another contract, this
messages is represented as:
\begin{lstlisting}
type transaction = {
amount: amount; (* amount being sent *)
parameters: data list; (* parameters passed to the script *)
(* counter (invoice id) to avoid repeat attacks *)
counter: int;
destination: contract hash;
}
\end{lstlisting}
Such a transaction can be sent from a contract if signed using the manager's key
or can be sent programmatically by code executing in the contract. When the
transaction is received, the amount is added to the destination contract's
balance and the destination contract's code is executed. This code can make use
of the parameters passed to it, it can read and write the contract's storage,
change the signature key and post transactions to other contracts.
The role of the counter is to prevent replay attacks. A transaction is only
valid if the contract's counter is equal to the transaction's counter. Once a
transaction is applied, the counter increases by one, preventing the transaction
from being reused.
The transaction also includes the block hash of a recent block that the client
considers valid. If an attacker ever succeeds in forcing a long reorganization
with a fork, he will be unable to include such transactions, making the fork
obviously fake. This is a last line of defense, TAPOS is a great system to
prevent long reorganizations but not a very good system to prevent short term
double spending.
The pair (account\_handle, counter) is roughly the equivalent of an unspent
output in Bitcoin.
\subsubsection{Storage fees}
Since storage imposes a cost on the network, a minimum fee of \tz~1 is assessed
for each byte increase in the storage. For instance, if after the execution of
a transaction, an integer has been added to the storage and ten characters have
been appended to an existing string in the storage, then \tz~18 will be withdrawn
from the contract's balance and destroyed.
\subsubsection{Code}
The language is stack based, with high level data types and primitives and strict
static type checking. Its design is insipired by Forth, Scheme, ML and Cat.
A full specification of the instruction set is available in\cite{language}.
This specification gives the complete instruction set, type system and semantics
of the language. It is meant as a precise reference manual, not an easy introduction.
\subsubsection{Fees}
So far, this system is similar to the way Ethereum handles transaction. However,
we differ in the way we handle fees. Ethereum allows arbitrarily long programs
to execute by requiring a fee that increases linearly with the program's
executing time. Unfortunately, while this does provide an incentive for one
miner to verify the transaction, it does not provide such an incentive to other
miners, who must also verify this transaction. In practice, most of the
interesting programs that can be used for smart contracts are very short.
Thus, we simplify the construction by imposing a hard cap on the number of steps
we allow the programs to run for.
If the hard cap proves too tight for some programs, they can break the execution
in multiple steps and use multiple transactions to execute fully. Since Tezos is
amendable, this cap can be changed in the future, or advanced primitives can be
introduced as new opcodes.
If the account permits, the signature key may be changed by issuing a signed
message requesting the change.
\section{Conclusion}
We feel we've built an appealing seed protocol. However, Tezos's true potential
lies in putting the stakeholders in charge of deciding on a protocol that they
feel best serves them.
\bibliographystyle{plain}
\bibliography{biblio}
\end{document}