Skip to main content

· 5 min read
Mark Seaborn

Since the advent of peer-to-peer networks in the 1970s, we have seen them applied in a variety of use cases on the internet. From the first usages in the Usenet network, to the infamous Napster network, astute developers have created peer networks to enable sharing of information and data without the use of central servers. Over the past few years, we have seen a new increase in the use of peer to peer networks and formalization of algorithms to implement these networks. The Pyrsia network is an example of a new application of peer to peer networks. Pyrsia is an open-source project that is designed to build trust that an open-source distribution is free of malicious code via a consensus algorithm, then share that distribution amongst the peers. An important goal of this project is to create a peer network that is reliable and "fast". To do this, the team added a unique feature to the peer network protocol that helps one peer in the network choose the best peer to download software distributions.

Herein I present the mechanism the Pyrsia team has created to help improve the selection process for downloading software distributions from peers. The idea is to improve upon the Kademila peer selection process and make "smarter" choices from Pyrsia peers based on a "peer metric". The mechanism is layered atop the Kademila peer selection process, which in Pyrisa returns a list of peers known to be in possession of an open-source distribution of interest. The peer metric is a real time assessment of the work-load of any given machine in the network, both related and unrelated to any current software transfers within the Pyrsia network.

The following table defines terms used in this blog.

TermDefinition
RequesterIn this context a requestor is a Pyrsia client requesting a file to be downloaded or uploaded to/from its peer(s).
Peer (Pyrsia Peer)Any Pyrsia client that is participating in the network.
Neighboring NodeAny node that is returned by the Kademila algorithm as having a software distribution desired by a peer.

As previously stated, a primary goal of the peer-to-peer software distribution sharing function of Pyrsia is that it efficiently uses peers. The efficient use of peers can be broken down into two aspects, efficiencies related to uploading software distributions to peers and efficiencies related to downloading software distributions from peers. This post limits the discussion to downloading software distributions from peers.

The primary rule defined by our selection process is that requesters must not overwhelm peers by requesting too many software distributions too often from any one neighboring node. Though hard resource usage limits are a configuration function for Pyrsia Peers, Pyrsia’s peer selection mechanism should also attempt to balance requests across the network of peers. Additionally, the peer selection mechanism should operate such that file requesters are not starved for bandwidth in the network by unresponsive peers, preventing fast downloads. While the first efficiency could be guaranteed by Pyrsia policy, the second one is more nebulous because Pyrsia cannot guarantee enough stable network peers exist at request times to achieve “fast downloads”.

The terms “fast downloads” and “requesting too much data” must be further defined to quantify what is meant. The means of describing these terms can be defined as a function of the attributes within the environment. There are measurable attributes of both the network and the Pyrsia peers participating in the networks that can help us define metrics to balance requests. These metrics will ultimately determine the quality of peers in the requesters' peer list. The term quality in this context is related to the ability of a peer to satisfy a request for download and will be defined by the function Q(x):

$Q(x) = \sum_{n=1} ^{pa\ count} pa_n * weight

where pa is a peer attribute of the environment and weight is a number that determines the expense of the attribute. The quality number will be calculated on demand by peers returned in the list of peers generated by the Kadimela algorithm. The quality function can be used for either upload quality or download quality depending on the need. The metric is currently used during the selection process for downloading software distributions. Finally, we need to define what characteristics are important to drive analytics for decisions about transactions on the network. The following table lists the attribute used by to Pyrsia generate the peer metric in Pyrsia. The peer with the lowest number is considered to be the ideal candidate for the software distribution download.

Peer AttributeDefinition
Peer Network LoadA measurement of the current network bandwidth usage in terms packets in and packet out summed over all network interfaces.
CPU LoadThe average CPU load over the last minute
Disk I/O LoadA measure of the current packets being read and written summed over all current processes on the system

This system of measuring the quality of peers will evolved over time and as other attributes are defined, they will be integrated into the Pyrsia network.

· 7 min read
Christopher McArthur

Blockchain technologies are trending. There’s a lot of information about what they do and how they work but often these are abstract high level overviews. Looking beneath the surface, these overviews skip some of the most fascinating parts of the implementations that bring these technologies to life.

What is a Blockchain

IBM has one of my favorite definitions:

 Blockchain is a shared, immutable ledger that facilitates the process of recording transactions and tracking assets [... on] a network. An asset can be tangible (a house, car, cash, land) or intangible (intellectual property, patents, copyrights, branding). Virtually anything of value can be tracked [... ] on a blockchain network, reducing risk [...] for all involved.

There’s a few key items to highlight from this definition are:

  • A shared, immutable ledger
  • Recording and tracking assets

Simply put, a Blockchain is a list of changes to assets, called transactions, which are grouped in blocks that are recognized by all participants in the network.

Unfortunately, it leaves out one major element. How do blockchain networks reduce risk? How do blockchain networks agree on the shared ledger?

Consensus

In the world of blockchain, consensus is the agreement of which block is next. Collectively all the participants in the blockchain’s network should come to the same conclusion.

You’ve probably heard of “Proof of Work”, usually called mining, and “Proof of Stake”, these two are the most popular in terms of market share in the crypto markets.

These consensus algorithms were some of the first to be popularized by projects like Bitcoin and Ethereum. These are far from the only ones, Proof of Authority, Proof of Burn, Proof of Capacity and Proof of History are just from others mentioned on Investopedia's Website.

Let’s dive deeper into the two popular algorithms to see how they obtain consensus.

Proof of Work

Proof of Work is based on a mathematical expression which is very costly to compute a magic number, called a nonce, but easily verified. Each participant, typically referred to as a miner, begins the calculation with the transitions it’s heard on the network since the last published block. Participants race each other to find the right answer and the winner traditionally gets a reward. It’s difficult to cheat and very rewarding to operate in good faith.

At the time of writing, January 28th 2022, the reward is 6.25 BTC which is worth 236,755 USD. It’s easy to draw parallels to the 1849 Gold Rush.

BitCoin Value

When miners hear of a new block, they immediately stop their calculation and begin listening for more transactions. This is approximately a 10 minute window for Bitcoin.

With a small number of competitors it’s pretty straightforward but when tens of thousands of nodes are involved, what happens if two miners finish at the same time. Who wins?

Proof of Stake

Proof of Stake is an investment strategy where committing more capital means you’re more likely to get the reward. Peercoin, a very early PoS implementation, kept the mining of PoW but required less computational complexity the more coinage was staked. If two participants offer the same investment in the next block, which one is rewarded?

If someone is able to offer more capital for their block to be accepted, are they able to always win? Yes, this is a special type of security exploit called a 51% attack. If the confirmation of the next block is tied to a resource, then an entity which holds a majority stake can take control of the blockchain. Proof of Work is also susceptible to this type of attack.

Stale and Orphan Blocks

Most blockchain’s are actually trees, not linked lists which is what probably comes to mind for young data scientists. This probably comes as a shock but it’s the secret ingredient to solving our racing condition when two participants propose the next block in PoW.

If two nodes broadcast different versions of the next block simultaneously, some nodes may receive one or the other first. In that case, they work on the first one they received, but save the other branch in case it becomes longer. The tie will be broken when the next proof-of-work is found and one branch becomes longer.

Bitcoin: A Peer-to-Peer Electronic Cash System Section 5

Chains are made of links that connect to others. This data structure is referred to as a linked list, when a link points to both it’s parent and child is a doubly linked list. Three sequential blocks being published would make the following chain:

If block 4 was published at the same time as our block 3 a node would have the following tree:

Since it’s extremely unlikely that a second pair of blocks would also be published at the same time, the tie is broken when the next sequential block is published. This is block 5 in the diagram above.

Which branch should we follow

Well, intuitively the “strongest” branch of our tree is the one we should stick with. The strength comes from the amount of work that has gone into making the branch. More work means it’s less likely that someone has cheated or lied.

Dynamic Validator Sets

Ethereum’s new PoS system named Consensus Layer, formally Ethereum 2.0, is still in development but it’s leading implementation is Casper FFG [2] (there’s also this alternative) which uses a Byzantine Fault Tolerance consensus.

In Proof of Stake, the participants are called validators who’s role is similar to miners in Proof of Work. The validator is staking money on the block that it thinks should be added to the chain. If the block gets appended, then they get a reward that is proportional to the bet that they placed on the block. [1] Validators are responsible for following the forking rules when staking checkpoints. Validators decide which block is the best by following a set of rules.

https://arxiv.org/pdf/1710.09437.pdf

The proposal set the stake deposit at 32 ETH to be eligible to act as a validator. At the time of writing that has a value of 81,374.81 USD even after a recent drop in value.

ETH Value

There’s a few key concepts employed in Casper FFG, many are outlined in the EIPS 1011’s Glossary

  • Checkpoint is the block in the finalization stage
  • Epoch is the range of blocks between checkpoints. This grow by one block for each new epoch
  • Dynasty refers to the number of finalized checkpoints in the chain. (Note: checkpoints do not reach finality unless a super-majority of votes are obtained in favor)

The validators are randomly selected (this is not detailed in the proposals but details are available here) based on who has deposited the correct funds. Participants must deposit the correct sum 2 dynasties in advance to join a validator set. In order to leave they must send a withdrawal request 2 dynasties in advance, however the funds are not returned for approximately 4 months since the withdrawal was requested.

Consensus, or finality, is not reached unless ⅔, a super-majority of votes have been gathered stating the checkpoint meets the forking rules “follow the chain containing the justified [block] of the greatest height”.

When a checkpoint is finalized all the blocks in it’s epoch are implicitly finalized as well. This also marks the start of a new dynasty, casting the previous on to history it can no longer be modified and reaches immutability. This is also when validators can enter or exit the staking on checkpoints.

References

https://developer.bitcoin.org/devguide/p2p_network.html?highlight=stale%20orphan#orphan-blocks

https://developer.bitcoin.org/devguide/block_chain.html?highlight=stale#block-height-and-forking

https://ethereum.org/en/developers/docs/consensus-mechanisms/pos/

https://vitalik.ca/general/2017/12/31/pos_faq.html

https://medium.com/unitychain/intro-to-casper-ffg-9ed944d98b2d

https://medium.com/@ppio/consensus-byzantine-fault-tolerance-402258ec7a60

https://ethresear.ch/t/casper-ffg-with-one-message-type-and-simpler-fork-choice-rule/103/31