The Byzantine Generals problem was first introduced in Computer Science paper published in 1982. The problem discussed in the paper is that reliable computer systems must be able to function effectively in the presence of faulty components that may send conflicting information to different parts of the system. This issue is even more accurate when we talk about decentralized computer networks.
Imagine the following thought experiment the Byzantine army has surrounded an enemy city. The Army is organized into several units. Each unit is commanded by a general and they all need to come up with a coordinated plan of action. However, they are located away from each other and the only means to communicate among themselves is via messages to make things more complicated one or more of the generals are possibly traitors. The presence of disloyal generals means that misleading messages could be sent. Aiming to disrupt any coordinated plan of action is it attack or retreat to find a successful solution to this conundrum.
The Byzantine army needs to find its path. The coordinated action one way or another to achieve this. The Byzantine army needs an algorithm that works effectively towards a coordinated outcome where the loyal generals follow it and the traders don't. Now that you're familiar with the problem let's see its solution. It's called the Byzantine fault tolerance algorithm. Over the years there have been several proposed theoretical solutions involving game theory and math. The first practical implementation of Byzantine fault tolerance algorithm came with the bitcoins.
Proof of work in this case the generals are nodes on the bitcoin network also known as Meiners a network node is a connection point that can receive create store and send data across the network. In other words nodes are the connected dots that make up a network to simplify. Think of it in the following way in the image we traditionally use to depict a block chain. Every single computer is a separate node. They are all connected and can receive create store the end data to each other in the context of the Byzantine fault tolerance algorithm. The important concept to grasp is that these mining nodes start from the assumption that nobody else on the network can be trusted. Proof of Work security network consensus even in the presence of non-compliant nodes that is even if there are some Byzantine Generals who are not acting in the Army's best interest coordinated action can still be achieved.
Let's see how this mechanism works in bitcoin as we all know Bitcoin is a peer to peer network where all activities are done by its users through appropriate software and hardware. These activities include making transactions receiving transactions and verifying and transmitting transactions. Now, this is where we need to introduce the concept of mining which many of you have probably heard. Mining is an activity carried out by network participants which involves proof of work and results in generating new coins as a reward for the miner who successfully did this proof of work.
First for each new block proof of work requires a hefty number of calculations done by a computer aimed at solving cryptographic hash puzzles. These are the same puzzles described earlier enabling the network to function and to continue exchanging transaction messages with other network participants. Let's dig in to the nuts and bolts of this mechanism to figure out how it works. First, let's see how miners create new blocks mining nodes collect and aggregate new transaction data. Upon receiving such data each node independently verifies each and every transaction against a long list of criteria including tracking the source of the digital money being spent checking for double spending of the same money checking if the total transaction volume is within the allowed range of zero to 21 million bitcoins as 21 is the maximum total supply of bitcoin allowed by the system and the list goes on. The Bitcoin software installed on the node performs a number of other checks and balances verified transactions are aggregated into transaction pools also called Memory pools or memory pools where they wait until they are included into a block as miners compete with each other to be the first one to come up with a new valid block. They need to make sure the transactions in their memory pools have not already been included in previous blocks.
After collecting and arranging verified transactions in a candidate block the miner needs to construct the block header which includes a few important components. A summary of all the transaction data in the candidate block a link to the previous block in the chain also known as a parent block a time stamp showing the time of creation of the block and a valid proof of work. The summaries of what's included in a given block are done through hash functions which process data in such a way that results in a standardized Unique Identification Code. This identification code is also referred to as a digital fingerprint. In this way, the system has a unique identifier for each block of transactions.
Here are some examples of block blockaders as viewed in block explorer Dom and block chain dot info. As you can see on top just below the block number there is a long alphanumeric string called block hash or just hash alphanumeric means that it consists of both letters and numbers. This is a type of encoding data and is the output result of processing the blockheaded data below through bitcoins cryptographic hash function. You may have heard the name of this function sh a 256 which stands for the secure hash algorithm. You probably remember that we mentioned and briefly explain hash functions in the section about cryptography we are discussing them again here because they play such an important role in the proof of work.
As we've learned a hash function can digest any kind of data of any size into a fixed length string of characters which can serve as a unique digital fingerprint or identifier. Moreover, these cryptographic hash functions only work in one direction. Once we have the output we cannot simply invert the function plug in the output and get input data on the other end. To illustrate what it means to invert a function consider the four basic mathematical operations addition and subtraction are inverse functions of each other multiplication and division are inverse functions of each other. We can always construct equations with these functions defined any unknown variable for example 3 times X equals 15 x equals 15 divided by three equals five. Many mathematical functions can be inverted in a similar way. However this is not the case with cryptographic hash functions. The only feasible way an unknown random variable can be found in a cryptographic functions input data set is by trying different values for the unknown. One by one given all the known parameters in order to find out what works.
This is basically a brute force approach of trying potentially all possible combinations and this is precisely the element of work as used in proof of work. The work comprises all the iterative computation a computer needs to do to find a solution of the cryptographic puzzle.