Byzantine Fault Tolerance For Unstoppable Distributed System

In this article, we will discuss the Famous Byzantine general’s problem which led to figuring out the solution with Byzantine fault tolerance as a consensus to avoid getting into the Byzantine general’s problem and ultimately developing a practical Byzantine fault tolerance.

Understanding Byzantine general’s problem

This problem was invented by Leslie Lamport, Robert Shostak, and Marshall Pease in the year 1982. This problem is an excellent example of the problems faced by various parties involved in a decentralized system in reaching a consensus. The Byzantine problem goes like this

The Byzantine army is divided into numerous troops, each of which is commanded by a general.
The troop generals have to agree on the next movement of the plan.
Generals communicate via messenger to come to an agreement on a coordinated plan of action whereby all troops coordinate and launch attacks from all directions to secure victory.
Here it is likely that traitors will try to interfere with their plan by intercepting and faulting the messages.
This challenge is meant to force all of the true commanders to come to a consensus without the imposters interfering with their schemes.
Regardless of the traitors’ actions here, the generals must come to a consensus or decision.
At no point in time should the system or network allow for a 51% attack.

So the consensus method called Byzantine Fault Tolerance (BFT) prevents a system from becoming involved in the Byzantine Generals’ dilemma. It also implies that if a node or the system fails, everything should still function. Furthermore, BFT seeks to lessen the network’s exposure to fraudulent byzantine nodes, or generic nodes.

The on-field solution to Byzantine General’s problem

Let’s say, the generals have all agreed upon a protocol that requires the solution of a difficult puzzle before the attack message can be appended. The proposing general finds the cryptography puzzle challenging, but other generals find it simple to review. Here, a general on the battlefield makes up a plan and wants to tell everyone else to attack on a specific day and time.

By the plan. The following will be the steps:

· He’s going to add an afterthought to the original structure.

· Next, he will see the outcome of hashing the plan that was appended with the nonce. Then, he will continue adding the nonce and determining whether he has the desired hash.

Next, forward the messengers to additional pertinent generals. Any alteration to their message will result in an entirely different hash, which the other generals can easily identify, even if the messengers are apprehended.

The solution to the Byzantine general problem

The Byzantine fault-tolerant algorithms encompass a collection of algorithms utilized to address the Byzantine general’s problem. Many early algorithms for solving the Byzantine general problem were proposed for example oral information (OM), the messages with signatures (SM) solution, Full-time domain-matching algorithm pursuit (FTMP), and The Scale-Invariant Feature Transform algorithm (SIFT). Among them, oral information (OM) and the messages with signatures (SM) solution were proposed by Lamport.

Practical Byzantine fault tolerance (PBFT)

In 1999, Castro et al introduced the practical Byzantine fault tolerance (PBFT) consensus algorithm. It describes the first practical state machine replication protocol. It was more operationally efficient and had less algorithmic complexity than the conventional SM algorithm, which made it an improvement. The three sub-algorithms that make up the PBFT algorithm are view change, garbage collection, and regular consensus. When the system is functioning normally, it employs the regular consensus algorithm, and superfluous log messages are eliminated through garbage collection.

If the leader, or primary node, fails, the view change mechanism is applied to identify a new main node. A protocol known as PBFT is created for networks with N nodes that are partially synchronous, where N is equal to 3f + 1, and f represents the number of faulty or Byzantine nodes. Replay and spoofing attacks are stopped by the system by using cryptographic techniques like Message authentication codes, hash functions, and public key signatures.

Working of PBFT

PBFT usually uses a traditional mechanism that consists of three segments: pre-prepare, prepare, and commit, to come to a consensus on a request. The primary replica assigns the client’s request a sequence number n during the pre-prepare phase, after which the resulting pre-prepared message is sent to all replicas. Simultaneously, the main replica will append the message to its log. Replica I will start the preparation phase by multicasting a prepared message to every other replica after receiving a pre-prepared message. Additional copies will confirm that the preparation is accurate.

After receiving a pre-prepared message, the replica starts the preparatory phase by multicasting a prepared message to every other replica via multicast. When other replicas receive the prepared message, they will check to see if it is correct and log it in their logs accordingly. Replica will start broadcasting the commit message if it receives at least 2f +1 valid prepared messages from different replicas.During the commit phase, the replica validates the correctness of commit messages from other replicas in the system.

It will send the client a request acknowledgment message only after receiving at least 2f + 1 accurate commit messages, including its own. Once the acknowledgment message is sent, the client can conclude that the system has achieved a consensus on the request. The replica receives 2f + 1 acknowledgment messages.

Solving the 3f+1

If all of the “f” nodes fail byzantine problem, the system must handle two different kinds of issues. Sending the message maliciously in the first place and not sending it at all in the second. Thus, the system must continue to work properly after “(n-f)” nodes. But, the “f” replicas that did not reply might not be defective, which would mean that the “f” replicas that did reply might be defective. Thus, “(n-f)-f > f” is required at the very least if we want the number of non-faulty nodes to exceed that of the faulty ones. ‘n > 3f+1’ is therefore the ideal value.

In practice, the observed Byzantine fault tolerance in Ethereum is 46% and in bitcoin, it is 49.5% which goes down to 33% when the block time is equal to network latency and then reduced to zero.

Advantage of PBFT

Every node receives a reward because PBFT necessitates that all nodes participate and fulfill client requests. Low reward variance amongst nodes is the result.
Here, a block of transactions does not need to wait for several confirmations from every node. Therefore, it takes less time.
It’s a consensus model that uses less energy.
Unlike PoW, a PBFT does not require performing complex mathematical computations.

Disadvantage of PBFT

It is not stable over large networks.
The high communication overhead associated with PBFT will rise as the number of nodes in the network increases.
Sybil attacks, in which a single node assumes the role of several network nodes, can affect PBFT.

Conclusion

Security is crucial when it comes to cryptocurrency, whether it be at the network level or the individual user level. Furthermore, self-custody is the only practical option for anyone concerned about their security in cryptocurrency. Since byzantine fault tolerance is the foundation of a blockchain that is impervious to corruption, it is essential for public blockchains. You wouldn’t be able to determine whether a block is legitimate without these kinds of mechanisms. Due to the increased possibility of double-spending, the network’s security is compromised. To put it briefly, byzantine fault tolerance is critical to every public blockchain.

Unlocking Byzantine Fault Tolerance for Unstoppable Distributed Systems

Table of Contents