Lighthouse Fuzzing Update
Lighthouse is being developed with a security-first mindset. We perform extensive reviews of our Rust code and monitor closely the security posture of software dependencies used within our Ethereum 2.0 client. Additionally, Sigma Prime has invested significant time and effort in fuzz testing (i.e. fuzzing ).
A quick summary about the fuzzing toolset used in Lighthouse can be found here.
Current fuzzing efforts have been targeting three areas of Lighthouse:
- Data received from other nodes on the network
- Serialisation decoding (SSZ)
- State transitions and block processing
Fuzzing activities leverage the
cargo-fuzz crate, a command-line wrapper for
libFuzzer (the library part of the LLVM project).
fuzzing-state-processing branch on the
sigp/lighthouse repository contains the fuzz targets developed by Sigma Prime for the state transition functions and the serialisation.
The Lighthouse network stack lives in a different repository (
sigp/rust-libp2p). The branches
discv5-fuzzing on this repository contain the fuzzing targets used for assessing our implementation of the network stack.
One of the core parts of any Blockchain client is the ability to process new blocks. Fuzzing the block processing functions in Lighthouse raised numerous challenges. This section outlines the steps taken to develop a framework for the fuzzing of block processing functions.
The initial target of our state transition fuzzer was the function
per_block_processing(), which involved taking a single block and updating the state accordingly. Initial attempts consisted of creating a genesis state, attempting to decode the fuzzer's input data from SSZ bytes into a block object, then finally processing the block.
This naive approach was far too time consuming since it required a genesis state to be built for every fuzzing iteration. A solution was to create a specific state (a couple of epochs after genesis), store this state in memory, and reload it for each run.
Adopting this approach yielded a small speed improvement with a block fuzzing processing time of ~1 second, however this is far too slow to be effective for fuzzing purposes.
One thing that arose was that the fuzzer covered only a very low percentage of the code base due to the cryptographic requirements of Lighthouse (i.e. ensuring
the block producers had signed the block correctly). An incorrect signature would cause
per_block_processing to return an error rather quickly. This is because the probability of the fuzzer guessing the correct signature is about
To gain a time improvement of almost 15x (as signature processing is a significant proportion of block processing)
--features fake_crypto was
leveraged, which skips signature verification (not cryptographically secure but very fast). While not only significantly speeding up the fuzzer, this also
had the benefit of increasing the code coverage as the function would no longer stop whenever a signature failed to be verified.
Additionally the check against the merkle tree root in
process_deposits was removed. The merkle root was a barrier for code coverage as it required the
fuzzer to correctly guess a hash of the object which has a probability of about
1 / 2^256.
Block processing still had a range of different areas that needed to be assessed such as attestations, attester_slashings, proposer_slashings, exits, transfers, etc.. As the fuzzer generates pseudo-random blocks, the functions such as
transfers which were processed last were rarely reached. However,
block_headers, which was processed first, was fuzzed far more. We therefore proceeded to segment and separate the block processing fuzzing into each of the sub sections.
Another reason contributing to the decision to break block processing up into smaller segments was the concept of a corpus. A corpus is used by a fuzzer as a set of starting points which are built on and mutated to speed up the exploration of new code paths. Using a specific set of corpora, we can provide
libFuzzer with valid inputs that we know trigger the execution of certain code blocks. In our case, our initial naive approach used a valid block containing one of each of the following objects as the corpus: attestation, slashings, transfers, etc. Providing
libFuzzer with an entire block containing each piece of data meant that our fuzzer had a much harder job mutating the data to identify the correct code paths. By splitting the block processing into smaller functions, such as
process_deposits, we were able to give a single
Deposit as the corpus for that specific fuzzing target which significantly increased the efficiency and coverage of the fuzzer.
A Blockchain node is constantly sending and receiving network packets to/from other nodes. Packets received need to be decoded and processed (to potentially advance the state, update the peer list, etc.). This represents the largest attack surface on a client as all other nodes have the potential to be malicious and attempt resource exhaustion attacks, leading to denial of service conditions. Worse still are packets that can be sent by malicious actors to cause crashes (or panics in Rust). For our fuzzing activities, we have therefore decided to prioritise the Lighthouse functions that decode data contained within packets transmitted over the wire.
libp2p for its peer-to-peer networking stack (Sigma Prime developed and maintains its own fork of
rust-libp2p). Three major entry points are targeted by our fuzzers:
discv5: An implementation of the new peer discovery protocol used in Ethereum 2.0;
gossissub: A publish/subscribe protocol specifically aimed at reducing the networking overhead (only receives packets related to specific shards);
ENR: An Ethereum Name Record, a url-safe
Base64representation of Ethereum node identifiers.
While fuzzing our
ENR crate, a memory allocation bug was found within one of its external dependencies, asn1_der. The bug arose due to the instantiation of an array whereby the size of the array was derived from an input parameter. By requesting a large enough array to be instantiated the related memory allocation failed, causing a panic. Please refer to the following issue for further details. This bug was raised on the relevant repositories and was fixed by the maintainers in a timely fashion.
Additionally, our fuzzers identified a minor bounds checking bug within our
discv5 crate. By failing to adequately check the bounds of an incoming slice (array), an out-of-bound index on the affected slice was accessed, resulting in a panic. Refer to this commit for further details.
Serialisation decoding (SSZ) Findings
ssz_decode is a critical function called anytime serialised data is decoded by the beacon nodes. Our fuzzing efforts identified a bug related to the decoding of a
Bitfield. If a bitfield was passed an empty array with the number of bits required set to
0, a lower level function would return an error which caused a panic when interpreting the results. The fix was to simply change the panic to error propagation as seen in this commit.
State Transition Findings
Our state transition and block processing fuzzing efforts (described above) resulted in the identification of an integer underflow vulnerability within the
process_transfers function, leading to a software crash (Rust panic). The bug occurred if the
transfer.fee + transfer.amount > balance as the check for the adequate balance did not incorporate the fee and therefore a subtraction underflow would occur when debiting the balance.
As the fuzzing framework continues to evolve, we are looking at deploying the fuzzers to a dedicated infrastructure such as fuzzit.
There are a range of other fuzzers such as AFL and Microsoft's struct aware fuzzer that we're looking at adding to our toolset in order to augment our fuzzing capabilities. Finally, we look forward to integrating Lighthouse into the Eth2 differential fuzzer currently being developed by the Ethereum Foundation. Work in this area has already started and a few functions (
shuffling) are already supported.