You are affected if:
zebrad up to and including v4.4.1.network.listen_addr is set, which is the default).All default configurations are affected.
The mempool download pipeline's cancel_handles map retains entries for transactions whose verification times out at the outer RATE_LIMIT_DELAY (73-second) boundary. The tokio::time::error::Elapsed error carries no payload, so the transaction ID is unrecoverable and the corresponding cancel_handles entry (including the full Gossip::Tx(UnminedTx), up to ~2 MB) is never removed. Entries accumulate monotonically with no upper bound or garbage collection, leading to eventual out-of-memory process termination.
Downloads::poll_next() at zebrad/src/components/mempool/downloads.rs:215-228 handles three terminal states for a verification task:
Ok(Ok(...)): success. Calls cancel_handles.remove(&tx.transaction.id). Correct.Ok(Err(...)): verification error. Calls cancel_handles.remove(&hash). Correct.Err(elapsed): outer timeout. Returns Err(elapsed) without removing anything. Bug.tokio::time::error::Elapsed has no payload, so the timed-out transaction's UnminedTxId is unrecoverable from the error. The consumer at zebrad/src/components/mempool.rs:663-672 explicitly acknowledges this gap with a TODO comment.
The only cleanup paths for cancel_handles are cancel(mined_ids) (removes entries matching mined transaction IDs; attacker transactions are never mined) and cancel_all() (clears everything on shutdown or chain reset). No periodic GC, no time-based eviction, and no count cap exists.
For direct tx pushes (Gossip::Tx), the retained entry holds the full deserialized transaction, which can be up to ~9 MB in memory for a transaction near the transparent-output extreme. Per-connection leak rate at worst case: ~685 KB/s (~2.4 GB/hour).
The fix preserves the UnminedTxId through the timeout error path: wrap the timeout future so the spawned task's outer error carries the txid (e.g., Err((txid, elapsed))). In Downloads::poll_next(), on the timeout arm, call cancel_handles.remove(&txid).
There is no configuration-level workaround. Restarting the node clears the accumulated entries. Operators running in memory-constrained environments (containers with cgroup limits) may see the process killed by the OOM killer before natural recovery.
Gradual, unbounded memory exhaustion of a Zebra node from unauthenticated P2P traffic. The leak is monotonic (entries are never freed under normal operation) but slow (~685 KB/s per connection worst case). An attacker must sustain traffic for hours to exhaust typical server memory. The node continues operating normally until memory pressure becomes critical, at which point the OS OOM killer terminates the process or the node degrades due to swap pressure. No consensus impact, no fund loss, no on-disk corruption.
Reported by @AnticsDecoded via a private GitHub Security Advisory submission. Working E2E reproduction on a live regtest node with staged parent/child transaction dependencies.
| Software | From | Fixed in |
|---|---|---|
zebrad
|
- | 4.5.0 |
A security vulnerability is a weakness in software, hardware, or configuration that can be exploited to compromise confidentiality, integrity, or availability. Many vulnerabilities are tracked as CVEs (Common Vulnerabilities and Exposures), which provide a standardized identifier so teams can coordinate patching, mitigation, and risk assessment across tools and vendors.
CVSS (Common Vulnerability Scoring System) estimates technical severity, but it doesn't automatically equal business risk. Prioritize using context like internet exposure, affected asset criticality, known exploitation (proof-of-concept or in-the-wild), and whether compensating controls exist. A "Medium" CVSS on an exposed, production system can be more urgent than a "Critical" on an isolated, non-production host.
A vulnerability is the underlying weakness. An exploit is the method or code used to take advantage of it. A zero-day is a vulnerability that is unknown to the vendor or has no publicly available fix when attackers begin using it. In practice, risk increases sharply when exploitation becomes reliable or widespread.
Recurring findings usually come from incomplete Asset Discovery, inconsistent patch management, inherited images, and configuration drift. In modern environments, you also need to watch the software supply chain: dependencies, containers, build pipelines, and third-party services can reintroduce the same weakness even after you patch a single host. Unknown or unmanaged assets (often called Shadow IT) are a common reason the same issues resurface.
Use a simple, repeatable triage model: focus first on externally exposed assets, high-value systems (identity, VPN, email, production), vulnerabilities with known exploits, and issues that enable remote code execution or privilege escalation. Then enforce patch SLAs and track progress using consistent metrics so remediation is steady, not reactive.
SynScan combines attack surface monitoring and continuous security auditing to keep your inventory current, flag high-impact vulnerabilities early, and help you turn raw findings into a practical remediation plan.