CVE-2025-4517（tarfile）

1. 漏洞概述 (Vulnerability Overview)

漏洞类型：路径遍历导致任意文件写入 (Path Traversal leading to Arbitrary File Write)。
CVSS 评分：Common Vulnerability Scoring System (CVSS) v3.1 评分为 9.4 (Critical / 严重)。
受影响组件：Python 标准库中的 tarfile 模块。
核心影响：当程序解压不可信的 .tar 归档文件时，攻击者可以通过精心构造的恶意压缩包，将文件写入到目标解压目录之外的任意系统路径下。这可能导致敏感文件被覆盖、未授权代码的执行，甚至系统被完全接管 (System Compromise)。

2. 技术原理 (Technical Details)

在 Python 的 tarfile 模块中，开发者通常使用 tarfile.TarFile.extractall() 或 tarfile.TarFile.extract() 这两个专业函数来解压文件。

为了提高解压安全性，Python 引入了 filter 参数机制（例如过滤掉危险的符号链接）。然而，CVE-2025-4517 的核心问题恰恰出在这个过滤机制的设计缺陷上：当开发者将 filter 参数设置为 "data" 或 "tar" 时，tarfile 模块未能正确地验证和规范化符号链接 (Symbolic Links) 或硬链接 (Hard Links) 的目标路径。

底层逻辑缺陷： 这本质上是一个路径验证与路径实现之间的不匹配问题 (Mismatch between path validation and path realization)。在底层处理（例如调用操作系统的 os.path.realpath() 函数处理 PATH_MAX 限制条件）时，如果恶意 .tar 文件内部包含指向 ../../../../etc/passwd 或其他关键系统目录的恶意成员名，tarfile 的 "data" 过滤器无法成功将其拦截。

结果就是，原本应该被限制在安全解压目录下的文件，突破了边界，被跨目录写入到了系统的绝对路径中。

3. 受影响的版本与高危触发条件 (Affected Versions and Triggers)

高危触发条件： 只要代码同时满足以下情况，就处于危险之中：

提取来自外部不可信来源的 tar 归档文件（如用户上传、网络下载的资源包等）。
在代码中调用了带有特定过滤器的解压函数：tarfile.TarFile.extractall(filter="data") 或 tarfile.TarFile.extract(filter="data")。

特别注意：在 Python 3.14 及更高版本中，filter 的默认值从 "no filtering"（无过滤）更改为了 "data"。这意味着即使开发者没有显式指定过滤器，依赖新版默认行为的代码也会自动受影响。

已发布修复的 Python 版本： Python Software Foundation (Python 软件基金会) 已在以下更新版本中修复了该问题：

Python 3.13.4
Python 3.12.11
Python 3.11.13
Python 3.10.18
Python 3.9.23

4. 实战利用场景 (Exploitation Scenarios)

在红队操作 (Red Teaming) 或渗透测试 (Penetration Testing) 的实战视角下，由于该漏洞的攻击门槛较低且无需用户交互，它非常适合作为获取初始访问权限 (Initial Access) 的跳板：

持续集成/持续部署流水线 (Continuous Integration / Continuous Deployment Pipelines, CI/CD)：CI/CD runner 经常从镜像仓库或缓存中拉取并解压构建产物 (Artifacts)。红队可以投毒一个恶意的 .tar 包，解压时直接覆盖 runner 的 SSH 密钥或环境变量配置文件。
机器学习环境 (Machine Learning Pipelines, ML)：加载外部的预训练模型权重包（通常打包为 .tar 或 .tar.gz）时，如果在自动化处理脚本中默认信任了模型文件，极易触发此漏洞覆盖模型加载器的执行逻辑。
自动化沙箱与插件系统 (Automated Sandboxes / Plugin Ecosystems)：在自动提取并分析未知文件，或安装第三方插件时，沙箱环境可能被反向穿透，导致宿主机受控。

5. 缓解与修复措施 (Mitigation and Workarounds)

版本升级 (Upgrade)：最直接且彻底的方案是将受影响环境的 Python 升级到上述对应的安全补丁版本。
代码层面的临时加固 (Code-level Workarounds)：
- 零信任原则：在更新 Python 之前，永远不要盲目解压来源不明的 .tar 包。
- 严格路径校验：如果必须解压，不要依赖 filter="data" 的默认防护。在调用解压函数前，应自行编写严格的路径校验逻辑。例如，遍历 tarfile.TarFile.getmembers() 返回的对象，检查 tarinfo.name 中是否包含 ../ ，或者计算合并后的绝对路径是否仍严格位于预期的安全文件夹内。
- 降权运行：在最小权限原则下运行执行解压任务的 Python 进程。即便发生越权写入，也能将影响控制在特定低权限用户范围内。

CVE-2025-4517 (tarfile)

1. Vulnerability Overview

Vulnerability Type: Path Traversal leading to Arbitrary File Write.
CVSS Score: Common Vulnerability Scoring System (CVSS) v3.1 score is 9.4 (Critical).
Affected Component: The tarfile module in the Python standard library.
Core Impact: When a program extracts an untrusted .tar archive file, an attacker can use a specially crafted malicious archive to write files to arbitrary system paths outside the target extraction directory. This can lead to sensitive file overwriting, unauthorized code execution, or even complete system compromise.

2. Technical Details

In Python’s tarfile module, developers typically use the specialized functions tarfile.TarFile.extractall() or tarfile.TarFile.extract() to extract files.

To enhance extraction security, Python introduced a filter parameter mechanism (e.g., to filter out dangerous symbolic links). However, the core issue of CVE-2025-4517 lies precisely in a design flaw within this filtering mechanism: when developers set the filter parameter to "data" or "tar", the tarfile module fails to properly validate and normalize the target paths of symbolic links or hard links.

Underlying Logic Flaw: This is fundamentally a mismatch between path validation and path realization. During low-level processing (e.g., calling the operating system’s os.path.realpath() function under PATH_MAX limit conditions), if a malicious .tar file contains malicious member names pointing to ../../../../etc/passwd or other critical system directories, the "data" filter of tarfile cannot successfully intercept them.

As a result, files that were supposed to be confined within a secure extraction directory breach the boundary and are written across directories to absolute system paths.

3. Affected Versions and High-Risk Triggers

High-Risk Triggers: The code is vulnerable if it simultaneously meets the following conditions:

It extracts tar archive files from untrusted external sources (e.g., user uploads, network-downloaded resource packages, etc.).
It calls the extraction function with a specific filter in the code: tarfile.TarFile.extractall(filter="data") or tarfile.TarFile.extract(filter="data").

Special Note: In Python 3.14 and later versions, the default value of filter has been changed from "no filtering" to "data". This means that even if developers do not explicitly specify a filter, code relying on the new default behavior will be automatically affected.

Python Versions with Published Fixes: The Python Software Foundation has fixed this issue in the following updated versions:

Python 3.13.4
Python 3.12.11
Python 3.11.13
Python 3.10.18
Python 3.9.23

4. Exploitation Scenarios

From a Red Teaming or Penetration Testing perspective, due to its low attack barrier and lack of required user interaction, this vulnerability is highly suitable as a springboard for gaining Initial Access:

Continuous Integration / Continuous Deployment Pipelines (CI/CD): CI/CD runners frequently pull and unpack build artifacts from image repositories or caches. Red teams can poison a malicious .tar package, which upon extraction can directly overwrite the runner’s SSH keys or environment variable configuration files.
Machine Learning Pipelines (ML): When loading external pre-trained model weight packages (usually packaged as .tar or .tar.gz), if the model files are implicitly trusted in automated processing scripts, this vulnerability can be easily triggered to overwrite the execution logic of the model loader.
Automated Sandboxes / Plugin Ecosystems: When automatically extracting and analyzing unknown files or installing third-party plugins, the sandbox environment may be breached, leading to control over the host machine.

5. Mitigation and Workarounds

Upgrade: The most direct and thorough solution is to upgrade the Python in the affected environment to the corresponding secure patched version mentioned above.
Code-level Workarounds:
- Zero Trust Principle: Until Python is updated, never blindly unpack .tar packages from untrusted sources.
- Strict Path Validation: If unpacking is unavoidable, do not rely on the default protection of filter="data". Before calling the unpacking function, you should write your own strict path validation logic. For example, iterate over the objects returned by tarfile.TarFile.getmembers(), check if tarinfo.name contains ../, or calculate whether the merged absolute path remains strictly within the expected secure folder.
- Run with Reduced Privileges: Run the Python process executing the unpacking task under the principle of least privilege. Even if unauthorized writes occur, the impact can be contained within a specific low-privilege user scope.

CVE-2025-4517（tarfile）

CVE-2025-4517（tarfile） cve-2025-4517