CVE-2025-4517(tarfile)
1. 漏洞概述 (Vulnerability Overview)
- 漏洞类型:路径遍历导致任意文件写入 (Path Traversal leading to Arbitrary File Write)。
- CVSS 评分:Common Vulnerability Scoring System (CVSS) v3.1 评分为 9.4 (Critical / 严重)。
- 受影响组件:Python 标准库中的
tarfile模块。 - 核心影响:当程序解压不可信的
.tar归档文件时,攻击者可以通过精心构造的恶意压缩包,将文件写入到目标解压目录之外的任意系统路径下。这可能导致敏感文件被覆盖、未授权代码的执行,甚至系统被完全接管 (System Compromise)。
2. 技术原理 (Technical Details)
在 Python 的 tarfile 模块中,开发者通常使用 tarfile.TarFile.extractall() 或 tarfile.TarFile.extract() 这两个专业函数来解压文件。
为了提高解压安全性,Python 引入了 filter 参数机制(例如过滤掉危险的符号链接)。然而,CVE-2025-4517 的核心问题恰恰出在这个过滤机制的设计缺陷上:当开发者将 filter 参数设置为 "data" 或 "tar" 时,tarfile 模块未能正确地验证和规范化符号链接 (Symbolic Links) 或硬链接 (Hard Links) 的目标路径。
底层逻辑缺陷:
这本质上是一个路径验证与路径实现之间的不匹配问题 (Mismatch between path validation and path realization)。在底层处理(例如调用操作系统的 os.path.realpath() 函数处理 PATH_MAX 限制条件)时,如果恶意 .tar 文件内部包含指向 ../../../../etc/passwd 或其他关键系统目录的恶意成员名,tarfile 的 "data" 过滤器无法成功将其拦截。
结果就是,原本应该被限制在安全解压目录下的文件,突破了边界,被跨目录写入到了系统的绝对路径中。
3. 受影响的版本与高危触发条件 (Affected Versions and Triggers)
高危触发条件: 只要代码同时满足以下情况,就处于危险之中:
- 提取来自外部不可信来源的 tar 归档文件(如用户上传、网络下载的资源包等)。
- 在代码中调用了带有特定过滤器的解压函数:
tarfile.TarFile.extractall(filter="data")或tarfile.TarFile.extract(filter="data")。
特别注意:在 Python 3.14 及更高版本中,filter 的默认值从 "no filtering"(无过滤)更改为了 "data"。这意味着即使开发者没有显式指定过滤器,依赖新版默认行为的代码也会自动受影响。
已发布修复的 Python 版本: Python Software Foundation (Python 软件基金会) 已在以下更新版本中修复了该问题:
- Python 3.13.4
- Python 3.12.11
- Python 3.11.13
- Python 3.10.18
- Python 3.9.23
4. 实战利用场景 (Exploitation Scenarios)
在红队操作 (Red Teaming) 或渗透测试 (Penetration Testing) 的实战视角下,由于该漏洞的攻击门槛较低且无需用户交互,它非常适合作为获取初始访问权限 (Initial Access) 的跳板:
- 持续集成/持续部署流水线 (Continuous Integration / Continuous Deployment Pipelines, CI/CD):CI/CD runner 经常从镜像仓库或缓存中拉取并解压构建产物 (Artifacts)。红队可以投毒一个恶意的
.tar包,解压时直接覆盖 runner 的 SSH 密钥或环境变量配置文件。 - 机器学习环境 (Machine Learning Pipelines, ML):加载外部的预训练模型权重包(通常打包为
.tar或.tar.gz)时,如果在自动化处理脚本中默认信任了模型文件,极易触发此漏洞覆盖模型加载器的执行逻辑。 - 自动化沙箱与插件系统 (Automated Sandboxes / Plugin Ecosystems):在自动提取并分析未知文件,或安装第三方插件时,沙箱环境可能被反向穿透,导致宿主机受控。
5. 缓解与修复措施 (Mitigation and Workarounds)
- 版本升级 (Upgrade):最直接且彻底的方案是将受影响环境的 Python 升级到上述对应的安全补丁版本。
- 代码层面的临时加固 (Code-level Workarounds):
- 零信任原则:在更新 Python 之前,永远不要盲目解压来源不明的
.tar包。 - 严格路径校验:如果必须解压,不要依赖
filter="data"的默认防护。在调用解压函数前,应自行编写严格的路径校验逻辑。例如,遍历tarfile.TarFile.getmembers()返回的对象,检查tarinfo.name中是否包含../,或者计算合并后的绝对路径是否仍严格位于预期的安全文件夹内。 - 降权运行:在最小权限原则下运行执行解压任务的 Python 进程。即便发生越权写入,也能将影响控制在特定低权限用户范围内。
- 零信任原则:在更新 Python 之前,永远不要盲目解压来源不明的
CVE-2025-4517 (tarfile)
1. Vulnerability Overview
- Type of Vulnerability: Path traversal that allows arbitrary file writing.
- CVSS Score: Common Vulnerability Scoring System (CVSS) v3.1 score of 9.4 (Critical).
- Affected Component: The
tarfilemodule in the Python standard library. - Core Impact: When a program decompresses an untrusted
.tararchive, an attacker can use a carefully crafted malicious archive to write files to any system path outside the target decompression directory. This can lead to the overwrite of sensitive files, the execution of unauthorized code, or even complete system compromise.
2. Technical Details
In the Python tarfile module, developers typically use the tarfile.TarFile.extractall() or tarfile.TarFile.extract() functions to decompress files.
To improve decompression security, Python introduced a filter parameter mechanism to filter out dangerous symbolic links. However, the core issue of CVE-2025-4517 lies in a design flaw in this filtering mechanism: when the filter parameter is set to "data" or "tar", the tarfile module does not properly validate and normalize the target paths of symbolic links or hard links.
Underlying Logic flaw: This is essentially a mismatch between path validation and path implementation. During the underlying processing (for example, when calling the operating system’s os.path.realpath() function to handle PATH_MAX limitations), if a malicious .tar file contains malicious entries pointing to directories such as ../../../../etc/passwd or other critical system directories, the tarfile module’s "data" filter cannot successfully intercept these attempts.
As a result, files that should be confined to a secure decompression directory can bypass these restrictions and be written to absolute system paths.
3. Affected Versions and Triggers
High-Risk Triggers: Code is at risk if the following conditions are met:
- The archive is extracted from an external, untrusted source (such as user-uploaded files or downloaded resources from the internet).
- The decompression function with a specific filter is called in the code:
tarfile.TarFile.extractall(filter="data")ortarfile.TarFile.extract(filter="data").
Special Note: In Python 3.14 and later versions, the default value of filter was changed from "no filtering" to "data". This means that code relying on the new default behavior will also be affected even if no filter is explicitly specified.*
Python Versions with Fixed Vulnerability: The Python Software Foundation has fixed this issue in the following updates:
- Python 3.13.4
- Python 3.12.11
- Python 3.11.13
- Python 3.10.18
- Python 3.9.23
4. Practical Exploitation Scenarios
From the perspective of red team operations or penetration testing, this vulnerability is highly suitable for gaining initial access due to its low attack threshold and the lack of user interaction:
- Continuous Integration/Continuous Deployment (CI/CD) Pipelines: CI/CD runners frequently pull and extract build artifacts from repository mirrors or caches. The red team can deliver a malicious
.tarpackage, which, upon extraction, will overwrite the runner’s SSH keys or environment variable configuration files. - Machine Learning Pipelines: When loading external pre-trained model weight packages (usually packaged as
.taror.tar.gzfiles), if the automation scripts trust these files by default, the vulnerability can easily be exploited to alter the model loader’s execution logic. - Automated Sandboxes and Plugin Ecosystems: During the automatic extraction and analysis of unknown files or the installation of third-party plugins, the sandbox environment may be compromised, leading to control of the host machine.
5. Mitigation and Workarounds
- Upgrade: The most direct and effective solution is to upgrade the Python environment in the affected systems to the version with the corresponding security patches.
- Temporary Code-level Mitigations:
- Zero Trust Principle: Never blindly extract
.tarpackages from unknown sources before updating Python. - Strict Path Validation: If extraction is necessary, do not rely on the default protection provided by
filter="data". Instead, implement your own rigorous path validation logic. For example, iterate through the objects returned bytarfile.TarFile.getmembers()and check iftarinfo.namecontains../`, or verify that the resulting absolute path remains within the expected secure folder. - Run with Reduced Permissions: Execute the Python process responsible for file extraction with minimal privileges. This way, even if unauthorized writes occur, the impact will be limited to users with those reduced permissions.
- Zero Trust Principle: Never blindly extract