Extracting files from a malicious tarball without validating that the destination file path is within the destination directory using shutil.unpack_archive() can cause files outside the destination directory to be overwritten, due to the possible presence of directory traversal elements (..) in archive path names.

Tarball contain archive entries representing each file in the archive. These entries include a file path for the entry, but these file paths are not restricted and may contain unexpected special elements such as the directory traversal element (..). If these file paths are used to determine an output file to write the contents of the archive item to, then the file may be written to an unexpected location. This can result in sensitive information being revealed or deleted, or an attacker being able to influence behavior by modifying unexpected files.

For example, if a tarball contains a file entry ../sneaky-file.txt, and the tarball is extracted to the directory /tmp/tmp123, then naively combining the paths would result in an output file path of /tmp/tmp123/../sneaky-file.txt, which would cause the file to be written to /tmp/.

Ensure that output paths constructed from tarball entries are validated to prevent writing files to unexpected locations.

Consider using a safer module, such as: zipfile

In this example an archive is extracted without validating file paths.

To fix this vulnerability, we need to call the function tarfile.extract() on each member after verifying that it does not contain either .. or startswith /.

  • Shutil official documentation shutil.unpack_archive() warning.