I always thought that it should be a trivial task. There are even some stackoverflow answers on that topic, but there is actually a catch that none of the answers talks about.
Originally tar did not support paths longer than 100 chars. GNU tar is better and they
implemented support for longer paths, but it was made through a hack called
././@LongLink
(see here). Shortly speaking, if you stumble upon an entry in
tar archive which path equals to above mentioned ././@LongLink
, that means that the
following entry path is longer than 100 chars and is truncated. The full path of the following
entry is actually the value of the current entry. So when extracting files from tar we also must
have in mind this possibility.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
require 'rubygems/package'
require 'zlib'
TAR_LONGLINK = '././@LongLink'
tar_gz_archive = '/path/to/archive.tar.gz'
destination = '/where/extract/to'
Gem::Package::TarReader.new( Zlib::GzipReader.open tar_gz_archive ) do |tar|
dest = nil
tar.each do |entry|
if entry.full_name == TAR_LONGLINK
dest = File.join destination, entry.read.strip
next
end
dest ||= File.join destination, entry.full_name
if entry.directory?
FileUtils.rm_rf dest unless File.directory? dest
FileUtils.mkdir_p dest, :mode => entry.header.mode, :verbose => false
elsif entry.file?
FileUtils.rm_rf dest unless File.file? dest
File.open dest, "wb" do |f|
f.print entry.read
end
FileUtils.chmod entry.header.mode, dest, :verbose => false
elsif entry.header.typeflag == '2' #Symlink!
File.symlink entry.header.linkname, dest
end
dest = nil
end
end