Skip to content

Splits stdin onto chunk files and writes resulting TAR archive to stdout

License

Notifications You must be signed in to change notification settings

ihoro/tar-stream-chunker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

70 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

tar-stream-chunker

badge badge tar stream chunker.c tar stream chunker.c tar stream chunker.c FreeBSD port informational?logo=freebsd

Splits stdin onto chunk files and writes resulting TAR archive to stdout.

It can help to handle data stream of unknown size without hitting a disk. As long as creating TAR archive as a stream needs to know file size in advance, that is why the idea of the chunker is to split input stream onto chunks with known size.

Initial motivation was to use it for backup streaming to tarsnap.com.

Build from sources

Expecting you have a POSIX compliant system:

$ make

Test suite

End-to-end test suite requires node.js:

$ make e2e

FreeBSD port/package

  • pkg install tar-stream-chunker

  • or cd /usr/ports/archivers/tar-stream-chunker && make install clean

Usage

$ tar_stream_chunker { --file-name | -f } <file-name> { --chunk-size | -s } <bytes>

And redirect its stdout to your target and that’s it.

Usage examples

PostgreSQL dump what is streamed to tarsnap:

$ pg_dump | tar_stream_chunker -f dump.sql -s 500000000 | tarsnap -cvf dump.daily.$(date +%Y%m%d.%H%M%S) @-

Let’s see what’s inside a resulting TAR:

$ cat 3MB-file | tar_stream_chunker --file-name file --chunk-size 1000000 > result.tar
$ tar -tf result.tar
file.00001
file.00002
file.00003

And original input can be re-assembled as follows:

$ tar -xf result.tar; cat file.* > original-3MB-file
$ tar -xOf result.tar > original-3MB-file