Recording audio with on-the-fly compression

Audio editors such as Audacity usually allow recording sound and saving it into compressed formats such as MP3 and Ogg Vorbis. However, for very long recording times, using graphical applications may not be convenient. Command line tools make it possible to record audio and compress it on the fly, without storing large amounts of uncompressed sound data in temporary files.

Probably the most versatile command-line audio tool is sox, which can be used not only for recording and converting audio files but can also apply transformations and sound effects to streams of sound samples. Depending on how it was compiled, sox may or may not be able to compress audio by itself. In either case it is reasonable to use an external program for compression anyway, as e.g. lame and oggenc offer more options to control how the compression is performed than sox does.

Below is a sample script for recording an audio stream and compressing it on the fly to Ogg Vorbis. Of course, it is also possible to compress the same stream into multiple formats by splitting the pipe into two separate pipelines using tpipe. Since the compression must be performed in real time, the machine should have sufficient computing power to handle it.

if [ $# != 1 ]; then
	echo "Usage: $0 file_name.ogg"
echo "Recording audio to OGG file $file_name"
echo "Press Ctrl+C to end recording"
cleanup() {
	echo -e "Done\n"
trap cleanup INT
	trap "" INT
	exec rec -t raw -c 2 -w -r 44100 - | \
		oggenc -r -B 16 -C 2 -R 44100 -q 5 -o "$file_name" - ;

Trapping SIGINT in the script serves the purpose of delivering that signal (triggered when the user presses Ctrl+C) to sox (to which rec is a frontend) instead of oggenc where it would normally go. If oggenc, the child process, receives SIGINT, its parent (sox) receives a SIGPIPE and terminates. The end-of-stream marker is then not written to the OGG file — this is not a big problem, as most apps can still use files without the marker, but for example the length of the stream can't be determined easily when it's missing. However, when the signal is delivered to the parent process first, the file is written to disk with all the data intact.

The technique outlined above makes it possible to create multihour recordings and compress them on the fly. Of course, opening such a big file in a graphical sound editing application can be quite slow or even impossible. Using sox's trim option, a certain part of the recording can be extracted to a separate file to allow for fast editing.

From the homepage of Michał Kosmulski,
Unless indicated otherwise, all content © 2004-2016 Michał Kosmulski. All rights reserved.