alsa-capabilities shows digital audio formats and sample rates alsa supports with your USB DAC

Alsa-capabilities shows which digital audio formats your USB DA-converter supports

Have you ever wondered which digital audio format your Linux and alsa based music computer, for example one that runs Music Player Daemon in bit perfect / audiophile mode, is actually sending to your sound card or USB DAC when playing a high resolution digital audio file? Especially when your external DAC doesn’t indicate the sample rate or resolution of the incoming stream with LEDs or a display? Or maybe you are curious about the digital audio formats and sample rates your soundcard or USB DAC handles natively? This article and the accompanying script try to assist you with that daunting task.

The alsa-capabilities script –which can be executed on any computer running Linux and alsa– will show the available alsa interfaces for audio playback, and the digital audio formats each sound card or external USB DAC supports.

Instructions for running it straight from the web

Open a terminal screen* on the computer connected to your DAC and copy-and-paste or type the line below in the terminal screen, followed by pressing ENTER:

bash <(wget -q -O - "")

That’s it!

This will display a list of each alsa audio output interface with its details. When an interface is in use by another program, it will display the name and identifier (pid), so may you examine, stop or kill it, and run the script again.

The script also supports some options. For example, to display the sample rates for each encoding format supported by each USB Audio Class audio interface on your computer, run the script together with the `’-l usb’` (show only usb interfaces) and the `’-s’` (show sample rates) options:

bash &lt;(wget -q -O - "") -l usb -s

This will output something like:

0) USB Audio Class Digital alsa audio output interface `hw:1,0'
 - device name       = Pink Faun USB 32/384 USB receiv                             
 - interface name    = USB Audio                                                   
 - usb audio class   = 2 - isochronous asynchronous                                
 - character device  = /dev/snd/pcmC1D0p                                           
 - rates per format  = S32_LE            : 32000Hz 44100Hz 48000Hz 88200Hz 96000Hz 176400Hz 192000Hz 352800Hz 384000Hz
 - monitor file      = /proc/asound/card1/stream0 

Downloading the script locally

When you want to download it to your computer, you could do the following:

bash alsa-capabilities

Options can simply be added to the command line like in the following example, which limits the output to only devices which support USB Audio Class 1 or 2 (using `-l usb`) while adding the listing of supported sample rates for each supported encoding format (the `-s` option):

bash alsa-capabilities -l usb -s

To display all options run the script with the `-h` option:

bash alsa-capabilities -h

Opening a terminal screen

For these tasks you need to start a terminal screen on your desktop computer. Linux users may press and hold CTRL+ALT while typing T from within their desktop environment. Both users of Linux and Mac desktops may search for the text "Terminal" in their applications menu.

Windows users can use putty to perform the step below, filling in the appropriate values for username and network address in the connection screen of putty.

When your music computer is remote, you should first make a ssh-connection to that remote computer. For such an connection, you need to know the following:

  • the username and password of a user account on the remote computer, as configured by you or instructed by your manufacturer, and
  • the network address of the music computer, in the form of an ip address (ie or a hostname (ie vortexbox).

To make a SSH connection to the linux based audio computer, first open a terminal screen, and copy-and-paste or type the line below, followed by the ENTER key:

ssh ${username}@${networkaddress}

… and fill in the proper password for the "${username}"-user when asked for, for example:

ssh root@vortexbox.local

In this remote terminal screen, you may enter any command like you would do on your local computer, including the commands in the instructions above.

Watching an interface reacting on different audio formats

To see how an audio interface reacts when you play different digital audio formats, you can use the output of the script with the watch command. Replace the value of the ${monitor file} below with the name of the file displayed by the script:

LANG=C watch -n 0.1 cat ${monitor_file}

Now try throwing audio files of different formats at your player and see what your digital interface makes of it.

Automated usage in scripts etc.

The script supports some features which could be handy in other scripts, like limiting the output to certain classes of audio interfaces, or interfaces with a certain name. To do this, you may use the limit option '-l' with an argument, either 'a' or 'analog', 'd' or 'digital' or 'u', 'usb' or 'uac' to only show interfaces fitting that limit. In addition, a custom regular expression filter may be specified as an argument for the 'c' option.To list only interfaces that support USB Audio Class you should execute:

bash alsa-capabilities -l usb

Furthermore the script can be sourced. That way one may automate and store certain properties for use in other scripts or config files, like mpd-configure does. Here’s a rough example.

# source the local script or, if it's available in the current directory,
# straight from the upstream url
source alsa-capabilities 2&gt;/dev/null || \
   source &lt;(wget -q -O - "")
# run the function return_alsa_interface to fill up the array
[[ $? -eq 0 ]] &amp;&amp; return_alsa_interface -l usb
# store the first item in the array in your own variable
# do something with it 
echo "${myvar}"
# display the corresponding character device: `/dev/snd/pcmCxDyp`:
echo "${ALSA_AIF_CHARDEVS[0]}"

To see all properties that can be accessed this way see the scripts source or grep for the following:

grep 'declare -a ALSA_AIF' alsa-capabilities  | awk '{ print $3}'

The working of the script explained

To detect which digital interfaces your computer has, the script filters the output of the command 'aplay -l' to list interfaces which have one of the words "usb", "digital", "hdmi", "i2s", "spdif", "toslink" or "adat" in them. However, before it does that, it temporary pauses pulseaudio, which would otherwise block the interface exclusively. It then reformats the output of the "aplay" command to show a clear listing of each alsa interface, consisting of a "hw:X,Y" hardware addresses, its human readable name, the character device it uses, the digital formats it supports natively, and –in case of a USB Audio Class device, the class (1 or 2) and its stream file.

After an interface is selected, either by the script (in case of single interface) or you (in the case of multiple interfaces), the script will play random noise to the interface, in order to force alsa to display the native digital formats the interface accepts. To keep this test silent, the sound output of the interface is redirected to `/dev/null`.

NOTE: for this to work one should temporary stop or pause any program using that interface, like mpd. However, the script will detect and show you which processes/programs are accessing which device while performing this test, so you may abort the script and stop the listed program, using `pkill ${process_name}` or `kill -9 ${process_id}`, before re-running the script.

The output below is that of my own system, with a sound card embedded on an Intel motherboard and two USB DACs connected:

sudo ./alsa-capabilities -s
 0) Analog alsa audio output interface `hw:0,0'
 - device name       = HDA Intel PCH                                               
 - interface name    = ALC887-VD Analog                                            
 - usb audio class   = (none)                                                      
 - character device  = /dev/snd/pcmC0D0p (in use by `mpd' with pid `37679')        
 - rates per format  = (error: can't determine)                                    
 - monitor file      = /proc/asound/card0/pcm0p/sub0/hw_params                     

 1) Digital alsa audio output interface `hw:0,1'
 - device name       = HDA Intel PCH                                               
 - interface name    = ALC887-VD Digital                                           
 - usb audio class   = (none)                                                      
 - character device  = /dev/snd/pcmC0D1p                                           
 - rates per format  = S16_LE            : 44100Hz 48000Hz 88200Hz 96000Hz 192000Hz 
                       S32_LE            : 44100Hz 48000Hz 88200Hz 96000Hz 192000Hz 
 - monitor file      = /proc/asound/card0/pcm1p/sub0/hw_params                     

 2) Digital alsa audio output interface `hw:0,3'
 - device name       = HDA Intel PCH                                               
 - interface name    = HDMI 0                                                      
 - usb audio class   = (none)                                                      
 - character device  = /dev/snd/pcmC0D3p                                           
 - rates per format  = S16_LE            : 44100Hz 48000Hz 88200Hz 96000Hz 176400Hz 192000Hz 
                       S32_LE            : 44100Hz 48000Hz 88200Hz 96000Hz 176400Hz 192000Hz 
                       IEC958_SUBFRAME_LE: 44100Hz 48000Hz 88200Hz 96000Hz 176400Hz 192000Hz 
 - monitor file      = /proc/asound/card0/pcm3p/sub0/hw_params                     

 3) USB Audio Class Digital alsa audio output interface `hw:1,0'
 - device name       = Peachtree 24/192 USB X                                      
 - interface name    = USB Audio                                                   
 - usb audio class   = 2 - isochronous asynchronous                                
 - character device  = /dev/snd/pcmC1D0p                                           
 - rates per format  = S32_LE            : 44100Hz 48000Hz 88200Hz 96000Hz 176400Hz 192000Hz 
 - monitor file      = /proc/asound/card1/stream0  

 4) USB Audio Class Digital alsa audio output interface `hw:2,0'
 - device name       = Pink Faun USB 32/384 USB receiv                                  
 - interface name    = USB Audio                                                   
 - usb audio class   = 2 - isochronous asynchronous                                
 - character device  = /dev/snd/pcmC2D0p                                           
 - rates per format  = S32_LE            : 32000Hz 44100Hz 48000Hz 88200Hz 96000Hz 176400Hz 192000Hz 352800Hz 384000Hz
 - monitor file      = /proc/asound/card2/stream0  

After this, the watch command may be used with the monitor file, which resides in the pseudo file system `/proc`, which allows for inspecting the snd_usb_audio kernel module’s parameters and their values, see the source of /tree/sound/usb. The script does this by translating the selected interface address `hw:X,Y` to the associated filename `/proc/asound/cardX/streamY`, where `X` is the number of the sound card and `Y` that of the output interface. Because this steam file is created by the kernel module, it only exists when you have interfaces that support a USB Audio Class, see the source of sound/usb/proc.c). In such cases, the file contains the actual values of the snd_usb_audio kernel module associated with the specific interface.

The script displays the changing contents of this file with a 100ms refresh rate (0.1s) using the `watch` command.

The script

The script is written in bash and part of my mpd-configure project hosted at gitlab:

Contents of the alsa stream files for USB Audio Class 1 & 2 DAC’s

Alsa stream file in adaptive UAC1 mode

I used to have a Pink Faun 3.24 USB DAC, fitted with a USB Audio Class (UAC) 1 transceiver chip from Tenor. With these UAC1 devices, the communication with the host computer runs in isochronous adaptive mode, meaning the data transfer type is isochronous and the audio synchronization type adaptive. See the official USB Audio Class 2 specification “A Device Class for Audio” from the USB consortium.

The contents of the stream file `/proc/asound/card0/stream0` look like this when playing a 16bit/44.1kHz CD-ripped file:

# contents of /proc/asound/card0/stream0
# while playing 16bit/44.1kHz audio 
 Status: Running
 Interface = 3
  Altset = 1
  URBs = 3 [ 8 8 8 ]
  Packet Size = 388
  Momentary freq = 44100 Hz (0x2c.199a)
 Interface 3
  Altset 1
  Format: S16_LE
  Channels: 2
  Endpoint: 3 OUT (ADAPTIVE)
  Rates: 44100, 48000, 96000
 Interface 3
  Altset 2
  Format: S24_3LE
  Channels: 2
  Endpoint: 3 OUT (ADAPTIVE)
  Rates: 44100, 48000, 96000

When playing a 24bit/96kHz file, the output of the file changes to the following.

Note that the `Altset` value has changed from `1` to `2` and `Momentary freq` from `44100` to `96000`, indicating that the second interface (`Altset = 2`) is activated with a 24bit format (`Format = S24_3LE`) and 96kHz sample rate (`Momentary freq = 96000`).

# contents of /proc/asound/card0/stream0
# while playing 24bit/96kHz audio 
 Status: Running
 Interface = 3
  Altset = 2
  URBs = 3 [ 8 8 8 ]
  Packet Size = 582
  Momentary freq = 96000 Hz (0x60.0000)
 Interface 3
  Altset 1
  Format: S16_LE
  Channels: 2
  Endpoint: 3 OUT (ADAPTIVE)
  Rates: 44100, 48000, 96000
 Interface 3
  Altset 2
  Format: S24_3LE
  Channels: 2
  Endpoint: 3 OUT (ADAPTIVE)
  Rates: 44100, 48000, 96000

The story above is summed up in the following diff:

# difference in contents of /proc/asound/card0/stream0 
# while playing 16/44.1 and 24/96 audio
--- /tmp/16bit  2013-08-24 12:21:22.363570848 +0200
+++ /tmp/24bit  2013-08-24 12:21:33.659570623 +0200
@@ -1,10 +1,10 @@
  Status: Running
  Interface = 3
-  Altset = 1
+  Altset = 2
   URBs = 3 [ 8 8 8 ]
-  Packet Size = 388
-  Momentary freq = 44100 Hz (0x2c.199a)
+  Packet Size = 582
+  Momentary freq = 96000 Hz (0x60.0000)
  Interface 3
   Altset 1
   Format: S16_LE

Alsa stream file in asynchronous UAC2 mode

With UAC1 device in isochronous adaptive mode, the DAC and computer negotiate a shared sample rate, mostly using a PLL mechanism. The `Momentary freq` value shows the result of that negotation which should be equal to that of the file being played.

With USB Audio Class 2 in isochronous asynchronous mode, like new Pink Faun DAC2 with an Amanero(?) supports, every 125us the DAC tells the computer how many SPDIF-packets it should sent in one USB Request Block (URB).

In the output you can actually see that happening when playing high resolution files; the `Momentary freq` flips from 192.000Hz to something like 191.999Hz and back again.

Other differences with the UAC1 device output are:

Packet Size5821024
Feedback Format(non-existent)16.16
Data packet interval1ms (1000us)125us

The output looks like this when playing a 16bit/44.1kHz file:

Pink Faun Pink Faun USB 32/384 USB receiv at usb-0000:00:1d.0-1.5, high speed : USB Audio

  Status: Running
    Interface = 2
    Altset = 1
    URBs = 3 [ 64 64 64 ]
    Packet Size = 1024
    Momentary freq = 44100 Hz (0x5.8332)
    Feedback Format = 16.16
  Interface 2
    Altset 1
    Format: S32_LE
    Channels: 2
    Endpoint: 5 OUT (ASYNC)
    Rates: 32000, 44100, 48000, 88200, 96000, 176400, 192000, 352800, 384000
    Data packet interval: 125 us

It changes to the following when playing a 24bit/96kHz file:

  Status: Running
    Interface = 2
    Altset = 1
    URBs = 3 [ 64 64 64 ]
    Packet Size = 1024
    Momentary freq = 96000 Hz (0xb.fffc)
    Feedback Format = 16.16
  Interface 2
    Altset 1
    Format: S32_LE
    Channels: 2
    Endpoint: 5 OUT (ASYNC)
    Rates: 32000, 44100, 48000, 88200, 96000, 176400, 192000, 352800, 384000
    Data packet interval: 125 us

And, finally, this is what it looks like when playing a 24bit/192kHz file:

  Status: Running
    Interface = 2
    Altset = 1
    URBs = 3 [ 64 64 64 ]
    Packet Size = 1024
    Momentary freq = 192000 Hz (0x18.0000)
    Feedback Format = 16.16
  Interface 2
    Altset 1
    Format: S32_LE
    Channels: 2
    Endpoint: 5 OUT (ASYNC)
    Rates: 32000, 44100, 48000, 88200, 96000, 176400, 192000, 352800, 384000
    Data packet interval: 125 us

As you can see below, this device pads each sample (with zeroes) until it fills up 32bits, regardless of the resolution of the source file. Therefore, it needs only a single `AltSet` and doesn’t change anything when changing from 16bit/44.1kHz to 24bit/192kHz, apart from the sampling frequency (`MomentaryFreq`):

--- /home/ronalde/16b-44.1k-bituac2  2014-03-20 16:38:40.475172541 +0100
+++ /home/ronalde/24b-192k-bituac2  2014-03-20 16:38:40.476172541 +0100
@@ -6,7 +6,7 @@
     Altset = 1
     URBs = 3 [ 64 64 64 ]
     Packet Size = 1024
-    Momentary freq = 44100 Hz (0x5.8330)
+    Momentary freq = 191998 Hz (0x17.fff0)
     Feedback Format = 16.16
   Interface 2
     Altset 1


Jan 26, 2016: enhanced handling of pulseaudio:

Jan 4, 2016: Enhanced program flow and optimized display of supported sample rates for UAC type interfaces:

Jan 3, 2016: Added support for accurate but (very) slow displaying of supported sample rates for each format an interface supports:

Dec 9, 2015: Fixed a (rather long running) error in the script:

May 13, 2015: Added a lot of extra error checking:

  • modified `alsa-capabilities` to make the script more robust

Jan 26, 2015: Added a temporary hack to address issue #8:

  • modified `alsa-capabilities` to skip checking for unset variables and brake on errors

Apr 18, 2014: Major rewrite of the script:

  • moved `tests/` to `alsa-capabilities`
  • modified `alsa-capabilities` to make it suitable to be sourced or run by itself from the command line
  • added simple and regexp filtering to `alsa-capabilities`

Apr 8, 2014: Script updated:

  • added functionality to monitor non-UAC devices using its `hw_params` file and a few improvements in UI.

Apr 3, 2014: Small script changes and moved PCM information:

  • introduced some more bashims to make the script faster and simpler.
  • Moved the background information on the PCM format to The PCM format explained.

Apr 2, 2014: full rewrite of the script

  • to minimize user interaction and making it a bit more robust.

Mar 21, 2014: Small changes to the article:

  • Reformatted the introduction and added some technical background about the script

Mar 20, 2014: Completely rewritten the article

  • the previous version of the article assumed you already had your (default) music player set up for using a alsa hardware playback interface, which most readers are trying to figure out.
  • created a script ( to quickly list the available interfaces and supported audio formats

mpd-configure: automatically turn Linux into an audiophile music player

Music Player Daemon (mpd) is a great free and open source tool which, together with Linux, can be used to turn any computer into a highest quality bit perfect audio player. That way your PC will act as a transparant transport device for streaming your PCM files, like WAV, FLAC and AIFF, and DSD audio files to your DAC and audio equipment.

For best results, the PC should be connected an external DA-converter with USB or HDMI/I2S and run as few applications as possible, thereby minimizing system load and switching of processors, memory and busses. This can be achieved by running a headless Linux installation and storing and accessing the music files on/from a network connected storage device like a NAS or NFS file server. The sound daemon should be controlled from a remote device, like a smartphone, tablet, laptop or desktop computer running a mpd-client.

A little background on pulseaudio, mpd and alsa

Modern Linux distributions ship with a standard audio library (pulseaudio) which will resample and convert digital audio on the fly for the best plug-and-play user experience. Default mpd-installations will also use those features, which unfortunately makes it unsuitable for audiophile purposes out of the box. The average audiophile user is less concerned with plug-and-play and more concerned with discretion; he/she wants the computer to act as a high end –black-box like– transport device for delivering the original –non altered– digital audio to the DAC or sound card.

This is perfectly feasible with stock software, by performing a few modifications to the mpd configuration file. However, finding the right values can be daunting for non-computer-savvy audio enthusiasts. This page with its scripts is aimed at helping those users. It will automagically find the right values and put them in a valid configuration file.

To determine which formats are natively accepted by your USB DAC, and how it actually behaves when feeding it a certain format, have a look at the article “Alsa-capabilities shows which digital audio formats your USB DA-converter supports” Then, convert your audio files accordingly as explained in the article “Script to convert FLAC files using Shibatch SRC while preserving metadata”.


  1. Install Linux on a computer.
  2. Open a terminal by pressing and holding CTRL+ALT while typing T or starting it from the application menu
  3. Install the music player daemon on that computer. Users of Debian (and its derivatives like Ubuntu and Mint) and Arch users may use the following command: sudo apt-get install mpd git || sudo pacman -S mpd git Users of other distributions like Fedora or RHEL/Centos may consult the installation Wiki page for mpd.
  4. Optional but recommended for sound quality: connect the computer to an external (USB-)DAC

That’s it for the preparation. Next we’ll download and use the script to create a working bit perfect mpd configuration file.

Basic usage of mpd-configure

## make the directory where you want to store the script
mkdir /tmp/mpd-configure
## change to that directory
cd /tmp/mpd-configure
## download and unpack the script and other files needed
wget -O - | tar --strip-components=1 -zxf -
## run the script
bash mpd-configure

That’s it.

Now with some explanation and clarification.

  1. Open a terminal by pressing and holding CTRL+ALT while typing T or starting it from the application menu
  2. Make a directory for the script, the location is not important. mkdir mpd-configure
  3. Change to that directory cd mpd-configure
  4. Download and unzip the mpd-configure script in the current directory wget -O – | tar –strip-components=1 -zxf –
  5. Stop mpd. sudo systemctl stop mpd
  6. Run the script to generate /etc/mpd.conf : sudo bash mpd-configure -o “/etc/mpd.conf”
  7. Start mpd using the new configuration file. sudo systemctl start mpd

What’s next

Now you’ve installed mpd, configured it for bit perfect playback and started it, you should grab a mpd-client, connect to the mpd-daemon and start enjoying your unaltered bit perfect music!

Advanced usage

More advanced usage: backup an existing `/etc/mpd.conf`, overwrite it with the script generated mpd configuration file, which without any prompts configures mpd to use the first available USB DAC interface in your system, execute the following:

## become root if neccessary
[[ $EUID -eq 0 ]] || sudo su
## stop mpd
systemctl stop mpd
## set the paths to the music and mpd data directories and run the script
## saving the output to `/etc/mpd.conf` and creates a backup of that file
## in case it exists:
CONF_MPD_MUSICDIR="/srv/media/music" CONF_MPD_HOMEDIR="/var/lib/mpd" \
bash mpd-configure --limit usb --noprompts --output "/etc/mpd.conf"  &amp;&amp; \
systemctl start mpd
## done (press ENTER)

The code for this script is maintained using the software version control system git. With it is even easier to get and update the script

# install git
sudo apt-get install git || sudo pacman -S git
# download the latest source files for the script
git clone
# change to the directory where you've downloaded the sources
cd mpd-configure

In the future you can update the script to the latest version by changing to the directory created above and entering

git pull

For more advanced usage, please consult the README file, the mpd man page, the online mpd user manual and the article “How to setup a bit-perfect digital audio streaming client with free software (with LTSP and MPD)”.

One may browse, share, clone and fork the source code of the script at

Summary of changes

    changelog Jan 8, 2016 Added command line parameters Jan 5, 2016 Bug fixes and enhancements Sep 11, 2014 Modified command line instructions

    • installation instructions compatible with Arch Linux
    • other instructions compatible with Arch and other systems using systemctl.

    Apr 21, 2014 Changed configuration snippets:

    • from included in code to seperate files in `./confs-available` which may be symlinked to `./confs-enabled` to activate, ie: cd confs-enabled ## change to the `confs-enabled` directory ln -s ../confs-available/plugin-playlist-lastfm.conf ## enable the plugin bash mpd-configure > mpd.conf ## regenerate the conf

    Apr 18, 2014 Major rewrite of the script:

    • moved `tests/` to `alsa-capabilities`
    • modified `alsa-capabilities` to make it suitable to be sourced or run by itself from the command line
    • added simple and regexp filtering to `alsa-capabilities`
    • removed alsa interface detection logic out of `mpd-configure`. This
      is now done sourcing `alsa-capabilities`.
    • modified `mpd-configure` to not write to a file by default (see updated `README`)

    Mar 20, 2014:

    Jan 6, 2014

    • fixed several typos and explained the usage a bit more

    Sep 18, 2013

    • enhanced the mpd-configure script and fixed several (severe) bugs, see changelog

    Script to download and host google web fonts and generate css: best-served-local

    Local Google web font hosting made easy, controllable and automated

    The bash script best-served-local makes downloading and self-hosting a Google web font as easy as running the following command in a terminal:

    <code>bash <(wget -q -O - "" >dev/null) "Roboto:100,900"</code>
    bash <(wget -q -O - "" >dev/null) "Roboto:100,900"

    This will download the appropriate font files (like Roboto_Bold_v15_latin_700.woff) to a local temporary directory and display the following valid css3 in the terminal:

    @font-face {
        font-family: 'Roboto Bold';
       local('Roboto Bold'), local('Roboto-Bold'), 
       url('Roboto_Bold_v15_latin_700.woff2') format('woff2'),
       url('Roboto_Bold_v15_latin_700.woff') format('woff');
        font-style:  normal;
        font-weight: 700;

    Specifying a few extra options fully automates your desired usage scenario:

    <code>bash <(wget -q -O - "") \ 
     --incss-fontpath /static/fonts \ 
     --outputfile /var/www/ \ 
     --fontdirectory /var/www/ --overwrite \ 
     --formats superprogressive \ 
     --subsets latin-ext \ "Open Sans:300,400,700" "Roboto:100,100italic,regular,italic,900"</code>

    Apart from bash version 4, this script only depends on curl (it’s not tested on OSX yet).

    Getting and running the script

    The script can be cloned or forked from its gitlab repository, downloaded, or started straight from the web like showed above (although some would advise against that).

    To display all commandline arguments, run the script with the --help (or -h) argument.

    See the README file for detailed usage information.

    Happy Google-free hosting!

    Adding pageNumber elements to an internet archive generated scandata.xml file


    After uploading a book to internet archive (IA), some task are started on IA’s servers to generate all the metadata files and derived formats. Like explained in the blog Scandata.xml –on the wiki of the university of Columbia– the generated bookid_scandata.xml file lacks a pageNumber element. This results in not being able to directly access (or link to) a pagenumber in IA’s online reader, using an url like:

    <a href="">page 10</a>

    Instead one can only access page 10 of the book with ia id bookid using the following link (observe the extra n before the page number):

    <a href="">page 10</a>

    Apart from being counter intuitive, this results in other problems, like not being able to link items in the table of contents of a book listed in openlibrary.


    This is caused by missing pageNumber children inside page elements:

    --- original_scandata.xml       2015-06-06 13:52:31.885618573 +0200
    +++ modified_scandata.xml       2015-06-06 13:52:20.232021740 +0200
    @@ -1,4 +1,5 @@
     <page leafNum="10">
    +    <pagenumber>10</pagenumber>

    A modified page element looks like this:

    <page leafNum="10">


    The blog mentiones the use of a simple xsl stylesheet, which together with the original scandata.xml file and a xslt processor, adds the appriote pageNumber elements to each page parent.

    The bash script below does the same, and could be run from the web directly, specifying only IA’s book id:

    bash &lt;(wget -q -O - "") some-ia-bookid

    So for a book with the id originofspecies00darwuoft, the following would add the pageNumber elements to its originofspecies00darwuoft_scandata.xml file:
    The bash script below does the same, and could be run from the web directly, specifying only IA’s book id:

    bash &lt;(wget -q -O - "") originofspecies00darwuoft

    You can use it also with a file you downloaded:

    bash &lt;(wget -q -O - "") ./bookid_scandata.xml

    Afterwards, the new script-generated xml file should be uploaded to your book resource page on internetarchive (with the name bookid_scandata.xml) replacing the original scandata xml file. After that, the Internet archive application triggers tasks which will make the desired urls available.

    Latest mpd packages for Debian

    Building an installation package for Debian with the latest version of mpd is easy, right?

    Sure, as long as you’re working on a Debian (or ubuntu) host. Therefore I’ve created the make-debian-mpd-chroot bash script, which lets me do just that, but on a host running Arch (or Fedora and friends).

    The script uses debootstrap to create a virgin debian chroot, which is started as a virtual machine using systemd-nspawn, and automagically builds the latest mpd (music player daemon) package for Debian inside the virtual machine. When done building, the script copies the resulting package back from the virtual machine to the directory where the script was started, so it can be distributed and installed to a Debian host designed for high quality music playback.

    It is intended to be run on a non-debian system, like arch. Users of debian and ubuntu may be better of using git-buildpackage for this purpose.


    export scriptname="make-debian-mpd-chroot"
    export url="${scriptname}"
    wget -q "${url}" || curl -s -k -o "${scriptname}" "${url}"
    sudo bash ${scriptname}

    Of course both debootstrapping a system and compiling mpd from source take a while. On my 8GB Intel core i3 using SSD storage, the whole process takes about ten minutes. This results in mpd_0.19.9-1_amd64.deb (current version in Debian maintainers git repository) in the directory where the script is started from.

    Technical details are available in the code and in the README in the git repository on github.

    A comprehensive guide to bit perfect digital audio using Linux

    Among audio and music lovers who use digital audio, bit perfect audio playback is hot. Time to explore the term and its backgrounds and guide you in setting up your own system.

    What’s in a name?

    Often the notion of “bit perfect” is used to describe audio playback systems that don’t alter audio on purpose, like non-optimized convenience-oriented systems often do. They mostly include plug-n-play features like resampling, volume leveling, resolution changes and channel mixing, all on-the-fly.

    Bit perfect audio systems shouldn’t do that; a digitally encoded 1 which is retrrieved from a file and sent from one system like a playback computer to another like a DAC, should arrive as the original 1, not 0. In the abstract realm of computers, bit perfect is as simple as that; input == output. But, in reality, as with all things, the concept or definition of perfectness is bit more complex. Likewise, the definition above does not quite cover things in our audio reality.

    Bit perfect transport and storage of digital audio files

    As long as the audio lives inside raw PCM file formats –like WAV and AIFF, lossless compressed file containers –like ZIP and FLAC, or DSD inside a DXD file, one may use of-the-shelve checksum algorithms and tools to easily verify that a transported, processed or stored audio file is exactly the same as its source.

    To see this in action, one may conduct the experiment below. It involves the downloading, storing and compressing of a file downloaded from the internet, and comparing the processed file to the original source file. We’ll be using md5 checksums to verify that the processed files are (bit) perfect, ie exactly the same as the original source file.

    First, make sure all needed programs are installed. Do this by opening a terminal screen by pressing T while holding CTRL+ALT or by searching for and starting a terminal from the menu. The following commands should work for Debian (and derivatives) and Arch:

    which flac || ( sudo apt-get install flac || sudo pacman -S flac)
    which wget || ( sudo apt-get install wget || sudo pacman -S wget)

    Now we’re ready for the real thing.

    1. Create a temporary working directory and change to it:
      cd $(mktemp -d "bitperfect-XXX")
    2. Download the upstream source flac file and the file containing md5 checksums as supplied by uploader by copying and pasting the following in the terminal screen and pressing ENTER:
      wget "" \
    3. Check if the download and storage of the source flac file succeeded, ie. is bit perfect:
      LANG=C md5sum -c "tsp2007-10-02.spc4.flac16.md5" 2&gt;/dev/null | grep "tsp2007-10-02.sp-c4.d2t02.flac"

      The output of latest command should be:
      tsp2007-10-02.sp-c4.d2t02.flac: OK
    4. While using the -c-switch, like we did in the former example, is a great way to check a file against predefined checksums and to ensure that your download succeeded, there are other ways to make sure the files on, and the ones downloaded to your computer and mine are exactly the same. This time we’ll run md5sum without the -c switch, directly against the file that should be checked:
      md5sum "tsp2007-10-02.sp-c4.d2t02.flac"
      # output should be

    This shows that file transport through TCP/IP connections and storage on any local or network storage device, can –and should– be bit perfect for digital audio files.

    Still not convinced (and not faint of heart)? You might want to try the following, to see what happens when one changes a single byte (ie. 8 bits) in the source wav file.

    1. Install the hex editor dhex:
      which dhex || (sudo apt-get install dhex || sudo pacman -S dhex)
    2. Decode/Unpack the digital audio file tsp2007-10-02.sp-c4.d2t02.wav from the flac container file, which we downloaded and verified in the first excercise:
      flac -d "tsp2007-10-02.sp-c4.d2t02.flac" 
    3. Make a backup of the source wav file
      cp -a "tsp2007-10-02.sp-c4.d2t02.wav" "tsp2007-10-02.sp-c4.d2t02.wav.original"
    4. Open the source wav file using dhex (the first time you start dhex, it will ask you to confirm keyboards keys, just do what the program asks you to do)
      dhex "tsp2007-10-02.sp-c4.d2t02.wav"
    5. While in dhex, with the wav file loaded, press ENTER
    6. Next, change the first byte of the header, 'R' (for RIFF) to 'Q' (for QIFF, by typing the number 51 and confirm by pressing ENTER.
    7. Save the modified file by pressing [F10].
    8. Prove that such single bit changes can’t be detected by simply listing file sizes:
      ls -la *.wav*
      # -rw-r--r-- 1 user users 34936652  5 sep 14:42 tsp2007-10-02.sp-c4.d2t02.wav
      # -rw-r--r-- 1 user users 34936652  6 okt  2008 tsp2007-10-02.sp-c4.d2t02.wav.original
      #                         ^^^^^^^^ the file sizes are exactly the same
    9. And finally the prove that md5sum is invaluable in such cases:
      md5sum *.wav*
      # a1ea1462e2e09305dfb7df3dcca14d33  tsp2007-10-02.sp-c4.d2t02.wav
      # 073b6cc32c5c3b9276074c2e4a85dbfe  tsp2007-10-02.sp-c4.d2t02.wav.original
      # ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ but the md5sums aren't the same

    While file sizes can be the same across different files, md5 checksums are hard to fake.

    Is flac bit perfect?

    In the former excercises we established proof that md5 is a great way to verify the integrity of downloads (and subsequent storage and retrieval) of files. Next we’ll try how accurate or bit perfect our beloved Free Lossless Audio Coder (FLAC) is.

    First we’ll extract the source wav file from the flac container, then repack it with flac and extract it once again. This way, we can see if flac does something with digital audio it packs so efficient, or, in other words, if it is really lossless.

    1. Clean up from previous step:
      rm *.wav*
    2. Decode the downloaded flac file:
      flac -d "tsp2007-10-02.sp-c4.d2t02.flac"
    3. Create the md5sum of the resulting uncompressed wav file and store in a file:
      md5sum -b "tsp2007-10-02.sp-c4.d2t02.wav" &gt; "mychecksums.md5"
    4. Backup the original downloaded flac file:
      mv "tsp2007-10-02.sp-c4.d2t02.flac" "tsp2007-10-02.sp-c4.d2t02.flac.original"
    5. Repack the wav file creating a new flac file:
      flac -e "tsp2007-10-02.sp-c4.d2t02.wav"
    6. Create a checksum of the resulting compressed flac file and append it to the self created checksums file:
      md5sum -b "tsp2007-10-02.sp-c4.d2t02.flac" &gt;&gt; "mychecksums.md5"
    7. Remove the source wav file:
      rm "tsp2007-10-02.sp-c4.d2t02.wav"
    8. Decode the self created flac file, thereby restoring the wav file once again:
      flac -d "tsp2007-10-02.sp-c4.d2t02.flac"
    9. Check the authenticity of both the wav and self generated flac files:
      LANG=C md5sum -c "mychecksums.md5"
    10. Check if the self generated flac is the same as the original downloaded flac:
      md5sum "tsp2007-10-02.sp-c4.d2t02.flac" "tsp2007-10-02.sp-c4.d2t02.flac.original"

    As you can see from the last command, the self generated flac created above, differs from the original downloaded flac. At the same time, the wav file extracted from the original downloaded flac file is exactly the same as the one extracted from the self generated flac file.

    When you regard flac for what it really is –a lossless compression container for audio files– that makes perfect sense. For instance, changing the compression level will lead to different flac files, while the contained audio file stays intact. When changing a letter inside a metadata field, although the file size stays the same, the md5 checksum will be altered.
    But flac is more than a mere zip file. For instance, in each flac file, a unique digital fingerprint of the audio it contains is stored inside the flac file itself. One may view (and compare) such fingerprints, thereby verifying the integrity of the audio (but not the metadata etc.):

    1. Extract and display the fingerprint of the wav file stored inside both the downloaded and regenerated flac files; they should be the same:
      metaflac --show-md5sum "tsp2007-10-02.sp-c4.d2t02.flac"
      metaflac --show-md5sum "tsp2007-10-02.sp-c4.d2t02.flac.original"
    2. The good folks at sometimes offer the original fingerprints in a downloadable file which you can use to visually establish the integrity of the audio within flac containers. Of course, with some grep magic, it is easy to not having to trust on your eyes. First download the file containing the fingerprints:
      wget -q ""
    3. Next, use metaflac and grep to create your own fingerprint checker:
      grep "$(metaflac --show-md5sum "tsp2007-10-02.sp-c4.d2t02.flac.original")" "tsp2007-10-02.spc4.flac16.ffp"
    4. And the same for our self generated flac file:
      grep "$(metaflac --show-md5sum "tsp2007-10-02.sp-c4.d2t02.flac")" "tsp2007-10-02.spc4.flac16.ffp"

    So flac is a bit perfect encoder/decoder for audio files containing PCM-audio. At the same time, two flac files containing exactly the same wav file can differ amongst each other, caused by differences in non-audio data and compression parameters. Does this mean that flac is suited for the purpose of this guide; creating a bit perfect audio chain? No, unfortunelately it isn’t.

    While flac is a bit perfect encoder/decoder for digital audio, the extra decompression that takes place when playing back the audio within a flac file, adds to the total load of the playback computer, which is something we’re trying to avoid. Therefore flac should not be considered suitable for usage in a bit perfect audio playback chain, although it is a great tool for efficient and accurate archiving and transport.

    The devil is in the details

    This is where the difficulties and subtleties come in. In current standards based audio, although the pristine source files are stored perfectly on your NAS, and retrieved by your computer through network interfaces, switches and cables in a perfect manner, the output of a digital playback system is always AES/EBU-format (or S/PDIF, like it was called before), as long as we’re talking about PCM, like with WAV or AIFF file formats.

    Even when using the highest quality standards like USB Audio Class 2 (UAC2) or IEC 60958 Type I Balanced XLR. The clock signal is always multiplexed with the audio data by the playback computer.

    A computer is essentially a very complex device built upon billions of very simple on/off switches. This high speed switching influences all kinds of power related properties of the computer itself, as well as any connected device. This also applies when CPU, RAM and busses inside the computer are instructed to construct a stream of bits consisting of perfect audio data source packets and near perfect source clock signals. The resulting audio output signal is never perfect.

    Regardless whether the source clock signal comes from an external femto clock generator inside your $10.000 external DAC connected to your computer using UAC2, the $10 clock on your high quality audiophile attached to the PCIe bus of your PC, or the $0.03 crystal on your computers mainboard, the resulting audio signal can and will suffer audible from system load.

    How about USB Audio?

    In the “USB Audio Class 1” (UAC1) standard, both the computer and the DAC are allowed to drop USB-packets, while the clock signal can only be generated by the computer itself. The timing errors induced by the cheap clock generator inside the pc and the fact that packets might get lost, means that the output signal won’t be the same as the original signal which results in audible artifacts.

    UAC1 should not be considered a bit perfect standard for digital audio.

    Schematic drawing of the processing of digital audio throughout various components.

    But surely UAC2 in isochronous asynchronous mode is bit perfect? In a proper UAC2 chain, consisting of a computer for audio playback and an UAC2-capable DAC, each USB packet –a so called “USB Request Block” (URB)— should arrive at the other end in perfect non-altered form. Another nice feature of UAC2 is that the clock signal can be fed from an external DAC, like an out-of-this-planet $20.000 DAC with a femto clock. Both, the transport of PCM encoded audio data packets inside URBs to the DAC and the clock signal separating the USB packets to the computer should be perfect! Sadly, that isn’t the problem.

    Although UAC2 facilitates the separtion of timing and audio data, UAC2 by definition needs to round the audio timing information to the nearest USB timing frame. While the URB’s themselves and therefore the audio data frames are transported perfectly, their timing is a non-perfect fit.

    The bottom line is that all current standards for transferring digital audio streams from one system to another, have non-perfect synchronisation of timing information. Therefore, there is no such thing as standards based bit perfectness!

    But there must be a solution, right?

    With all the (power) switching going on inside the device that generates the AES/EBU stream, whether it’s a computer or a dedicated audio device, that does have an audible effect on the resulting audio output signal.

    Some manufacturers started to use their own implementations of i2s instead of AES/EBU. That makes perfect sense, in that it is designed to be a discrete transport mechanism for digital audio using three parallel streams, seperating clock from audio data, so it doesn’t suffer from the muxing-issues tight to AES/EBU. There is one rather big problem with these solutions. The i2s standard is only designed to be used inside gear, like inside a DAC, CD player or smartphone. There it performs the task of discrete transport for feeding audio sources from audio generating sources –like the USB receiver inside a DAC, the pickup of a CD player or the mic in a smartphone– to its internal DAC chip. The standard does not have provisions for transporting source digital audio signals to external equipment.

    That’s why now and again ones sees proprietary interfaces and protocols emerging. Most of them (mis)use HDMI as the interface, while using non-HDMI-compliant internal wiring. Some try it with CAT5e/6 network cable and interfaces, while others use BNC connectors. Other manufactures try to completely de- and reconstruct the incoming AES/EBU signal, replacing the incoming clock signal by there own high quality clock signal. I don’t like those solutions and I don’t believe them to be sustainable.

    We’re depending on the audio industry to agree upon a new standard, incorperating the insights bit perfect audio lovers gathered over recent yearsfor which I would suggest the inspiring name “USB Audio Class 3 (UAC3)”.

    Back to reality

    Unfortunately, there are not even traces of debate on USB Audio Class 3. In the mean time, we’re left with:

    1. UAC2 as the standard of choice,
    2. using meta tagged AIFF files (whenever Musicbrainz Picard supports that)
    3. stored on a proper –preferrably dedicated– file server with a Gigabit network port, like a NAS (any proper NAS should suffice) using NFS as its high level protocol,
    4. coupled to a solid –preferrably dedicated– gigabit TCP/IP network,
    5. a designed-for-audio dedicated playback computer,
    6. and last but not least, a designed-for-audio OS and software chain, running with minimal system load.
    A dedicated/isolated audio network
    For less then $100/€80, one can buy a professional managed ethernet layer-2/3 8-port gigabit switch, like the HP Procurve 1810-8G v2 (J9802A). Such a device enables one to implement the most complex and robust of networks. Using VLAN’s, virtual isolated networks can be created, for example a 900Mbit dedicated network for audio and a 100Mbit network for all others duties.
    A dedicated designed-for-audio computer (hardware part)
    On the hardware side we’re looking for a fanless, diskless and headless industrial grade PC with two CPU cores and two network interfaces.Fanless: We aim to minimize noise and vibration.
    Diskless: We don’t want spinning disks inside our box, because they cause noise, vibrations and power fluctations. The only disk inside the PC will be a small mSATA solid state disk, to store the OS and music playing software on. Those wille be loaded in memory at power up, after which everything is booted and executed from RAM. We will be storing user files, like audio files, settings and preferences, on a network storage device. This way, we’ll complete eliminate activity on the SATA-bus and controller.
    Headless: We only want the PCIe/USB-busses and controllers be dealing with the handling of audio. So there will be now screens or input devices attached to our box and we have no need for resource hungry 2D/3D graphics systems and their complex and error prone drivers. Instead we’ll be controlling the OS and music playing software from native applications on other devices, like desktops, laptops, smartphones or tablets, or, if desired, with any webbrowser on the local network using a webserver on the music playing PC. Of course, that would require careful implementation and resource assignement, as we would want to minimize it’s influence on audio related processes.
    Industrial grade We’re looking for a system that lasts at least ten years with extensive use, without active cooling and with minimum EMI/RF radiation and vibration. We want extended lifespan components from well established suppliers. We want a well build non-vented box with as few holes as possible. We only need holes for connectors, two for ethernet, two for usb and one for power.
    Two CPU cores We want a dedicated core to which we will tie all processes related to audio with realtime priority. Here processess will run like audio file retrieval from the network including the network stack itself, the usb stack and the processes needed for the music playback software. All other processes, like logging, controlling and the optional webserver, will be tied to the second core.
    Two network interfaces We want a single dedicated gigabit network adapter with hardware TCP/IP offloading for retrieval of audio files on the network. All other network traffic, like control sequences, will be redirected tot the second (built-in) network interface.While the C.A.P.S. proposals are great, things can be simpler, cheaper and even better.When your on a tight budget (aren’t we all?) buy yourself a fanless industrial Intel dual core Atom based based system, like the Logic Supply AG150, configured with 2GB RAM, an idustrial 32GB msSATA drive and a best-in-class Seasonic switching power supply for around $390/€260, and you have a great starting point for this purpose.More speed means less switching, so if you can afford it, you might want to spend around double that money and buy a fanless industrial Intel dual core i5 Haswell based system, like the Logic Supply ML320, configured with 4GB RAM, a 32GB internal mSATA drive and a best-in-class Seasonic switching power supply for ~$750/€560. This system features the-best-in class NUC-design, coupling the CPU directly, so without heatsinks, to the upper side of the box. The upper part of the box is a folded sandwich construction of thick aluminium and thinner iron, which is great because it not only keeps the cores cool but minimizes RF/EMI radiation and vibration as well.You might improve on the rather good basics by replacing the switched mode adapter with a proper linear audio supply. I still haven’t come around to listen to the effect of such an upgrade, and I’am currently working with a local engineer to get such a beast built.The use of a dedicated USB PCIe card designed for audio in the PC, like the ones offered by Sotm (~$300/€350) or Paul Pang / PPA Studio (~$130) (who also offers great audio PC’s and other tweaks) did do some good in the Atom based system, but did not have any audible effect in my Core i5 system. This probably is due to the fact that I didn’t use an external linear power supply to feed the cards.

    Other tweaks, like dedicated audiophile SATA-controllers and cables, do not apply to our system, as we only use our solid state mSATA disk to boot the OS and music playing software. After that everything will be executed from RAM, thereby bypassing the SATA-bus and controllers completely.

    A dedicated designed-for-audio computer (software part)
    As Microsoft has a long and bad track record of proprietary, hidden and non-sustainable “standards” and technology while frustrating open standards, they are not the supplier I want to attach myself to. But there are those who do and some of them have created some nice offerings, which can be divided in two categories.The first type consists of stuff that’s meant to be used like a desktop, connected to a TV and input devices or touchscreen, like JRiver Media Center (~$50/€40) and the free (as in free beer) closed source and proprietary Foobar. Of course that price is without a valid Windows (desktop) license, which those users –knowingly of course– bought as part of an OEM-installation for about $100/€100. For reasons described in this article, I don’t like all-in-one solutions like these and I’m not interested in them.The headless ones (based on Windows Server) are –as designs– more to my taste, like Audiophile Optimizer (~$100/€80) and JPlay (~$130/€100). Apart from that, you will of course need to buy a proper Windows Server license, which is an art in its own, that will set you back more than $300/€300 (just an estimation).Apart from the price, the Windows based “products” all suffer from two intrinsic problems. The first one is that Windows seized supporting USB Audio after Class 1 was defined, back in 2006. A a result, there’s no native UAC2 support in Windows, which means you have to revert to third party (and closed source) drivers, which is something I’m surely not after. The other problem is that they can only go forward by going backwards, ie. by reverse engineering. Thereby they’re battling their supplier of choice, which seems silly in my opinion. Generally these “products” consists of registry tweaks and scripts that disable standard services or tweak the system in some way. Their developers bet they can get and keep the OS, drivers and software in control that way, and hopefully the 25.000 remaning settings, proprietary drivers and their updates don’t interfere with their plans. The same applies to Apple, although the underlying OS does offer more possibilities.

    On the other hand, using free and open source software one can design and build a custom dedicated OS with playback software for a single purpose; getting the AES/EBU signal from the files on the network to your external UAC2 DAC in the best possible way.

    Some of my fellow enthusiasts have created some great things based on free and open software. AudioPhile Linux is in active development and uses Arch, which is fitted with a custom realtime kernel and mpd. Voyage MPD, the first audio oriented system in a single compressed image, together with Vortexbox are geared towards small and cheap embedded DiY platforms, like Beaglebone and RaspberryPi.

    Mine consists of a fully automated silent installation of a heavily customized (not reverse engineered) Debian with a custom compiled kernel based of the stock backported realtime kernel. It uses stock mpd and alsa modules and libraries and achieves great results.

    Shibatch SSRC Packages for Debian and Ubuntu

    IconNow you can use packages for Shibatch SSRC, the best-in-bread open source sample rate converter for digital audio, for easy installation in Ubuntu and Debian.

    Using and installing the packages

    SSRC on Ubuntu

    Ubuntu users can use the corresponding PPA for easy package installation by opening a terminal window (by pressing CTRL+ALT+T on the keyboard) and copying/pasting the following text, followed by pressing [ENTER]:

    sudo add-apt-repository -y ppa:ssrc-packaging-group/ppa
    sudo apt-get update
    sudo apt-get install ssrc


    SSRC on Debian

    Users of Debian can install SSRC using the following actions from the command line, which involves importing my public gpg key from MIT’s keyserver in to apt, and then adding my custom apt repository.

    # become root
    wget -O - ";search=0xFBF05DDFC04DF16B" | apt-key add -
    echo "deb lacocina-stable/" | tee  /etc/apt/sources.list.d/
    apt-get update
    apt-get install ssrc

    Alternatively, you can just download a single deb:
    # become root
    ARCH=$(uname -m | sed 's/x86_/amd/;s/i[3-6]86/i386/')
    DIST=$(grep ^VERSION= /etc/os-release | sed 's/^.*(\([[:alpha:]]*\))"$/\1/')
    wget "${DEB}"
    dpkg -i ${DEB}

    SSRC packages are avaliable for Debian old-stable (squeeze), stable (wheezy), testing (jessie) and unstable (sid).


    Looking for other distibutions?
    Apart from these packages, an arch package is available in the aur
    Notice a problem with the (upstream) software itself?
    Please contact the upstream developers.
    Have a bug regarding Debian and Ubuntu packaging?
    Please submit a new bug in the appropriate issue tracker on github
    How good is ssrc in interpolation (eg. converting from 96khz to 44.1kHz)?
    Have a look at the extensive list of resampling software and hardware at the “Sample Rate Conversion Comparison Project” from mastering studio infinite wave, to compare ssrc to sox, adobe audition, cubase and lots of other samplerate converters.
    What’s the going on under the hood when doing interpolation (eg. converting from 96khz to 44.1kHz)?
    Have a look at the extensive white paper “Digital Audio Resampling Home Page” by Julius O. Smith III of the Center for Computer Research in Music and Acoustics (CCRMA) at Stanford University.


    Packaging updates


    Current packages

    Shibatch SSRC Distribution  Release
    1.4.0 Ubuntu 16.04, 15.10, 15.04, 14.10, 14.04 LTS, 13.10, 13.04, 12.10, 12.04
    Debian jessie/stable (8), stretch/testing, sid/unstable

    Workflow for packaging

    The workflow for dealing wit upstream changes, changes to the packaging for Debian, and the automatic building on launchpad, are extensively documented in

    Quimup packages for Ubuntu and Debian

    IconIn the quimup-packaging repository on github, I track releases of Quimup (formerly known as Guimup), another graphical MPD-client written in Qt5. One may use the repository to create the corresponding installation packages for Ubuntu and Debian, or use the packages I’ve created.


    Using and installing the packages

    Quimup on Ubuntu

    Ubuntu users can use the corresponding PPA for easy package installation by opening a terminal window (by pressing CTRL+ALT+T on the keyboard) and copying/pasting the following text, followed by pressing ENTER:

    sudo add-apt-repository -y ppa:quimup/quimup
    sudo apt-get update
    sudo apt-get install quimup


    Quimup on Debian

    Users of Debian can install quimup using the following actions from the command line, which involves importing my public gpg key from MIT’s keyserver in to apt, and then adding my custom apt repository.

    # become root
    wget -O - "" | apt-key add -
    echo "deb lacocina-stable/" | tee  /etc/apt/sources.list.d/
    apt-get update
    apt-get install quimup

    Alternatively, you can just download a single deb:

    # become root
    ARCH=$(uname -m | sed 's/x86_/amd/;s/i[3-6]86/i386/')
    DIST=$(grep ^VERSION= /etc/os-release | sed 's/^.*(\([[:alpha:]]*\))"$/\1/')
    wget "${DEB}"
    dpkg -i ${DEB}

    Quimup packages are available for all current Debian versions (amd64 only) and can be downloaded directly from


    Looking for other distibutions?
    The upstream developer keeps a list, an arch package is available in the aur.
    Notice a problem with the (upstream) software itself?
    Please contact the upstream developer.
    Have a bug regarding Debian and Ubuntu packaging?
    Please submit a new bug in the appropriate issue tracker on github

    Packaging updates


    Current packages

    Quimup Distribution  Release
    1.4.0 Ubuntu 15.10, 15.04, 14.10, 14.04 LTS
    Debian jessie/stable (8), stretch/testing, sid/unstable
    1.3.2 (n.a.) (n.a.)
    1.3.1 Ubuntu 12.04 LTS, 12.10, 13.04.1 and 13.10.1
    Debian squeeze(6), wheezy (7)

    Technical details

    From upstream tarball to local Debian packages and an Ubuntu ppa

    After creating a basic working structure for packaging, I built the packages using the workflow described in PackagingWithGit on the Debian Wiki. For it to work reasonably smooth, I use the following branch layout:

    master         # development of files in the ./debian subdirectory
    debian/x.x.x-y # tagged branch for a debian package release
    upstream/x.x.x # tagged branch for an upstream release

    When the original tarball from Sourceforge is updated by Coon, you can update the master branch in the git repository using:
    ## 1. create an upstream/x.x.x branch,  
    ##    download the latest upstream tarball from sourceforge, 
    ##    and extract it contents in to the new branch, and switches
    ##    back to the `master' branch, where changes in the 
    ##    `debian'-subdirectory are tracked:
    git-import-orig --uscan --verbose 
    ## 2. manually update files in the ./debian subdirectory (debian/changelog, 
    ##    debian/control, and if necessary debian/rules:
    ${editor:-emacs} debian/control debian/rules # ...
    ## 3. increase the debian version in debian/changelog
    dch -i 
    ## 4. try to build the new version
    ## 5. repeat 3. and 4. until it works, possibly adding `--git-ignore-new' to
    ##    `git-buildpackage'
    ## 6. commit the changes (to master)
    git commit -a -m '(describe fix, reference github issue #x)'
    ## 7. tag the build; this creates the git branch/tag debian/x.x.x-y, 
    ##    where y is the debian version for the x.x.x. upstream version.
    git-buildpackage --git-tag --git-ignore-new
    ## 8. publish the tagged debian/x.x.x-y version as a new branch.
    ## when only the ./debian files are updated, specify y like former y + 1
    git push origin debian/x.x.x-y
    ## 9. publish the changes to the published master branch (necessary for bzr)
    git push
    ## 10. request a new bzr `import` at
    ## 11. force a new build of the daily recipe at


    This updates the (pristine) sources from the latest Sourceforge tarball in the upstream/x.y.z branch, while having the debian-version tagged branch in debian/x.y.z-a and leaving my packaging work intact in the master branch. Super!

    The changes in the github branch are automatically pulled by the bzr branch in launchpad on a daily basis. From that bzr branch I’ve created a daily packaging recipe, which builds packages for recent Ubuntu releases as soon as a change is detected in the bzr branch.


    Obligatory screenshots

    Quimup 1.3.1 – Media browser (by Year)


    Script to convert FLAC files using Shibatch Sample Rate Converter (SSRC) while preserving meta data

    When using your PC as a high end audiophile transport device for delevering digital audio to an external DA-converter as explained in How to turn Music Player Daemon (mpd) into an audiophile music player, one has to make sure that the audio files are in a suitable audio format for the DAC. Consult the Background section below to determine which formats are suitable.

    The flac-src-shibatch script will take a directory containing digital audio files, and convert it to a chosen bitdepth and sample rate using the not-too-bad Shibatch sample rate converter in “twopass non-fast mode” while preserving the flac metadata stored in the original files. It will put the original files in a tar ball for future usage.


    1. Install SSRC: See Shibatch SSRC Packages for Debian and Ubuntu
    2. Install flac and sox*:
      sudo apt-get install flax sox
    3. Download and unpack the source for the script:
      wget -O


    Ad *: Although sox can be used as an audio backend and sample rate and format converter, in this scenario we only install it for obtaining the convenient soxi file analyzer.


    The script resamples and replaces the original files in the directory and makes a tar ball containing the original files:

    cd flac-src-shibatch-master
    ./convert-flac $sourcedir

    Where $sourcedir is a directory containing high resolution tagged flac files.


    The designer of my excellent Pink Faun 3.24 USB DAC, fitted its DAC with a Tenor USB chip and limited the maximum rate to that of USB Audio Class 1 (24bit / 96KHz) to make sure alle available operating systems can use the DAC as an audio device without having to install additional drivers.

    However, when buying audio at HD Tracks  or Channel Classics, I want to get the best format available; currently that would be 24bit/176KHz or 24bit/192KHz. After buying and downloading the files I use Musicbrainz Picard to tag and rename the original files. If necessary I create the Musicbrainz entries myself. I then convert (downsample and convert) the files using my script to 24bit/96KHz when they have a sample rate which is higher than 96KHz. Files with lower sample rates are to be left alone.

    To determine which formats are natively accepted by your USB DAC, and how it actually behaves when feeding it a certain format, have a look at What digital audio format (bit depth and sample rate) does your alsa sound card support and what does it actually use?

    The resulting quality

    The quality of the resulting file depends on a number of factors. The most important factor is the ratio between the original and the desired samplerate.

    For instance, when the source file is in a multitude of fs = 48KHz, eg. (fs*4 =) 192KHz or (fs*8 =) 384KHz, the conversion in my case is a matter of simple calculation; leave out each nnd sample, where n is the multiplier devided by two (fs*2 = 96KHz). This effectively makes the audio signal more course because the time between samples will grow with n. Furthermore, the avaliable frequency band will be narrowed (divied by n), following the Nyquist-Shannon theorem (fs/2 – width needed for the lowpass filter). For example a 192KHz sample signal leaves room for a (theoretical) frequency band of <96KHz while the downsampled 96Khz sample will be limited to <48KHz. However, there will be no further interpretation of the audio like in the cases below.

    For formats based on fs=44.1KHz,  like (fs * 2 = ) 88.2KHz and (fs * 4 = ) 176.4KHz, I need to let shibatch interpret the source and calculate a new file based on its internal algorithms, a Finit impluse response filter (FIR) applied through Fast Fourier Transforms (FFT). Furthermore, shibatch can be told to apply dithering when desired. This of course means that your milage may vary, depending on the source material. It can be rewarding to try different parameters while listening to the resulting file(s).

    The scripts is available for downloading, sharing and modification at

    How to setup a bit-perfect digital audio streaming client with free software (with LTSP and MPD)


    This article describes how to create a software environment on your desktop computer which will make it possible to boot a second computer from the network and use that as the interface between the desktop computer –from which the audio files and the boot environment are served– and the external audio equipment.

    The goal is to achieve maximum audio quality using low-cost off-the-shelve equipment and free software and at the same time offering ease of configuration and use. Of course this comes with a price; reading and understanding this article.

    When you simply want your PC (with a locally installed mpd) act as an audiophile transport device, please have a look at the article “How to turn Music Player Daemon (mpd) into an audiophile music player”.

    Background and goals

    I love both music and free software.

    After having used the Slimdevices Squeezebox as a streaming client connected to a MSB Link DACII (review) for seven years, I got tired with the fact that I couldn’t use any other client devices or software or formats –the squeezebox client device only outputs 16bit/44.1KHz and can only communicate with the squeezebox server software. Furthermore, my audio equipment didn’t fit my life any more (too big and THX/surround targeted).

    In September 2010 I therefore bought a new audio system consisting of small and relative cheap Opera Mezza stereo speakers, the 24 bit Pink Faun USB DAC 3.24 and a Pink Faun D-Power 140 power amplifier in which the excellent engineers of Pink Faun fitted their newly developed remote controlled volume control. The speakers are connected to the amp using Pink Faun SC-4 cable, while Pink Faun IL-2SE cable is used between the amplifier and the DAC. The power and USB-cables I use are all standard cheap ones and leave room for improvement.

    After buying this set I started using xbmc on a laptop which was connected to the DAC with a cheap USB cord and really loved the sound. I couldn’t wait to play native 24bit / 96KHz music. All my digital music files where flacced EAC rips of my own CD’s. I bought some 24/96 albums from but the results were disappointing. After waiting for a while (to let my rather new equipment settle) I tried again with the same results. On the ergonomic side I really liked the fact that I could use my laptop both as a media controller and as sort of a black box by using whatever software client (like banshee and rhythmbox) on my desktop PC and the DLNA/upnp features of pulseaudio.

    On the Internet I learned that although the input consisted of 24bit/96KHz files, the actual audio stream received by the USB-DAC was in fact down sampled to 16bit/44.1KHz by some piece of software!

    My DAC doesn’t indicate which bit depth or sample rate it uses, but watch out. Even if your DAC does indicate 24bit/96KHz when playing a high resolution file, it could also do so with other formats. That is because the software is also able to upsample, using an arbitrary and bad format conversion algorithm like libsamplerate. To complicate things further, a lot of DACs try to do the same on the hardware side.

    The worst case scenario is that when you play a 24bit/96KHz file, it first gets downsampled to 16bit/44.1Khz by the software –again using a fast and sloppy algorithm– and secondly gets upsampled to 24bit/96KHz by the DAC. Instead of crispy tube like sound and depth, you get a MP3 experience.

    It seems the price for convenience buys you a ticket to 1990’s ideas of sound perception.

    Default pulseaudio with alsa configurations perform on the fly bit depth and sample rate conversion for ensuring a pop and crackle free “plug-n-play” experience (see pulseaudio ticket #930: Media players report 96khz, proc reports 44khz and the article “Which digital audio formats does your USB DA-converter support and use?”). Most audio enthusiasts won’t be thrilled by these “qualities” and would like to leave decoding of the audio stream to (more suitable) external audio equipment like a DA-converter. Sample-rate conversion should be avoided altogether when possible. Currently, one has to bypass the default pulseaudio layer altogether and alter the default alsa configuration to get multi-format bit-perfect output to the external DAC.

    In short, my wanted setup had the following requirements:

    • having bit-perfect audio for all digital formats accepted by my DAC
    • having the freedom of only using free software
    • having the freedom of only using open –non patent encumbered– file formats
    • having the freedom to use any client device as a streaming digital audio device
    • having the freedom to use any device for controlling the streaming digital audio device (acting like a “media controller”)

    To achieve these goals I’ve made the following choices:

    • create a LTSP environment using my desktop PC as the LTSP server
    • installing a MPD client on the desktop PC or Android client acting as a media controller
    • using an old HP t5725 thin client as the LTSP client
    • this thin client will also act as a (headless) MPD server

    The result is whopping! I can plug in any PXE-capable PC, thin client, plug computer or laptop to my LAN and connect it to my DAC with USB. After 15 seconds of booting (serving the LTSP image from a old-school SATA disk) it is a bit perfect audio streaming client. I can use any PC, laptop, tablet, smartphone or tablet to control the music player and browse through my playlists and music library.


    Step 1: Create a working LTSP-server environment

    Follow the instructions to set up a LTSP server and LTSP client image using

    This process also involves installing and/or configuring DHCP and TFTP to serve the generated LTSP image to LTSP clients.

    The result should be:

    • a working DHCP server which has `dhcp-boot` options filled in for PXE enabled clients
    • a working TFTP server with LTSP client boot scripts in `/var/lib/tftpboot/ltsp.i386`
    • a working LTSP fat client chroot environment in `/opt/ltsp/i386`

    Step 2: Install mpd in the LTSP client chroot

    Install `mpd` in the LTSP client chroot and disable the default startup scripts to prevent loading mpd at system start as a system daemon.

    Open a terminal on the server using CTRL+ALT+T, become root and leave this terminal open throughout this exercise.

    sudo su  # become root

    Determine where you want to place files.

    • The location of the music library, which in my case consists of a directory per album containing tracks in FLAC format, is presumed to be `/srv/media/music` in this example, change it to your preference.
    • The home directory of the user that will run mpd, will store the settings and database for mpd. In this example we’ll use `/var/lib/mpd` for that purpose. Again, change this to your preference.

    ## still in the root terminal window on the server
    export music_dir="/srv/media/music" # directory containing music files
    export mpd_dir="/var/lib/mpd"       # home directory for the mpd user

    Next create a system user account on the server for mpd with the proper group id and assign a random generated password to it.

    ## still in the root terminal window on the server
    # generate a group for accessing music files and directories
    export group_music="music"
    getent group ${group_music} || addgroup --system ${group_music}
    # generate a group for accessing the mpd home directory, its settings and database
    export group_mpd="mpd"
    getent group ${group_mpd} || addgroup --system ${group_mpd}
    # extract the group id of the mpd group
    export group_mpd_gid=$(getent group ${group_mpd} | awk -F: '{print $3}') 
    # determine the name for the system account that will start mpd
    # create it with the proper home directory, group id and username
    getent passwd ${user_mpd} || \
     adduser --system --home "${mpd_dir}" --shell /bin/bash --gid ${group_mpd_gid} ${user_mpd}
    adduser ${user_mpd} audio          # make sure it can access the audio hardware
    adduser ${user_mpd} ${group_music} # make sure it can access the music files 
    chown -R ${user_mpd}:${group_mpd} "${mpd_dir}"   # fix permissions on the home directory
    chgrp -R ${group_music} "${music_dir}"     # make audio group owner of the music dir
    # assign a random generated password to the user
    which pwgen || apt-get install pwgen # make sure pwgen is installed
    export password=$(pwgen 7 1)         # generate 1 random password of 7 characters
    echo "${user_mpd}:${password}" | chpasswd    # assign it the generated password

    Finally we enter the ltsp chroot created earlier, install mpd and prevent it from being started from its init scripts when the ltsp client boots, because we’ll configure that later on.

    ## still in the root terminal window on the server
    ltsp-chroot -c -d -p      # enter the LTSP client chroot environment on the server
    apt-get install mpd       # install mpd in the chroot 
    # comment the line to make sure mpd will not be started by the init scripts
    sed -i 's/\(START_MPD\)/#\1/' /etc/default/mpd 
    exit                      # exit the ltsp chroot (but leave the terminal screen open)
    ltsp-update-image --arch i386 # update the initrd image

    Step 3: disable pulseaudio for the LTSP client and make the music library accessible

    Dynamic configuration of LTSP clients can be achieved by the configuration file `/var/lib/tftboot/ltsp/i386/lts.conf` on the LTSP server. Perform the following on the server to add the proper line which disables the auto configuration mechanism for redirecting pulseaudio and at the same time leave alsa intact.

    Furthermore, we’ll add the proper lines to `lts.conf` to make the music directory accessible through sshfs.

    ## still in the root terminal window on the server
    cat >> /var/lib/tftpboot/ltsp/i386/lts.conf &lt;&lt;EOF
    SOUND = False
    LOCAL_APPS = True
    LOCAL_APPS_EXTRAMOUNTS = ${mpd_dir},${music_dir}



    Step 4: Prepare the user environment

    One has to configure a user account on the server which will automagically logon in the LTSP client after it is booted, determine and configure the preferred sound card as exposed by the external DAC and start serving mpd on the local network.

    Next, test the LTSP client environment by booting the LTSP client and logging in as the mpd user created above. If you can login and get a proper desktop, continue with the following steps.

    Step 5: Download and unpack the mpd-configure script to configure alsa and mpd

    Again on the server, download and unpack the mpd-configure script in  `/srv/media/mpd`. The basic example below will work if you have a single USB DAC. See the `README` file in the `./mpd-configure` directory for additional and more advanced settings.

    ## still in the root terminal window on the server
    export dir_mpdconf="${mpd_dir}/mpd-configure" # target for the mpd-configure script
    mkdir -p "${dir_mpdconf}"   # create it
    cd "${dir_mpdconf}"         # change to it
    # download and unpack the latest version of the script
    wget -O - | tar --strip-components=1 -zxf - 
    # next store the settings in the file ${HOME}/mpd-configure/mpd-configure.conf 
    # for easy retrieval later on.
    cat >> "${dir_mpdconf}/mpd-configure.conf" &lt;&lt;EOF
    export CONF_MPD_MUSICDIR="${music_dir}" # directory holding the msuci files
    export CONF_MPD_HOMEDIR="${mpd_dir}"    # home directory for the user running mpd
    export LIMIT_INTERFACE_TYPE="usb"       # limit to usb audio devices

    Test the script by logging on to the LTSP client as user mpd, starting a terminal and executing the following commands. Next, examine the output and modify the settings in `mpd-configure.conf` until the following conditions are met:

    1. it shouldn’t ask questions about which audio interface you want to use
    2. The paths should match those specified earlier

    ## logged on to the ltsp client as user mpd
    export dir_mpdconf="${HOME}/mpd-configure
    bash ${dir_mpdconf}/mpd-configure

    If the test above succeeds, try starting the mpd application by executing the following command (still logged on the LTSP client as the mpd user).

    ## still in the mpd terminal window on the client
    bash ${dir_mpdconf}/mpd-configure > ${HOME}/mpd.conf # generate the mpd.conf file
    /usr/bin/mpd ${HOME}/mpd.conf   # start mpd using the generated .mpdconf
    ps auxwww | grep "/usr/bin/mpd" # check if the mpd process has been started

    Step 6: Install a mpd client on the LTSP server

    On the server you should now be able to connect to the mpd application running on the LTSP client with a mpd client like the Gnome Music Playing Client (gmpc). Execute the following on the server to install and run this application.

    ## back in the root terminal window on the server
    add-apt-repository ppa:gmpc-trunk/mpd-trunk # add the mpd ppa to the server
    apt-get update # refresh the list with available packages
    apt-get install gmpc # install gmpc on the server


    You may now start GMPC by looking for it in the starter menu on your normal desktop. In the configuration dialog which appears when you first start gmpc enter the IP address of the LTSP client in the Host section and check the option Automatically connect.

    Step 7: Automate the logon process

    If these tests succeed, modify `${mpd_dir}/.profile` to make sure the script above will start automagically after the mpd user logs on on the LTSP client.

    # still logged on as root on the server
    # modify the startup script of the user running mpd
    cat >> ${mpd_dir}/.profile &lt;&lt;EOF
    export MPD_CONF="${HOME}/mpd.conf"
    /usr/bin/mpd --kill               # stop mpd if it's running
    ${HOME}/mpd-configure/mpd-configure > "${MPD_CONF}" &amp;&amp; \
     /usr/bin/mpd "${MPD_CONF}"    # start mpd with the script generated config
    chown -R ${user_mpd}:${group_mpd} "${mpd_dir}./profile" # fix ownership

    As a final test, reboot the LTSP client and logon as the `mpd` user ; now mpd should be automatically configured and running.

    Finally, modify the LTSP client configuration `/var/lib/tftboot/ltsp/i386/lts.conf` on the LTSP server. The following command adds the proper lines to make sure the user running mpd logs on automatically after starting the LTSP client.

    ## still in the root terminal window on the server
    cat >> /var/lib/tftpboot/ltsp/i386/lts.conf &lt;&lt;EOF
    LDM_USERNAME = ${user_mpd}
    LDM_PASSWORD = ${password}

    Restart the LTSP client; it now should automatically logon and start the properly configured mpd. Test it by using the gmpc on the server.

    References and thoughts

    Drawbacks of the setup

    In this setup, one has to make sure that each audio file is properly formatted according to the DAC’s native formats.

    The designer of my DAC, Matthijs de Vries, has some good reasons to limit the USB interface of the DAC to 24bit/96KHz as he explains in Dutch in his White Paper “Pink Faun DAC 2 (Oktober 2010)”.

    A higher data rate theoretically provides a better output. This appears still not completely true at this time. The disadvantages of 24/192 [USB DAC’s] are:

    1. [Support for 24/192 USB-DAC’s] is only possible in the newest operating software versions, and is asynchronous. As a result, one needs to fall back on self-developed drivers, which adds another layer in the software stack. 24/96 usually works with the native drivers which are already present [in current operating systems]. The user doesn’t have to install any additional software.
    2. The electronics are much more complex with 24/192; both DSP’s and numerous operations are necessary. The cure quickly becomes worse than the problem.
    3. The sampling frequency determines the resolution bandwidth, which is around 40KHz at 96kHz sampling rate. 192 kHz only gains so much in the audible range.
    4. [Galvanic] isolation does not currently work with 24/192, at the moment, 24/96 is the maximum achievable.

    The engineer is –like me– a child of the spirit of early 90’s Dutch universities. He therefore was introduced to a Microsoft dominated computer world, rather than the pre-90’s environments in which open source and Unix dominated universities. You can’t blame him that his perspective on software is Microsoft oriented. Although Microsoft tried to frustrate the further development of the USB Audio Class standard by bailing out and not developing a driver for USB Audio Class 2, fact of the matter is that asynchronous 192kHz USB Audio Class 2 is natively supported on both Linux and Mac since 2010. So his first statement is wrong.

    Unfortunately my DAC doesn’t support sample rates which are a multitude of 44.1KHz, like CD (1x=44.1KHZ), DVD-audio (2x=88.2) KHz or PCM converted from DSD (64x=2822.4kHz DSD converted to 4x=176.4Khz PCM). This means that I have to resample those files using a good-as-it-gets software converter.  In my case, I choose to upsample 88.2 to 96KHz and downsample 176.4 to 96Khz using the following script using the Shibatch SRC in twopass non-fast mode algorithm.

    To see how your USB DAC behaves, have a look at the article “Which digital audio formats does your USB DA-converter support and use?”.

    Good luck

    Getting rid of pulseaudio without breaking your system

    When you (occasionally) want to use a normal linux desktop
    computer as a bit perfect music streamer, its
    underlying sound system (called alsa) needs exclusive access to the
    audio interface like an external USB DAC. However, in normal linux
    desktop installations this leads to errors because of pulseaudio
    sitting in the way.

    Pulseaudio is sound software which is installed by default in all current linux distributions and desktop installations and acts as proxy for the “real” sound software, like alsa or OSS. Desktop environments and applications expect pulseaudio to be present and functional. So even when it would be possible to remove the pulseaudio software–which isn’t the case in many distributions– it is not a good option and could break the system configuration, preventing applications and software management from functioning properly .

    For dedicated headless audiophile audio playback computers, I would
    advice against installing a desktop environment and pulseaudio.

    However, even though pulseaudio can’t (and shouldn’t be) un-installed, it can easily be told to stay out of the way.

    Manually disabling pulseaudio

    This is done by setting the parameter autospawn to off in the pulseaudio client configuration file ~/.config/pulse/client.conf. Using your favorite text editor, like gedit or nano, add the following line to that file (if it exists, or create it when it doesn’t). Replace any other line that starts with autospawn=, and make sure the line is uncommented, ie. remove the pound # sign at the beginning of such a line:

    ## contents of ~/.config/pulse/client.conf

    Save the file, and log off and on again for the setting to take effect.

    From then on pulseaudio clients –like your desktop environment, mediaplayers and web browsers– will not be able to (re)start the real pulseaudio application (called pulseaudio daemon) when needed. This gives you control over pulseaudio, instead of the desktop environment. This of course also means that this setting prevents the pulseaudio daemon from starting at logon.

    If you want to play audio using a pulseaudio client, you’ll have to start it by hand first:


    Afterwards, when you want to get it out of the way again, you can simply stop pulseaudio daemon with:

    pulseaudio --kill

    Alternative: disabling pulseaudio client and stopping pulseaudio daemon on the fly

    Alternatively, one may set this option and stop any running pulseaudio daemon by copying and pasting the following commands in a terminal, after having stopped any application that might use audio, like your webbrowser:

    ## make a backup of an existing client.conf (if any)
    cp -f ~/.config/pulse/client.conf{,.backup}
    ## make sure the directory for client.conf exists
    [[ ! -d ~/.config/pulse ]] &amp;&amp; mkdir -p ~/.config/pulse
    ## set the autospawn parameter
    echo "autospawn=off" &gt; ~/.config/pulse/client.conf
    ## kill pulseaudio daemon (if it is was running)
    pulseaudio --kill

    The effect is the same as the previous step, without having to log off and on again.

    Disabling pulseaudio for a single command

    Sometimes all you need is to run a single command, for instance aplay -l to get a list of alsa cards. For those scenario’s you don’t need to create or modify the client.conf file or kill pulseaudio. You can just use pasuspender as a pseudo shell for your command:

    ## run aplay with full access to all alsa hardware devices, 
    ## by temporary suspending pulseaudio
    pasuspender -- /usr/bin/aplay -l

    In fact, it’s the method I use in my alsa-capabilities script.

    Some explanation

    Although pulseaudio is distributed as a single package (simply called pulseaudio), it essentially consists of a server component (called “pulseaudio daemon”), a client interface which is used by modern dektops and desktop applications, like webbrowsers, called “pulseaudio clients”, utilities (like pasuspender and pacmd), and lots of “modules” (84 in my system).

    Schematic of pulseaudio components
    The relationship between various pulseaudio components and alsa

    Whenever one of the pulseaudio clients needs to play (or record) audio, the request is targeted at the pulseaudio client interface, which communicates with the users’ pulseaudio daemon instance. If the daemon is not running, it will automatically start it, and then transfers the audio request to it. The daemon then locks the (automatic or manual) configured alsa interface, to prevent it from being used by other software. Also see David Henningsson blog “Pulseaudio buffers and protocol.” (Nov 2014).

    This process, called “(auto) spawning”, is managed by the XDG-session autostart script /usr/bin/start-pulseaudio-x11, which is executed by the file /etc/xdg/autostart/pulseaudio.desktop. Like all other XDG desktop starters in that directory , it is triggered when a user logs in (in the desktop environment). Both files, the executable and the desktop file, are provided and installed by the pulseaudio package. See the diagram at the right for an overview of the relationship between the various pulseaudio components and alsa. A detailed diagram is provided by Manuel Amador.

    Contrary to popular perception, the pulseaudio daemon in itself does not lock the alsa interface. It just waits for a client application to connect to the daemon, which will then lock the alsa device. This means that if one disables the autospawning, one can manually start and stop pulseaudio.

    LTSP-specific instructions

    I used to use LTSP for getting a remote booted mpd client. LTSP adds its own logic to the startup environement, and needs an additional setting. The LTSP startup scripts try very hard to redirect audio from the server (where audio applications normally run) to the client (in which the sound card is present and where the kernel including alsa is running) through a virtual pulseaudio interface.

    Simply adding the following line to /var/lib/tftboot/ltsp/i386/lts.conf on the server disables this autoconfiguring mechanisme of pulseaudio and leaves alsa intact:

    SOUND = False

    (thanks “alkisg” at #ltsp@freenode)