September 18, 2008
This article was contributed by Don Marti
Audio is a fitting topic for the first day of the Linux
Plumbers Conference. Users want sound to
Just Work, and there's lots of working code in
individual projects. But so far, it seems like
nobody has everything quite plumbed together in an
annoyance-free way.
Lennart
Poettering, a lead
developer of PulseAudio and Red Hat employee,
moderated the miniconference and started with a
summary of the state of Linux audio: "it's a mess."
The audio miniconference came up with two steps
toward cleaning up the mess, though. First, come up
with a coherent story for application developers on
what sound API to use, and how. Second, clean up the
often-confusing array of user-visible audio level controls.
PulseAudio first appeared to regular users
in
Fedora, starting with version 8, and now,
as Lennart puts it, is for up-to-date users,
"the software that currently breaks your audio."
PulseAudio is a sound server that mixes audio from
multiple applications and passes it along to the
sound hardware. It offers advanced features such
as network transparency: an application can play a
sound on a remote system, and PulseAudio makes it
come out the speakers on the remote machine where
the user is working. Supporting it shouldn't
be a big change for most application developers
to handle. It will handle applications written
to the kernel's maintained audio API, ALSA, using the
PulseAudio backend for alsa-lib. So the
PulseAudio transition has been relatively painless
for the distributions.
An earlier sound server project, the Enlightened
Sound Daemon (ESD) sound
server, is falling out of favor and Media
Application Server (MAS) has never really caught
on. However, one of the competing sound servers looks likely
to remain. On the pro audio side, the low-latency
sound server JACK
is the recommended option. JACK, the "Jack Audio
Connection Kit," as Dave Phillips writes, "holds
the keys to the kingdom" for connecting
studio applications such as the Ardour
digital audio workstation and the Rosegarden
MIDI sequencer. "If you want all of the features,
no one audio system supports all of them," Lennart
said.
Apple and Microsoft each have a single sound server
that does both desktop and pro audio, but nobody at
the session seemed to have much interest in that
direction for Linux. PulseAudio is optimized for
general desktop use and power savings, and supports
scheduling features that should minimize wakeups but
still allow for reasonably low-latency playback of
streaming audio. It's also
network-transparent and supports features such as
placing desktop sound events based on mouse position.
Network audio and desktop effects don't tempt pro
audio users. JACK's uncompromising approach toward
latency means it's likely to hog too much power to
be acceptable to battery-life-watching desktop users,
but fine for a studio with a rack full of gear. So two
sound servers, one for pro and one for the masses, seems
to be fine with both sets of users.
Abusing ALSA
PulseAudio, however, can't give applications direct
access to the hardware, and currently only about 70%
of ALSA applications use the API in a PulseAudio-safe
way, Lennart said. Some high-profile applications
are among those doing audio wrong. "Flash and
Skype are really really broken applications,
especially Flash," he said. Adobe split out the
parts of its code that talk to the audio subsystem,
and certain other plumbing, into an open-source
library, libflashsupport. But Flash remains broken.
The proprietary Flash library talks to libflashsupport
from multiple threads, and one thread calls a
destructor while another continues to send data.
"It works until you close the browser window and then
you get a race," Lennart said.
Developers who want to play audio have a
sometimes-confusing choice of tools, including PortAudio and
GStreamer.
(PortAudio is cross-platform, which is likely why
the popular cross-platform audio editing application
Audacity uses it.) GStreamer is relatively
feature-intense and heavyweight, also handling
video and transcoding. (Write a player with
Gstreamer and you get the ability to play your
collection of C64 SID files for free.)
[PULL QUOTE:
If someone comes and says, 'I want to
write an audio application. Which API should
I use?' I don't have a good answer
END QUOTE]
"If someone comes and says, 'I want to
write an audio application. Which API should
I use?' I don't have a good answer," Lennart
said. The current best answer seems to be to
write to the PulseAudio-safe subset of ALSA.
Jeff Licquia
of the Linux Standard Base (LSB), in the audience,
mentioned that ALSA is on track for inclusion
in LSB 4.0, and is a trial use module for 3.2.
LSB aims to define a compatibility standard for
Linux applications, and aims to do the kind of
application developer education that Linux audio
developers seem to need. Applications seeking LSB
certification must run all of the LSB tests, but can
fail anything tagged as trial use. "We're only keeping
the stuff that we hope will be around for the long
term," he said. If the LSB-safe subset of ALSA fits
into the PulseAudio-safe subset of ALSA, application
developers could write to ALSA and test with LSB.
"I would like to be able to tell people to use libsydney,"
Lennart said. Libsydney, in
progress, is intended to be a networking-friendly
general-purpose audio API.
ALSA and the HD-Audio widget problem
In ALSA, the hardware/software interface is in
good shape, but software to user interface needs some work. Takashi
Iwai, a core ALSA developer and Novell
employee, pointed out in a talk that the line
count for /sound code in the kernel is actually
shrinking, except for ASoC (system on a chip)
and HD-audio. "There will be no more sound cards,
especially PCI," he said. The one exception is the
SoundBlaster X-FI for gamers, which is currently
not supported well in ALSA. Creative announced proprietary
drivers in 2006, but one ALSA developer recently
did get access to a data sheet under NDA.
The new audio standard, HD-Audio, is commonly
found on new systems, and it's well-supported at the
kernel level. However, it's based on "widgets" with
vendor-configurable I/O pins. A driver can't tell
how the HD-Audio part is connected, so some Linux
plumbing work is required to identify which of the
many exposed level controls is the right one to show
the user. An audience member pointed out the need
to tweak multiple level settings on his hardware,
to get the right level without distortion.
Linux will need more information on how each
machine has its HD-Audio hardware hooked up in order
to reliably give the user a useful volume control.
(
Log in to post comments)