Abstract
This paper studies the fine-grain scalable compression problem with emphasis on 1-D signals such as audio signals. Like in the successful 2-D still image compression techniques embedded zerotree wavelet coder (EZW) and set partitioning in hierarchical trees (SPIHT), the desired fine-granular scalability and high coding efficiency are benefited from a tree-based significance mapping technique. A significance tree serves to quickly locate and efficiently encode the important coefficients in the transform domain. The aim of this paper is to find such suitable significance trees for compressing dynamically variant 1-D signals. The proposed solution is a novel dynamic significance tree (DST) where, unlike in existing solutions with a single type of tree, a significance tree is chosen dynamically out of a set of trees by taking into account the actual coefficients distribution. We show how a set of possible DSTs can be derived that is optimized for a given (training) dataset. The method outperforms the existing scheme for lossy audio compression based on a single-type tree (SPIHT) and the scalable audio coding schemes MPEG-4 BSAC and MPEG-4 SLS. For bitrates less than 32 kbps, it results in an improved perceived audio quality compared to the fixed-bitrate MPEG-2/4 AAC audio coding scheme while providing progressive transmission and finer scalability.
| Original language | English |
|---|---|
| Article number | 5406138 |
| Journal | IEEE Transactions on Audio, Speech and Language Processing |
| Volume | 19 |
| Issue number | 1 |
| Pages (from-to) | 14-23 |
| Number of pages | 10 |
| ISSN | 1558-7916 |
| DOIs | |
| Publication status | Published - 01.01.2011 |
Funding
Manuscript received January 12, 2009; revised September 24, 2009; accepted January 04, 2010. Date of publication February 05, 2010; date of current version October 01, 2010. This work was supported in part by the SFB/TRR 31 and in part by the International Graduate School for Neurosensory Science and Systems, Carl von Ossietzky University, Oldenburg. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Patrick A. Naylor. From 2001 to 2005, he was a Research Assistant at the Max-Planck-Institute for Meteorology, Hamburg, Germany. From 2005 to 2009, he was a Research As-sistant at the Department of Physics, Carl von Ossi-etzky University of Oldenburg, Oldenburg, Germany, and member of the project SFB-TRR 31 “Das aktive Gehör,” funded by the German Research Foundation. His research interests are digital signal and audio processing, with special focus on optimal embedded audio coding.