Projects per year
Abstract
Given a string, the task of grammarbased compression is to find a small contextfree grammar that generates exactly that string. We investigate the relationship between grammarbased compression of strings over unbounded and bounded alphabets. Specifically, we show how to transform a grammar for a string over an unbounded alphabet into a grammar for a block coding of that string over a fixed bounded alphabet and vice versa. From these constructions, we obtain asymptotically tight relationships between the minimum grammar sizes for strings and their block codings. Furthermore, we exploit an improved bound of our construction for overlapfree block codings to show that a polynomial time algorithm for approximating the minimum grammar for binary strings within a factor of c yields a polynomial time algorithm for approximating the minimum grammar for strings over arbitrary alphabets within a factor of 24c + ∈ (for arbitrary ∈ > 0). Currently, the latter problem is known to be NPhard to approximate within a factor of 8569/8568. Since there is some hope to prove a nonconstant lower bound, our results may provide a first step towards solving the long standing open question whether minimum grammarbased compression of binary strings is NPcomplete.
Original language  English 

Title of host publication  Data Compression Conference (DCC'06) 
Number of pages  10 
Publisher  IEEE 
Publication date  01.12.2006 
Pages  173182 
Article number  1607252 
ISBN (Print)  0769525458 
DOIs  
Publication status  Published  01.12.2006 
Event  Data Compression Conference 2006  Snowbird, United States Duration: 28.03.2006 → 30.03.2006 Conference number: 77103 
Fingerprint
Dive into the research topics of 'On the Complexity of Optimal GrammarBased Compression'. Together they form a unique fingerprint.Projects
 1 Finished

Robust learning methods and data compression
Reischuk, R.
01.01.04 → 31.12.08
Project: DFG Projects › DFG Individual Projects