Using partial string matching for coding of hierarchical grammars

dc.creatorSethuraman, Radhakrishnan
dc.date.accessioned2016-11-14T23:11:16Z
dc.date.available2011-02-19T01:02:02Z
dc.date.available2016-11-14T23:11:16Z
dc.date.issued2002-12
dc.degree.departmentComputer Scienceen_US
dc.description.abstractText data can be compressed by extracting repeated strings in the input. Therefore, the input can be expressed using a context-free grammar, where each repeated string is denoted by a rule of the grammar. Sequitur [3] is one such method that derives a context-free grammar, which generates the given input precisely as its language. Sequitur is a two-pass method; in the first pass, it determines a symbolic representation of the grammar, and in the second pass, it encodes this representation using Arithmetic coder to realize a very efficient compressed representation of the input. This thesis presents a complete and robust implementation of Sequitur. First, it presents the design and then gives a detailed description of the algorithm. Next, it presents a context based coding of the grammar; the purpose of this coding method is to code the next symbol in the context of the input character, which is the last character generated by the preceding rule. This coding method is known as partial matching method, and hence this implementation of sequitur is termed here as sequitur-PM. The thesis presents a number of compression results; the first set of results is given to show that our implementation is able to attain as good compression ratio as presented in [1] for all benchmark test input. The next set of results is given to show further compression obtained by using PM method. These results where on par with the published results [3].
dc.format.mimetypeapplication/pdf
dc.identifier.urihttp://hdl.handle.net/2346/22511en_US
dc.language.isoeng
dc.publisherTexas Tech Universityen_US
dc.rights.availabilityUnrestricted.
dc.subjectData compressionen_US
dc.subjectComputer algorithmsen_US
dc.titleUsing partial string matching for coding of hierarchical grammars
dc.typeThesis

Files