Wojciech Szpankowski - Publications - Bridge of Knowledge

Search

Filters

total: 17

  • Category
  • Year
  • Options

clear Chosen catalog filters disabled

Catalog Publications

Year 2016
  • Asymmetric Renyi Problem and > PATRICIA Tries
    Publication

    - Year 2016

    In 1960 R´enyi asked for the number of random queries necessary to recover a hidden bijective labeling of n distinct objects. In each query one selects a random subset of labels and asks, what is the set of objects that have theselabels? Weconsider here anasymmetric version of the problem in which in every query an object is chosenwith probability p > 1/2 and we ignore “inconclusive” queries. We study the number of queries needed...

    Full text to download in external service

  • Average Size of a Suffix Tree for Markov Sources
    Publication

    - Year 2016

    We study a suffix tree built from a sequence generated by a Markovian source. Such sources are more realistic probabilistic models for text generation, data compression, molecular applications, and so forth. We prove that the average size of such a suffix tree is asymptotically equivalent to the average size of a trie built over n independentsequences from the same Markovian source. This equivalenceis only known for memoryless...

    Full text to download in external service

  • Lossless Compression of Binary Trees with Correlated Vertex Names
    Publication

    - Year 2016

    Compression schemes for advanced data structures have become the challenge of today. Information theory has traditionally dealt with conventional data such as text, image, or video. In contrast, most data available today is multitype and context-dependent. To meet this challenge, we have recently initiated a systematic study of advanced data structures such as unlabeled graphs [1]. In this paper, we continue this program by considering...

    Full text to download in external service

  • The Boltzmann sequence-structure channel
    Publication

    - Year 2016

    We rigorously study a channel that maps binary sequences to self-avoiding walks in the two-dimensional grid, inspired by a model of protein statistics. This channel, which we also call the Boltzmann sequence-structure channel, is characterized by a Boltzmann/Gibbs distribution with a free parameter corresponding to temperature. In our previous work, we verified experimentally that the channel capacity has a phase transition for...

    Full text to download in external service

  • Types of Markov Fields and Tilings
    Publication

    - IEEE TRANSACTIONS ON INFORMATION THEORY - Year 2016

    The method of types is one of the most popular techniques in information theory and combinatorics. However, thus far the method has been mostly applied to one-dimensional Markov processes, and it has not been thoroughly studied for general Markov fields. Markov fields over a finite alphabet of size m ≥ 2 can be viewed as models for multi-dimensional systems with local interactions. The locality of these interactions is represented...

    Full text available to download

Year 2015
  • On the Origin of Protein Superfamilies and Superfolds
    Publication

    - Scientific Reports - Year 2015

    Distributions of protein families and folds in genomes are highly skewed, having a small number of prevalent superfamiles/superfolds and a large number of families/folds of a small size. Why are the distributions of protein families and folds skewed? Why are there only a limited number of protein families? Here, we employ an information theoretic approach to investigate the protein sequence-structure relationship that leads to...

    Full text available to download

  • Phase Transition in a Sequence-Structure Channel
    Publication

    - Year 2015

    We study an interesting channel which maps binary sequences to self-avoiding walks in the two-dimensional grid, inspired by a model of protein folding from statistical physics. The channel is characterized by a Boltzmann/Gibbs distribution with a free parameter corresponding to temperature. We estimate the conditional entropy between the input sequence and the output fold, giving an upper bound which exhibits an unusual phase transition...

    Full text to download in external service

Year 2014
  • A Note on a Problem Posed by D. E. Knuth on a Satisfiability Recurrence
    Publication

    - COMBINATORICS PROBABILITY & COMPUTING - Year 2014

    We resolve a conjecture proposed by D.E. Knuth concerning a recurrence arising in the satisfiability problem. Knuth's recurrence resembles recurrences arising in the analysis of tries, in particular PATRICIA tries, and asymmetric leader election. We solve Knuth's recurrence exactly and asymptotically, using analytic techniques such as the Mellin transform and analytic depoissonization.

    Full text to download in external service

  • On Symmetry of Uniform and Preferential Attachment Graphs
    Publication

    - ELECTRONIC JOURNAL OF COMBINATORICS - Year 2014

    Motivated by the problem of graph structure compression under realistic source models, we study the symmetry behavior of preferential and uniform attachment graphs. These are two dynamic models of network growth in which new nodes attach to a constant number m of existing ones according to some attachment scheme. We prove symmetry results for m=1 and 2 , and we conjecture that for m≥3 , both models yield asymmetry with high...

    Full text available to download

  • On the Limiting distribution of Lempel Ziv'78 Redundancy for Memoryles Sources
    Publication

    - Year 2014

    We show that the Lempel Ziv'78 redundancy rate tends to a Gaussian distribution for memoryless sources. We accomplish it by extending findings from our 1995 paper [3]. We present a new simplified proof of the Central Limit Theorem for the number of phrases in the LZ'78 algorithm. As in our 1995 paper, here we first analyze the asymptotic behavior of the total path length in a digital search tree (a DST) built from independent sequences....

  • On the Limiting Distribution of Lempel-Ziv’78 Redundancy for Memoryless Sources
    Publication

    We study the Lempel-Ziv'78 algorithm and show that its (normalized) redundancy rate tends to a Gaussian distribution for memoryless sources. We accomplish it by extending findings from our 1995 paper, in particular, by presenting a new simplified proof of the central limit theorem (CLT) for the number of phrases in the LZ'78 algorithm. We first analyze the asymptotic behavior of the total path length in the associated digital search...

    Full text to download in external service

Year 2013
  • Average Redundancy of the Shannon Code for Markov Sources
    Publication

    It is known that for memoryless sources, the average and maximal redundancy of fixed–to–variable length codes, such as the Shannon and Huffman codes, exhibit two modes of behavior for long blocks. It either converges to a limit or it has an oscillatory pattern, depending on the irrationality or rationality, respectively, of certain parameters that depend on the source. In this paper, we extend these findings, concerning the Shannon...

    Full text to download in external service

  • Towards More Realistic Probabilistic Models for Data Structures: The External Path Length in Tries under the Markov Model
    Publication

    - Year 2013

    Tries are among the most versatile and widely used data structures on words. They are pertinent to the (internal) structure of (stored) words and several splitting procedures used in diverse contexts ranging from document taxonomy to IP addresses lookup, from data compression (i.e., Lempel- Ziv'77 scheme) to dynamic hashing, from partial-match queries to speech recognition, from leader election algorithms to distributed hashing...

Year 2012
  • On a Recurrence Arising in Graph Compression
    Publication

    - ELECTRONIC JOURNAL OF COMBINATORICS - Year 2012

    In a recently proposed graphical compression algorithm by Choi and Szpankowski (2012), the following tree arose in the course of the analysis. The root contains n balls that are consequently distributed between two subtrees according to a simple rule: In each step, all balls independently move down to the left subtree (say with probability p) or the right subtree (with probability 1􀀀p). A new node is created as long as...

    Full text available to download

Year 2011
  • Limiting distribution of Lempel Ziv'78 redundancy
    Publication

    - Year 2011

    We show that the Lempel Ziv'78 redundancy rate tends to a Gaussian distribution for memoryless sources. We accomplish it by extending findings from our 1995 paper [3]. We present a new simplified proof of the Central Limit Theorem for the number of phrases in the LZ'78 algorithm. As in our 1995 paper, here we first analyze the asymptotic behavior of the total path length in a digital search tree (a DST) built from independent sequences....

    Full text to download in external service

Year 2008
  • What is information?
    Publication

    Przedstawiono zarys metodologii konstrukcji miar informacyjnych opartej na sterowaniu zdarzeniowym. Podejście takie umożliwia jawne uwzględnianie wpływu kontekstu, kryterium użyteczności i protokołu postępowania na ilość informacji. Podano szereg przykładów obejmujących systemy rozproszone, transmisję danych, środowiska niekooperacyjne i in. wraz z oceną wynikowej przepustowości. Wskazano dalsze kierunki badań, m. in. w kierunku...

Year 2007

seen 920 times