Pseudorandom Hashing for Space-bounded Computation with Applications in Streaming

面向空间受限计算的伪随机哈希及其在流式传输中的应用

阅读:2

Abstract

We revisit Nisan's classical pseudorandom generator (PRG) for space-bounded computation (STOC 1990) and its applications in streaming algorithms. We describe a new generator, HashPRG, that can be thought of as a symmetric version of Nisan's generator over larger alphabets. Our generator allows a trade-off between seed length and the time needed to compute a given block of the generator's output. HashPRG can be used to obtain derandomizations with much better update time and without sacrificing space for a large number of data stream algorithms, for example: Andoni's Fp estimation algorithm for constant p > 2 (ICASSP, 2017) assumes a random oracle, but achieves optimal space and constant update time. Using HashPRG's time-space trade-off we eliminate the random oracle assumption while preserving the other properties. Previously no time-optimal derandomization was known. Using similar techniques, we give an algorithm for a relaxed version of ℓp sampling in a turnstile stream. Both of our algorithms use O˜(d1-2/p) bits of space and have O(1) update time.For 0 < p < 2 , the 1 ± ε approximate Fp estimation algorithm of Kane et al., (STOC, 2011) uses an optimal O(ε-2 log d) bits of space but has an update time of O(log2(1/ε)log log(1/ε)) . Using HashPRG, we show that if [Formula: see text] for an arbitrarily small constant c > 0 , then we can obtain a 1 ± ε approximate Fp estimation algorithm that uses the optimal O(ε-2 log d) bits of space and has an update time of O(log d) in the Word RAM model, which is more than a quadratic improvement in the update time. We obtain similar improvements for entropy estimation.CountSketch, with the fine-grained error analysis of Minton and Price (SODA, 2014). For derandomization, they suggested a direct application of Nisan's generator, yielding a logarithmic multiplicative space overhead. With HashPRG we obtain an efficient derandomization yielding the same asymptotic space as when assuming a random oracle. Our ability to obtain a time-efficient derandomization makes crucial use of HashPRG's symmetry. We also give the first derandomization of a recent private version of CountSketch. For a d -dimensional vector x being updated in a turnstile stream, we show that (x)∞ can be estimated up to an additive error of ε‖x‖2 using O(ε-2 log(1/ε)log d) bits of space. Additionally, the update time of this algorithm is O(log 1/ε) in the Word RAM model. We show that the space complexity of this algorithm is optimal up to constant factors. However, for vectors x with (x)∞ = Θ((x)2) , we show that the lower bound can be broken by giving an algorithm that uses O(ε-2 log d) bits of space which approximates ‖x‖∞ up to an additive error of ε‖x‖2 . We use our aforementioned derandomization of the CountSketch data structure to obtain this algorithm, and using the time-space trade off of HashPRG, we show that the update time of this algorithm is also O(log 1/ε) in the Word RAM model.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。