Aug 22, 2000 - which the oracle and the computer exchange several rounds of messages, each round consisting of O(log ... Unlike classical decision tre...

0 downloads 27 Views 102KB Size

Entropy lower bounds of quantum decision tree complexity ∗ Yaoyun Shi† August 22, 2000

Abstract We prove a general lower bound of quantum decision tree complexity in terms of some entropy notion. We regard decision tree computation as a communication process in which the oracle and the computer exchange several rounds of messages, each round consisting of O(log n) bits. Let E(f ) be the Shannon entropy of the random variable f (X), where X is taken uniformly random in f ’s domain. Our main result is that it takes Ω(E(f )) queries to compute any total function f . It is interesting to contrast this bound with the Ω(E(f )/ log n) bound, which is tight for partial functions. Our approach is the polynomial method.

keywords: Quantum computation; Decision tree; Lower bounds; Computational complexity; Entropy

1

Introduction

The decision tree model is probably the simplest model in the study of computational complexity. In this model, the input x is known only to an oracle, and the only way that the computer can access the input is to ask the oracle questions of the type ‘xi =?’. The computational cost is simply the number of such queries, and the complexity of a problem is the minimal worst case cost. For example, to find out whether or not there is a 1 in ∗

This work was supported in part by the National Science Foundation under grant CCR-9820855. Computer Science Department, Princeton University, Princeton, New Jersey 08544. E-mail: [email protected] †

1

x1 , x2 , · · · , xn , any deterministic decision tree algorithm needs to ask for all the xi ’s in the worst case. Therefore, its (deterministic) decision tree complexity is n. Unlike classical decision trees, a quantum decision tree algorithm can make queries in a quantum superposition, and therefore may be intrinsically faster than any classical algorithm. For example, Grover’s quantum √ algorithm[4] for finding the location of the only 1 in an n bit string makes only O( n) queries, while any classical algorithm needs Ω(n) queries. In recent years, the quantum decision tree model has been extensively studied by many authors from both upper bounds and lower bounds perspectives. Here we consider the latter aspect only. [3] is a recent survey of both classical and quantum decision tree complexity. Throughout this paper, f denotes a function: f : {0, 1}n ⊇ A → B = f (A) ⊆ {0, 1}m, for some integers n, m > 0. Let Q(f ) be the quantum decision tree complexity of f with error probability bounded by 1/3. Our goal is to derive a general lower bound for Q(f ) in terms of E(f ) defined as follows: Definition 1.1 For any f , define the entropy of f , E(f ), to be the Shannon entropy of f (X), where X is taken uniformly random from A. More explicitly, E(f ) =

X

py log2

y∈B

1 , py

where py = Prx∈R A [f (x) = y]. We first note the following general lower bound: Proposition 1.2 For any f , Q(f ) = Ω(E(f )/ log n). This fact can be proved by a standard information theoretical argument, which we sketch here. The computation can be viewed as a process of communication: to make a query, the algorithm sends the oracle ⌈log2 n⌉ + 1 bits, which are then returned by the oracle. The first ⌈log2 n⌉ bits specify the location of the input bit being queried and the remaining the oracle to write down the answer. Now we run the algorithm Pone bit allows → − on √1 |xi | 0 i , X Y where X and Y denote the qubits that hold the input and the x∈A |A|

(t)

intermediate results of the computer respectively. Now we consider SB , the von Neumann entropy of qubits in Y after the tth query. If the algorithm computes P f in T queries, at 1 √ the end of the computation, we expect to have a vector close to x∈A |xiX |f (x)iY . |A|

2

(0)

(T )

(t+1)

(t)

Clearly SB = 0, SB ≈ E(f ), and |SB − SB | = O(log n) for any t, 0 ≤ t ≤ T − 1. The latter two assertions can be proved by standard applications of Holevo’s theorem[5]. Therefore T = Ω(E(f )/ log n). We will provide an example later to show that indeed this bound is tight. This means one quantum query can get log n bits of information, while any classical query can only get no more than 1 bit of information. Surprisingly, this power of getting ω(1) bits of information in a query is not useful in computing total functions, i.e., functions that are defined on every string in {0, 1}n , in the sense that each quantum query can only get O(1) bits of information on average, as stated in our main theorem: Theorem 1.3 (Main Theorem) For any total function f , Q(f ) = Ω(E(f )). Now we sketch the proof idea. We take the polynomial approach initiated in [2]. Any correct algorithm that computes f will produce a set of polynomials {f˜y : {0, 1}n → R : y ∈ B}. Each f˜y is an approximation to the characteristic polynomial for f −1 (y). If f is a total function, on any Boolean inputs, f˜y is forced to be close to either 0 or 1, and this ‘takeit-or-leave-it’ nature makes it harder to approximate; in contrast, when f is not a total function, on inputs where f is not defined, f˜y has more freedom to take values that make the approximation easier. There are several previous papers that prove general lower bounds on quantum decision tree complexity in terms of different complexity notions: [2] by Boolean (block) sensitivity and by degree of approximating polynomials, [1] by a combinatorial property, and [6] by average Boolean sensitivity. In the next two sections we shall provide a rigorous definition of the quantum decision tree model and then prove the main theorem.

2

Quantum decision tree model

In the quantum decision tree model, the computer has three sets of qubits: P , Q, and R. P has n bits, which hold the input; Q has ⌈log2 n⌉ + 1 bits, which contain a pointer to the input bits (i.e., an integer between 1 and n), as well as one more bit; R has an unlimited number of bits which serve as the algorithm’s working space. A quantum decision tree computation with input x is the application (from the right to the left) of a sequence of unitary operators A := UT OUT −1 O · · · U1 OU0 3

on the initial state

→ − |xiP | 0 iQR ,

where O is the oracle gate: O|xiP |i, biQ |ciR = |xiP |i, b ⊕ xi iQ |ciR , ˜t a unitary and each Ut = I ⊗ U˜t , 0 ≤ t ≤ T , where I is the identity operator on l2 (P ) and U operator on l2 (Q ∪ R). We say that the algorithm computes f (with error bounded by 1/3) if there exists a measurement M on l2 (Q ∪ R), such that for any x ∈ A, with probability no less than 2/3 f (x) will be observed by applying M at the final state of the computation. The quantum decision tree complexity Q(f ) is defined to be the minimal T such that there is a quantum decision tree algorithm that computes f in T queries. The following example demonstrates that the lower bound in Proposition 1.2 is tight. Example 1 Assume n is a power of 2. For any z ∈ {0, 1}log2 n , e(z) ∈ {0, 1}n is defined as follows: e(z)i = i · z (parity of bitwise product). Consider f (x) := z if x = e(z), otherwise f is undefined. Then E(f ) = log2 n, while Q(f ) = 1. Let H be the Hadamard transformation on the log2 n index bits in Q, and M acts on the last bit in Q such that M|0i = √12 (|0i − |1i). It is easy to verify that for any x = e(z), → − M −1 HOHM|xiP | 0 iQ = |xiP |z, 0iQ .

3

Proof of the main theorem

For 0 ≤ t ≤ T , let φt (x) ∈ l2 (Q ∪ R) be the state such that

→ − Ut OUt−1 · · · U0 |xiP | 0 iQR = |xiP ⊗ φt (x).

Let Ψ be any orthonormal basis for l2 (Q ∪ R). Our proof will finally make use of the following fact observed in [2]: Fact 1 φt (x) =

X

pψ (x)|ψiQR ,

ψ∈Ψ

for some set of multi-linear polynomials pψ (x), each of which is of degree no more than t.

4

Therefore, proving lower bounds in quantum complexity can be reduced to proving lower bounds on the degree of approximating polynomials. We shall first prove some lemmas on the latter. For any g : {0, 1}n → R, define the average sensitivity of g, s¯g = Ex,i (g(x) − g(x + ei ))2 and, pg = Ex [g(x)] . All randomness is uniform. When g is a Boolean function, s¯g is just the probability that a random edge in the Boolean cube connects two vertices of different function values, and pg is the probability for a random input to have function value 1. Now let g be a Boolean function, and g˜ : {0, 1}n → [0, 1] approximate g, i.e., |˜ g (x) − g(x)| ≤ 1/3 for all x ∈ {0, 1}n . The following theorem says that a larger s¯g or a smaller pg˜ will force g˜ to have high degree. Lemma 3.1 deg(˜ g ) ≥ n¯ sg˜ /(4pg˜) ≥ n¯ sg /(36pg˜). Proof. Let d = deg(˜ g ), then the Fourier representation of g˜ is X gˆ˜r (−1)x·r , g˜(x) = r∈{0,1}n ,|r|≤d

where gˆ˜r = Ex [˜ g (x)(−1)x·r ]. By simple calculation, s¯g˜ =

X

r,|r|≤d

2 4|s| ≤( gˆ˜r n

X

r,|r|≤d

2 4d gˆ˜r ) = Ex g˜2 (x) 4d/n ≤ 4dpg˜/n. n

Since g˜ approximates g, s¯g˜ ≥ 91 s¯g . The following lemma about Boolean functions will be needed immediately: Lemma 3.2 Let k be the cardinality of X ⊆ {0, 1}n , tX the number of edges in the Boolean cube that connect two vertices in X. Then tX ≤ k log2 k/2. Proof. By induction. It is true for k = 1, 2. Assume the statement is true for all natural numbers smaller than k, and let’s examine the case k ≥ 3. Pick a coordinate i such that both the subcubes of xi = 1 and xi = 0 have nonempty subsets A and B of 5

X. Then tX ≤ tA + tB + min{|A|, |B|}. We can assume without loss of generality that 1 ≤ |A| = a ≤ k/2. Then by simple calculation, 1 1 tX ≤ a log2 a + (k − a) log2 (k − a) + a ≤ k log2 k/2. 2 2 1 . Let H(·) be the entropy function, i.e., for η ∈ [0, 1], H(η) := η log2 η1 + (1 − η) log2 1−η The following lemma says that if the number of true assignments is close to the number of false assignments, then the Boolean function should have high average sensitivity:

Lemma 3.3 For any Boolean function g, s¯g ≥ H(pg )/n. Proof. Let k = 2n pg be the number of true assignments. By Lemma 3.2, in the Boolean cube, the number of edges that connect two true assignments is less than k log2 k/2, and the number of edges that connect two false assignments is less than (2n −k) log2 (2n − k)/2. Therefore, s¯g = Pr [g(x) 6= g(x + ei )] ≥ (n2n − k log2 k − (n − k) log2 (n − k)) /n2n = H(pg )/n. x,i

We are now ready to prove our main theorem: Proof. [Main Theorem] For each y ∈ B, let fy be the characteristic function of f −1 (y), i.e., 1 if f (x) = y, fy (x) = 0 otherwise. Let f˜y (x) be the probability that y is observed as the output when the input is x. Then by ˜ Lemma 1, f˜y is a nonnegative polynomial P ˜ of degree no more than 2Q(f ), and fy approximates fy . Furthermore, for any x, y fy (x) ≤ 1. For simplicity of notation, we shall use py in place for pfy , p˜y for pf˜y , and s¯y for s¯fy . Note that X 1 E(f ) = py log2 , py y and, X y

p˜y ≤ 1.

Let d = maxy deg(f˜y ). We want to get a lower bound for d. 6

By Lemma 3.1

n s¯y ≤ dy p˜y ≤ d˜ py . 36 Summing over all i, and by Lemma 3.3, we get n X 1 X 1 d≥ s¯y ≥ H(py ) ≥ E(f ). 36 y 36 y 36

4

Acknowledgment

The author would like to thank Andy Yao for discussion, Jason Perry and Lane Hemaspaandra for going through the paper and giving useful comments.

References [1] A. Ambainis. Quantum lower bounds by quantum arguments. In Proceedings of the Thirty-second Annual ACM Symposium on the Theory of Computing, pages 636–643, Portland, Oregon, May 2000. [2] R. Beals, H. Buhrman, R. Cleve, M. Mosca, and R. de Wolf. Quantum lower bounds by polynomials. In 39th Annual Symposium on Foundations of Computer Science, pages 352–361, Los Alamitos, CA, November 1998. IEEE. [3] H. Buhrman and R. de Wolf. Complexity measures and decision tree complexity: A survey. Unpublished manuscript, 1999. [4] Lov K. Grover. A fast quantum mechanical algorithm for database search. In Proceedings of the Twenty-Eighth Annual ACM Symposium on the Theory of Computing, pages 212–219, Philadelphia, Pennsylvania, May 1996. [5] A. S. Holevo. Some estimates for the amount of information transmittable by a quantum communications channel. Problemy Peredaˇci Informacii, 9(3):3–11, 1973. English translation: Problems of Information Transmission, 9(3):177–183, 1973. [6] Y. Shi. Lower bounds of quantum black-box complexity and degree of approximating polynomials by influence of boolean variables. Information Processing Letters, 75(12):79–83, 31 July 2000. 7