Efficient Quantum Circuits for Diagonal Unitaries Without Ancillas Jonathan Welch,1 Daniel Greenbaum,2 Sarah Mostame,1 and Al´n Aspuru-Guzik1 a 1 Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA 02138 2 MIT Lincoln Laboratory, 244 Wood Street, Lexington, MA 02420 (Dated: June 19, 2013) arXiv:1306.3991v1 [quant-ph] 17 Jun 2013 The accurate evaluation of diagonal unitary operators is often the most resource-intensive element of quantum algorithms such as real-space quantum simulation and Grover search. Efficient circuits have been demonstrated in some cases but generally require ancilla registers, which can dominate the qubit resources. In this paper, we point out a correspondence between Walsh functions and a basis for diagonal operators that gives a simple way to construct efficient circuits for diagonal unitaries without ancillas. This correspondence reduces the problem of constructing the minimalx depth circuit within a given error tolerance, for an arbitrary diagonal unitary eif (ˆ) in the |x basis, to that of finding the minimal-length Walsh-series approximation to the function f (x). We apply this approach to the quantum simulation of the classical Eckart barrier problem of quantum chemistry, demonstrating that high-fidelity quantum simulations can be achieved with few qubits and low depth. Quantum computation within the circuit model1 relies on the ability to construct efficient sequences of elementary quantum operations, or gates, that produce a faithful representation of the unitary operators appearing in quantum algorithms. We consider the situation where the unitary of interest is diagonal. Some important algorithms where this applies are quantum simulation of quantum dynamics [1–4], quantum optimization [5], and Grover search [6]. For example, optimization – finding the maximum of a function g(x) – can be reformulated as the problem of finding the ground state of the diagˆ onal Hamiltonian, H = − x g(x)|x x|, which requires ˆ −iHt x implementing e = eitg(ˆ) . To implement an n-qubit diagonal unitary exactly on a quantum computer generally requires 2n+1 − 3 one- and two-qubit gates [7]. However, one is usually interested in circuits that approximate the unitary to within some error tolerance, ǫ. In order to be of practical value, such a circuit must be efficient – the number of one- and twoqubit gates should scale no worse than O(poly(n, 1/ǫ)) [8]. Efficient circuits for diagonal unitaries have been demonstrated, but with the requirement of ancilla qubits. In the real-space quantum simulation algorithm [1, 2], for example, studies indicate that ancilla registers often dominate the qubit resources [3, 9]. Due to limitations in the coherence time and number of qubits in any future practical implementation of quantum computing, it is desirable to decrease these resources as much as possible. In this paper, we provide a constructive algorithm that significantly reduces the qubit resources by pointing out a correspondence between Walsh functions [10] and a basis for diagonal operators that enables efficient implementation of diagonal unitaries without ancillas. Fundamentally, this correspondence arises because Walsh functions are the representation functions [11] of the abelian group, Z⊗n , of n-bit strings under bit-wise addition modulo 2. 2 This is also the group formed by the basis for diagonal operators on n qubits. The elements of this group acting on the basis states of an n-qubit register result in multiplication by Walsh functions. As a result, the circuit x for an arbitrary diagonal unitary eif (ˆ) (in the |x basis) can be constructed based on the terms appearing in the Walsh-Fourier series for the function f (x). We show that the gate count is proportional to the number of terms in the series, and has the maximal value of 2n+1 − 3 eleˆ mentary gates only when f must be represented exactly2 on the n-qubit register. This representation corresponds to a Walsh transform with 2n terms. Below, we consider approximating f (x) with a partial Walsh-Fourier series containing 2k terms, with k ≤ n. Since the bound on the error in this approximation is inversely proportional to the number of terms [12], the resulting gate sequence is efficient. Although partial Walsh-Fourier series lead to efficient implementations, one can do better. In particular, we address the problem of finding the shortest possible gate sequence that approximates the diagonal x unitary eif (ˆ) with error ǫ. This problem reduces to finding the minimal-length Walsh series fs (x) satisfying |fs (x) − f (x)| ≤ ǫ. This is in general an integer programming problem [13], but its solution can be found to a good approximation by throwing away the coefficients of the Walsh-Fourier series for f that fall below a certain bound [12–14]. This can lead to a significant additional reduction in circuit depth.3 1 There are two major paradigms, analog and digital, currently being considered as models for quantum computation. In contrast to digital quantum computation, also known as the circuit model, the analog approach relies on mapping the quantum algorithm to the time-evolution of a physical system. We do not consider this approach here. 2 3 ignoring the sampling error if x is continuous. In this paper, we use the term circuit depth synonymously with the number of elementary gates. Generally the two terms differ, 2 As a simple yet practical demonstration of these ideas, we describe a 1D implementation of the real-space quantum simulation algorithm for a single particle tunneling through an Eckart barrier [15]. This problem is a benchmark in classical computational methods of quantum chemistry for simulating quantum dynamics. This example illustrates that high-fidelity quantum simulations without ancillas can be achieved with few qubits and low depth. j=0 1 0 −1 1 0 −1 j=1 1 0 −1 j=2 1 0 −1 j=3 0 0.5 j=4 1 0 0.5 j=5 1 0 0.5 j=6 1 0 0.5 j=7 1 1 0 −1 1 0 −1 1 0 −1 1 0 −1 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 I. WALSH FUNCTIONS AND OPERATORS FIG. 1. First 8 Walsh functions, in Paley order. Rademacher functions are in red. In this section we identify the mapping between Walsh functions and a basis for diagonal operators. We begin with some definitions. A. Walsh Functions Like trigonometric functions, Walsh functions can be used as a basis for orthonormal expansion. For discretely sampled functions this is accomplished by a discrete Walsh-Fourier transform. For arbitrary n, let us discretize the interval [0, 1) into N = 2n points, xk = k/N , k = 0, ..., N − 1. We define discrete Walsh functions as wjk = wj (xk ). In terms of the bits of j, k, and x, we have wjk = (−1) n i=1 ji ki = (−1) n i=1 ji xi , (2) The Paley-ordered Walsh functions are defined on the continuous interval 0 ≤ x < 1 as [12] wj (x) = (−1) n i=1 ji xi , (1) for integer j = 0, 1, 2, ..., ∞. They form a complete and 1 orthonormal set, 0 wj (x)wl (x)dx = δjl . This definition may be extended to the entire real line by periodic repetition. Here ji is the i-th bit in the binary expansion, j = n ji 2(i−1) , and xi is the i-th bit in the dyadic4 i=1 ∞ expansion, x = i=1 xi /2i .5 n is the index of the most significant non-zero bit of j. In standard binary notation, therefore, we have j = (jn jn−1 ...j1 ) and x = (x1 x2 ...xn ), where the most significant bit is on the left. The wj with indices that are powers of two, j = 2 , n = 1, ..., ∞ are square waves known as Rademacher functions. The Rademacher function of order n is denoted Rn (x) = (−1)xn . The first eight Walsh functions are plotted in Fig. 1. Rademacher functions are in red. n where ki is the i-th bit in the dyadic expansion, k = n (n−i) . The second equality shows that the funci=1 ki 2 tional form is the same whether x is continuous or discrete, the only difference being the number of bits in the expansion of x. This makes Walsh series useful for representing discretely sampled continuous functions. The orthonormality and completeness properties in N −1 1 the discrete case are N k=0 wjk wlk = δjl and N −1 1 j=0 wjk wjl = δkl , respectively. The discrete WalshN Fourier transform aj of a function fk = f (xk ) is aj = fk = j=0 1 N N −1 fk wjk , k=0 (3) N −1 aj wjk . (4) To complete the analogy with Fourier series, we recall that orthonormal functions arise as the irreducible representations of symmetry groups [11]. For trigonometric functions, the relevant group is that of translations. For Walsh functions up to order 2n , it is the group Z⊗n , 2 which is formed by a basis for diagonal operators on n qubits. These are the Walsh operators introduced below. B. Walsh Operators 4 5 circuit depth meaning the number of time steps. We do not consider the number of time steps here since it differs from the gate count by at most a factor of two. A dyadic expansion is a series of dyadic fractions, 2−i . Note that the index i runs from the least significant bit to the most significant bit in the binary expansion, and in the opposite direction in the dyadic expansion. For dyadic rationals x = a/2k , which have two dyadic expansions, we take the finite one. For example, the dyadic rational x = 1/2 has two expansions: a finite expansion, 1/2 = 1/2 + 0 + 0 + ... and an infinite one, 1/2 = (1/2)2 + (1/2)3 + .... The state of an n-qubit register in a quantum computer is typically expressed as a superposition, |ψ = N −1 n k=0 ck |k , of N = 2 states in the computational basis [6], defined as |k = |k1 , ..., kn . (5) 3 Here k = 0, 1, ..., N − 1 is represented as an n-bit dyadic expansion, as above, k = n ki 2n−i . The bits ki = 0 i=1 or 1 denote the state of the i-th qubit. A unitary opˆ ˆ erator U = eif that is diagonal in this basis is given in ˆ terms of its eigenvalues as f |k = fk |k . Functions f (x) of a continuous variable, x ∈ [0, L), may be represented in this way if they are discretely sampled. Here we will assume a constant sampling interval, and define the sampling (grid) points as xk = kL/N , so that fk ≡ f (xk ). To simplify the discussion, we use units such that L = 1 to restrict the variable xk to the interval [0, 1) where Walsh functions are defined. The results for general L are obtained by replacing w(x) by w(x/L). We will also use the notation |k , |xk , and |x interchangably, dropping the subscript k on x when there is no loss of clarity. ˆ Let Zi denote the Pauli Z operator acting on the i-th ˆ qubit, Zi |k1 , ..., kn = (−1)ki |k1 , ..., kn . We define the Walsh operator of order j on n qubits as n ponentials of Walsh operators, ˆ ˆ U = eif = N −1 ˆ eiaj wj . j=0 ˆ ˆ Each term in the product, Uj = eiaj wj , is of the form θ ˆ exp −i 2j i (Zi )ji , where θj = −2aj . Hence the cirˆ cuit for Uj is identical to that for wj , except the Z-gate ˆ is replaced by a Z-rotation, Rz (−2aj ), where Rz (θ) ≡ ˆ e−iZθ/2 . The circuit for U is given by successively applyˆ ing the circuits for Uj . Figure 3 shows two equivalent ways of implementing ˆ one such term, specifically U7 . As seen in this figure, the gate configuration is not unique. We adopt the convention in 3(b) where the CNOTs are always targeted on the highest order qubit possible. Then a precise rule for ˆ constructing the circuit for Uj can be given in terms of the binary expansion of j: A rotation gate, Rz (−2aj ), is placed on the qubit corresponding to the most significant non-zero bit (MSB) of j. Then CNOTs are placed on either side, targeted on the same qubit as the rotation gate, and controlled on the qubits corresponding to the 1’s other than the MSB in the binary expansion of j. This rule will be used in the next section to construct an ˆ optimal circuit for U . (7) wj = ˆ i=1 ˆ ˆ ˆ ˆ (Zi )ji = (Z1 )j1 ⊗ (Z2 )j2 ⊗ ... ⊗ (Zn )jn , (6) where j = 1, ..., 2n , and ji is the i − th bit in the binary n ˆ expansion, j = i=1 ji 2(i−1) . Powers of Zi are defined ˆi )1 ≡ Zi and (Zi )0 ≡ ˆ The set of all Walsh operaˆ ˆ as (Z 1. tors j = 1, ..., 2n forms a basis for diagonal operators on n qubits, given by all possible tensor products of singlequbit Pauli Z gates. Their eigenvalues in the computational basis |x , x ∈ [0, 1), are Walsh functions with index n ˆ j and independent variable x: wj |x = i=1 (Zi )ji |k = ˆ n ji ki (−1) |k1 , ..., kn = wjk |k = wj (x)|x .6 i=1 ˆ The locations of the Z operators in wj correspond to ˆ the positions of the 1’s in the bit-reversed binary string for j. For example, the Walsh operator with j = 6 on ˆ ˆ ˆ n = 3 qubits is w6 = 1 ⊗ Z ⊗ Z, since j = 6 in binary is ˆ (j3 j2 j1 ) = (110). The gate representation of w6 is shown in Fig. 2. The general Walsh operator requires O(n) gates for its implementation: a single Z gate and up to 2n controlled NOTs. |k1 |k2 |k3 • Z • • U7 = • R7 (a) • • = • • R7 (b) • • FIG. 3. For n = 3 qubits, equivalent circuits for implementing ˆ ˆ ˆ ˆ the operator U7 = exp ia7 Z ⊗ Z ⊗ Z in Eq. (7). We use the compact notation Rj ≡ Rz (−2aj ). We follow the convention in (b) where the CNOTs are always targeted on the highest order qubit possible. ˆ FIG. 2. w6 = ˆ ⊗ Z ⊗ Z ˆ 1 ˆ Using Eq. (4), any diagonal operator on n qubits may be expanded as a sum of N = 2n Walsh operators, N −1 ˆ f = j=0 aj wj . Walsh operators commute. Therefore ˆ any diagonal unitary may be written as a product of ex- 6 For the case of Rademacher functions, this relationship was pointed out by Sornborger [16], who observed that the eigenvalue of a single Pauli Z gate acting on the i-th qubit in Eq. (5) is a binary-valued function of x with period 1/2(i−1) . Equation (7) easily generalizes to more than one dimension. For a d-dimensional system represented by d registers of n qubits each, the single Walsh operators will be replaced by tensor products of up to d Walsh operators over the different registers. The exact number depends on the number of variables in the function f . For applications to quantum simulation, this does not significantly increase the gate complexity as interaction potentials are generally few-body. Since products of Walsh operators are also Walsh operators, the expansions have the same form as Eq. (7) with N = 2dn . The utility of Eq. (7) is that it relates the circuit depth ˆ of U to the number of coefficients in the Walsh series for f . If some of these coefficients are zero or may be neglected, this leads to a reduction in the circuit depth for ˆ implementing U . We will examine such cases below, but first we discuss methods to find optimal-depth circuits given a Walsh series for f , as well as to calculate the circuit depth based on the number of non-zero coefficients aj . 4 II. OPTIMAL CIRCUIT CONSTRUCTIONS Implementing all 2n − 1 Walsh functions7 for the unitary in Eq. (7) by concatenation of the individual circuits for each Paley-ordered Walsh operator gives an elementary gate count that scales as O (n2n ).8 For n = 3 qubits, the general circuit found in this way is shown in Fig. (4), with vertical dashed lines separating the different Walsh operators. However, as Bullock and Markov have shown [7], this circuit construction is not optimal. They find that it is possible to reduce the gate count to 2n+1 −3 and prove that this is optimal within a factor of two. In this section, we show that putting the Walsh operators in sequency order automatically produces the optimal circuit. In addition, we describe how to calculate the gate count for an arbitrary number of Walsh functions N ′ < N , where N = 2n . This gate count scales as O(N ′ ). We begin by recalling the relationship between the gate sequence for a Walsh operator wj and the binary exˆ pansion of its index, j. This gate sequence is given by placing a rotation gate, Rz (−2aj ), on the qubit corresponding to the most significant bit of j, and CNOTs on either side targeted on the same qubit and controlled on the qubits corresponding to the other non-zero bits of j. Since two identical CNOTs (CNOTs with the same targets and controls) cancel, it follows that the CNOTs between the rotation gates in adjacent Walsh operators are controlled on the non-zero bits of the bitwise XOR between their indices. For example, given a circuit with w6 followed by w7 , the bitwise XOR of their indices is ˆ ˆ 6 ⊕ 7 = (110) ⊕ (111) = 001. Thus there will be a single CNOT controlled on qubit 1, located between the Rz (−2a6 ) and Rz (−2a7 ) gates on qubit 3. The target of this CNOT is qubit 3, the common MSB of the two Walsh function indices. In order to minimize the number of CNOTs between rotation gates in a circuit containing all 2n − 1 Walsh operators, the operators must be ordered in such a way that adjacent indices have the minimal number of binary transitions between them. Such an ordering is given by the Gray code [17], where the number of binary transitions between adjacent indices is exactly one. Walsh functions sorted in this way are called sequency ordered [17]. This is also the order of increasing number of zero crossings.9 In addition, we partition the Walsh functions according to their common MSBs. For i > 1, the circuit for the i-th partition, corresponding to MSB ji , contains 2(i−1) rotation gates and 2(i−1) CNOTs, for a total of 2i gates. When i = 1, there is only a single rotation gate. For an n qubit system the total number of gates is then 1 + n 2i = 2n+1 − 3, which is the optimal gate count i=2 found by Bullock and Markov [7]. To illustrate the procedure, we give an example with n = 3 qubits. First we reorder the binary strings corresponding to the indices j (except j = 0) in Gray code. This is given by {j3 j2 j1 } = {001, 011, 010, 110, 111, 101, 100} = {1, 3, 2, 6, 7, 5, 4}. Next, this set is partitioned into subsets with a common MSB: G1 = {001} = {1}, G2 = {011, 010} = {3, 2}, G3 = {110, 111, 101, 100} = {6, 7, 5, 4}. Each partition Gi corresponds to a set of operators with rotation gates on, and CNOTs targeted on, qubit i. Finally, adjacent entries in each Gi are XOR’ed to give the qubits containing the controls of the CNOTs. Since the last entry of each partition is always a single 1 in the i–th place, this approach extends formally to the left-most CNOT in the corresponding circuit by taking an XOR between the first and last element of Gi . For example, the sub circuit corresponding to G3 is found by evaluating 100 ⊕ 110 = 010, 110 ⊕ 111 = 001, 111 ⊕ 101 = 010, 101 ⊕ 100 = 001. From top to bottom, this gives CNOTs controlled on qubits 2, 3, 2, and 3, respectively. These go to the left of each rotation gate. In this way, we reduce the initial non-optimal circuit in Fig. 4 to the optimal circuit in Fig. 5. While this method can be used to generate the optimal circuit containing all N − 1 = 2n − 1 Walsh functions, it is not optimal when applied directly to the case of N ′ < N − 1 Walsh functions, since adjacent elements in the sets Gi will now contain multiple binary transitions. In this case, we can use the following commutation relations between CNOTs to simplify the circuit further. Letting i Cj denote a CNOT with control i and target j, i i Cj Zi = Zi Cj , i k Cj Cj i i Ck Cj i j Cj Ck (8) (9) (10) (11) = = = 7 8 We ignore w0 = I ⊕n as it will only contribute a global phase, ˆ and hence will not affect the final result of any algorithm. Each operator for a given wj will require 2 (hj − 1) CNOTs where ˆ hj is the Hamming weight of index j, and one Rz (θ) gate. Therefore to implement all 2n − 1 Walsh operators in the expansion would require n k=2 k i Cj Cj , i i Cj Ck , j i i Ck Ck Cj . n k The first equation states that a Z gate commutes with the control of any CNOT. The second and third equations state that CNOTs with common targets but different controls, or common controls but different targets, 2 (k − 1) = 2 − 2(n+1) + n2n 9 CNOTs and 2n − 1 single qubit rotation gates, which gives 1 + 2n (n − 1) = O (n2n ). For consistency with the rest of the paper, we keep the indices of Walsh functions and operators in Paley order, and do not relabel them in sequency order. 5 |k1 |k2 |k3 R1 R2 • R3 R4 R5 • • • • R6 • • • R7 • • FIG. 4. Non-optimal circuit implementing the Paley-ordered Walsh operators w1 through w7 . Dashed lines separate the sub ˆ ˆ circuits for each of the individual Walsh functions. The rotation gates are Rj ≡ Rz (−2aj ). x ˆ ψ|ψ = 1. Letting Uǫ = eifǫ (ˆ) , wavefunctions ||ψ | = ˆǫ , U ) ≤ ǫ iff ˆ it follows from these definitions that E(U |fǫ (x) − f (x)| ≤ ǫ, ∀ x. ˆ As discussed in the introduction, the circuit Uǫ apˆ proximating the operator U with error ǫ is efficient if it can be implemented using O(poly(n, 1/ǫ)) one- and twoqubit gates. Our approach is to resample f at a rate 2k ≤ 2n , with k the smallest integer possible resulting in a sampling error ǫk ≤ ǫ. The resampled function may be written in terms of a 2k -term partial Walsh-Fourier series 2k −1 as fk (x) = i=0 ai wi (x). (x can be discrete or continuous.) For a continuously differentiable function f (x), the absolute error, ǫk = supx |fk (x) − f (x)|, satisfies11 ǫk ≤ supx |f ′ (x)|/2k [12, 13]. (This expression works for discrete x as well, by interpreting f ′ (x) as a finite difference.) Since the number of terms in the series fk is 2k , this implies that the number of Walsh functions necessary to approximate f (x) with absolute error ǫk ≤ ǫ is O(1/ǫ). The number of gates in the corresponding unitary operator Uǫ is 2k+1 − 3 [7], which is also O(1/ǫ) and is a constant independent of n as long as k ≤ n. x This proves that the operator eifk (ˆ) is an efficient gate if (ˆ) x sequence for e for any n ≥ k. Although efficient circuits for diagonal unitaries can be constructed using partial Walsh-Fourier series, the circuit depth can often be reduced further by minimizing the number of Walsh functions used in the series for f (x). To be precise, consider the problem of finding a Walsh series fs (x) that satisfies |fs (x) − f (x)| ≤ ǫ with the smallest possible number of Walsh coefficients. This is an integer programming problem, whose solution can be found numerically given f (x) and ǫ. However, Yuen has shown that simply throwing away the terms of the WalshFourier series for f (x) below an appropriate bound gives close to optimal results [13]. This is a much simpler procedure, and we apply it in the example in the next section. The solution gives the non-zero Walsh coefficients aj as well as the minimum number of grid points, 2n , needed to represent the resulting function. This information can then be combined with the circuit optimization methods described in the previous section to obtain ˆ a minimal-depth and minimal-width circuit for U . commute. The final equation describes the case when the target of one CNOT is the control of another. Then commuting the two introduces an additional CNOT that is controlled by same qubit as the first and targeted on the same qubit as the second: |i |j |k • • = • • • (12) Using these rules, we find that in most cases the gate count for a circuit with N ′ < N Walsh operators on n = log2 (N ) qubits can be reduced to O(N ′ ) gates.10 |k1 |k2 |k3 R1 • R3 • R2 • R6 R7 • • R5 R4 • FIG. 5. Optimal circuit implementing all 7 Walsh operators on 3 qubits. The Walsh operators are first reordered in sequency order (but keeping the Paley indices). Then all but one CNOT in between adjacent rotation gates cancels. This circuit is equivalent to the one in Fig. 4. The rotation gates are Rj ≡ Rz (−2aj ). III. EFFICIENT CIRCUITS FOR DIAGONAL UNITARIES In this section, we consider approximating f (x) with a partial Walsh-Fourier series. If n is fixed, this leads to an ˆ efficient circuit for U . Otherwise it gives the minimum n ˆ necessary to represent U within the given error, ǫ. Following Ref. [6], we define the error in implementing ˆ ˆ ˆ ˆ ˆ ˆ the operator V instead of U as E(U , V ) ≡ ||U − V ||, ˆ ˆ where ||A|| ≡ max|ψ |A|ψ | is the spectral norm of the ˆ operator A. The maximum is taken over all normalized 10 We verified this for example cases with up to 5 qubits. Starting with an optimal circuit with N = 2n Walsh operators, a single gate is removed for each Walsh coefficient corresponding to a Rademacher function (an index of the form j = 2k ) that is identically zero. Otherwise generally 2 gates are removed, with occasional exceptions that do not affect the final scaling in the examples we tested. An analysis of the general case with n qubits is in progress and will be published separately. 11 A more general expression for the error is ǫk ≤ ω(1/2k , f ), where ω is the modulus of continuity, defined as ω(δ, f ) = sup|t−x|≤δ |f (t)−f (x)| [12]. For continuously differentiable functions, ω(δ, f ) = δ sup |f ′ |, which gives the expression in the text. 6 IV. QUANTUM SIMULATION EXAMPLE: ECKART BARRIER As a practical example of the ideas above, we analyze the quantum simulation of tunneling through an Eckart barrier by numerically implementing the real-space algorithm of Wiesner and Zalka [1, 2]. The Eckart barrier problem is a benchmark in classical computational methods of quantum chemistry for simulating quantum dynamics and transition states of chemical reactions. The solution to the scattering problem can be used for calculating chemical reaction rates [3]. A. Real-space algorithm As we have seen, the gate error in evaluating a diagonal unitary is equal to the absolute error in the exponent. For the potential energy propagator, approximating V (x) with a function Vǫ (x) satisfying supx |Vǫ (x) − V (x)| ≤ ǫ, ˆ ˆ results in an error E(e−iVǫ δt , e−iV δt ) ≤ ǫδt. Letting ǫV be the error in V (x) and ǫK be the error in K(p), the total gate error for the algorithm satisfies EG ≤ ǫV t + ǫK t. ˆ The total error in evaluating U therefore satisfies ˆ E(U , e−iHt ) ≤ αtδt + ǫV t + ǫK t. ˆ (15) We evaluate the time evolution of a quantum system, |ψ(t) = e−iHt |ψ(0) , ˆ (13) using the real-space, or 1st quantized, representation of the wavefunction in terms of position eigenstates, |ψ(t) = |x x|ψ(t) dx. For a d-dimensional system (d = 3m for m particles), |x = |x1 ...|xd , and dx ≡ dd x. Each xi is discretized, and represented on the quantum computer in the computational basis as in Eq. (5). Using d registers of n qubits each, the basis states corresponding to a grid of 2dn points can be represented. Eq. (13) is evaluated using the first-order Trotter formula [1, 3, 18, 19]. Assuming a time-independent Hamilˆ ˆ ˆ tonian H = K(ˆ)+V (ˆ), where [K, V ] = 0, the quantum p x algorithm for evaluating Eq. (13) is ˆ ˆ ˆ ˆ |ψ(t) = F † e−iKδt F e−iV δt t/δt Since the diagonal unitaries can be implemented efficiently, and the QFT requires poly(n) gates, the entire algorithm is efficient, and requires O(poly(n), 1/ǫV , 1/ǫK , t/δt) gates. The parameters δt, ǫV , and ǫK may be varied to obtain the shortest gate sequence for the simulation given a combined total error tolerance. Here, we only consider the problem of finding the shortest gate sequence for a single Trotter step. This corresponds to finding the shortest Walsh series for the approximate potential and kinetic energies given ǫV and ǫK . C. Simulation |ψ(0) , (14) ˆ where t/δt is an integer called the Trotter number. The F operators are Quantum Fourier transforms (QFTs) and are inserted to diagonalize the kinetic energy operators ˆ ˆ K. The potential energy V is already diagonal in the position representation. B. Error analysis The Eckart barrier is defined as A sech(a x) [15], and is plotted in Fig. 6 for A = 1, a = 0.05. Also shown is a plot of a 19-term Walsh series for this potential that is accurate to 10%. This series was constructed by including a subset of coefficients from the full Walsh-Fourier series starting from the largest, then the next largest, etc. until the function was reproduced within the required 10% accuracy.12 This approach gives the minimal set of Walsh-Fourier coefficients, and is usually very close to the fully optimized solution found when the magnitudes of the coefficients are allowed to vary [13]. We find that only 7 qubits are necessary to represent the potential to 10% accuracy, with the given set of parameters. If n > 7, only the qubits corresponding to the 7 most significant digits in the register will be used. This illustrates the resource savings possible if n is large.13 It is generally not possible to evaluate the diagonal unitary kinetic and potential propagators in Eq. (14) exactly. At the very least, there will be sampling error in going from the continuous |x to the discrete |xk representation. This contribution to the total error is in addition to the Trotter error from splitting the propaˆ gator into non-commuting parts. Letting U denote the operator on the right hand side of Eq. (14), the total ˆ ˆ ˆ ˆ simulation error satisfies E(U , e−iHt ) ≡ ||U − e−iHt || ≤ αtδt+ EG , where EG denotes the gate error in evaluating the kinetic and potential propagators, αtδt is the 1-st orˆ ˆ der Trotter error, and α = ||[V , K]|| is a problem-specific constant. 12 13 We used a 2n -term Walsh-Fourier transform with n = 13 to approximate the infinite series. This introduces a discretization error of about 0.1%. The 19 largest coefficients included in the approximate Walsh series are those with Paley indices: 1, 2, 4, 7, 8, 11, 13, 14, 16, 19, 21, 22, 25, 32, 35, 37, 38, 64, and 67. The smallest power of 2 that is greater or equal to every index is 27 = 128, which means that 7 qubits are necessary to represent the series. Although useful for illustrating the approach, the classical algorithm we described for finding best subset of Walsh-Fourier coefficients to approximate a function is not efficient since it requires first calculating a high-dimensional Walsh-Fourier transform. In fact it is not necessary to do this. For a given k, efficient methods exist for finding the best k-term Walsh-Fourier series approximation to a given function without calculating the entire transform [20]. 7 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.2 0.4 0.6 0.8 1 −5 −5 0 0 5 0 0.2 (a) 0.4 0.6 5 0 0.2 (b) 0.4 0.6 −5 −5 0 0 5 5 0 0.2 (c) 0.4 0.6 0 0.2 (d) 0.4 0.6 FIG. 6. Eckart barrier, A sech(a x), with A = 1, a = 0.05. In blue is the exact function and in green is a 19-term partial Walsh-Fourier series, which is accurate to 10%. The largest Paley index of these terms is 67. Therefore at least 7 qubits are needed to implement this approximation. Had we opted to use a partial Walsh-Fourier series (keeping all 2n coefficients for some integer n) to approximate the Eckart barrier, we would also find that n ≥ 7 is required to obtain better than 10% accuracy. (The discretization error with n = 7 is 7.8%, and with n = 6 is 15.6%.) The efficient circuit produced in this way requires a total of 27 − 3 = 125 gates, of which 26 = 64 are rotation gates.14 In contrast, the truncation described in the previous paragraph gives a circuit with a total of approximately 50 gates, of which 19 are rotation gates. This is more than a factor of two improvement. We performed numerical simulations of Eq. (14) for the Eckart barrier with multiple error tolerances on the potential. The wavefunction was initialized to a Gaussian wavepacket traveling towards the barrier. Since there is a known polynomial-time algorithm for the kinetic enp2 ergy propagator, eiˆ /2 [21], we evaluated it with maximum resolution. The time-evolution of the wavefunction is shown in Fig. 7. One can see from these figures that relatively few Walsh functions are needed for an accurate simulation. Even the lowest fidelity simulation reproduces the important features of the quantum scattering problem, including interference fringes. (Although here the fringes are due to periodic boundary conditions and are not physical.) For the present example, Eq. (15) drastically overestimates the total error, since it is a bound over all wavefunctions. To quantify the error in the simulation for the particular initial states under consideration, we found it more convenient to use the fidelity, defined as FIG. 7. Plots of |ψ(x, t)|2 for the Eckart barrier simulation with different error tolerances on the potential. Time t is on the horizontal axis, consisting of a total evolution time of 0.6 divided into 1000 time steps (which gives a negligible Trotter error of less than 1% for the simulation parameters given below). The vertical axis contains 2n grid points, with n different for each figure and −5 ≤ x < 5. The initial state ψ(x, 0) is a Gaussian wave packet given by ψ(x, 0) ∝ exp{−(x − x0 )2 /2σ 2 + i[p0 (x − x0 )]} in units such that = m = 1. The parameter values are x0 = −3, p0 = 15, σ = 0.5. The Eckart barrier potential is V (x) = A sech(a x) with A = 100, a = 0.5. The number of qubits n, errors in the potential and kinetic energies, number of Walsh functions nW , and fidelity of the final state compared to a 10-qubit simulation with maximal resolution (1% discretization error in the potential energy and 0.4% in the kinetic energy) are (a) n = 10, ǫV = 1%, ǫK = 0.4%, nW = 512 (“exact”), F = 1, (b) n = 8, ǫV = 5%, ǫK = 1.6%, nW = 30, F = 0.9794, (c) n = 7, ǫV = 10%, ǫK = 3.1%, nW = 19, F = 0.9105, and (d) n = 6, ǫV = 15%, ǫK = 6.25%, nW = 14, F = 0.6507. F = | ψ(t)|ψ0 (t) |, where |ψ0 (t) is the exact final wavefunction.15 As a proxy for |ψ0 (t) , we used a 10-qubit simulation with maximum possible resolution (including all Walsh operators) and 1000 time steps. By numerically analyzing the scaling of the error with the number of time steps, we found this number of time steps gives a Trotter error of less than 1%. V. CONCLUSION We showed that Walsh functions correspond to a basis for diagonal operators, and used this Walsh operator basis to prove that efficient circuits can be constructed for diagonal unitaries. We also described how the truncated Walsh-Fourier series for a function f (x) leads to an approximately minimal-depth circuit for the diagonal 14 The Eckart barrier is an even function. Therefore half the Walsh coefficients are zero. 15 The fidelity is related to the simulation error defined previously as E = sup|ψ 2(1 − F ). 8 x unitary eif (ˆ) given an error tolerance on f . This circuit has a gate count that scales proportionally to the number of Walsh functions in the series for f (x). We applied this approach to the quantum simulation of tunneling through an Eckart barrier, demonstrating that high-fidelity quantum simulations without ancillas can be achieved with few qubits and low depth. VI. ACKNOWLEDGEMENTS We would like to acknowledge Michael Biercuk for his insightful discussions on the Walsh system, Venkat Chandar, for discussing sparse Walsh-Fourier approximation and pointing out Reference [20], Alexandre Cooper-Roy for the insightful technical discussions on the Walsh system and its applications, Michael Burns for his helpful discussions on digital function approximation, and Juan Bermejo Vega, for pointing out that Walsh functions are the characters of Z⊗n . We also thank John Chiaverini 2 and Jeremy Sage for helpful comments and suggestions. This work was sponsored by the Assistant Secretary of Defense for Research and Engineering under Air Force Contract number FA8721-05-C-0002, and by NSF CCI under award 1037992-CHE. JW and SM are supported by the Air Force Office of Scientific Research under award number FA9550-12-1-0046. Sponsored by United States Department of Defense. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressly or implied, of the U.S. Government. [1] C. Zalka, Proc. R. Soc. Lond. A 454, 313 (1998). [2] S. Wiesner, arXiv:quant-ph/9603028 (1996). [3] I. Kassal, S. P. Jordan, P. J. Love, M. Mohseni, and A. Aspuru-Guzik, Proc. Natl. Acad. Sci. 105, 18681 (2008). [4] S. P. Jordan, K. S. M. Lee, and J. Preskill, Science 336, 1130 (2012). [5] E. Farhi, J. Goldstone, S. Gutmann, J. Lapan, A. Lundgren, and D. Preda, Science 292, 472 (2001). [6] M. Nielsen and I. Chuang, Quantum Computation and Quantum Information (Cambridge University Press, 2010). [7] S. S. Bullock and I. L. Markov, Quant. Info. and Comput. 4, 27 (2004). [8] A. M. Childs, Quantum information processing in continuous time, Ph.D. thesis, Massachusetts Institute of Technology (2004). [9] G. Strini, Fortschr. Phys. 50, 171 (2002). [10] J. Walsh, Am. J. Math. 45, 5 (1923). [11] W.-K. Tung, Group Theory in Physics (World Scientific Publishing Co. Pte. Ltd., 1985). [12] B. Golubov, A. Efimov, and V. Skvortsov, Walsh Series and Transforms: Theory and Applications (Kluwer Academic Publishers, 1991). [13] C.-K. Yuen, IEEE Trans. Computers C-24, 590 (1975). [14] C.-K. Yuen, IEEE Trans. Computers 21, 1273 (1972). [15] C. Eckart, Phys. Rev. 35, 1303 (1930). [16] A. Sornborger, Sci. Rep. 2, 597 (2012). [17] K. Beauchamp, Applications of Walsh and Related Functions, With an Introduction to Sequency Theory (Academic Press, 1984). [18] M. Suzuki, Phys. Lett. A 165, 387 (1992). [19] H. F. Trotter, Proc. Amer. Math. Soc. 10, 545 (1959). [20] E. Kushilevitz and Y. Mansour, STOC ’91 Proc. 23rd ACM Symposium on Theory of Computing , 455 (1991). [21] G. Benenti and G. Strini, Am. J. Phys. 76, 657 (2008).