Quantum topological data analysis with continuous variables

I introduce a continuous-variable quantum topological data algorithm. The goal of the quantum algorithm is to calculate the Betti numbers in persistent homology which are the dimensions of the kernel of the combinatorial Laplacian. I accomplish this task with the use of qRAM to create an oracle which organizes sets of data. I then perform a continuous-variable phase estimation on a Dirac operator to get a probability distribution with eigenvalue peaks. The results also leverage an implementation of continuous-variable conditional swap gate.


I. INTRODUCTION
Extracting useful information from data sets is a difficult task, and in large cases it can be impossible on a classical computer.It is an ongoing field of research to produce quantum algorithms which can analyze data at large scales [1][2][3][4][5].Topological methods for data analysis allow for general useful features of the data to be revealed, and these features do not depend of the representation of the data or any additional noise.This makes topological techniques a powerful analytical tool [6][7][8][9][10][11][12][13][14][15][16][17][18][19].These methods classically scale with exponential computing time, but have been shown to be a great example of the power of quantum algorithms [1,4,5,20].
In this work, I follow and build upon the results in [1], focusing on finding the Betti numbers in persistent homology.Persistent homology is a topological method that revolves around representing a space in terms of a simplicial complex and examining the application of a scaled boundary operator.The Betti numbers represent features of the data, such as the number of connected components, holes, and voids.In order to determine the Betti numbers, I use a quantum algorithm which employs the tools of continuous-variable (CV) quantum computation.More specifically, I use quantum principal component analysis (QPCA) to resolve a spectrum of Betti numbers [2].
A CV quantum system is one that utilizes an infinitedimensional Hilbert space, where the measurement of variables produces a continuous result.This is a substrate that is being studied extensively, and shown to have applications in generating entanglement, quantum cryptography, quantum teleportation, and quantum computation [21][22][23][24][25][26][27][28][29][30][31].The use of a continuous system provides advantages over a qubit, or discrete-variable (DV) system, such as low cost of optical components, less need for environmental control, and potentially better scaling to larger problems [2,21,23,[28][29][30][32][33][34].The CV substrate has also been demonstrated to be more useful in situations with high rate of information transfer such as computing on encrypted data [29,32,33,35].These * siopsis@tennessee.eduadvantages can be very useful when analyzing and extracting information on large volumes of classical data, and as a result make CV quantum computing the natural choice in this setting.
The body of this work starts by discussing persistent homology in a general way, and by setting up the mathematical background of the algorithm, including some useful definitions such as the combinatorial Laplacian and the kth Betti number.I then define the use of quantum Random Access Memory (qRAM) which allows a mapping of classical data into a set of quantum states [36][37][38].In addition, I outline the process of exponentiation of a Hermitian operator, and arrive at the construction of an oracle that returns the elements of the kth Vietoris-Rips complex.Finally, I provide the CV quantum algorithm which uses the process of QPCA, utilizing an implementation of a hybrid [39] exponential conditional swap, to determine the Betti numbers of the system.
The discussion is organized as follows.In Section II, I discuss in general the steps involved in persistent homology as well as some basics in topological data analysis.In Section III, I introduce pertinent mathematical background needed to set up the algorithm.Section IV describes the usage of qRAM, exponentiation, and the oracle.Section V outlines the algorithm.I offer a discussion and a conclusion in Section VI.

II. BACKGROUND
The final goal of topological data analysis, along with the algorithm introduced here, is to determine interesting features of a data set.In this case, the indicator of structure is the Betti numbers, which are a count of topological features.The Betti numbers distinguish between topological spaces based on their connectivity, and are grouped based on dimension.The common notation for Betti numbers is β k , where for k = 0, 1, 2 one has the Betti numbers that correspond to connected components, one-dimensional holes, and two-dimensional voids, respectively.
As an example, see Figure 1, where I consider some simple topological surfaces in one to three dimensions, and list the values of the k = 0, 1, 2 Betti numbers.The algorithm introduced in this work allows one to find the Betti numbers after representing some given data in a space of vertices with connecting edges.Starting with the data, one can create this representation in the following steps.
(a) Start by allowing each data point to represent a position vector, and place one point at the end of each of these vectors.These points are referred to as the vertices.
(b) Next, for a given diameter, draw a circle around each vertex in the space.
(c) Between every two vertices which have contacting circles (and as such are less than distance apart), draw a connecting line.These connections are edges of n-dimensional shapes called simplices, and the space of simplices is called a simplicial complex.
This process is visualized in Figure 2. Now, in order to begin to analyze this representation of the data on a quantum computer, they must be first encoded in a quantum state.This is the subject of the next section.The Betti numbers β0,1,2 for four example shapes.They are the number of connected components, onedimensional holes (also called tunnels or handles), and twodimensional voids, respectively.

III. INITIALIZATION OF ALGORITHM
To start, we are given n points in a d-dimensional space at position vectors v i , i = 1, 2, . . ., n.For simplicity, assume that all points are on the unit sphere, More general sets of points can be considered by extending the discussion in a straightforward, albeit somewhat tedious, manner.For each vector, construct the quantum state using log 2 d qubits.This is analogous to the first part of Figure 2 where data are represented with a series of dots of variable distance from one another.A k-simplex s k is defined as a simplex consisting of k+1 vertices at points v i0 , . . ., v i k connected with k(k + 1)/2 edges.The first four k-simplices are shown in Figure 3.A simplex can be represented by a string of n bits consisting of k + 1 1s at positions i 0 , . . ., i k , and 0s otherwise (for an example of this representation see Figure 4 Next, define the diameter D(s k ) as the maximum distance between two vertices of the simplex, ( This allows one to define the Vietoris-Rips complex S k as the complex consisting of all k-simplices with diameter D ≤ , for a given scale .The construction of the Vietoris-Rips complex is equivalent to the latter pair of steps in Figure 2, where circles of diameter are drawn around each vertex, and then the vertices of contacting circles are connected with edges.The objective of persistent homology is to continuously vary the scale until one finds a value which gives the space an interesting structure, as determined by the Betti numbers.The word persistent comes from this varying of , whereas homology is the algebraic tool that measures the structure of the complex.For the algorithm in the Hilbert space of n qubits, one can define the projection operator onto S k .Evidently, for ≥ 2, all k-simplices are included in S k , so one need only consider ∈ (0, 2).The parameter can be encoded using m qubits as |x , x = 0, 1, . . ., 2 m − 1, where = x/2 m−1 .
If one removes the lth vertex from the k-simplex s k , one obtains the (k − 1)-simplex s k−1 (l), l = 0, 1, . . ., k. Evidently, where X i l is X acting on the i l th qubit.Let us now probe the space to determine whether or not any interesting features are present.This probing is done by the boundary map acting on the space.Define the boundary map ∂ k by One easily deduces ∂ k ∂ k+1 = 0. To restrict to Vietoris-Rips complexes, introduce The action of the boundary operator on a simplex, as well as an example encoding into bits is visualized in Figure 4.
The entire Hilbert space of n qubits is split into n + 1 subspaces labeled by k.To keep track of this splitting, I introduce a register of log 2 n qubits to store the state |k and map in parallel.This can be done in n steps as follows.
with the state |0 for the register.Apply the permutation for each digit of |s k equal to 1 (using the qubit corresponding to each digit as control).The permutation P is a 1-sparse matrix and can be implemented efficiently.Thus one applies P k , so |0 → P k |0 = |k , as desired.
A general state can be written as where |Ψ k is in the span of {|s k }.
Define the Dirac operator as the Hermitian matrix One easily obtains where is the combinatorial Laplacian of the kth simplicial complex.The output of a kth combinatorial Laplacian being zero tells us exactly that we have found a space which is boundary less and also not a boundary itself.The number of these features in our data is the Betti number.Therefore, one can say that the dimension of the kernel of ∆ k is the kth Betti number, An example of determining a void in a k = 2 complex is shown in Figure 5.The total number of voids is the Betti number β 2 .
To summarize, when starting with a data set, the following steps are needed in order to perform persistent homology using the quantum algorithm presented in this paper.(a) Start with n points called vertices in a ddimensional space defined by a set of position vectors.For each vector, the quantum state (1) is constructed.
(b) Using the diameter D(s k ) defined in ( 2), the vertices are connected to form the Vietoris-Rips complex S k consisting of all k-simplices of diameter ≤ .
(c) The space is then split into subspaces labeled by k, consisting of all k-simplices.A general state is then constructed which spans all of these subspaces, and is given in Eq. ( 8).
(d) In order to probe the space for interesting structures, use the boundary map (5).To determine if a region of the space is one of the features which are tallied up to become the Betti numbers (Holes, Voids, etc.), this region must be boundary-less and also not boundary of any other part of the space, as shown in Fig. 5.These two properties are satisfied when the action of the combinatorial Laplacian (11) returns zero.The combinatorial Laplacians of kth order are the elements of the diagonal matrix which is the square of the Dirac operator (9).
(e) In order to apply this operator in the CV algorithm presented here, and to construct some of the states mentioned above, some additional mathematical tools are required, and outlined in the following section.

IV. MATHEMATICAL TOOLS
In this section, the tools needed in the CV quantum algorithm are outlined, and the way they are used and implemented is discussed.
In order to complete the quantum algorithm, We assume that we are equipped with a qRAM [36][37][38] which, given an input state |i |0 , produces the output state |i |v i in quantum parallel, QRAM : Another useful tool which will be used in the algorithm is the exponentiation of an operator.Given a Hermitian operator A, and a resource qumode of quadratures (q R , p R ), it is necessary to apply in parallel.To this end, I will use the exponential swap operator where S is the swap operator.Its implementation is discussed in appendix A (and differs from the one given in [2]).While the body of this work uses a CV quantum algorithm, the method of implementing the exponential conditional swap also uses single photon qubits in a hybrid approach [39].The latter can also be implemented using CV systems in the dual rail representation.
Then we form the state ρ A = A trA (assuming trA = 0), and apply (15) on the combined system of |Ψ and ρ A , for a short time δt.After tracing out the auxiliary mode ρ A , we obtain By repeating this t δt(trA) times, an approximation to the desired operator ( 14) is obtained.
We also assume we are in possession of a quantum oracle O k that acts on |s k |ψ in parallel, flipping the last qubit if s k ∈ S k , otherwise doing nothing, This oracle can be implemented in O(k 2 ) steps.If we choose |ψ = |− , where X|± = ±|± , then the last qubit decouples and the oracle is a unitary acting on |s k as To construct the oracle, first we construct the state by making two copies of the state 1 √ n i |i .To this state we attach |0 ∈ C d as well as a qubit in the state |+ .We then query qRAM to obtain Then we use the last qubit as control to apply the swap operator and obtain Then we measure X on the last qubit.If the outcome is −1, the state collapses to (unnormalized) We then add ancillae and copy the labels i and j on them, respectively.We obtain the state (ignoring the last qubit which has decoupled) Tracing out the ancillae and the last qubit, we obtain the (unnormalized) state It is a Hermitian operator that can be implemented as e itH , as discussed above.Its eigenvalues are the distances between points.Also any function of H can be implemented; in particular, the step function θ( 2 − H), that tests whether s k ∈ S k ; hence the oracle.

V. QUANTUM TOPOLOGICAL DATA ANALYSIS FOR CVS
The algorithm requires a register of m qubits to record as |x with = x/2 m−1 ∈ (0, 2).Suppose is fixed (a condition that can be relaxed to include a filtration).Let us start with the initial state that includes all k-simplices equally weighted, where The initial state can be constructed from the state |0 |s , where the register |0 consists of log 2 n qubits, and consists of n qubits.By using each qubit in the state |s as control to apply the permutation P on the register, we arrive at the desired initial state (25).
From the initial state (25), we construct an approximation to the state where using Grover's search algorithm [40], with the aid of the oracle.
Notice that the action of the Dirac operator (9) simplifies, because all projection operators P k act as the identity on |Ψ k .Therefore, one could instead consider the simpler operator My goal is to compute the eigenvalues of B. For Betti numbers, I am interested in the frequency of occurrence of the zero eigenvalue which yields the dimension of the kernel of the combinatorial Laplacian.I will compute the eigenvalues using QPCA, as discussed in [2], which is a more specific implementation than the original work in [1] which cites the use of general Hamiltonian simulation.Let us attach a squeezed resource qumode in the state (unnormalized) and apply the unitary where γ is a parameter that can be adjusted at will.This unitary is of the form (14), except that trB = 0. We need to regulate B, by adding αI, where α is arbitrary.The eigenvalues are shifted by α, and tr(B + αI) = 0. Suppose that the eigenvalue problem of B + αI is and the state is expanded as Then we obtain A measurement of the quadrature q R of the resource qumode with homodyne detection projects the state onto with the probability distribution consisting of peaks at the eigenvalues.If one is interested in distinguishing between eigenvalues, one ought to choose sufficiently large parameters s and γ so that the width of each peak, 1/(γ √ s), is narrow enough.From this probability distribution, one can deduce all Betti numbers.

VI. DISCUSSION AND CONCLUSION
In this work, I discussed a quantum algorithm for topological data analysis using the method of persistent homology.The algorithm was designed using qRAM as well as a continuous-variable substrate to take advantage of a continuous output from which Betti numbers can be calculated.I also examined the use of a continuousvariable exponential conditional swap operation which is outlined in more detail in the Appendix.As in the discrete-variable case [1], although the matrix (30) is exponentially large (O(2 n )) the size of the required qRAM is small.This provides an advantage over other algorithms that require a large qRAM [18,41,42].
In general, the use of discrete-variable quantum algorithms for topological data analysis is something that has been used before [1,4,5,20].The work presented here provides new tools for continuous-variable systems as well as a direct circuit implementation of one of those tools.
In the algorithm presented here, I used a subset of phase estimation called principal component analysis in order to determine the eigenvalues of the exponential operator.This method is a natural fit for the continuousvariable framework discussed here, but there are other methods which have been examined.For example, a hybrid approach which uses a qumode as well as a mixed state of qubits [43].There is also the well-understood purely-qubit phase estimation [44], but this approach would require many copies of the unitary (32), greatly increasing any resource costs as a result.Note that the algorithm used in this work is largely an adaptation of the exponentiation and phase estimation of ref. [2].The discussion of resource costs in that work can then be sufficiently translated to the present algorithm.
FIG. 1.The Betti numbers β0,1,2 for four example shapes.They are the number of connected components, onedimensional holes (also called tunnels or handles), and twodimensional voids, respectively.

FIG. 2 .
FIG. 2. (a) Given data represented by points.(b) For a given distance , a circle is drawn around each point.(c) Between every two points with contacting circles a line is drawn.These connections are edges of n-dimensional shapes (simplices), and the space of simplices in (c) is called a simplicial complex.For two different values of , as in (b) i, ii, and (c) i, ii, one can get more or less connections between the data points resulting in different topologies.Therefore Betti numbers depend on the initial choice of .It is useful to vary to find interesting structures.

FIG. 4 .
FIG. 4. The action of the boundary operator is shown on a k = 2 simplex.A visual representation of a simplex being broken down into its boundary is depicted above.Its boundary consists of simplices of k − 1 = 1.Below is the encoded representation of the boundary operator acting on the 2-simplex.In this encoding a 1 represents a vertex in the corresponding position in the string of bits.The boundary sum is represented by a clockwise rotation around the original simplex, and the negative sign in the result alternates as in Eq. (5).

FIG. 5 .
FIG.5.Consider the k = 2 complex on the left, for a given value of .In order to show that the striped area is a void, it itself must be boundary-less, and not a boundary for any part of the complex.Fulfillment of these two properties is equivalent to the combinatorial Laplacian(11) applied to the stripped area returning zero.Therefore this area would be part of the kernel of the combinatorial Laplacian for k = 2 contributing to the β2 Betti number.