Least upper bound of the exact formula for optimal quantization of some uniform Cantor distributions

The quantization scheme in probability theory deals with finding a best approximation of a given probability distribution by a probability distribution that is supported on finitely many points. Let $P$ be a Borel probability measure on $\mathbb R$ such that $P=\frac 12 P\circ S_1^{-1}+\frac 12 P\circ S_2^{-1},$ where $S_1$ and $S_2$ are two contractive similarity mappings given by $S_1(x)=rx$ and $S_2(x)=rx+1-r$ for $0<r<\frac 12$ and $x\in \mathbb R$. Then, $P$ is supported on the Cantor set generated by $S_1$ and $S_2$. The case $r=\frac 13$ was treated by Graf and Luschgy who gave an exact formula for the unique optimal quantization of the Cantor distribution $P$ (Math. Nachr., 183 (1997), 113-133). In this paper, we compute the precise range of $r$-values to which Graf-Luschgy formula extends.

M (a|α) is called the Voronoi region generated by a ∈ α. On the other hand, the set {M (a|α) : a ∈ α} is called the Voronoi diagram or Voronoi tessellation of R d with respect to the set α. Definition 1.1. A set α ⊂ R d is called a centroidal Voronoi tessellation (CVT) with respect to a probability distribution P on R d , if it satisfies the following two conditions: (i) P (M (a|α) ∩ M (b|α)) = 0 for a, b ∈ α, and a = b; (ii) E(X : X ∈ M (a|α)) = a for all a ∈ α, where X is a random variable with distribution P , and E(X : X ∈ M (a|α)) represents the conditional expectation of the random variable X given that X takes values in M (a|α).
A Borel measurable partition {A a : a ∈ α} is called a Voronoi partition of R d with respect to the probability distribution P , if P -almost surely A a ⊂ M (a|α) for all a ∈ α. Let us now state the following proposition (see [4,6]). Proposition 1.2. Let α be an optimal set of n-means with respect to a probability distribution P , a ∈ α, and M (a|α) be the Voronoi region generated by a ∈ α. Then, for every a ∈ α, (i) P (M (a|α)) > 0, (ii) P (∂M (a|α)) = 0, (iii) a = E(X : X ∈ M (a|α)), and (iv) P -almost surely the set {M (a|α) : a ∈ α} forms a Voronoi partition of R d .
If α is an optimal set of n-means and a ∈ α, then by Proposition 1.2, we see that a is the centroid of the Voronoi region M (a|α) associated with the probability measure P , i.e., for a Borel probability measure P on R d , an optimal set of n-means forms a CVT of R d ; however, the converse is not true in general (see [3,2,12,13]).
In [7], Graf and Luschgy showed that β n (I) forms an optimal set of n-means for the probability distribution P when r = 1 3 , and the nth quantization error is given by Notice that β n (I) forms a CVT of the Cantor set generated by the two mappings S 1 (x) = rx and S 2 (x) = rx+(1−r) for 0 < r ≤ 5− √ 17 2 , i.e., if 0 < r ≤ 0.4384471872 (up to ten significant digits). In [13], we have shown that if 0.4371985206 < r ≤ 0.4384471872 and n is not of the form 2 (n) for any positive integer (n), then there exists a CVT for the Cantor set for which the distortion error is smaller than the CVT given by β n (I) implying the fact that β n (I) does not form an optimal set of n-means for all 0 < r ≤ 5− √ 17 2 . It was still not known what is the least upper bound of r for which β n (I) forms an optimal set of n-means for all n ≥ 2. In the following theorem, which is the main theorem of the paper, we give the answer of it.
Theorem 1.5. Let β n (I) be the set defined by Definition 1.3. Let r ∈ (0, 1 2 ) be the unique real number such that Then, r 0 = r ≈ 0.4350411707 (up to ten significant digits) gives the least upper bound of r for which the set β n (I) forms an optimal set of n-means for the uniform Cantor distribution P .
In the sequel instead of writing r ≈ 0.435041170, we will write r = 0.435041170. The arrangement of the paper is as follows: In Definition 2.7, we have constructed a set γ n (I), and in Proposition 2.8, we have shown that γ n (I) forms a CVT for the Cantor distribution P if 0.3613249509 ≤ r ≤ 0.4376259168 (written up to ten decimal places). In Theorem 3.1, we have proved that the set β n (I) forms an optimal set of n-means for r = 0.4350411707. In Proposition 4.1, we have shown that V (P, β n (I)) = V (P, γ n (I)) if r = 0.4350411707, and V (P, β n (I)) > V (P, γ n (I)) if 0.4350411707 < r ≤ 0.4376259168 < 5− , then the set β n (I) forms a CVT but does not form an optimal set of n-means implying the fact that the least upper bound of r for which β n (I) forms an optimal set of n-means is given by r 0 = r ≈ 0.4350411707 (up to ten significant digits) which is Theorem 1.5. Notice that the optimal sets of n-means and the nth quantization errors are not known for all Cantor distributions P given by P := 1 2 P • S −1 1 + 1 2 P • S −1 2 , where S 1 (x) = rx and S 2 (x) = rx + 1 − r for 0 < r < 1 2 . Thus, it is worthwhile to investigate the least upper bound of r for which the exact formula to determine the optimal quantization given by Graf-Luschgy works.

2.
Preliminaries. As defined in the previous section, let S 1 and S 2 be the two similarity mappings on R given by S 1 (x) = rx and S 2 (x) = rx + 1 − r, where 0 < r < 1 2 , and P = 1 be the probability distribution on R supported on the Cantor set generated by S 1 and S 2 . Write p 1 = p 2 = 1 2 , and s 1 = s 2 = r. By I * we denote the set of all words over the alphabet I := {1, 2} including the empty word ∅. For ω ∈ I * , by s ω we represent the similarity ratio of the composition mapping S ω . Notice that the identity mapping has the similarity ratio one. Thus, if ω := ω 1 ω 2 · · · ω k , then we have s ω = r k . Let X be a random variable with probability distribution P . By E(X) and V := V (X) we mean the expectation and the variance of the random variable X. For words β, γ, · · · , δ in {1, 2} * , by a(β, γ, · · · , δ) we mean the conditional expectation of the random variable X given We now give the following lemma.
The last equation is true, since for any c ∈ α, Hence, by Definition 1.1, the lemma follows.
From Lemma 2.1 the following corollary follows.
Corollary 2.2. Let i = 1, 2, and let β form a CVT for the image measure P • S −1 i . Then, S −1 i (β) forms a CVT for the probability measure P . The following two lemmas are well-known and easy to prove (see [7,13]). Lemma 2.3. Let f : R → R + be Borel measurable and k ∈ N, and P be the probability measure on R given by Lemma 2.4. Let X be a random variable with the probability distribution P . Then, We now give the following corollary.
Corollary 2.5. Let σ ∈ {1, 2} k for k ≥ 1, and x 0 ∈ R. Then, Note 2.6. Corollary 2.5 is useful to obtain the distortion error. By Lemma 2.4, it follows that the optimal set of one-mean is the expected value and the corresponding quantization error is the variance V of the random variable X.
Since S 1 and S 2 are similarity mappings, it is easy to see that E(S j (X)) = S j (E(X)) for j = 1, 2 and so by induction, Definition 2.7. For n ∈ N with n ≥ 2 let (n) be the unique natural number with 2 (n) ≤ n < 2 (n)+1 . Write For n ≥ 4, define γ n := γ n (I) as follows: Proposition 2.8. Let γ n := γ n (I) be the set defined by Definition 2.7. Then, γ n (I) forms a CVT for the Cantor distribution P if 0.3613249509 ≤ r ≤ 0.4376259168 (written up to ten decimal places).
The inequalities in (10) are true if 0 < r < 0.4850084548, and the inequalities in (11) are true if 0 < r < 0.4847126592. Combining these with (1), we see that γ n (I) forms a CVT for the Cantor distribution P if 0.3613249509 ≤ r ≤ 0.4376259168 (written up to ten decimal places). Thus, the proof of the proposition is complete.
Proof. For n = 2 (n) , we have Thus, the proof of the proposition is complete.
3. Optimal sets of n-means for r = 0.4350411707 and n ≥ 2. Recall that β n (I) forms a CVT if r = 0.4350411707. In this section, we state and prove the following theorem.
Theorem 3.1. Let n ≥ 2, and let β n (I) be the set given by Definition 1.3. Then, β n (I) forms an optimal set of n-means for r = 0.4350411707.
To prove the theorem, we need some basic lemmas and propositions. The following two lemmas are true. Due to technicality the proofs of them are not shown in the paper.   Since V n is the nth quantization error for n ≥ 4, we have V n ≤ V 4 ≤ 0.00352544. Let α n be an optimal set of n-means. Write α n := {a 1 , a 2 , · · · , a n }, where 0 < a 1 < a 2 < · · · < a n < 1. If a 1 ≥ r, using Corollary 2.5, we have which is a contradiction. Thus, we can assume that a 1 < r. Similarly, we can show that a n > (1 − r). Thus, we see that if α n is an optimal set of n-means with n ≥ 2, then α n ∩ [0, r) = ∅ and α n ∩ (1 − r, 1] = ∅. Thus, the lemma is yielded.
The following lemma is a modified version of Lemma 4.5 in [7], and the proof follows similarly.
Lemma 3.5. Let n ≥ 2, and let α n be an optimal set of n-means such that α n ∩J 1 = ∅, α n ∩ J 2 = ∅, and α n ∩ (r, 1 − r) = ∅. Further assume that the Voronoi region of any point in α n ∩ J 1 does not contain any point from J 2 , and the Voronoi region of any point in α n ∩ J 2 does not contain any point from J 1 . Set α 1 := α n ∩ J 1 and α 2 := α n ∩ J 2 , and j := card(α 1 ). Then, S −1 1 (α 1 ) is an optimal set of j-means and S −1 2 (α 2 ) is an optimal set of (n − j)-means. Moreover, Remark 3.6. Lemma 4.5 in [7] does not work for all 0 < r < 1 2 . Due to that we have added an extra condition to Lemma 4.5 in [7] to work for all 0 < r < 1 2 . Lemma 3.7. Let α 4 be an optimal set of four-means. Then,  Since V 4 is the nth quantization error for n = 4, we have V 4 ≤ 0.00352544. Let α 4 := {a 1 , a 2 , a 3 , a 4 }, where 0 < a 1 < a 2 < a 3 < a 4 < 1, be an optimal set of four-means. If a 1 > 0.20 > 0.189261 = S 11 (1), using Corollary 2.5, we have which is a contradiction. So, we can assume that a 1 ≤ 0.20. Similarly, a 4 ≥ 0.80. We now show that α 4 does not contain any point from (r, 1 − r). Suppose that α 4 contains a point from (r, 1 − r). Then, due to Lemma 3.4, without any loss of generality, we can assume that a 2 ∈ (r, 1 − r), and 1 − r ≤ a 3 < a 4 . Two cases can arise:
Case 2. a j+1 ∈ (r, 1 2 ]. Since this case is the reflection of Case 1 with respect to the point 1 2 , a contradiction arises. Hence, α n does not contain any point from the open interval (r, 1 − r). Thus, the proof of the proposition is complete. Proposition 3.9. Let α n be an optimal set of n-means with n ≥ 2. Then, the Voronoi region of any point in α n ∩ J 1 does not contain any point from J 2 , and the Voronoi region of any point in α n ∩ J 2 does not contain any point from J 1 .
Since V n is the nth quantization error for n ≥ 8, we have V n ≤ V 8 ≤ 0.000667229. Let α n := {a 1 , a 2 , · · · , a n } be an optimal set of n-means for n ≥ 8 with 0 ≤ a 1 < a 2 < · · · < a n ≤ 1, and let j be the greatest positive integer such that a j ∈ J 1 . Then, by Proposition 3.8, we have a j < r and 1 − r < a j+1 . Suppose that the Voronoi region of a j+1 contains points from J 1 . Then, 1 2 (a j + a j+1 ) < r yielding a j < 2r − a j+1 ≤ 2r − (1 − r) = 3r − 1 = 0.305124 < 0.312534 = S 12122 (0). Hence, by Corollary 2.5, which is a contradiction. Thus, we can assume that the Voronoi region of a j+1 does not contain any point from J 1 . Similarly, we can show that the Voronoi region of a j does not contain any point from J 2 . Hence, the proposition is true for all n ≥ 8. Thus, we complete the proof of the proposition.