Equality of Kolmogorov-Sinai and permutation entropy for one-dimensional maps consisting of countably many monotone parts

In this paper, we show that, under some technical assumptions, the Kolmogorov-Sinai entropy and the permutation entropy are equal for one-dimensional maps if there exists a countable partition of the domain of definition into intervals such that the considered map is monotone on each of those intervals. This is a generalization of a result by Bandt, Pompe and G. Keller, who showed that the above holds true under the additional assumptions that the number of intervals on which the map is monotone is finite and that the map is continuous on each of those intervals.


Introduction
Determining the Kolmogorov-Sinai entropy (K-S entropy) of a dynamical system is a key part in the analysis of a system's complexity. This entropy is a measure for the amount of information that is gained on average by observing the system's dynamics. While the K-S entropy has a precise mathematical definition and interesting properties, its computation can be difficult. Therefore, in 2002, Bandt and Pompe introduced the so called permutation entropy as an alternative measure for the complexity of a one-dimensional dynamical system which is easier to evaluate numerically than the K-S entropy [3]. The permutation entropy is a measure for the information contained in the ordinal structure of a dynamical system. One can extend the definition of the permutation entropy to multi-dimensional systems by introducing a number of real-valued random variables as observables that each project the multidimensional dynamics into the real numbers in which the ordinal structure is then considered. The relatively easy calculation of the permutation entropy led to many practical applications and gave rise to a number of theoretical questions as well (e.g. [8], [10], [12]). One of the first questions someone could ask is about the equality of the Kolmogorov-Sinai and the permutation entropy. Bandt, Pompe and G. Keller showed that those entropies are equal for one-dimensional interval maps if there exists a finite partition of the domain of definition into intervals such that the considered map is monotone and continuous on each of those intervals [2]. It was also shown that the permutation entropy is an upper bound for the K-S entropy for all one-dimensional systems (see [2] or [1]). This inequality can be generalized to multidimensional maps under sufficiently general conditions [7]. It is still an open question if and how the condition of piecewise monotony for the equality of the entropies can be generalized to a larger class of one-dimensional maps. In this paper, we are able to show that the equality of K-S and permutation entropy still holds true if we omit the condition of continuity and if there exists a countable partition of the domain of definition into intervals such that the one-dimensional map is monotone on each of those intervals. Unlike in the paper of Bandt, Pompe and G. Keller [2], we do not require that this partition into intervals is finite.
The value of the Kolmogorov-Sinai entropy depends on the position of the elements of orbits (ω, T (ω), T 2 (ω), . . .) with respect to a finite or count-able partition. The iterates T t (ω) can be recursively defined as T t (ω) = T (T (t−1) (ω)) for t ∈ N and ω ∈ Ω with T 0 (ω) = ω. For the definition of the permutation entropy, we investigate for which s, t ∈ N 0 the inequality T s (ω) ≤ T t (ω) holds true. To simplify our argumentation we want to exclude the possibility that T s (ω) is equal to T t (ω) for s, t ∈ N with s = t, so that the inequalities T s (ω) ≤ T t (ω) and T t (ω) ≤ T s (ω) are mutually exclusive. To achieve this, we require that T is aperiodic with regard to µ, which means that µ( ∞ n=1 {ω ∈ Ω| T n (ω) = ω}) = 0 holds true. For aperiodic maps, the probability of two different iterates of a single point being equal is zero. Being aperiodic is not a significant restriction though, as noted in section 3.
When determining the complexity of a dynamical system, we consider the probabilities of ω, T (ω), T 2 (ω), . . . , T (n−1) (ω) lying within specific sequences of sets. This leads to the definition of the Kolmogorov-Sinai entropy: Definition 1 (Kolmogorov-Sinai entropy). Let (Ω, A, µ, T ) be a measurepreserving dynamical system and P = {P i } i∈I a partition of Ω with some finite or countable index set I. Define for n ∈ N and a multi index i = (i 0 , i 1 , . . . i n−1 ) ∈ I n the set and the partition P (n) := {P (i)| i ∈ I n }. The Kolmogorov-Sinai entropy (or entropy rate) of T with regard to the partition P is defined as where H(P (n) ) = − i∈I n µ(P (i)) log(µ(P (i))) is the Shannon entropy of the partition P (n) . By the Kolmogorov-Sinai entropy of T is defined, where the supremum is taken over all finite or over all countable partitions with finite entropy.
Remark 1. Originally, the Kolmogorov-Sinai entropy was defined as the supremum of the entropy rates over finite partitions, disregarding countable partitions. However, according to Abramov's Theorem, the supremum of the entropy rates over all countable partitions with finite entropy is not larger than the supremum of the entropy rates over all finite partitions [6].

Remarks
1. Like Amigó, Kennel and Kocarev [1], we use the limit inferior for the definition of the permutation entropy in (3) instead of the limit like Bandt, Pompe and G. Keller [2]. This is because, unlike in (1), one does not know whether 1 n H(P n * ) converges for n → ∞. Alternatively, one could use the limit superior in (3) like, for example, A. M. Unakafov and V. A. Unakafova in [7]. By replacing the limit inferior with the limit superior in the argumentation of this paper each statements remains valid so that we can conclude that the limit in (3) does exist for the here considered class of maps T .
2. Technically speaking, the collection P n * of sets P π , π ∈ Π n , is not actually a partition. A point ω ∈ Ω with T i (ω) = T j (ω) for some i, j ∈ N 0 with 0 ≤ i < j ≤ n − 1 belongs to at least two sets P π ∈ P n * . However, such points belong to the set of (pre-)periodic points, which has measure zero for aperiodic maps T . So the sets of points P π with ordinal patterns π ∈ Π n are only disjoint µ-almost surely. This is not a problem because the value of the entropy is not affected by sets of measure zero.

The Main result
Since taking the supremum over all finite partitions is necessary to calculate the Kolmogorov-Sinai entropy, its determining can be difficult. There are theoretical results that ensure, under some conditions, the existence of a partition, such that the entropy rate with regards to this partition yields the K-S entropy. However, in practice, one does not know how such partitions look like. Additionally, this partition depends on the dynamics T whose precise description might be unknown as well for practical applications. The permutation entropy has the advantage that it can be calculated without having to find such partitions. The ordinal patterns necessary for the calculating of the permutation entropy automatically partition the space Ω in a way that can capture the information of a system, independently of the considered map T . We would like to know whether the complexity of the ordinal structure is equal to the complexity of partitions generated by iteration of T . That is, does h P E (T ) = h(T ) hold true? It is possible to show that is fulfilled for a measure-preserving dynamical system (Ω, B(Ω), µ, T ) with Ω ⊆ R (see [2] or [1] holds true.
Being piecewise monotone is defined in the following way: Definition 3 (Piecewise monotony). Let T : Ω → Ω be a map for Ω ⊆ R.
T is called monotone on a set M ⊆ Ω if We call M partition into monotony intervals of T .
Given a compact metric space Ω ⊆ R, the fact that T : Ω → Ω is (countable) piecewise monotone automatically implies that Ω can be represented as a finite (or countable) union of intervals. However, Ω itself can be a more general set than a single interval.
It was not known whether (5) is true for a more general case than piecewise monotony.
is called Gauss function (see Figure 1). This map is measure-preserving with regard to the measure µ, which is defined by µ(A) = 1 is a partition into monotony intervals of T . The map T is countable piecewise monotone but not piecewise monotone. Therefore, we cannot use Theorem 1 to decide whether h P E (T ) and h(T ) are equal. However, we can use our new theorem below to show the equality as explained in Section 3. We will show here that (5) is true for countable piecewise monotone maps T as well. Our main result can be formulated as follows: Theorem 2 (Main result). Let (Ω, B(Ω), µ, T ) be a measure-preserving dynamical system with Ω ⊆ R being a compact metric space. Let T be aperiodic and countable piecewise monotone and M a countable partition into monotony intervals of T . If H(M) < ∞ holds true, then As already mentioned, the above theorem is a generalization of the result of Bandt, Pompe and G. Keller (Theorem 1). Note that in the simpler case of piecewise monotony the restriction H(M) < ∞ is not necessary because H(M) is always finite for a finite partition M into monotony intervals. To prove our results, we begin with Lemma 1 by reducing our problem to a combinatorial one. This is analogue to the approach used in [2]. While Bandt et al. followed this by an examination of periodic points, utilizing the piecewise monotony and continuity, we use the piecewise monotony more directly and then apply measure theoretic arguments. Bandt et al. showed the equality of the topological entropy and a topological variant of the permutation as well [2], which is generally not possible for maps with infinitely many monotony intervals, as Misiurewicz had shown [9].

Proofs
Rough outline of the proof Given a finite or countable partition P, we show in subsection 2.1 that the entropy difference h P E (T ) − h(T, P) is bounded from above by a term depending on the number of intersections between sets of points with an ordinal pattern and the sets of the partition P (n) . This number of intersections depends on the chosen partition P. Using the monotony of the map T , we chose the partition P as the partition into monotony intervals M in the next subsection. We show that for this choice of P the upper bound established in the previous subsection is finite. In particular, we show that the upper bound for the entropy difference depends on how frequently iterates of the map T are lying within the same subset of the partition P. This fact allows us to create an arbitrary small upper bound for the entropy difference by constructing partitions P into monotony intervals in such a way that iterates of T cannot stay within the same subset of P too frequently. This is done in the more technical subsection 2.3 by using Rokhlin towers.

Upper bound for entropy difference
We look at the difference between h P E (T ) and h(T ) by considering the refinement P n * ∨ P (n) of an ordinal partition P n * and the partition P (n) used in the definition of the Kolmogorov-Sinai entropy. The refinement of two partitions P = {P i } i∈I and Q = {Q j } j∈J of Ω is defined by One can easily verify that H(P n * ) ≤ H(P (n) ∨ P n * ) is true. In the following lemma, we show that H(P (n) ∨ P n * ) is bounded from above by H(P (n) ) plus some term depending on n and the given partition P. To find this term, we consider sets for i ∈ I n , which contain all permutations whose ordinal patterns are intersecting the set P (i). Roughly speaking, if the size of S P n (i) does not grow too fast on average for increasing n, then the partition P n * does not add a lot of new information to P (n) , so H(P (n) ∨ P n * ) and H(P (n) ) will be similar in size. We give an upper bound on the difference between h P E (T ) and h(T ) based on #S P n (i), where #A denotes the number of elements in a set A: (Ω), µ, T ) be a measure-preserving dynamical system with Ω ⊆ R and P = {P i } i∈I a finite or countable partition of Ω with H(P) < ∞. Then holds true with S P n (i) as defined in (6). Proof. Let P n * be the partition into ordinal patterns of length n. Then i∈I n µ(P (i)) log(µ(P (i))) + i∈I n µ(P (i)) log(#S P n (i)) = H(P (n) ) + i∈I n µ(P (i)) log(#S P n (i)).
Dividing both sides by n and taking the limit inferior n → ∞ finishes the proof.
The above lemma is useful because it allows us to work with the number of elements of S P n (i) and, therefore, to use combinatorial arguments. This is done in the following subsection.

Using monotony
Given a countable partition P = {P i } i∈I with finite entropy, the term bounds the difference between h P E (T ) and h(T, P). Our goal is now to find a sequence of countable partitions To achieve this, we construct the partitions P d as special refinements of a given partition M into monotony intervals. This allows us to give an upper bound on the size of #S P d n (i).
In (2), an ordinal pattern of length n was encoded by a permutation π = (π 0 , π 1 , . . . , π n−1 ) ∈ Π n , where T π i (ω) is the i-th smallest element in the sequence ω, T (ω), . . . T n−1 (ω). To prove the above lemma, it helps to consider a different way of encoding ordinal patterns: If we know for all s, t ∈ {0, 1, . . . , n − 1} whether T s (ω) ≤ T t (ω) is true, we can determine to which ordinal pattern P π the point ω belongs to. Therefore, we can encode an ordinal pattern by all pairwise comparisons of elements of the orbit of length n.
where R c s,t denotes the complement Ω \ R s,t of R s,t . Every permutation π ∈ S P n (i) is uniquely determined by a function f ∈ F n (i) and vice-versa. This implies It is easy to see that E n = n−1 d=1 E d n is true. Therefore, every function f ∈ F n is generated by a combination of functions f d ∈ F d n for d ∈ {1, . . . n − 1} (but not every combinations of functions f d ∈ F d n necessarily generates a function f ∈ F n ). This implies Now fix d ∈ {1, . . . n − 1}. The problem of figuring out whether f (s, t) can be 0 or 1 for (s, t) ∈ E d n can be seen as the problem to deduce whether (T s (ω), T t (ω)) lies within R := {(ω 1 , ω 2 ) ∈ Ω 2 | ω 1 ≤ ω 2 } or R c from the rectangle M is × M it in which (T s (ω), T t (ω)) is lying. The set R corresponds to the striped triangle in Figure 2. If i s = i t holds true for (s, d) ∈ E d n , the points T s (ω) and T t (ω) lie in different intervals M is and M it for all ω ∈ M(i). In Figure 2, this corresponds to the fact that M is × M it either completely lies in the triangle R or in the triangle R c . Therefore, If i s = i t holds true for (s, t) ∈ E d n , the points T s (ω) and T t (ω) lie within the same interval M is = M it for all ω ∈ M(i). In Figure 2 this corresponds to the fact that M is × M it is a square intersecting the diagonal of Ω 2 . So we cannot establish a straightforward equation like (8) that determines whether f (s, t) is equal to 0 or 1. Since T acts monotonically on the interval M is , applying the map T on  T s (ω) and T t (ω) preserves or reverses the order relation of T s (ω) and T t (ω), depending on whether T is increasing or decreasing in M is . Therefore, In terms of Figure 2, this means that in each square So every value f (s, t) can be uniquely determined by the subsequent value f (s + 1, t + 1) for all t < n − 1, which implies #F d n ((i 0 , . . . i n−1 )) = #F d n−1 ((i 1 , . . . i n−1 )) = . . . = #F d d+1 ((i n−d−1 , . . . i n−1 )). Therefore, the value of #F d n (i) does not depend on n but on the amount of possibilities for the last value f (n − 1 − d, n − 1). Hence (8) and there are at most 2 possible outcomes for f (n − d − 1, n − 1) otherwise, we have for all d ∈ {1, 2, . . . , n − 1}, which, together with (7), finishes the proof.
As an immediate consequence of the above Lemma we get that the difference between Kolmogorov-Sinai entropy and permutation entropy is not larger than log 2. In order to give a smaller upper bound than log 2, we have to put more effort into choosing our partition into monotony intervals M. Notice that a partition into monotony intervals of T is not unique. For every countable partition M into monotony intervals and and every countable partition Q into intervals (but not necessarily into intervals of monotony) the partition M ∨ Q is a countable partition into monotony intervals as well.

Rokhlin towers
We want to construct partitions Q = {Q j } j∈J so that the expected value of #{s ∈ {0, 1, ..., n − 2}| j s = j n−1 }/n can be made small if n goes to infinity. For #{s ∈ {0, 1, ..., n − 2}| j s = j n−1 }/n to be small, we try to construct the sets Q j in such a way that the iterates T s (ω) cannot stay in the same set Q j too frequently. In the case of ergodic maps T , Birkhoff's ergodic theorem provides that the number of iterates T s (ω) that are element of the set Q j is proportional to the measure of Q j . So, by making the measure of Q j smaller for all j ∈ J (which, consequently, increases the number of sets in Q) we decrease the expected value of #{s ∈ {0, 1, ..., n − 2}| j s = j n−1 }/n. Finally, one can use a similar approach to the one used in the proof of Theorem 2 to show the equality of permutation and KS-entropy. Since the ergodic case is contained in the general case, we do not show this here explicitly. If T is not ergodic, just making the measure of Q ∈ Q small is not enough any more, the sets Q ∈ Q need to be constructed in a specific way. This is done in theorem 3 by using Rokhlin towers.
Lemma 3 (Rokhlin Lemma [5]). Let Ω be a separable metric space, (Ω, B(Ω), µ, T ) a measure-preserving dynamical system and T aperiodic. Then for all d ∈ N and ε > 0 there exists a set B ∈ B(Ω) with Such towers will turn out to be very useful because is true for all ω ∈ Ω if B is a base of a Rokhlin tower of height d. We will show in the proof of Theorem 2 by using Lemma 1 and 2 that, after dividing by n and taking n → ∞, inequality 9 can be used to find an upper bound on the difference between Kolmogorov-Sinai and permutation entropy. Since we can construct Rokhlin tower of arbitrary height d ∈ N, we can make this upper bound arbitrarily small by increasing d. However, we cannot use B, T −1 (B), . . . , T −d+1 (B) directly to construct a countable partition Q of Ω into intervals because the sets B, T −1 (B), . . . , T −d+1 (B) are generally not intervals. We need our partition Q to consist of intervals if we want to relate this partition to the ordinal partition P n * because two disjoint intervals are characterized by the fact that all elements in one interval are smaller than every element in the other interval. This fact does not need to be true any more if the sets Q ∈ Q are disconnected. So the sets B, T −1 (B), . . . , T −d+1 (B) have to be approximated by sets of disjoint intervals, which is done with the help of the following two lemmas.
Since O is open, there exists a countable collection of pairwise disjoint open is true. This implies with the triangle inequality So we can approximate the base B of a Rokhlin tower by a set of disjoint interval A i , i = 1, . . . , n with arbitrary precision. Since T is measurepreserving, we can also approximate T −k (B) by the sets T −k (A i ) but these sets do not need to be intervals any more. However, one can use the piecewise monotony of T to show that T −k (A i ) can be written as a finite or countable union of intervals: For a given (countable) piecewise monotone map T : Ω → Ω, one can easily see that T −1 (A) is a finite (or countable) union of intervals for every interval A ⊆ Ω. In fact, if A is an interval and M the partition into monotony intervals of T , the set T −1 (A) ∩ M is an interval for every M ∈ M. This implies that for every partition P into intervals the partition P ′ := T −1 (P) ∨ M will be a partition into intervals. Analogously, will be a partition into intervals as well. Repeating this argument provides that is a partition into intervals for all k ∈ N. Since the intersection of two intervals is an interval again, is a partition into intervals for all d ∈ N.
Applying Lemma 4 and using the piecewise monotony of T as explained above provides us with many sets of disjoint intervals. In the proof of the following lemma, we combine those intervals into a countable partition P = {P i } i∈I of disjoint intervals. We are then interested in specific elements A ∈ σ(P), where σ(P) is the smallest σ-algebra containing all sets P ∈ P. Since P, as well as the index set I, is countable and all sets P ∈ P are pairwise disjoint, we can explicitly state σ(P) as In particular, this implies that every set A ∈ σ(P) can be expressed as with a unique countable index set I A ⊆ I. We denote by split(A|P) := {P i } i∈I A the countable collection of sets in P into which A is split.
Theorem 3. Let (Ω, B(Ω), µ, T ) be a measure-preserving dynamical system with Ω ⊆ R being a compact metric space. Let T be aperiodic, countable piecewise monotone and M a countable partition into monotony intervals of T . Then for all ε > 0 and d ∈ N there exists a partition Q = {Q j } j∈J of Ω and an index set J ⊆ J, such that (i) Q consist of countably many intervals, Proof. Take ε > 0 and d ∈ N. According to Lemma 3, there exists a set for all k ∈ {1, 2, . . . , d − 1} and µ(B) ≥ 1−ε/2 d . Since any Borel probability measure on a compact metric space is regular [11], we can apply Lemma 4 to B, which provides the existence of a finite number of disjoint intervals A i , i ∈ {1, . . . , n} such that Consider Ω := [inf Ω, sup Ω] and R := Ω \ n i=1 A i . Because Ω and A i are intervals for all i ∈ {1, . . . , n}, there exists m ∈ N and intervals R i ⊆ Ω with Consider the partition It follows from (10) that P is a countable partition into intervals. Now define For all u, v ∈ {1, 2, . . . , n} and k, l ∈ {0, 1, . . . , d − 1} with k < l we have is fulfilled for all u, v ∈ {1, 2, . . . , n} and k, l ∈ {0, 1, for all i ∈ {1, 2, . . . n} and l ∈ {0, 1, . . . , d − 1}. This implies Since P is a countable partition into intervals, T −l ( A i ) and R can be expressed as the union of countably many intervals P ∈ P. The partition Q that consists of those intervals is defined as The collection of sets Q is indeed a partition of Ω because Notice that, because P is a countable partition into intervals and d is finite, Q is a countable partition into intervals as well. So (i) is fulfilled. Choose and take j ∈ J 0 . Then there exists a set A i and l ∈ {0, 1, So (iii) holds true. It remains to show (iv): Using A i ⊆ A i and the fact that the sets A i are pairwise disjoint provides This implies which is equivalent to So (iv) is fulfilled.
We now combine a partition Q as described in Lemma 3 and a partition M into monotony intervals of T into the partition P = M∨Q. This partition combines the properties of Q and M, i.e. we can apply Lemma 2 to P as we could to M and the properties given in Lemma 3 are true for P as they were for Q.
Proof of Theorem 2. The inequality h P E (T ) ≥ h(T ) follows from (4). We now have to show h P E (T ) ≤ h(T ). Let M = {M i } i∈I be a finite or countable partition into monotony intervals of T with H(M) < ∞. For any d ∈ N choose a countable partition of Ω into intervals and an index set J d ⊆ J d with • H(Q) < ∞, • and µ( j∈ J d Q d j ) ≥ 1 − 1 d . According to Theorem 3, this is always possible. Consider the partition Applying Lemma 1 to P d yields where we consider (i, j) itself as one multi index and I × J d as one index set. Note that P d is a countable partition into monotony intervals of T for all d ∈ N. Therefore, we can apply Lemma 2 to #S P d n ((i, j)), which yields We have H(P d ) ≤ H(M) + H(Q d ) < ∞ for all d ∈ N and, consequently, Combining (14), (15) and (16)  is true.
We have for any n, d ∈ N. This implies equality (17) and finishes the proof.

Discussion
Remark 2. The requirement of µ being aperiodic is not a significant restriction of the main result: Consider the set of periodic points We can divide the measure µ into a periodic part µ p with µ p (A) := µ(A∩P er) µ(P er) and an aperiodic part µ a with µ a (A) := µ(A\P er) 1−µ(P er) for all A ∈ A. Since P er is µ-almost surely T -invariant, i.e. µ(P er△T −1 (P er)) = 0, both µ p and µ a are T -invariant measures. Therefore, µ = µ(P er)µ p + (1 − µ(P er))µ a implies (see e.g. [13]) h(T, µ) = µ(P er)h(T, µ p ) + (1 − µ(P er))h(T, µ a ) By using the same arguments as in [13] for the proof of (18), one can verify that h P E (T, µ) = µ(P er)h P E (T, µ p ) + (1 − µ(P er))h P E (T, µ a ) holds true, where h(T, µ) and h P E (T, µ) denote the corresponding entropies of the dynamical system (Ω, B(Ω), µ, T ). We can apply our main theorem to h(T, µ a ) and h P E (T, µ a ) because T is aperiodic with regards to µ a . Since the dynamics of periodic points are determined by a finite number of iterations, on can show that h(T, µ p ) = h P E (T, µ p ) = 0 is true. Combining (18)  2 log n (log 2)n 2 < ∞, which allows us to apply Theorem 2 and get h P E (T ) = h(T ).

Generalizing interval notion
The main reason why the partition M = {M i } i∈I into sets of monotony for the map T was required to be a partition into intervals is the fact that a collection of disjoint intervals can be ordered in such a way that the order relation between two different intervals corresponds to the order relation of the points within those intervals. This information about the order relation was utilized in (8) as one part of determining the number of sets with ordinal patterns of length n intersecting a set M(i) for i ∈ I n . To describe this specific ordering on the set of those intervals, we write M i < M j if ω i < ω j holds true for all (ω i , ω j ) ∈ M i × M j .
We can generalize this ordering of intervals in a way that allows us to ignore sets of points with measure zero and that preserves the correspondence between the order of the different sets M i and points within those sets. To achieve this, we write A < µ B for A, B ∈ B(Ω), if holds true, where (µ × µ) is the product measure on the σ-algebra B(Ω 2 ). So A < µ B means that almost every point in A is smaller than almost every point in B, which can be interpreted as a probabilistic formulation of (21). We say that M = {M i } i∈I is an ordered partition of Ω, if M i < µ M j or M j < µ M i is true for all i, j ∈ I with i = j. Since elements of an ordered partition can be ordered the same way basic intervals could, up to sets with measure zero, our main theorem remains valid if we consider such a generalized partition as a partition into monotony sets.