Entropy conjugacy for Markov multi-maps of the interval

We consider a class $\mathcal{F}$ of Markov multi-maps on the unit interval. Any multi-map gives rise to a space of trajectories, which is a closed, shift-invariant subset of $[0,1]^{\mathbb{Z}_+}$. For a multi-map in $\mathcal{F}$, we show that the space of trajectories is (Borel) entropy conjugate to an associated shift of finite type. Additionally, we characterize the set of numbers that can be obtained as the topological entropy of a multi-map in $\mathcal{F}$.


Introduction
Multi-maps, also called set-valued maps, have been studied in the topological dynamics literature for some time, with such notable examples as [1,21,22]. In the past decade multi-maps have been studied extensively, with a particular focus on the topological structure of the associated space of trajectories or a related inverse limit space; see [15]. This development has also led to a renewed interest in the dynamics of multi-maps [11,13,17,18]. Additionally, multi-maps are the topological analogues of random maps of the interval, which have received substantial attention, e.g., [4,10,14,23].
In the study of single-valued maps of the interval, Markov maps [7] are particularly well-understood. These maps have a finite invariant set such that the map is strictly monotone on the intervals between elements of that set. This structure allows one to associate to each Markov interval map a corresponding shift of finite type that preserves many aspects of the dynamics.
Recent work [3,5,6,12] has generalized the notion of Markov interval maps to the setting of multi-maps and established some of their basic properties. In particular, [2] proves that under some conditions on the Markov multi-map, one may find upper and lower bounds for its entropy using associated shifts of finite type.
Our main results substantially sharpen this previous work. Under mild conditions on the Markov multi-map, we associate to it a single shift of finite type, and then we establish a close connection (in the form of a Borel entropy conjugacy) between the dynamics of the multimap and its associated shift of finite type. In particular, for a Markov multi-map in the class F considered here, the topological entropies of the Markov multi-map and of the associated shift of finite type must be equal. Furthermore, we demonstrate the richness of the class F by showing that any number that appears as the entropy of a shift of finite type also appears as the entropy of a Markov multi-map in F .
Further, let σ X : X → X denote the left-shift map (x n ) ∞ n=0 → (x n+1 ) ∞ n=0 on X. We seek to understand the multi-map F by studying the dynamics of the system (X, σ X ). In Section 3 we introduce a class of multi-maps that we call Markov multi-maps, and in Definition 3. 4 we state what it means for a Markov multi-map to be properly parametrized. In this work we focus on a specific class F of Markov multi-maps (see Definition 3.8): properly parametrized Markov multi-maps with complete sets of coding and avoiding words and positive entropy. For any multi-map F in this class, one may associate to F a square matrix M = M(F ) with entries in {0, 1} (see Section 3.2). The matrix M encodes the combinatorial structure of F . Let Σ M be the shift of finite type defined by M, with left-shift map σ M . The following theorem provides a precise correspondence between a "large" subset of the trajectory space X and a "large" subset of the SFT Σ M , where "large" here refers to a notion of entropy. A precise definition of Borel entropy conjugacy, originally defined by Buzzi [9] under the term "entropy conjugacy," appears in Definition 2.2. Theorem 1.1. Let F be the class of Markov multi-maps specified in Definition 3.8. Let F be in F with trajectory space X and associated SFT Σ M . Then (X, σ X ) is Borel entropy conjugate to (Σ M , σ M ).
Since Borel entropy conjugacy is known to preserve topological entropy, we immediately obtain the following corollary. In fact, since Borel entropy conjugacy provides a correspondence between all ergodic measures with large enough entropy, the following corollary is also immediate. Corollary 1.3. Let F be in F with trajectory space X and associated SFT Σ M . Then (X, σ X ) has the same number of measures of maximal entropy as (Σ M , σ M ). In particular, if (Σ M , σ M ) is irreducible, then (X, σ X ) is intrinsically ergodic (i.e., has a unique measure of maximal entropy). Remark 1.4. Random maps of the interval have been studied primarily with an eye towards the existence and properties of absolutely continuous invariant measures, e.g., see [4,7]. While the entropy conjugacy guaranteed by Theorem 1.1 provides a correspondence between ergodic measures of large entropy on X and on Σ M , it does not address questions about the whether any of these measures is absolutely continuous on X.
Let us now answer a question of Karl Petersen (personal communication). Let H(F ) denote the set of real numbers r > 0 such that there exists a multi-map F ∈ F having trajectory space X with h top (X, σ X ) = r. Recall that Lind has characterized the set of positive real numbers that arise as the entropy of a SFT as the set of all positive rational multiples of logarithms of Perron numbers [19]. 1.2. Organization of the paper. In Section 2, we provide background information and notation concerning shifts of finite type, ergodic theory, and Borel entropy conjugacy. Section 3 introduces Markov multi-maps and the class F of interest. Taken together, Sections 4 -7 contain the proof of our main result, Theorem 1.1. In Section 8 we establish some sufficient conditions for a Markov multi-map to be in F , and then in Section 9 we prove the realization result, Theorem 1.5. Finally, Section 10 contains some examples of Markov multi-maps.
trajectory for F is a sequence (x 0 , x 1 , . . .) ∈ [0, 1] Z + such that for all n ≥ 1, we have x n ∈ F (x n−1 ), or equivalently (x n−1 , x n ) ∈ G(F ). We denote by X = X(F ) the set of trajectories for F , and we give X the topology it inherits as a subspace of [0, 1] Z + with the product topology. We also define the left-shift on X, denoted σ X , by setting σ X (x 0 , x 1 , . . .) = (x 1 , x 2 , . . .). Observe that σ X is a continuous mapping on X, and if G(F ) is closed in [0, 1] 2 , then X is closed in [0, 1] Z + .
2.1. Shifts of finite type. Let A be a finite set, which we call the alphabet. An element b ∈ A n is called a word of length n. The full shift on A is Σ = A Z + , endowed with the product topology induced by the discrete topology on A. Given a set of words F , we may define Σ F ⊆ Σ to be the set of points that do not contain any word in F . We refer to words in F as forbidden words. Then Σ F is closed and invariant under the left-shift on Σ. If F is finite, then we refer to Σ F as a shift of finite type (SFT). In this work we restrict attention to SFTs for which all the forbidden words have length two, called nearest neighbor SFTs. For more on SFTs, we refer the reader to the book [20].
Any nearest neighbor SFT may be expressed in terms of a directed graph, (V, E), where the set of vertices V is equal to A, and given a, b ∈ A, there is an edge from a to b in the edge set E if and only if ab / ∈ F . Furthermore, we associate to any such graph its adjacency matrix M, defined as the square matrix indexed by A such that for a, b ∈ A, if ab / ∈ F then M(a, b) = 1, and otherwise M(a, b) = 0. Note that any zero-one matrix indexed by A also defines an associated nearest neighbor SFT (by letting ab be a forbidden word whenever M(a, b) = 0). The nearest neighbor SFT defined by a zero-one matrix M is denoted by Σ M , and the left-shift restricted to Σ M is denoted by In what follows it is convenient to have some notation for words of arbitrary length that do not contain any forbidden word. For n ≥ 1, we let L n denote the set of words a 0 . . . a n ∈ A n+1 such that M(a i , a i+1 ) = 1 for each i = 0, . . . , n − 1. Then let Consider an arbitrary nearest neighbor SFT Σ M on alphabet A. It has an associated finite directed graph Γ, with vertex set A and an edge from a to b whenever M(a, b) = 1. Let C 1 , . . . , C K ⊂ A be the vertex sets of the maximal strongly connected components of Γ, which we call the irreducible components of Γ. For each C k , the set of points in Σ M containing only symbols from C k forms an irreducible SFT, which we denote by Σ M (C k ). We refer to Σ M (C k ) as an irreducible component of Σ M . Note that the irreducible components Σ M (C 1 ), . . . , Σ M (C k ) are pairwise disjoint, and the set Σ M \ k Σ M (C k ) contains only wandering points. See [20,Chapter 4] for more details on this decomposition. We also denote by L(C k ) the set of words of arbitrary length on C k that do not contain a forbidden words.

Invariant measures and entropy.
In this work a topological dynamical system consists of a pair (X , T ), where T : X → X is a continuous self-map of a compact metrizable space. For any such system, we let M(X , T ) denote the set of Borel probability measures µ on X such that µ(E) = µ(T −1 E) for all Borel sets E ⊂ X . Note that M(X , T ) is a nonempty, convex set that is compact in the weak * topology. A measure µ ∈ M(X , T ) is called ergodic if µ(E) ∈ {0, 1} for all Borel sets E such that T −1 (E) ⊂ E. The set of ergodic measures is denoted by M e (X , T ). Note that a measure µ ∈ M(X , T ) is an extreme point in M(X , T ) if and only if µ is ergodic.
The following notation is used in subsequent sections. For any Borel set E ⊂ X , the union of all of its pre-images is denoted Note that T −1 (Pre(E)) ⊂ Pre(E), and therefore if µ ∈ M e (X , T ) then µ(Pre(E)) ∈ {0, 1}.
We also require some elementary facts regarding the entropy theory of dynamical systems. Complete definitions and proofs can be found in [24]. Let h top (T ) denote the topological entropy of the topological system (X , T ). Furthermore, when the system (X , T ) is understood and µ ∈ M(X , T ), we denote the measure-theoretic entropy of µ by h(µ). The standard variational principle for entropy states that and the supremum may be taken over only the ergodic measures. Furthermore, for SFTs it is known that the supremum is achieved, and if the SFT is irreducible, then it has a unique measure of maximal entropy. Furthermore, we note for future use that an irreducible SFT is entropy minimal, i.e., if X is an irreducible SFT of positive entropy and Y is a strict subset of X, then h top (Y, σ| Y ) < h top (X, σ| X ) (see [20] for a proof).

Entropy conjugacy.
We adopt the following definition of entropy for Borel sets (following Buzzi [9]). Definition 2.1. Let T : X → X be a topological dynamical system. For a Borel set E ⊂ X , let Now we define a notion of entropy conjugacy, which was previously introduced by Buzzi [9]. Definition 2.2. Suppose that T 0 : X 0 → X 0 and T 1 : X 1 → X 1 are topological dynamical systems. We say that they are Borel entropy conjugate if there exist Borel sets E 0 ⊂ X 0 and E 1 ⊂ X 1 and an invertible Borel bi-measurable map ψ : It is an easy corollary of the variational principle for topological dynamical systems that if (X 0 , T 0 ) and (X 1 , T 1 ) are Borel entropy conjugate, then h top (X 0 , T 0 ) = h top (X 1 , T 1 ). Remark 2.3. In his work on topological entropy for non-compact sets, Bowen introduced a notion that he called entropy conjugacy [8]. Bowen's definition of entropy conjugacy requires that the sets E 0 and E 1 have smaller topological entropy (in the dimension-theoretic sense defined in his paper) than the full system and that the conjugating map ψ is continuous. As such, Bowen's notion of entropy conjugacy is stronger than the notion of Borel entropy conjugacy defined above.

Markov multi-maps
We now give a precise definition of Markov multi-maps on the interval [0, 1]. This definition is based on the one given in [2], though our definition is slightly less general. (1) P = {p 0 , . . . , p r } is a partition of the interval [0, 1] with 0 = p 0 < · · · < p r = 1; 1] , and for each a ∈ A, there exists p i ∈ P such that 1] , and for each a ∈ A, there exists  Note that each G(a) is closed in [0, 1] × [0, 1], and so is G(F ). Some examples of Markov multi-maps and their graphs are given in Section 10. Now we make some additional graph-related definitions that are used repeatedly throughout this work. Our results require that F has some additional structure, which we now begin to define. Definition 3.3. We say that F satisfies the no crossing property if the following holds: for all a, b The following property strictly implies the no crossing property. Definition 3.4. We say that F is properly parametrized if the collection {G 0 (a) : a ∈ A} forms a partition of G(F ).
We think of the no crossing property as a property of the graph G(F ) (and the partition P ), whereas being properly parametrized depends on the particular parametrization of the Markov multi-map F . However, these properties are related by Lemma 4.1: if F 0 is a Markov multi-map with the no crossing property, then there exists a properly parametrized Markov multi-map F 1 such that G(F 0 ) = G(F 1 ).
Remark 3.5. If F is a Markov multi-map, then it possesses the following graph Markov property: . This property is used to define the SFT associated with F , which appears in the next section.

3.2.
The SFT associated to a Markov multi-map. We associate to any Markov multi-map F an SFT as follows. Let M be the square matrix indexed by A such that for a, b ∈ A, Let Σ M ⊂ A Z + be the nearest neighbor SFT with alphabet A and transition matrix M.
In our main results, we relate the SFT Σ M to the trajectory space X. In particular, Theorem 1.1 establishes sufficient conditions for these systems to be Borel entropy conjugate.

Nested intervals.
Let F be a properly parametrized Markov multi-map with associated matrix M. Here we associate to each sequence a ∈ Σ M a nonempty closed (possibly degenerate) interval in [0, 1]. To begin, for each a ∈ A 0 , we let f −1 a be the standard inverse function (which exists since f a is assumed to be a homeomorphism). For a ∈ A 1 ∪ A 2 , we let f −1 a be the unique map such that f −1 a : R(a) → D(a) (which exists since R(a) is non-empty and D(a) is a singleton in this case). Let u = a 0 . . . a n ∈ L n . Define the set • f −1 a n−1 (D(a n )) We make the following elementary observations. • Then I a is a non-empty, closed interval. Additionally, we note that ). These intervals appear in the next section in the definitions that characterize the class of Markov multi-maps in our main results.
3.4. Definition of the class F . In this section we define the class F of Markov multi-maps that appears in our main results. Let F be a properly parametrized Markov multi-map with associated matrix M.
Definition 3.6. Suppose C ⊂ A is an irreducible component of the graph with adjacency matrix M. We say that C has a coding word if there exists u ∈ L(C) such that if a ∈ Σ M (C) and {n ≥ 0 : σ n (a) ∈ [u]} is infinite, then I a is a singleton. Furthermore, we say that F has a complete set of coding words if each irreducible component with positive entropy has a coding word.
Definition 3.7. Suppose C ⊂ A is an irreducible component of the graph with adjacency matrix M. We say that C has an avoiding word if there exists u ∈ L(C) such that if a ∈ [u], then I a ∩ P = ∅. Furthermore, we say that F has a complete set of avoiding words if the following condition holds: if C is an irreducible component with positive entropy that is entirely contained in A 0 , then C has an avoiding word. Now we are prepared to give a precise definition of the class of Markov multi-maps that appears in our main results.
Definition 3.8. The class F consists of all properly parametrized Markov multi-maps F such that F has a complete set of coding words, F has a complete set of avoiding words, and the associated SFT Σ M has positive entropy.

Finite labeled trajectories.
Here we define some additional terminology that is useful in the following sections.
Let T m be the set of finite labeled trajectories of length m+1. We endow T m with the subspace topology inherited from [0, 1] m+1 × L m−1 (which has the product of the usual topology on [0, 1] m+1 and the discrete topology on L m−1 ).
Finally, we say that (x, b) ∈ T m is a special finite labeled trajectory Let S m denote the set of special finite labeled trajectories of length m + 1, and we let S m inherit the subspace topology inherited from T m .

Preliminary results
4.1. Parametrization lemma. The following simple result states that any Markov multi-map with the no-crossing property can be properly parametrized without changing its graph. Since the space of trajectories of a Markov multi-map depends only on its graph, this reparametrization also preserves the space of trajectories. Proof. Let F be a Markov multi-map with the no-crossing property.
Then let F 1 be the Markov multi-map defined by B 0 , B 1 , and B 2 .

4.2.
Graph lemmas. In this section we prove a few facts about graphs of Markov multi-maps. Throughout the remainder of this section, we consider F ∈ F . Since F is properly parametrized, we know that if a, b ∈ A are distinct elements, then G 0 (a) ∩ G 0 (b) = ∅. However, it is possible that a = b and yet G(a) has nontrivial intersection with G(b). The following lemma shows that any such intersection must be contained in P × P .
Then there is a unique a ∈ A such that (x, y) ∈ G(a), and furthermore (x, y) ∈ G 0 (a).
Proof. Since (x, y) ∈ G(F ) = ∪ a G(a), there must exist some a ∈ A such that (x, y) ∈ G(a). For uniqueness, suppose that (x, y) ∈ G(a) ∩ G(b). By the no-crossing property, for any a = b, we have G(a)∩G(b) ⊂ P × P . Since (x, y) / ∈ P × P , we conclude that a = b. Since (x, y) / ∈ P × P and (x, y) ∈ G(a), we see that a ∈ A 0 ∪ A 1 , and we must have (x, y) ∈ G 0 (a).
The next lemma asserts that G(F ) cannot accumulate along a horizontal line to any point of G(F ) ∩ (P × P ).
and if (y, q) ∈ G 0 (a), then y = p and a must be the unique element of A 2 such that G(a) = {(p, q)}.
The next two lemmas address the convergence of sequences in the space of finite labeled trajectories.
First, note that since A m+1 has the discrete topology, for all large enough k, we have b k = b. Then for all large enough k, we have (y k n , y k n+1 ) ∈ G(b n ) for all n = 0, . . . , m. Since G(b n ) is closed and {(y k n , y k n+1 )} ∞ k=1 converges to (x n , x n+1 ), we see that (x n , x n+1 ) ∈ G(b n ) for each n = 0, . . . , m.
Since x m+1 / ∈ P , Lemma 4.2 gives that there is a unique a ∈ A such that (x m , x m+1 ) ∈ G(a), and therefore we must have b m = a = w m .
Furthermore, since x m ∈ P and x m+1 / ∈ P , we see that x m for all large enough k. We claim by backwards induction that for each j = 0, . . . , m, we have b j = w j and y k j = x j for all large enough k. We have established the base case (j = m) in the preceding paragraph. Now suppose it holds for some j + 1. Let U be given by Lemma 4.3 for the point (x j , x j+1 ). By the inductive hypothesis, for all large enough k, we have y k j+1 = x j+1 ∈ P . Also, for all large enough n, we must have . By our choice of U, we must have that b j = w j and y k j = x j for all large enough k, which completes the induction.
Proof. As A m+1 has the discrete topology, we must have that b k = b for all large enough k. Then for all large enough k, we have (y k n , y k n+1 ) ∈ G(b n ), which is closed, and therefore (x n , x n+1 ) ∈ G(b n ).
Observe that if (x n , x n+1 ) ∈ G(F )\(P ×P ), then b n = w n by Lemma 4.2. Now suppose that we have some n such that (x n , x n+1 ) ∈ P × P . Then there exists N ∈ [n + 1, m] such that x j ∈ P for all j = n, . . . , N and x N +1 / ∈ P . Thus x n , . . . , x N +1 satisfies the conditions of Lemma 4.4, and we conclude that b n = w n .

Construction of the joint system and factor maps
Let F be a Markov multi-map in F with associated trajectory space X and SFT Σ M . In the following section, we introduce a topological dynamical system by taking limits of special finite labeled trajectories. We call this system the joint system. Then in Sections 5.2 and 5.3, we show that the joint system is in fact a common extension of X and Σ M . The joint system and its factor maps onto X and Σ M are central to the construction of the Borel entropy conjugacy in our proof of Theorem 1.1. We establish their key properties in this section.
5.1. The joint system. Let F be a Markov multi-map in F with associated SFT Σ M . Here we define a subset V of the product space [0, 1] Z + × Σ M , which will serve as a common extension of the trajectory space X and the SFT Σ M .
of natural numbers tending to infinity and a sequence {(y k , a k )} ∞ k=1 of special finite labeled trajectories, with (y k , a k ) ∈ S ℓ k , such that for each n ≥ 0, the sequence {(y k n , a k n )} ∞ k=1 converges to (x n , a n ) in [0, 1] × A. To complete the proof, we exhibit a sequence {ℓ j } ∞ j=1 of natural numbers and a sequence {(z j , c j )} ∞ j=1 of special finite labeled trajectories to demonstrate that (x, a) ∈ V .
Let j ≥ 1. First choose m j such that for all n = 0, . . . , j, we have a m j n = a n and x m j n − x n < 1 2j .
Next choose k j (depending on m j ) such that ℓ(m j , k j ) ≥ j and for all n = 0, . . . , j, we have b Finally, let ℓ j = j, and define Then {ℓ j } ∞ j=1 tends to infinity and {(z j , c j )} ∞ j=1 is a sequence of special finite labeled trajectories. Furthermore, for each n, we have that {(z j n , c j n )} ∞ j=1 converges to (x n , a n ) in [0, 1] × A. We have thus exhibited the necessary sequences to establish that (x, a) ∈ V .
As V is invariant under the left shift, we define σ V : V → V by letting σ V (x, a) = (σ(x), σ(a)).
In the proof of our main results, we use the joint space V as an intermediary between the spaces X and Σ M . To make this connection precise, we define factor maps from V onto each of X and Σ M . It is clear that φ is continuous and commutes with the left shift.
The following proposition asserts that φ preserves the entropy of ergodic measures. Its proof is an adaptation of the proof of [2, Theorem 4.1], and we provide it in Appendix A for completeness.

Factoring onto X.
Here we show that the joint space V from Definition 5.1 factors onto X.
Definition 5.7. Let F be in F with associated trajectory space X, SFT Σ M , and joint space V . Define the map π : V → [0, 1] Z + by the rule π(x, a) = x.
It is clear that π is continuous and commutes with the left shift. The following result shows that the image of π is contained in X.
Proof. Let n ≥ 0. Since (x, a) ∈ V , there exists y k n and y k n+1 such that lim k y k n = x n , lim k y k n+1 = x n+1 , and (y k n , y k n+1 ) ∈ G(a n ). Since G(a n ) is closed, we see that (x n , x n+1 ) ∈ G(a n ) for each n ≥ 0. Then by the definition of X, we have x ∈ X.
By Proposition 5.8, we have π : V → X. Next we establish that π in fact maps onto X. First, let V 0 ⊂ V be the set of points (x, a) ∈ V such that for each n ≥ 0, we have (x n , x n+1 ) ∈ G 0 (a n ). Proposition 5.9. For each x ∈ X, there exists a unique a ∈ Σ M such that (x, a) ∈ V 0 . In particular, π : V → X is surjective.
Proof. Let x = (x n ) ∞ n=0 ∈ X. Let n ≥ 0. Since (x n , x n+1 ) ∈ G(F ) and {G 0 (a) : a ∈ A} is a partition of G(F ), there is a unique element a n ∈ A such that (x n , x n+1 ) ∈ G 0 (a n ). This uniquely defines a sequence a = (a n ) ∞ n=0 . Let us show that a ∈ Σ M . Since Σ M is a SFT defined by the matrix M, it suffices to show that for each n ≥ 1, we have M(a n−1 , a n ) = 1. Let n ≥ 1. By construction, we have that (x n−1 , x n ) ∈ G 0 (a n−1 ) and (x n , x n+1 ) ∈ G 0 (a n ). Then x n ∈ R 0 (a n−1 ) and x n ∈ D 0 (a n ), and therefore D 0 (a n )∩R 0 (a n−1 ) = ∅. By the Markov property, we conclude that D 0 (a n ) ⊂ R 0 (a n−1 ), and therefore M(a n−1 , a n ) = 1, as desired.
Finally, note that (x, a) ∈ V . Indeed, for each k ≥ 1, let ℓ k = k, y k n = x n and b k n = a n . Then we have exhibited the necessary sequences to establish that (x, a) ∈ V .

Constructing the bad sets
Our aim is to show that under certain conditions, we can construct a Borel entropy conjugacy between X and Σ M . In this construction, we identify "bad sets", on which the Borel entropy conjugacy map will not be defined. The main source of difficulty in constructing our Borel entropy conjugacy arises from the fact that points in the trajectory space that stay in the critical set P can have multiple symbolic codings. In order to deal with this difficulty, we group such symbolic codings into the "bad sets" and show that we have only removed sets of strictly smaller entropy that the full system. In fact, we carry out this process for each irreducible component of Σ M separately. In the following section, we define the critical set of points in X that cause us difficulty. Then in the following sections we analyze the irreducible components in detail and construct their bad sets.
6.1. The critical system. Let F be in F with trajectory space X, SFT σ M , and joint system V . Consider the set of trajectories contained in the critical set P : Note that X P is closed and invariant under σ X . We refer to X P as the critical system. Now let Z = π −1 (X P ) ⊂ V , and note that Z is closed and invariant under σ V . As we mentioned above, one of the main difficulties in relating X and Σ M lies in the fact that π may not be injective on Z (or its pre-images under the shift).
6.2. Irreducible components. We find it useful to distinguish between the following types of irreducible components for Markov multimaps.
Definition 6.1. Let F be in F with associated SFT Σ M . Let C ⊂ A be an irreducible component of the M-graph. We say that • C is of Type III if it is not Type I or Type II.
Remark 6.2. Suppose C is of Type III. Then for each i ∈ {0, 1, 2}, we must have C ∩ A i = 0. In fact, there must exist allowable transitions in C from A 0 to A 2 (cross-over), from A 2 to A 1 (into a vertical line), and from A 1 to A 0 (out of vertical line).
Let Σ M (C) denote the irreducible component of Σ M corresponding to C. Note that Σ M (C) is an SFT contained in Σ M . Also, for distinct irreducible components C 1 and C 2 , we have that Σ M (C 1 ) and Σ M ( 6.3. Constructing the bad sets: Types I and III. We now show the existence of our "bad sets" off of which φ and π are injective. In the proof of the following proposition, we use the following immediate consequence of Lemma 4.5: if (x, a), (x, b) ∈ V and x m / ∈ P , then for each n < m, we have a n = b n . Also, for notation, for any word w ∈ L and any a ∈ Σ M , let Proof. First, suppose that C is Type I. Since F is in F , it has a complete set of coding words, and we may select a coding word u c for C. Similarly, since F is in F , it has a complete set of avoiding words, and then since C is of Type I, we may select an avoiding word u a for C. Now suppose that C is Type III. Since C is of Type III, it contains a word u a = w 0 w 1 such that w 0 ∈ A 0 and w 1 ∈ A 1 ∪ A 2 . Note that u a is an avoiding word. Furthermore, since C is of Type III, it contains a symbol u c ∈ A 1 . Note that u c is a coding word for C.
For the remainder of the proof, we do not distinguish between whether C is Type I or Type III.
Then let and let B = φ −1 (B 0 ). (Note that B 0 and B are invariant.) To establish (1), let (x, a) ∈ Pre(Z(C)). Then there exists N such that for each n ≥ N, we have x ∈ I σ n (a) ∩ P . Since u a is an avoiding word, we see that N u a (a) < ∞. It follows that a ∈ B 0 , and therefore (x, a) ∈ B. Now we establish (2). Let Y ⊂ Σ M be the SFT obtained by forbidding u c and u a . Since Σ M (C) is irreducible, it is entropy minimal.
, as Y is a strict subsystem of Σ M (C). Suppose µ is an ergodic measure on Σ M (C) such that µ(B 0 ) = 1. As µ is ergodic and the words u c and u a appear only finitely for points in B 0 , we must have µ([u c ]) = µ([u a ]) = 0, and therefore µ(Y ) = 1. Then by the variational principle, we see that h(µ) ≤ h top (Y, σ| Y ). Taking the supremum over all such µ, we ob- ). Furthermore, since φ preserves the entropy of ergodic measures (by Proposition

5.6), we obtain that h prob
To show that φ is injective on V (C) \ B, let (x, a) ∈ V (C) \ B, and suppose (y, a) ∈ V (C) \ B. Then a / ∈ B 0 , and in fact σ n (a) / ∈ B 0 for all n ≥ 0. Let n ≥ 0. Then σ n (a) contains the word u c infinitely many times, and therefore I σ n (a) is a singleton (since u c is a coding word). Since we must have both x n ∈ I σ n (a) and y n ∈ I σ n (a) , we conclude that x n = y n . As n ≥ 0 was arbitrary, we have shown that φ is injective on For each m ∈ T , we have that x m ∈ I σ m (a) ⊂ [0, 1] \ P (since u a is an avoiding word). Since (x, a) ∈ V (C) \ B, the set T must be infinite. Let n ≥ 0. Since T is infinite, there exists m > n such that m ∈ T . Then x m / ∈ P . By Lemma 4.5, we see that a n = b n . As n ≥ 0 was arbitrary, we conclude that π is injective on V (C) \ B.
6.4. Constructing the bad sets: Type II. We don't have to remove any bad sets from Type II components. Indeed, the following proposition establishes that π and φ are injective on the union of all Type II components. Then π and φ are injective on V P .
Proof. Suppose (x, a), (x, b) ∈ V P . Then a n , b n ∈ A 2 for all n ≥ 0, and we must have G(a n ) = {(x n , x n+1 )} = G(b n ) for all n. Therefore a n = b n for all n ≥ 0, and π is injective on V P .
Suppose that (x, a), (y, a) ∈ V P . Then D(a n ) is a singleton for each n, and we must have {x n } = D(a n ) = {y n } for all n. Therefore x n = y n for all n ≥ 0, and φ is injective on V P .

Proof of the main result
Now that we have constructed the bad sets for each type of irreducible component, we are ready to prove our main result on Borel entropy conjugacy.
Proof of Theorem 1.1. Let F be in F with associated trajectory space X and SFT Σ M . Furthermore, let V be the associated joint space, as in Definition 5.1, and let φ : V → Σ M and π : V → X be the maps defined in Definitions 5.3 and 5.7, respectively.
Since F ∈ F , we have that Σ M has positive entropy. Enumerate the irreducible components with positive entropy: C 1 , . . . , C J . For each C j of Type I or Type III, let B j ⊂ V (C j ) be the bad set given by Proposition 6.3. For each C j of Type II, let B j = ∅. Furthermore, let .
Then let Proof. Suppose that (x, a), (x, b) ∈ V \ B. By (7.1), there exists i, j such that (x, a) ∈ A i and (x, b) ∈ A j . If x n / ∈ P for infinitely many n, then a = b by Lemma 4.5. Now suppose that x n ∈ P for all by finitely many n. Then there exists N such that σ N (x) ∈ X P . Hence (x, a), (x, b) ∈ Pre(Z), and therefore C i and C j must be of Type II (since A k ∩ Z = ∅ whenever C k is of Type I or Type III). Since π is injective on V P , we conclude that a = b.
Proof. Suppose that (x, a), (y, a) ∈ V \ B. By (7.1), there exists i, j such that (x, a) ∈ A i and (y, a) ∈ A j . Then a ∈ Σ M (C i ) ∩ Σ M (C j ). Since distinct irreducible components are disjoint, we see that i = j. Since φ is injective on A i , we conclude that x = y.
Proof. First, by Proposition 5.8, we have that π is a factor map. Since entropy cannot increase under a factor map, h top (X, , and let µ be an ergodic measure on V (C i ) such that h(µ) = h top (V (C i ), σ| V (C i ) ) (which exists Σ M (C i ) has a measure of maximal entropy and φ preserves entropy). Since h prob (B i ) < h top (V (C i ), σ| V (C i ) ) = h(µ), we must have that µ(B i ) = 0. Then π is injective a set of full µ-measure, and therefore π is an isomorphism from µ to πµ = µ • π −1 . In particu- where the last inequality follows from the Variational Principle. We have now shown that h top (X, σ X ) = h top (V, σ V ).
Proof. Consider A i . First suppose that C i is of Type II. Then A i = V (C i ), which is compact. Thus π(A i ) is also compact. In particular, π(A i ) is closed and hence Borel.
Now suppose that C i is of Type I or Type III. Then there exist words u and v in L(C i ) (in particular, a coding word and any avoiding word) such that Note that for each n, the sets [u] and [v] are compact. Hence π[u] and π[v] are compact and in particular closed. Then which shows that π(A i ) is Borel. Finally, since A X = ∪ i π(A i ), we conclude that A X is Borel. For each A i , we have σ(A i ) = A i , and therefore σ(π(A i )) = π(σ(A i )) = π(A i ). As A X = ⊔ i π(A i ), we see that σ(A X ) = A X .
Let ν be an ergodic invariant measure on X such that ν(X \A X ) > 0. Since A X is invariant and ν is ergodic, we have that ν(X \ A X ) = 1, and therefore ν(A X ) = 0. Also, ν is supported on some set of the form π(B i ), with 1 ≤ i ≤ J. Let µ be an ergodic measure on V such that πµ = ν. Then µ(A) ≤ µ(π −1 π(A)) = ν(A X ) = 0, and µ(V (C i )) = 1. Therefore µ(B i ) = 1. Finally, we observe that h(ν) ≤ h(µ) ≤ h prob (B i ) ≤ max i h prob (B i ). Since the right hand side is strictly less than h top (V, σ V ), which equals h top (X, σ X ) by Lemma 7.3, we conclude that h prob (X \ A X ) < h top (X, σ| X ).
The following proposition may be quite easily deduced from the definitions, and we omit its proof.
V \B : A Σ → A X , which will serve as our Borel entropy conjugacy map. Proposition 7.6. ψ is bijective, bi-measurable, and commutes with the left shift.
Proof. Taken together, Propositions 7.1 and 7.2 yield that ψ is bijective. Let E ⊂ A X be Borel. Then π −1 (E) ∩ (V \ B) is Borel. Also, since φ| V \B is an injective continuous map on the Borel set V \ B, it maps Borel sets to Borel sets. Therefore is Borel measurable. Therefore ψ is Borel measurable. An analogous argument shows that φ −1 is also Borel measurable. Finally, since π and φ commute with the left shift, ψ also commutes with the left shift.
By the previous propositions, we conclude that ψ is the desired Borel entropy conjugacy between X and Σ M .

Sufficient conditions for F to be in F
Now that we have proved Theorem 1.1, we wish to highlight its utility by establishing some straightforward conditions that are sufficient for a Markov multi-map F to be in the family F . We focus on the case where there is an irreducible component C that contains all of A 0 . For singlevalued functions, this condition amounts to the topological transitivity of the system. Proof. Since F codes for points on C, by definition, we must have that I a is a singleton for all a ∈ Σ M (C). Therefore every word in L(C) is a coding word. To see that C also has an avoiding word, we consider two cases.
Case 1: Suppose A 0 is a strict subset of C. Then C must be a Type III component, and we showed in the proof of Proposition 6.3 that every Type III component has an avoiding word.
Case 2: Suppose A 0 = C. We have shown that C has a coding word, so there must exist a, b ∈ A 0 such that ab ∈ L(A 0 ) and I ab is a strict subset of I a = D(a). This would imply that D(b) is a strict subset of R(a).
By the definition of a Markov multi-map, D(b) is an interval between adjacent elements of the partition P . It follows that P partitions R(a) into at least two intervals, so there exist distinct elements p i , p j ∈ P such that [p i , p i+1 ] ∪ [p j , p j+1 ] ⊆ R(a). Then there must be b The interval I ab 1 is a strict subset of I a , so it contains at most one endpoint of D(a). Since A 0 is irreducible, there exists u ∈ L(A 0 ) such that ab 1 ua ∈ L(A 0 ). The interval I ab 1 ua is contained in I ab 1 , so it contains at most one endpoint of I a . Then I ab 1 uab 1 and I ab 1 uab 2 are non-overlapping, so at least one of them is disjoint from P . Therefore A 0 has an avoiding word.
Next we define what it means for F to be uniformly expanding on C, and we show that if that is the case, then F codes for points on C. Recall that for each a ∈ A, we have a well-defined function f −1 a : R(a) → D(a), and if u = a 0 · · · a n ∈ L, then we define f −1 Proof. Let 0 < λ < 1 such that |(f −1 u ) ′ (x)| < λ for all u = a 0 · · · a N ∈ L N (C) and x ∈ R(a N ). Then ℓ(I u ) < λℓ(R(a N )) ≤ λ. It follows that for all k ≥ 1 and u ∈ L kN (C), we have ℓ(I u ) < λ k . Therefore F codes for points on C.
By combining these results, we arrive at the following sufficient condition for F to be in F . Corollary 8.5. Let F be a properly parametrized Markov multi-map with associated SFT Σ M . Suppose that h top (Σ M , σ M ) > 0, and furthermore there is an irreducible component C with A 0 ⊂ C. If F is uniformly expanding on C, then F ∈ F , and hence (X, σ X ) is entropy conjugate to (Σ M , σ M ).
Proof. By Lemma 8.4 and Lemma 8.2, the component C has a coding word and an avoiding word. Since A 0 ⊂ C, no irreducible component (except possibly C) could be contained in A 0 , so F has a complete set of coding words and a complete set of avoiding words. Thus F ∈ F , so by Theorem 1.1, (X, σ X ) is entropy conjugate to (Σ M , σ M ).

Realization of entropies
We now prove Theorem 1.5, which we restate here. Proof. It suffices to show that for any irreducible SFT with positive entropy, there is a Markov multi-map in F with the same entropy.
Let Σ M be an irreducible SFT with positive entropy associated with the n × n matrix M. Since Σ M has positive entropy, there is one row of M with (at least) two ones. After possibly permuting the alphabet, suppose the first row has a one in columns k and k + 1.
Now we define a Markov multi-map F on the interval [1, n + 2] in terms of its graph. (To illustrate our construction, we give a specific matrix M in Example 9.1, and we show that graph of the corresponding multi-map in Figure 1 We have described the graph of F , but in order to show it is a Markov multi-map in F we should specify the indexing set A and identify a coding word and an avoiding word. Let C 0 be a labeling of all of the straight lines that correspond to ones in the matrix M. (Recall that the cardinality of C 0 will be one less than the number of ones in M, because the ones in the (1, k) and (1, k + 1) entries correspond to just one straight line in the graph.) Then let B 0 be the additional straight lines whose ranges were all [n + 3/2, n + 2], and define A 0 = C 0 ∪ B 0 .
Each of the straight lines we considered have a bottom left endpoint and a top right endpoint. Let C 2 and B 2 be the collections of these left and right endpoints, respectively, and let A 2 = C 2 ∪ B 2 . Finally let Then C 0 is a Type I irreducible component whose corresponding SFT has the same entropy as Σ M . To complete the proof, we show that if Σ(A) and Σ(C 0 ) are the SFTs associated with A and C 0 respectively, then h top (Σ(A)) = h top (Σ(C 0 )). Towards this end, we show that the symbols in B 0 , C 2 , and B 2 do not increase the entropy.
Let b 0 ∈ B 0 represent the straight line in [n + 3/2, n + 2] × [n + 3/2, n + 2], then any b ∈ B 0 can only be followed by b 0 . This means h prob (Σ(A 0 )) = h prob (Σ(C 0 )). Now we consider C 2 and B 2 . Each of these individually follows nearly the same pattern as A 0 with only one difference. Let a * ∈ A 0 correspond to the straight line in [1, 1/2] × [k, k + 3/2], and let c * ∈ C 2 and b * ∈ B 2 correspond to the respective endpoints of this line. The symbol a * can be followed by any symbol whose domain is [k, k + 1/2] or [k + 1, k + 3/2]. On the other hand c * can only be followed by points whose first coordinate is k, and b * can only be followed by points whose first coordinate is k + 3/2. The SFTs corresponding to C 2 and B 2 are disjoint from one another and are invariant. It follows that the entropy contributed by these sets is less than or equal to the entropy from C 0 .
All of this shows that F is a Markov multi-map whose associated SFT has the same entropy as (Σ M , σ M ). It only remains to show that F ∈ F . The only irreducible component in A 0 with positive entropy is C 0 . We must show it has a coding word and an avoiding word. Once again let a * ∈ A 0 correspond to the straight line in [1, 1/2] × [k, k + 3/2]. The range R(a * ) is partitioned by P into three non-overlapping intervals, so every occurrence of a * in a word u decreases the length of I u by a factor of 3. It follows that a * is a coding word.
We can also use a * to construct an avoiding word. Let u = u 1 · · · u m ∈ L(C 0 ) be any word such that a * ua * ∈ L(C 0 ). The interval I a * u is a strict subset of [1, 1 + 1/2], so it contains at most one point of P . There is then an element b ∈ C 0 such that I a * ua * b is disjoint from P and hence an avoiding word. Therefore F ∈ F . Using the method outlined in the proof of Theorem 1.5, this matrix would yield the graph pictured in Figure 1. Recall we stipulated that there must be at least two adjacent 1s in the first row of the matrix. For the rest of the graph, note that if we rotate the matrix counterclockwise ninety degrees, then the pattern of 1s in the matrix matches the pattern of lines in the lower portion of the graph.

Examples
We show various examples demonstrating the utility of our results. We begin by showing that Theorem 1.1 generalizes the well-known result for the case that F is single-valued.
Example 10.1. Suppose F is any uniformly expanding (single-valued) Markov map. Then Corollary 8.5 recovers the well-known fact that F is entropy conjugate to its combinatorial SFT.
Next we give an example of a Markov multi-map that is not uniformly expanding but still satisfies the hypotheses of Theorem 1.1.  4)). This defines a Markov multi-map whose graph is pictured in Figure 2.
Then A 0 and A 2 are both irreducible components with A 0 (a Type I component) having greater entropy. The graph G(3) has slope 2, but the rest of the graphs G(1), G(2), and G(4) have slope 1, so F is not uniformly expanding on A 0 . However, 3 ∈ L 0 is a coding word, because each time the symbol 3 appears in a word u ∈ L, the length of the interval I u is divided in half. Also 331 ∈ L 2 is an avoiding word, because I 331 = [3/6, 7/12]. Thus F ∈ F , so by Theorem 1.1 (X, σ X ) is entropy conjugate to (Σ M , σ M ).
Next we show an example with a Type III irreducible component.  5), . . . , G(9) so that they are the endpoints of the graphs of G(1), . . . , G(4). The graph of this Markov multi-map is pictured in Figure 2.
In this case, two symbols from A 2 represent the points {(0, 0)} and {(1/2, 0)}. For simplicity, say these are G(8) and G(9). Then C = {1, . . . , 7} is a Type III irreducible component which means it must have a coding and an avoiding word. In this case, we can use u = 13 ∈ L 1 as both a coding and an avoiding word, because I 13 = {1/4}. Since there is no Type I component, we automatically have F ∈ F . Finally we give an example that does not satisfy our hypotheses, and for which h top (Σ M , σ M ) is strictly greater than h top (X, σ X ). Then the only non-trivial irreducible component is A 0 = {1, 2}, and Σ M (A 0 ) is the full shift on two symbols, which has entropy log 2. However, the only non-wandering points of (X, σ X ) are the fixed points (0, 0, . . .) and (1, 1, . . .), so h top (X, σ X ) = 0.
Note that I u = [0, 1] for all u ∈ A 0 , so this multi-map has neither coding words nor avoiding words. Thus F / ∈ F , and Theorem 1.1 does not apply.
Appendix A. Proof of Proposition 5.6 Here we aim to prove Proposition 5.6. First, we recall the result of Katok [16] relating the measure-theoretic entropy of an ergodic measure to Bowen balls. Consider a compact metric space (X , d) and a continuous transformation T : X → X . For n ≥ 1, define the metric d n on X by setting d n (x, y) = max d T k (x), T k (y) : k = 0, . . . , n − 1 .
Let us now prove that the factor map φ : V → Σ M preserves the entropy of all ergodic measures. Proof of Proposition 5.6. As entropy cannot increase under factor maps, we have h(ν) ≤ h(µ). To complete the proof, we establish the reverse inequality. Fix α ∈ (0, 1). For n ≥ 1, let r(n, α) denote the minimal cardinality of a set of words W ⊂ L n such that Let ǫ > 0. For n ≥ 1, let s(n, ǫ, α) denote the minimal cardinality of a collection U of (n, ǫ) balls in V such that µ U ∈U U ≥ α. Now let n ≥ 1. Select a set W = {w 1 , . . . , w K } ⊂ L n with cardinality K = r(n, α) and satisfying (A.1). By the construction given in the proof of [2, Theorem 4.1], for each k, there exists a collection U k of (n, ǫ) balls in V such that and |U k | ≤ (n + 1) 1 ǫ + 1 .
Taking the limit supremum of this inequality as n tends to infinity yields lim sup n 1 n log s(n, ǫ, α) ≤ lim sup n 1 n log r(n, α).