On the non-equivalence of the Bernoulli and K properties in dimension four

We study skew products where the base is a hyperbolic automorphism of $\mathbb{T}^2$, the fiber is a smooth area preserving flow on $\mathbb{T}^2$ with one fixed point (of high degeneracy) and the skewing function is a smooth non coboundary with non-zero integral. The fiber dynamics can be represented as a special flow over an irrational rotation and a roof function with one power singularity. We show that for a full measure set of rotations the corresponding skew product is $K$ and not Bernoulli. As a consequence we get a natural class of volume-preserving diffeomorphisms of $\mathbb{T}^4$ which are $K$ and not Bernoulli.


Introduction
Bernoulli shifts provide the simplest model for chaotic behaviour of dynamical systems in the measurable category. Ornstein showed in [18] that the entropy of a Bernoulli system is the only invariant up to measurable conjugacy. Another natural property describing chaotic systems is the Kolmogorov property, or K-property for short. It follows by [24] that the K-property is equivalent to completely positive entropy: each non-trivial factor of the system has positive entropy. It is an easy observation that Bernoulli shifts are K. Kolmogorov conjectured that the converse is also true. This was shown to be false by Ornstein, who contructed a K-automorphism which is not Bernoulli [19]. Ornstein's contruction, however, was done through a certain specialized combinatorial procedure. In particular, it did not indicate whether one should expect equivalence of the properties in any category. Since the introduction of these properties, much effort has been put into understanding how often to expect equivalence of these properties, and finding more natural examples: both in the sense of finding families of examples in the measurable category, and finding examples in the smooth category.
A natural place to search for such transformations is skew products over Bernoulli shifts. Recall that for T : (X, B, µ) → (X, B, µ), S : (Y, C, ν) → (Y, C, ν) and ϕ : X → Z, the skew product over the base T with fiber S and with cocycle ϕ is given by T S ϕ (x, y) = (T x, S ϕ(x) y).
Notice that one can take the fiber to be a flow (S t ), the cocycle ϕ to take values in R and obtain more general skew products of the form ϕ (x, y) = (T x, S ϕ(x) (y)).
Since every positive entropy system has a Bernoulli factor of the same entropy [28], the skew-product construction is in some sense universal in the measurable category. However, to include all possible transformations, one in general must take cocycles ϕ : X → Aut(Y ) for some Lebesgue space Y . In the skew-product setup, one may analyze the entropy through looking at the contributions from the base and fiber. Indeed, the Abramov-Rokhlin entropy formula [1] shows that h T St ϕ = h(T ) + h(S) ϕ . This observation also shows that if the Bernoulli base is also a Bernoulli factor with maximal entropy, the fiber entropy must be 0. This can occur for transformations of type (1) in two ways: by having 0 average for the cocycle ϕ (as in the (T, T −1 ) examples), or by having a 0 entropy fiber S. One may, of course, take positive entropy fibers with ϕ > 0, but the Bernoulli base will fail to be a candidate for isomorphism. In [10], J. Feldman introduced the notion of loosely Bernoulli process to build skew-product examples of K but not Bernoulli automorphisms.
His key criterion is that if T is loosely Bernoulli then S should be loosely Bernoulli. In this case he used as base dynamics the shift on two symbols with equal mass and φ is the 0, 1 function, so the integral is positive. The fiber map S is what is now called a standard map: a 0 entropy, loosely Bernoulli system. Later, Kalikow used the loosely Bernoulli property to show that the (T, T −1 )-transformation is K but not Bernoulli in [11].
If one wishes to exhibit examples in the smooth category along the same construction, one can easily replace the Bernoulli base with a smooth realization: toral automorphisms. However, since the cocycle ϕ must also be smooth, one must take the fiber system as a smooth flow (and not transformation) 2 . Such examples were first obtained by Katok in [12] using a skew product of the form (1). There, T is an Anosov diffeomoprhism, S t is a smooth ergodic flow, and ϕ is a positive cocycle which is not cohomologous to a contant.The principle achievement of [12] was showing that the skew product is K with a general cocycle ϕ. The Feldman criterion was then used to produce first smooth examples of K systems which are not Bernoulli: if (S t ) is not loosely Bernoulli, the skew product is not Bernoulli. One may take, for example, S t = h t × h t where h t is the horocycle flow on the unit tangent bundle to a surface of constant negative curvature 3 . This example has the advantage of having very explicit formulas, but also the disadvantage of being 8-dimensional. While not written formally at the time, it was remarked that one could obtain an example in dimension 5. This example can be constructed in the following way: The approximation-by-conjugation method developed by Anosov and Katok can produce smooth realizations of certain models in the measurable category [3]. This yields a nonloosely Bernoulli transformation S : T 2 → T 2 , following the combinatorial construction of Feldman [10] (see [14] or [6] for a modern treatment of this construction). Then taking the constant-time suspension of this transformation yields a non-loosely Bernoulli constanttime suspension in the fiber. Then the skew-product and the direct product of the special flow with base T and roof ϕ and S are sections of the same smooth flow, and hence Kakutani equivalent. As a result, T (St) ϕ is not loosely Bernoulli, and is K by [12]. Later Rudolph using methods similar to the Kalikow (T, T −1 ) example, obtained other smooth examples in dimension 5 [25]. In the class considered in [25], one can take (S t ) to be the geodesic flow and ϕ some smooth cocycle which is not a coboundary such that ϕ = 0. One the other hand, by Pesin theory, smooth K automorphisms in dimension 2 are Bernoulli [20].
It is therefore natural to ask about the equivalence of K and Bernoulli properties in dimensions 3 and 4. The case of dimension 3 is more difficult, as skew-product constructions respecting the smooth structure would require a circle fiber. Since we assume measurepreserving, the action on fibers must be isometric. Such skew-product extensions are always Bernoulli [27]. On the other hand, in the context of volume preserving partially hyperbolic automorphisms on 3 dimensional manifolds with central foliation by circles, the accessibility property implies the K-property (see [23] for definitions). Among these systems, if central exponent is non-zero, by Y. Pesin [20] the system is Bernoulli. If the central exponent is zero but the central foliation is absolutely continuous, then the system is again Bernoulli by an application of the results in [5] and the aforementioned result of [27]. The remaining case is therefore the 3 dimensional partially hyperbolic, volume preserving systems, with accessibility property, non-absolutely continuous compact central foliation and zero central exponent. G. Ponce, A. Tahzibi and R. Varão conjectured later that not all such systems are Bernoulli [21].
We summarize the main developments for examples of K, non-Bernoulli systems here, in chronological order: LB Fiber Fiber Entropy Smooth ϕ Ornstein [19] N/A N/A No N/A Feldman [10] No 0 No = 0 Katok [12] No 0 Yes = 0 Burton [8] Yes Another interesting property to investigate is whether the examples are loosely Bernoulli. Following [10], most proofs actually prove that the skew product is not loosely Bernoulli. In fact, the only example above which does not show that the skew product is not loosely Bernoulli is [8].
In those examples and ours, the fiber is loosely Bernoulli. We do not have reasonable evidence either way to conjecture whether the total map is loosely Bernoulli.

Statement of Main Results
We show that there are a class of flows on surfaces (S t ), which when used in transfomations of the form (1) over a Bernoulli base, yield K, non-Bernoulli examples. In particular, we produce the first such examples in dimension 4. For the fiber, we take a smooth flow K t (T 2 , λ) with a single singularity with a high order of degeneracy (we will also call this flows Kochergin flows). More precisely, we take a time change of a linear flow with a Hamiltonian which at the point of degeneracy is locally given by (ax + by)(x 2 + y 2 ) l for some sufficiently large l. Such flow can be represented as special flow over irrational rotation by α and roof function which is C 2 (T \ {0}), f > 0 and satisfies where η is a small number (since the order of degeneracy of the saddle is high), η ∈ (0, 1 100 ), 0 < M 1 , N 1 , R 1 < +∞. For more on the connection between smooth flows on surfaces and special flows over rotations (or more generally, interval exchange transformations), see [16,9]. We will denote the flow given by f, α by (K f,α t ) (or, if f and α are fixed and understood, we will often simply write (K t )). Finally, the cocycle ϕ : T 2 → R is a smooth function such that ϕ 0 = T 2 ϕ dλ = 0 and has no continuous solutions ψ. We say that α ∈ D if and only if there exists a constant C = C(α) such that for any p, q ∈ Z, q > 0, α − p q C q 2 log 11/10 q .
It is classical that the set D has full Lebesgue measure on T (see, eg, [15]). With the above notation, our main result is the following: is K and is not Bernoulli.
By the results in [12], since (5) has no continuous solution, A (Kt) ϕ is K. Here, since the base dynamics is a toral automorphism there is no need to assume the flow K t is weak mixing (though it is the case for Kochergin flows, [16]). Therefore to prove Theorem 1 one has to show that A (Kt) ϕ is not Bernoulli. We will in fact show that A (Kt) ϕ is not VWB (very weak Bernoulli, see Definition 3.1) with respect to some convenient partition.

Some Observations on Fiber Complexity
Let us make a few remarks highlighting the differences between our approach and those of [11], [25] or [12]. First, thanks to Pesin formula [20], smooth surface flows always have 0 entropy. In fact, the orbit growth is polynomial and therefore the methods of [11], [25] showing the non-Bernoulli property will fail for us. On the other hand every smooth surface flow is standard and therefore has a section on which the first return map is an interval exchange transformation (which is loosely Bernoulli, see [13]). In summary, our example is the first which uses a smooth standard fiber transformation. Such a fiber transformation was constructed by [8] in the measurable category. However, like the original example of Ornstein, it was created exclusively for the purpose of having a loosely Bernoulli fiber, and did not arise naturally.
Let us also emphasize another important consequence of Theorem 1. It was suggested in [29], to consider skew-products over Bernoulli transfomrations, and study which fiber transformations result in a Bernoulli skew product. With elliptic fibers (very slow or no orbit growth), one expects the skew product to remain Bernoulli. This is shown to be true in [2], where the fiber is an irrational rotation and in [7], where the fiber is a mixing rank one system with slow orbit growth. With slightly higher complexity, namely weakly mixing systems which admit good cyclic approximations, one often finds non-Bernoulli extensions. This was the original method of Feldman and Katok. Benhenda was able to use these methods to find uncountable class of non-isomorphic extensions [6]. From [11], [25] one can deduce that if the fiber is hyperbolic (i.e., has exponential orbit growth or more explicitly, positive entropy) and ϕ = 0, then the corresponding skew product does not remain Bernoulli. 4 This is also confirmed in a recent work of Austin [4], where another uncountable family of non-isomorphic skew product with Bernoulli base is constructed.
However, nothing was known for fibers with polynomial or intermediate growth, i.e. parabolic fibers. Kochergin flows are parabolic in this sense, so one may consider Theorem 1 as a first step towards the study of skew products with parabolic fibers. It also suggests that such skew products do not remain Bernoulli. The methods used to prove Theorem 1 use tools with properties that generalize readily. The authors plan to continue the investigation of skew products with parabolic fibers in a subsequent paper, in particular the case of horocycle flows, their time changes and time changes of nilflows.

Plan of the paper.
In Section 2 we give definitions of very weak Bernoulli (VWB), Kochergin special flows and give some Denjoy-Koksma estimates for a Kochergin roof function. In Section 3 we first give a characterization of (VWB) for zero entropy extensions of Bernoulli systems (see Proposition 3.2) then we state Theorem 2 which contains the main combinatorial properties. Finally in Subsection 3.3 we use Proposition 3.2 to show that Theorem 2 implies Theorem 1.
A proof of Theorem 2 is then given in Section 4 conditionally on Proposition 4.2. The rest of the paper is devoted for proving 1-6 in Proposition 4.2. Section 5 is devoted for construction of B and {C y } y∈M from Proposition 4.2. The set B is constructed by Egorov's theorem type of reasoning as we want good control on ϕ (given by CLT) and on f (not coming to close to singularity). For y ∈ M , C y is the set of points whose orbit do not come close to y in a short time. Properties 1. and 2. will follow then automatically by the construction. Property 5. is a striaghtforward consequence of Lemma 5.3 (which is a consequence of diophantine assumptions). The construction of C y gives then Properties 3. and 6. (see Subsection 5.4).
The most difficult is 4., which is handeled separately in Section 6. It requires vertical stretch for nearby points. This is guaranteed by f n being of order n 2−η . This does not happen for all times, since we have cancelations (the roof is symmetric). Proposition 5.2 however shows that f n is of correct order for most of the times (with a polynomial gain). Finally in Section 7 using probabilistic tools, we prove Proposition 5.2.

Special flows
n=1 and let ψ ∈ L 1 (T, B, λ) be a strictly positive function. We recall that the special flow T t := T α,ψ t constructed above R α and under ψ is given by where ∼ is the identification Equivalently, this special flow is defined for t + s 0 (with a similar definition for negative times) by where N (y, s, t) is the unique integer such that and Kochergin flows under consideration Flows which we will consider are special flows over an irrational rotation α, and a roof function f satisfying (2), (3), (4) with η ∈ (0, 1/100). To simplify notation we assume that M 1 = N 1 = R 1 = 1 and that T f dλ = 1. We moreover assume that α ∈ D, i.e. α is not to well aproximated by rationals. We will denote the space on which the Kochergin flow (K t ) t∈R acts by (M, µ). Notice that we have the following metric on M : where d H (y, y ) = y − y is the horizontal distance. For simplicity we will often denote (y, s) and N (y, s, t) respectively by y and N (y, t). For a set W ⊂ T we denote W f := {(y, s) ∈ M : y ∈ W }.

Very weak Bernoulli for skew products and a theorem which implies Theorem 1
Our strategy for showing that a transformation T is not Bernoulli is to disprove the very weak Bernoulli property with respect to a convenient partition R. The definition has the advantage that if T is Bernoulli, then it has the very weak Bernoulli property for every partition. The results of this section follow [27] and [26]. Let T : (X, µ) → (X, µ) be a measurable transformation preserving a probability measure µ, and R be a finite partition of X. The R-name of x is the sequence of atoms of x which the orbit of x determines, denoted by identifying cylinder sets with the corresponding elements of Let us define the very weak Bernoulli property following [26, p. 134].
there exists some N ∈ N and a measurable set G ⊂ ∞ i=0 T i R (meaning it is measurable with respect to this partition) such that µ(G) > 1 − ε and for every pair of atoms r,r ⊂ G of ∞ i=0 T i R, there is a µ-preserving map Φ r,r : r →r and a set L ⊂ r such that: Here µ r stands for the conditional measure of µ along the atom r of the partition For the benefit of the reader to unwrap the definitions, we point out the structures above in the case of a Bernoulli transformation with the parition into legnth 1 cylinders. Then points lie in the same atom of ∞ i=0 T i R if and only if they share the same present and past. Then we may take G to be the full shift. Then given two atoms r,r (ie, two pasts), there is an obvious map which simply re-assigns the pasts and presents of points, match not only most codes of the orbit, but all of them. So we interpret the VWB condition as the future having aribtrarily small dependence on the past and present.

Zero Entropy Extensions of Bernoulli Systems
Definition 3.1 can be difficult to work with in general, but in the case of skew products, can be simplified by making a few assumptions. Suppose that S is a Bernoulli transformation, so that S is the shift on Σ d with probabilities (p 1 , . . . , p d ), ϕ : Σ d → R is a measurable function, and (K t ) : M → M is a zero entropy flow preserving a measure ν on M . We now assume that X = Σ d × M and T : X → X takes the following form: Let P = P d be the partition of Σ d into cylinders [i] 0 , i = 1, . . . , d and let Q be a finite partition of M such that P d × Q is generating for T (see Lemma 3.4). Here we denote with P d × Q the partition into products of elements in P d and Q. Note also that The following proposition is adapted from [27], where it was phrased only for the case when d = 2 and the (1/2, 1/2) measure on Σ 2 (although our deduction of this equivalence is virtually identical): Proposition 3.2. Assume T takes the form (11) with K of zero entropy and that Q is a partition of M such that P d × Q is generating for T . Then T is very weak Bernoulli with respect to P d × Q if and only if for every The following is analogous to [27, Lemma 1] and will be used to prove Proposition 3.2.
Lemma 3.3. Assume T takes the form (11), where K t is of zero entropy and let Q be a partition such that P d × Q is generating for T . Then there is a set of full measureX ⊂ X such that if r is an atom of the past partition To be more precise for a.e. x ∈ Σ d we have that for almost every x. Since P d × Q is generating we have that the right hand side is a single point and the proof follows. The next lemma shows that a finite partition Q such that P d × Q is generating for T always exists.
Lemma 3.4. Assume T is ergodic, K has finite entropy and ϕ ∈ L ∞ . Then there is a finite partition Q of M and a subsetX of X of full measure such that (P d × Q) ∩X is a generating partition for T .
Observe that in our situation K is of zero entropy. The proof of this Proposition is contained in the Appendix.

Smooth Dynamics and Bernoulli Shifts
In this section, we relate the combinatorial structures above with smooth dynamical systems. Let A : T 2 → T 2 be an hyperbolic toral automorphism. Every such automorphism has a Markov partition, which induces an almost one-to-one semiconjugacy h : Σ A → T 2 taking a Markov measure on the subshift of finite type Σ A to Lebesgue measure on T 2 .
Since it is almost one-to-one, it is also a measurable conjugacy (not just semiconjugacy, as it is in the topological world). Furthermore, this map is Hölder continuous, so the pullback of any Hölder (in particular, smooth) function on T 2 will remain Hölder in Σ A .
Since any subshift of finite type is also measurably isomorphic to a Bernoulli shift, we may further get a measurable isomorphism h : Σ d → Σ A taking a Bernoulli measure with weights (p 1 , . . . , p d ) to the desired Markov measure on Σ A . Remark 3.5. h is only measurable, so the pullback of a Hölder continuous function ϕ on Σ A or T 2 may (and most likely will) fail to be Hölder. However, any statistical asymptotics made for a Hölder continuous function on Σ A or F can also be made for its pullback by h , since the dynamics and measures are both intertwined.

A combinatorial result which implies Theorem 1
For a partition Q of M , (x, y), (x , y ) ∈ T 2 × M and N ∈ N let (12) We will show the following: There exist a set B ⊂ T 2 × M , µ × ν(B) > 9/10, a measurable family of sets (C y ) y∈M , C y ⊂ M , ν(C y ) 9/10 and a partition Q of M such that for every (x, y) ∈ B, (x , y ) ∈ B ∩ T 2 × C y and for every N ∈ N D Q N (x, y), (x , y ) < 9/10.
Theorem 2 is the most important combinatorial result of the paper. Before we prove Theorem 2, let us show how it implies Theorem 1.
Proof of Theorem 1. We begin by using the observations of Section 3.2 to write the system A (Kt) ϕ as the skew product T of a zero-entropy dynamical system over a Bernoulli shift on Σ d . We may then pull back the structures of Theorem 2 to these more convenient combinatorial structures. To simplify the notation we denote the pullbacked structures by the same symbols as the original ones.
Observe that if a partitionQ Q then DQ N D Q N , so we may refine the partition in Theorem 2 with the one in Lemma 3.4 and assume without loss of generality that P d × Q is generating for T . In particular, given ε > 0, we may find the partition Q of M guaranteed by Theorem 2 such that P d × Q is generating for T . We will show that T fails to be very weak Bernoulli with respect to Q and ε = 1/100 using the criteria of Proposition 3.2. We assume otherwise, and let G ⊂ Σ − d × M be the set in Proposition 3.2, µ − d × ν(G) 99/100. We now work in the set of quadruples (x − , y, , and let Z denote such a quadruple. Then the conclusion of Proposition 3.2 is the existence of a family of measure-preserving functions Φ Z : . Then using Fubini's Theorem again 8 10 , Then, on one hand since Z ∈ G × G and x + ∈ L but on the other hand (x − , x + , y) ∈ B, (x − ,x + ,ȳ) ∈ B and y ∈ C y (since Z ∈Ĝ), so by Theorem 2 D Q N ((x − , x + , y), (x − ,x + ,ȳ)) 9/10, a contradiction.
Therefore it remains to prove Theorem 2.
4 Discussion and proof of Theorem 2

A Summary of Technical Content
This section contains the main technical tools for establishing the criteria of Theorem 2. The convenient partition Q is a partition into boxes of small diameter with the neighbourhood of the cusp being one atom. The first tool is Proposition 4.1. It shows that for choices of (x, y), (x , y ) in B their orbits stay away from the cusp for most of the times (properties a.,b.). (Property c.) says that for these good times they cannot stay at a similar level of closeness for long (with the level of closeness measured by the sets A N,ξ 0 j ). In fact we have an exponential estimate (given by η 0 ). Summing up over all j ∈ N and using c. gives the assertion of Theorem 2. So it remain to prove Proposition 4.1. This is done by the use of Proposition 4.2, which is also the most technical and involved part of the paper. guarantees that we stay out of the cusp (this gives a. and b.). Properties 3., 4., 5. and 6. serve to prove c.: property 3. is a way to deal with low values of N , which are covered by choosing our parameters sufficiently small. Property 6. guaranties that the orbits of y and y don't come to close together so that they will split at time N . The main content then lies in 4. and 5. Recall that we wish to show that the amount of time spent in the set A N,ξ 0 j (for fixed j) is not long. Conclusion 5. deals with the case when horizontal separation occurs. The danger is the recurrence of circle rotations: if there is enough separation, the horizontal position of the orbits could become close after some time. Property 5. shows that for a time comparable to the inverse of the distance (up to a log factor) either there is no horizontal divergence, or the divergence is very large (see the constant 100) compared to the original closeness (see Figure 2). In view of 5. the only way points can stay long in A N,ξ 0 j is when they move isometrically with respect to the horizontal direction. Property 4. then ensures that at a scale comparable to the inverse of the distance (up to a small power), the number of such occurences is small (in fact we have 1 − η 0 gain). This follows by the study of Birkhoff sums of f , as we show later that for most of the times the growth of f is n 2−η , so points in 4. will split in the vertical direction (see Figure 1).
So it remains to prove Propostion 4.1.

The main technical result; proof of Proposition 4.1
The following proposition implies Proposition 4.1. Recall that d is a metric on M and we denote by d H and d V the distances on first and second coordinate respectively. Recall that η 0 η, y m := K ϕm(x) y, y m := K ϕm(x ) y and R m := d H (y m , y m ) −1 .
Proposition 4.2. There exist a set B ⊂ T 2 × M , λ(B) > 9/10, a measurable family of sets (C y ) y∈M , µ(C y ) 9/10 and there exists N such that for every N 3 N there exists ξ 3 > 0 (small) such that for every (x, y) ∈ B, (x , y ) ∈ B ∩ T 2 × C y and every N ∈ N there exists a set U N = U N (x, y, x , y ) ⊂ [0, N ] such that 3. for every j ∈ N and N N 3 The proof of Proposition 4.2 is the most involved part of the paper, it will be given in separate subsection. Fix j ∈ N and assume that U N ∩ A N,ξ 3 j (x, y, x , y ) = ∅. Divide the interval [0, N ] into intervals I 1 , ..., I k of length 2 j(1−10η) . By 6. it follows that 2 j < N log 6 N and therefore k > 1 assuming N is large enough, i.e. N 3 large enough. Consider only those I i , for which U N ∩ A N,ξ 3 j (x, y, x , y ) ∩ I i = ∅. In Figure 3, we see the scheme which we are about to pursue formally. We have already broken up the long interval [0, N ] into smaller (blue) intervals I i . By dividing j (x, y, x , y ). So we obtained that: Moreover, by 4. we get that |{n ∈ I i : d(y n , y n ) < ξ 3 and d H (y n , y n ) = d H (y r , y r )}| 2 j(1−10η)(1−η 0 )+2 .
The rest of the paper will be devoted for proving Proposition 4.2. Properties 1., 2. will follow straightforward after we define the sets B and {C y } y∈M . Properties 3., 5. and 6. are of intermediate level of difficulty, we will devote separate subsections for them. The heart of the problem is property 4. for which we devote a separate section.

Construction of B and {C y } y∈M
In this section we construct sets B and {C y } y∈M for Proposition 4.2. Construction of B is more involved it uses precise estimates on Birkhoff sums for ϕ, f, f and we will conduct it in several subsections below. The construction of C y is much more straightforward, we will give it now.

Construction of {C
where n 0 1 is such that µ(C 0 y ) 99/100. Define Then µ(C y ) 9/10. This way we defined a family of sets {C y } y∈M . We will later show that this family satisfies the assumptions of Proposition 4.2. We will now construct the set B.

Construction of B
Scheme of the construction The construction of B brakes down into several steps. First we construct sets E 0 and F 0 for which the desired asymptotics for the hyperbolic part holds (see (17) and (18)). Then we define the set V (see Proposition 5.2) which gives correct asymptotics for f , i.e. for (x, y) ∈ V the derivative is large for most of the times (see the definition of W n ). This corresponds for spliting in vertical direction. We also have the set S which allows to control the distance to singularity at scales (q n ). This gives the set G (see (19)). The set B is the set of points whose orbits visit G with correct propotion.
For simplicity we assume that ϕ 0 = 1. Then, by Egorov's theorem, there exists a set F 0 and N such that for every x ∈ F 0 and |n| N ϕ n (x) |n| 2 .
We have the following proposition: We will give the proof of Proposition 5.2 in a separate subsection. We have now defined all sets, which are needed in the definition of B in Proposition 4.2.
Consider the set where E 0 commes from (17), S from (19) and V from Proposition 5.2. Then it follows that λ × µ(G) 19/20. Therefore: there exists a set B erg ⊂ T 2 × M, λ × µ(B erg ) 19/20 and there exists N 2 ∈ N such that for every (x, y) ∈ B erg , and every N N 2 , we have Define We will show that the set B, sets {C y } y∈M (see (15)) and U N satisfy the assumptions of Proposition 4.2.
Properties 1. and 2. in Proposition 4.2. Notice that 1. and 2. follow by the definition of U N (see (23)) and the set B (if ξ 0 is chosen sufficiently small).

Proof of 5. in Proposition 4.2
Property 5. is a straightforward consequence of the following general lemma. Recall that y n := K ϕn(x) y, y n := K ϕn(x ) y and R n := d H (y n , y n ) −1 , R 0 := d H (y, y ) −1 .
Take N 0 such that log N 0 C(α) −1 . For n ∈ [0, R 0 log 5 R 0 ] ∩ Z the first coordinates of y n and y n are respectively y 0 + m n α, y 0 + r n α for some m n , r n ∈ N. Since |ϕ| < C and f > c, it follows that m n , r n < C 0 n for some constant C 0 . Therefore we have either m n = r n in which case d H (y n , y n ) = |y 0 − y 0 | = d H (y, y ) or by (26) and This finishes the proof.

Proof of 3. and 6. in Proposition 4.2
We have the following easy lemma: Lemma 5.4. There exists N 4 ∈ N such that for every y ∈ M , every y ∈ C y (see (15)) and every N ∈ Z, |N | N 4 we have Proof. Notice that by (15), for sufficiently large n, we have min −qn log qn s qn log qn d H (y, K s y ) 1 q n log 3 q n .
Proof. Let y + m n α and y + r n α denote first cooridnates of respectively y n , y n . Since |ϕ| < C, f > c it follows that m n , r n C 0 n for some C 0 > 0. Therefore for n N 4 , by (25) and y ∈ C y , we have d H (y n , y n ) = y + y + (m n − r n )α inf Therefore 3. and 6. in Proposition 4.2 follow by (28) if only we take N 3 N 4 and ξ 3 N −1 4 .

Proof of 4. in Proposition 4.2
In this section we assume that Proposition 5.2 holds (we prove Proposition 5.2 in Section 7). Recall that y n := K ϕn(x) y, y n := K ϕn(x ) y and R n := d H (y n , y n ) −1 , R 0 := d H (y, y ) −1 .
Proof. Assume that n > 0, the proof in case n < 0 is analogous. Since (x, y) ∈ G, we know that x ∈ F 0 . Hence for n N 3 , by (18) we know that ϕ n (x) n 2 . Therefore, by the definition of special flow, we have n 2 ϕ n (x) < f N (y,ϕn(x))+1 .
Proof. We will conduct the proof in the case t > 0, the oposite case is analogous. By Lemma 2.1 (using the estimates in (8) and (10)) we have Therefore it is enough to show that Let k ∈ N be unique such that q k N (y 0 , t) < q k+1 . Notice that since y ∈ S and θ − y 0 < 1 t log 3 t we know that Therefore, by Lemma 2.1 and diophantine assumptions on α, and N (y This finishes the proof.
7 Growth of the derivative, proof of Proposition 5.2 Recall that for y ∈ M , y 0 ∈ T denote the first coodinate of y and that W n = {y : |f n (y 0 )| |n| 2−4η }. Proposition 5.2 will follow from the proposition and lemma below: Proof of Proposition 5.2. By Proposition 7.1 and Egorov theorem, there exists a set V 0 ∈ M , µ(V 0 ) 99/100 and N 0 ∈ N such that for every y ∈ V 0 and |M | N 0 (31) is satisfied. (17) and (19)) and N 1 sufficiently large. Then since x ∈ E 0 , |ϕ N (x) − N | N 2/3 . Moreover, let s be unique such that q s N < q s+1 since y ∈ S, for s n 0 ].
Hence, by (8) So for any (x, y) ∈ V and |N | N 1 sufficiently large, the assumptions of Lemma 7.2 are satisfied. Therefore (32) holds for y and M . By taking M = N, −N we get that also (20) holds . The proof of Proposition 5.2 is thus finished.
Therefore it remains to prove Proposition 7.1 and Lemma 7.2.
Proof of Proposition 7.1 We will first prove the following lemma: There exists a constant C > 0 such that for every n ∈ Z µ(W c n ) < C|n| − 3η 2 .
Proof. Let us conduct the proof in the case n > 0 the case n < 0 is analogous. Let s ∈ N be unique such that q s n < q s+1 . Let I = (a, b] be any interval in the partition I of T given by {−iα} n−1 i=0 . It follows by (2),(3), (4), that f n is C 2 on I. Moreover lim x→b − f n (x) = +∞ and lim x→a + f n (x) = −∞. Hence there exists x I ∈ I such that f n (x I ) = 0. Then for x ∈ I there exists θ x ∈ I such that and therefore µ(W c n ) Cn − 3η 2 .

Now we can give the
Proof of Proposition 7.1. The proof uses some ideas of the proof of Theorem 1. in [17]. We will use the following simple lemma: If Y n are random variables such that n 1 Y n 2 2 < ∞, then Y n → 0 a.s.
Let us denote X i := χ W c i . Notice that by Lemma 7.3, we have Therefore there exists a sequence (N k ), By Lemma 7.4 we get Proof of Lemma 7.2 Proof. Without loss of generality, we may assume ϕ is strictly positive (we may replace ϕ with a cohomologous function which is strictly positive, if necessary). Denote δ = η/10. Let V M := {t ∈ [−M, M ] : y ∈ W N (y,t) }. We will show the following: Notice that (32) follows by (34). Indeed, we have For any such j let T j be such that N (y, T j ) = N j . Then by (35), By definition N j+1 − N j = q k . Therefore and by (8), we have |T j+1 − T j | f N (y,T j+1 −T j ) (y) = f N j+1 −N j (y) = f q k (y) q k − 2q 1−η k .

So finally
which finishes the proof.
Appendix: Proof of Lemma 3.4 By Krieger generator theorem and inducing as necessary we may assume that K is the special flow over a subsystem (with respect to some invariant measure) of a full shift S on 2 symbols with a roof function f such that f − f < for some small enough (say where we use the identification as in formula (6). As in formula (7) we get for (w, s) ∈ M , K t (w, s) = (S N (w,s,t) (w), t + s − f N (w,s,t) (w)) where N (w, s, t) is the unique integer such that 0 t + s − f N (w,s,t) (w) f (S N (w,s,t) (w)), and f n (y) =    f (w) + . . . + f (S n−1 (w)) if n > 0 0 if n = 0 −(f (S n (w)) + . . . + f (S −1 (w))) if n < 0.