A prequential test for exchangeable theories

We construct a prequential test of probabilistic forecasts that does not reject correct forecasts when the data-generating processes is exchangeable and is not manipulable by a false forecaster.

results should be examined. First, at the outset, Alice knows nothing about how the data will be produced. Second, the data available to her is finite.
The first condition (Alice's complete ignorance) imposes no restrictions on the types of stochastic processes that may generate the data. So, the results hold for tests that are unlikely to reject any data generating process. In contrast, consider the case in which Bob and Alice assume that the data will be produced within a class of data generating processes, such as exchangeable processes. Then Alice may look for tests that do not reject any exchangeable process and she may require Bob to forecast consistently with some exchangeable process.
This allows for empirical testing that relies on a priory assumptions on how the data might be produced.
The second condition (finite data sets) is typically justified based on the practical observation that the data cannot feasibly be infinite. However, the impossibility results in this literature are conceptual: they show the existence of forecasting schemes that can manipulate Alice's test, but they do not determine how to construct such manipulating schemes in practice. The assumption of finite data sets may be significant because the forecasting schemes that manipulate the test may involve complex processes that require large, potentially infinite, streams of data to be effectively tested.
It is tempting to speculate that Alice's ignorance, combined with finite data sets, is central to the impossibility of screening informed and uninformed experts. An ignorant, but strategic, expert may be able to exploit the tester's ignorance, combined with her limited data sets, to manipulate her test. However, the impossibility theorems still hold even if any one of these two conditions is disposed: Assume that it is known, to both Alice and Bob, that the data generating process is an exchangeable processes. Alice does not know which exchangeable process runs the data and Bob claims that he knows the exact process (before any data is observed). Then, Alice has, a priory, a significant understanding of how the data will be produced. She knows that the data is produced by some exchangeable process and her empirical test may rely on this condition. Yet, with finite data the impossibility results still hold: Any prequential test that is likely to pass an informed expert can be manipulated by an uninformed expert, and so is equally likely to pass an uninformed expert even if he is restricted to produce forecasts based on an exchangeable process (Proposition 1 below). Now consider the case of infinite data sets, but with no restrictions on the types of stochastic processes that may generate the data. Shmaya (2008) showed that the impossibility results still hold: Any prequential test that is likely to pass an informed expert is also likely to pass an uninformed, but strategic, expert (Proposition 2 below).
Thus, neither a restriction to exchangeable processes, nor infinite sequences of forecasts and outcomes, can, by itself, rule out the impossibility theorems. The question addressed in this paper is whether the impossibility theorem still hold if it is assumed simultaneously that the data generating process is exchangeable and that the tester can obtain large, perhaps infinite, sequences of forecasts and outcomes. Under these conditions, we show that the impossibility theorems do not hold. We construct a prequential test that is likely to pass an informed expert and cannot be manipulated by an uninformed expert; no matter which forecasting scheme he uses, he cannot be assured that he is likely to pass the test.
Here is the basic idea of our test: By De-Finetti's Theorem, every exchangeable process has a representation as a sequence of i.i.d. coin tosses with an unknown parameter q. By the law of large numbers, the infinite realization pins down the parameter q. In addition, we show that after observing Bob's forecasts along the realization Alice can deduce Bob's prior distribution on the parameter. So, the sequential framework reduces to a one-shot framework in which Bob delivers a prior that (he claims) generates a parameter q that is observed by Alice. The existence of non-manipulable test can now be obtained by adapting a result in Olszewski and Sandroni (2009b) to this one-shot framework.
Our argument reveals a non-intuitive distinction between the infinite and finite horizon setups. With infinite streams of forecasts and outcomes, it is possible to determine q and, hence, to screen informed and uninformed experts. With finite (perhaps arbitrarily long) streams of forecasts and outcomes, it is possible to make arbitrarily accurate statistical inferences about q and yet this is insufficient to screen informed and uninformed experts.

Setup
Let S be a finite set of outcomes. Every period n = 0, 1, . . . an outcome is realized.
Let Ω = S N be the set of realizations equipped with the product topology. A theory is a probability measure over Ω, representing the distribution of some stochastic process. The set ∆(Ω) of theories is a convex and compact metrizable subspace of the topological vector space of all signed measures equipped with the weak- * topology. A paradigm is a set of theories.
Let S <N = n≥0 S n be the set of partial realizations including the empty sequence e.
For every partial realization σ = (s 0 , . . . , s n−1 ) let N(σ) ⊆ S N be the set of realizations x ∈ S N that extends σ, i.e. such that σ is the initial segment of x of length n. In particular N(e) = S N . Definition 1. Let µ be a theory and x ∈ S N = (s 0 , s 1 , . . . ) a realization. The forecast of µ over x at period n is the element p n ∈ ∆(S) that is given by  Dawid's (1984) prequential principle that assessment of a theory µ should depend on µ only through the sequence of forecasts that µ made over the realized sequence of outcomes, and not on forecasts conditional on data that has not been realized. More general tests allow the tester (Alice) to use forecasts of Bob's theory after any finite history, including the histories that did not occur. This paper focuses on prequential tests.
The test is finite if, for some n ∈ N, T depends only on the forecasts and outcomes of the first n periods.
Definition 2. Fix a prequential test T and let µ be a theory.
is the set of all realizations x ∈ S N such that µ passes the test over x.
Definition 3. The prequential test T accepts the data generating process in a paradigm Γ If the test accepts the data generating process with high probability and Bob's theory µ is the data generating process then Bob is assured that he will pass the test with high probability.
If a test is ǫ-manipulable then Bob can randomize a theory according to ζ and be sure that, with high probability, he passes the test regardless of the data generating process µ.
If a test is not manipulable then every strategy to fake theories fails on at least one data-generating process. Following Dekel and Feinberg (2006) and Olszewski and Sandroni (2009b), we use in this paper a stronger condition, which entails that every strategy employed by an ignorant expert will fail on a topologically large set of data generating processes. Recall that a set M of a separable metric space Z is co-meager if it contains a countable intersection of dense open sets of Z. We say that a property is satisfied by co-meager many elements in Z if the set of elements that satisfy this property is co-meager.
Definition 5. Let Γ be a paradigm. A test T is strongly non-manipulable w.r.t Γ if, for every distribution ζ over Γ, for co-meager many µ-s in Γ.
Note that in Definition 5 we require that the set of µ on which the expert fail be co-meager relative to the topological space Γ. We do not assume that Γ itself is a co-meager subset of ∆(Ω). Indeed, most paradigms used in modeling, i.e., most interesting classes of stochastic processes (stationary processes, markov processes, mixing processes) are meager as subsets of ∆(Ω). T is (ǫ + δ)-manipulable w.r.t Γ for every δ > 0.
The following result was proved by Shmaya (2008 For the rest of the paper we fix the set of outcome S = {0, 1}. Theorem 1 below and its proof applies with minor changes to the case of arbitrary finite set of outcomes, however since we are mainly interested in a counter-example the restriction to a two-elements outcome set is worth the notational convenience it provides. We identify the set forecast p ∈ ∆(S) with Before stating our result, we give two simple examples of paradigms for which Proposition 2 does not hold. In Example 1 the paradigm is compact but not convex and in Example 2 the paradigm is convex but not compact. Theorem 1. Let Γ be the paradigm of exchangeable processes. Then there exists a prequential test T such that T accepts the data-generating process in Γ with probability 1 and T is strongly non-manipulable w.r.t Γ.

Additional comments.
Non-prequential tests. construct a (non-prequential) non-manipulable test for the paradigm of theories which are learnable and sufficient for predictions, which is convex but not compact.

Non Borel tests. Another assumption of Proposition 2 is that the test is a Borel function
(This assumption is part of our definition of test in Section 2). Shmaya (2008) shows that under the axiom of choice, there exist non-manipulable prequential tests which are not Borel.
Future independent tests. Olszewski and Sandroni (2008) introduce the class of future independent tests. For prequential tests, the future independence property can be seen as weakening of the finite horizon property: instead of requiring that the test terminates at finite time, it only requires that if the expert fail the test then this failure will be demonstrated in finite (possibly unbounded) time. Olszewski and Sandroni (2009a) show that many of their manipulability results for future independent tests can also be proved for convex and compact paradigms. The test we construct is not future independent.
Strong non-manipulability. The definition of strong non-manipulability appeals to a topological notion of 'large' set of distribution as a co-meager set. The goal of the definition is to capture the idea that, regardless of the strategy he uses to creates forecasts, the uninformed expert will fail on a large set of stochastic processes. This definition has been used in the expert testing literature, but we recognize that such topological notion has well-known drawbacks. In particular, it does not indicate odds of discrediting the uninformed expert.
We view the main contribution of our theorem to be the very fact that there exists a test which is not manipulable. The strong non-manipulability property is a bonus.
This proposition has the following nice interpretation: Assume that an expert claims to know the distribution from which an element q ∈ Q is drawn. Say that the expert passes the test if t(μ, q) = PASS, whereμ is the distribution proclaimed by the expert and q is the actual realization. The first property says that if the expert provides the correct distribution of q then he passes with probability 1. The second property ensures that, if the expert randomizes his theory according to some distributionζ, then he will fail the test with probability 1 for a large (co-meager) set of truths.

4.2.
Proof. We first prove that when forecasts are made using an exchangeable theory, then the forecasts over a single realization already determines the entire theory.
Proof. Under the assumption of the lemma, it follows by induction from (1) that (6) µ ′ (N (s 0 , . . . , s n−1 )) = µ ′′ (N (s 0 , . . . , s n−1 )) for every n ∈ N. From (6) and (4) it follows that for every n. Since n−1 i=0 q s i (1 − q) 1−s i is a polynomial in q of degree n, it follows again by induction that q n λ ′ (dq) = q n λ ′′ (dq) for every n. Since a probability measure with bounded support is determined by its moments it follows that λ ′ = λ ′′ .  where the first equality follows from (7), the second from the definition ofμ and the third from Proposition 3. Thus T accepts exchangeable data generating processes with probability 1.

Testing and merging
The expert testing literature asks whether an uninformed expert can produce forecasts that appear to be are as good as the forecasts of the informed given the observed outcomes.
A related question is whether an uninformed expert can produce forecasts that are close to the forecasts of the informed expert. In this section we show that the answer to these two questions might be different.
Recall that a distribution ν ∈ ∆(S N ) merges with a distribution µ ∈ ∆(S N ) if Say that a paradigm Γ is learnable if there exists ν ∈ Γ such that ν merges with every µ ∈ Γ. If Γ is learnable then an uninformed expert can learn to predict, i.e., to provide forecasts that becomes arbitrary close to the forecasts of the informed expert.
The paradigm of all exchangeable theories is learnable: If Ω = {0, 1} then the theory ν = ε λ given in (4) where λ is the uniform distribution on [0, 1] merges with all exchangeable theories. However, by our Theorem 1, there exists a test that screens an uninformed expert from the informed expert based on the outcomes of the process.
The paradigm of all theories is not learnable, so that an uninformed expert cannot produce forecasts that are close to the informed expert's forecasts for every process, but, by Proposition 2, he can do as good as the informed expert in every prequential test. where the suprema are over all ζ ∈ ∆(Γ) and the minima are over all µ ∈ Γ.