A New Proof of the Likelihood

Principle

Greg Gandenberger

ABSTRACT

I present a new proof of the likelihood principle that avoids two responses to a wellknown proof due to Birnbaum ([1962]). I also respond to arguments that Birnbaum’s proof is fallacious, which if correct could be adapted to this new proof. On the other hand,

I urge caution in interpreting proofs of the likelihood principle as arguments against the use of frequentist statistical methods. 1 Introduction 2 The New Proof 3 How the New Proof Addresses Proposals to Restrict Birnbaum’s Premises 4 A Response to Arguments that the Proofs Are Fallacious 5 Conclusion 1 Introduction

Allan Birnbaum showed in his ([1962]) that the likelihood principle follows from the conjunction of the sufficiency principle and the weak conditionality principle.1,2 The sufficiency principle and the weak conditionality principle are 1 Birnbaum calls the principles he uses simply the sufficiency principle and the conditionality principle. Dawid ([1977]) distinguishes between weak and strong versions of the sufficiency principle, but this distinction is of little interest: to my knowledge, no one accepts the weak sufficiency principle but not the strong sufficiency principle. The distinction between weak and strong versions of the conditionality principle (due to Basu [1975]) is of much greater interest: the weak conditionality principle is much more intuitively obvious, and Kalbfleisch’s influential response to Birnbaum’s proof (discussed in Section 3) involves rejecting the strong conditionality principle but not the weak conditionality principle. 2 The conditionality principle Birnbaum states in his ([1962]) is actually the strong conditionality principle, but the proof he gives requires only the weak version. Birnbaum strengthens his proof in a later article ([1972]) by showing that the logically weaker mathematical equivalence principle can take the place of the weak sufficiency principle. I present Birnbaum’s proof using the weak sufficiency principle rather than the mathematical equivalence principle because the former is easier to understand and replacing it with the latter does not address any important objections to the proof.

Brit. J. Phil. Sci. 0 (2014), 1–29 The Author 2014. Published by Oxford University Press on behalf of British Society for the Philosophy of Science. All rights reserved.

For Permissions, please email: journals.permissions@oup.comdoi:10.1093/bjps/axt039

The British Journal for the Philosophy of Science Advance Access published March 26, 2014 at Pennsylvania State U niversity on A ugust 10, 2014 http://bjps.oxfordjournals.org/

D ow nloaded from intuitively appealing, and some common frequentist practices appear to presuppose them. Yet frequentist methods violate the likelihood principle, whereas likelihoodist methods and Bayesian conditioning do not.3 As a result, many statisticians and philosophers have regarded Birnbaum’s proof as a serious challenge to frequentist methods and a promising ‘sales tactic’ for Bayesian methods.4

Most frequentist responses to Birnbaum’s proof fall into one of three categories: (i) proposed restrictions on its premises; (ii) allegations that it is fallacious; and (iii) objections to the framework within which it is expressed. In this article, I respond to objections in categories (i) and (ii). In Section 2, I give a new proof of the likelihood principle that avoids responses in category (i). In

Section 3, I explain that analogues of the minimal restrictions on Birnbaum’s premises that are adequate to block his proof are not adequate to block the new proof. Suitably modified versions of the responses in category (ii) apply to this new proof as well, but I argue in Section 4 that those responses are mistaken. Arguments in category (iii) have been less influential because standard frequentist theories presuppose the same framework that Birnbaum uses.

For objections to theories that use a different framework, see Berger and

Wolpert ([1988], pp. 47–64).

I see little hope for a frequentist in trying to show that Birnbaum’s proof and the new proof given here are unsound, but one can question the use of these proofs as objections to frequentist methods. The likelihood principle as

Birnbaum (for example, [1962], p. 271) and I formulate it says roughly that two experimental outcomes are evidentially equivalent if they have proportional likelihood functions—that is, if the probabilities that the set of hypotheses under consideration assign to those outcomes are proportional as 3 Throughout this article, unless otherwise specified, ‘frequentist’ and ‘frequentism’ refer to ‘statistical frequentist’ views about statistical inference that emphasize the importance of using methods with good long-run operating characteristics, rather than to ‘probability frequentist’ views according to which probability statements should be understood as statements about some kind of long-run frequency. The earliest statistical frequentists were also probability frequentists, but the historical and logical connections between statistical frequentism and probability frequentism are complex. I use the word ‘frequentist’ despite its ambiguity because it is the most widely recognized label for statistical frequentism and because recognized alternatives also have problems (Grossman [unpublished], pp. 68–70). 4 See for instance (Birnbaum [1962], p. 272; Savage [1962], p. 307; Berger and Wolpert [1988], pp. 65–6; Mayo [1996], p. 391, Footnote 17; Grossman [unpublished], p. 8), among many others.

Birnbaum himself continued to favour frequentist methods even as he refined his proof of the likelihood principle ([1970b]). He claims that the fact that frequentist principles conflict with the likelihood principle indicates that our concept of evidence is ‘anomalous’ ([1964]). I regard the frequentist principles that conflict with the likelihood principle (such the weak repeated sampling principle from (Cox and Hinkley [1974], pp. 45–6)) not as plausible constraints on the notion of evidence, but rather as articulating potential reasons to use frequentist methods, despite the fact that they fail to track evidential meaning in accordance with the intuitions that lead to the likelihood principle. This view differs from Birnbaum’s only in how it treats words like ‘evidence’ and ‘evidential meaning’, but this verbal change seems to me to clarify matters that Birnbaum obscures.