An Analysis of the Falsification Criterion of Karl Popper: A Critical Review

Karl Popper identified ‘falsifiability’ as the criterion in demarcating science from non-science. The method of induction, which uses the (debated) principle of uniformity of nature, was rejected by Popper. He instead suggested that a scientific theory cannot be ‘verifiable’ but only ‘falsifiable’; one counter-example to the claims made by the theory would falsify it. The paper conducts a survey of the extant literature to understand the concept, the methodology as suggested by Popper to operationalize the concept, and possible limitations, both conceptual and methodological. The extant literature points out inherent ambiguities in the Popperian concept of falsifiabilty. One recurring theme is that Popper, the deductivist, uses the much critiqued inductivistic method among his methodological suite.


Introduction
Science can be defined as a systematic endeavor to organize knowledge as a set of falsifiable or testable explanations and predictions about the universe. One keyword here is testability or Popper surmised that a key difference between the two theories (Freud's Psychoanalytic theory and Einstein's theory of relativity) was the intrinsic 'risk' in Einstein's theory which could lead to its potential falsification. In contrast, the psychoanalytic theory was, even in principle, not falsifiable. The component of risk in Einsteinian theory emanated from the fact that highly improbable or impossible consequences, as per the Newtonian paradigm (such as light bending towards massive objects, a fact confirmed by Eddington in 1919), which would, if they were shown to be false, falsify the theory. Popper was critical of the Marxian theory as well, although Popper admitted that it was initially propounded as a truly predictive theory; when facts showed that it was inadequate it was worked on by supplementation of ad-hoc hypotheses to reflect these facts. Thus, Marxism, a scientific theory, was reduced to a "pseudo-scientific dogma" (Thornton, 2019). Hence Popper concluded that these "theories" (Psychoanalytic theory, worked-on Marxism) were similar to primitive myths and not to modern science (Mitra, 2016).
These experiences led Popper to use falsifiability as the benchmark to demarcate (distinguish) science from non-science. A theory would be deemed to be scientific if it has the potential to be incompatible with at least one or more of all possible empirical observations, whereas a theory that is compatible with all possible empirical observations, either because it has been modified on an ex-post basis to accommodate these observations (such as revised Marxism) or it has been developed to be compatible with all possible observations (such as psychoanalytic theories) is unscientific. A theory that is unscientific, being unfalsifiable, might however, become scientific with the development of technology and/or with further refinement of the theory.
Popper authored three books between 1935 and 1957. The first book was named Logik der Forschung (1935), and was translated later to English as The Logic of Scientific Discovery (Popper, 2002) [hereinafter L. Sc. D.]. This book provides an overview of his notions on science and its philosophy. His other books include The Poverty of Historicism (1957) that critiques the idea of historical laws and The Open Society and its Enemies (1945), a treatise on philosophy of society, history and politics.
This article intends to introduce Popper's ideas of falsification in the context of philosophy of sciences, conduct a review of the literature that critiques Popper's conceptualization and operationalization of falsification, and then discuss and comment on the findings. The article collates research that has critiqued his ideas, discusses them and comments on the possible limitations and merits of Popper's notions of falsificationalism in the light of existing critique.
The rest of this paper is structured as follows: this introduction is followed by a brief exposition of Popper's idea of demarcation between sciences and non-sciences in section 2. This is followed by an analytic understanding of Popper's criterion of falsifiability in the extant literature in section 3. Standard literature has been used from a wide array of sources. Results and discussion of findings constitute section 4. A conclusion follows in section 5.

Demarcation and Falsifiability
According to Popper, the principal issue in philosophy of sciences is that of demarcation or distinguishing science from non-science, such as metaphysics or Freudian psychoanalysis. While accepting Hume's critique of induction as valid, he opines that induction should not generally be used by a scientist. He contends that all observations are "selective and theory-laden". Or, in other words, there can be no observation without theory. Thus, he challenges the hitherto dominant viewpoint that the inductive method demarcates science from non-science.
Popper thus suggests falsification as a valid method for scientific investigation after rejecting induction as a methodology. According to Popper a theory can be corroborated as scientific only if it endures truly 'risky' forecasts which have the potential to turn out false. A test of a scientific theory is an attempt to falsify it, with only a single counter-instance rendering the whole theory untrue. Popper's idea of demarcation is rooted in the fact that there exists a logical asymmetry between verification and falsification: it is impossible to conclusively verify a universal proposition by induction, whereas one counter-example proves the universal law to be false.
A scientific theory, thus, Popper says is prohibitive in that it prohibits certain events. Hence, whereas testing and falsification of such a theory is possible, logical verification is not possible. Hence, a theory should not be assumed to be verified even after very rigorous testing for years. The most that can be said is that it has been highly corroborated and is a good candidate to be rated as the best available theory till it is falsified.
However, according to Popper there is a distinction between the logic of falsifiability and the relevant methodology. For example, if a ferrous metal is shown to be not influenced by magnetic fields, then it cannot be said that all ferrous metals are affected by magnetic fields. Thus the Popperian paradigm says that a scientific law is falsifiable but not conclusively verifiable. However methodological errors import a dimension of uncertainty: could there have been an experimental error which probably influenced the outcome of the experiment? Thus in actual practice, one single counter-example is not sufficient to falsify a theory. This is the reason for retaining scientific theories in many cases despite anomalous evidence. The OPERA experiment, a collaborative scientific effort between CERN, Geneva and LNGS, Italy, for detecting neutrinos, a subatomic particle, reported findings that said neutrinos were found to travel faster than light. Scientists announced this result in September, 2011. However the scientific community retained belief on Einstein's theory of relativity that specifies an upper limit to the velocity of any particle, namely the velocity of light. However, later concerned scientists admitted two errors in their experimental set-up (Mitra, 2016).
According to Popper, based on the criterion of demarcation using falsifiability, among other things, physics, chemistry, nonintrospective psychology can be classified as sciences, psychoanalysis as pre-science and astrology and phrenology as pseudosciences.
Unlike many social scientists, in Popper's view, the more improbable a theory is, the more competent it is scientifically, as the probability of a theory to be true and its information content are inversely proportional to each other. Thus statements which closely approach truth are those with high information content although improbable.
Popper, although initially skeptical about the concept of truth -he considered a theory to be an open-ended hypothesis and hence potentially false -later in his Conjectures and Refutations (1963) integrated the concepts of truth and content (a new theory has more empirical content than an old one) to frame the concept of verisimilitude or truth-likeliness. The content of a theory is the sum total of its logical consequences, divisible into two classes: 'truth content', the class of true propositions derivable from it, and the 'falsity content', the class of false consequences; this may be an empty set. Derksen (1985) critiques Popper's concept of falsifiability and calls it 'fake cement' to achieve methodological unity in the philosophy of science. He examines Popper's epistemology and suggests that apparently the concept of falsifiability leads to a 'great chain' of concepts, all linked to falsifiability. A wide variety of desiderata, such as most falsifiable, most testable, most informative and the best corroborated are all achievable simultaneously. However Derksen (1985) claims that there are inherent ambiguities in the Popperian concept of falsifiabilty. As a result, the great chain disintegrates along with the methodological unanimity that goes with it. Derksen (1985) examines the great chain in detail. According to Popper, any learning is possible through falsifying our guesses. Hence the first claim in this chain is: "only from our mistakes can we learn". This is actually falsifiability, or criticizabilty, a broader term. Now, if learning is possible only through mistakes, scientific theories should be open to empirical falsification. Generalizing this, a more falsifiable theory has a better probability of being falsified, and hence is more scientific. Or in other words, the scientific character of a theory is measurable and falsifiabilty is its metric.

Review of Literature
Popper goes on to argue that a highly falsifiable theory has "little chance to escape falsification". A bolder theory not only presents more risks of falsification, it also offers more opportunity to learn something new, thus offering scientific knowledge a chance to grow. Hence Popper's second claim is: a more falsifiable theory offers a better opportunity of scientific growth.
Another link of falsifiability with Popper's deductivist thought is that a highly falsifiable theory contains more information. The converse is true as well; a theory with no falsifiers has no empirical content. Hence the initial links of the 'great chain' are: falsifiability links informative content links (larger) class of potential falsifiers.
Popper further contends that the most falsifiable theory is also the one with the highest explanatory capacity and one with the maximum simplicity. A theory can be made more falsifiable either by making the theory more general, or by using a more specific predicate. Examples are: all crows are black to all birds are black for the first way; all crows are black or brown to all crows are black for the second way. Thus it follows that explanatory power increases as falsifiability increases. As regards simplicity, Popper quotes the example of the hypothesis stating that planets revolve around the sun in circles as compared to the less simple elliptical orbits. The circle hypothesis can be falsified by four observations, whereas the elliptical hypothesis can be falsified through at least six observations. Also, the circle hypothesis is more precise as all circles constitute a subset of ellipses. Hence it can be seen that simplicity and falsifiability are positively correlated.
Hence the great chain is reached: falsifiability  potential falsifiers  testability  information content  explanatory power  simplicity.
Popper later realized that only falsification is not enough. A scientist has to be certain that his more falsifiable, and consequently falsified theories are heading him in the 'right direction'. Popper (1960) says that when all new attempts at theory building are refuted, the scientist "would feel that we were producing a sequence of theories which … were ad-hoc and … that we were not getting any nearer to the truth", and consequently "science would lose its empirical character". Hence Popper amended the first claim to say: only through falsifications, occasionally interspersed with corroborations (this may be called an amended claim), can we learn.
Popper claims in his L. Sc. D. that the theory which has resisted the most severe testing is also the most falsifiable theory which has not been falsified; hence this is the most corroborated theory. There are, however, questions regarding whether scientists actually test the most falsifiable theory and corroborate it, and hence the most corroborated theory is not necessarily the most falsifiable one.
However, if one follows Popper's advice, that is, testing the most falsifiable theory, then the last link in the great chain is added: The most corroborated theory  the most falsifiable theory yet to be falsified. However, Popper's 'Great Chain' experiences tension as illustrated below. It is possible to argue that the most corroborated theory at one point of time which has been tested most thoroughly is no longer the most falsifiable one. This is because the "most risky predictions" have been tested, and the comparatively less risky ones are yet to be tested. Hence with a smaller probability of refutation, these less risky predictions do not make the theory most falsifiable currently. Also, corroborated experiments, which are now part of scientific knowledge, leaves much less scope for counterexamples to occur, thus decreasing the severity of the tests. This implies that falsifiability as indicated by testability of a theory decreases with time. Popper admits as much: "empirical character of a very successful theory grows stale after a time" (Popper, 1972).
Falsifiablity under the Popperian paradigm shows up as falsifiabilty as information content and also falsifiabilty as testability. While a theory is successful against falsification, its information content and its explanatory power remain constant, whereas its testability decreases. 'Corroboration', a complement to testability, hence, changes. The 'chain' linked with a single concept of falsifiabilty comes under strain: is falsifiabilty to be reckoned as information content or testability?
So the question boils down to: which theory do we test? The most testable theory (best chance to learn) or the most corroborated theory (to be closer to the truth)? Suppose that we settle for the most testable theory, as this offers the greatest chance to learn. Science, after all, is an endeavor to learn from our mistakes; this maintains the empirical character and rationality inherent in sciences. However, going by our amended claim, we also need occasional corroborations. How can the riskiest and most testable theory guarantee that? Since Popper says that corroboration should emerge as a result of the most severe test, it is clear that we need the most testable theory, and hope for occasional corroboration and finally falsification. Here, Popper makes his third claim: "corroboration gives us a reason for believing that science has come closer to the truth." Popper propounded methodological directives as to which theories should be chosen to test: • The most falsifiable, informative, testable theories for obvious reasons • The most corroborated theory; this is the most severely tested theory, "appears to be the best so far", and hence is 'rational' To offer a solution to the issue narrated in the earlier paragraph, Popper propounded two more directives: • A new theory must cover successes of the old theory • An old theory should be an approximation of the new theory These directives are relevant when an old theory has been tested so much that its "empirical character has grown stale". Popper offers an elegant way out of the issue raised in the preceding paragraph.
We need to know that we are moving in the "right direction"; hence the old theory has to be approximately true. Since the new theory has to preserve all past successes it can be seen that the directives by Popper imply choosing the most testable theory among the most corroborated, convergent ones. Derksen (1985) shows that Popper's third claim carries the weight of the amended first claim and the second claim. Popper extends arguments in support of the third claim, namely, the verisimilitude argument, and the highly unlikely accident argument. Popper puts the later thus: a theory which has withstood a series of different and risky tests, it is "highly improbable this is due to an accident, highly improbable therefore that the theory is miles away from the truth."Although apparently a deductivist argument, Derksen concludes that since the future is involved it is an inductivist argument. Hence claim three cannot be explained by Popper's deductivism. Hence the first two claims are not so meaningful. Also testability and information content, as shown earlier, are disassociated. Hence, Derksen concludes that falsifiabilty is 'fake cement'. Gillies (2003) comments on a challenge to falsifiability as the demarcation criterion, known as Duhem-Quine thesis, which was brought to the fore by Neurath in 1935 and was based on the work of Duhem. The following presents the gist of the thesis.
It is agreed on common consent that Newton's first law of motion is a scientific law. As it happens, it is not falsifiable. This law states that a body continues in its state of rest or of uniform motion in a straight line, unless acted upon by an external impressed force. Let it be supposed a body is found neither at rest nor at uniform motion in a straight line, which it seems is not acted upon by any external force. This observation seemingly refutes Newton's law, but in reality this does not necessarily hold true. Newton himself on observing the elliptical orbits of planets came to the conclusion that they were acted on by gravitational forces from celestial bodies other than the sun. This issue is discussed by Duhem (1962) as cited in the "Underdetermination of Scientific Theory" (2019)"…the physicist can never subject an isolated hypothesis to experimental test, but only a whole group of hypotheses; when the experiment is in disagreement with his predictions, what he learns is that at least one of the hypotheses constituting this group is unacceptable and ought to be modified; but the experiment does not designate which one should be changed." Going by the above, Newton's first law cannot be tested on its own as a standalone hypothesis, but only as a group of hypotheses. For meaningful results, the law should be used in conjunction with: one, further assumptions, such as Newton's second and third laws and the law of universal gravitation; and two, auxiliary assumptions, such as the mass of the sun is much greater than that of the planets.
Since the first law needs to be used in conjunction with many assumptions, it would not be possible to refute the law in case forecasts from the law are not realized, as any further assumptions and/or auxiliary assumptions might not hold. Hence, by the Duhem-Quine thesis Newton's first law is not falsifiable.
Popper replied to the issue. He used a four-level model of types of statements divided on the basis of their falsifiability and conformability. Gillies (2003) extend this in Table 1:  Gillies (2003) points out where the ideas of Kuhn and Popper converge. Level 2 theories, such as Newton's first law, cannot be falsified through observation as shown in Table 1. According to Thomas Kuhn the Newtonian paradigm was replaced by the Einstinian paradigm not through one single observation, but through a process of "scientific revolution". This is expected in the case of level two theories which cannot be falsified. However, the Popperian scheme of falsification is applicable to level-one theories.
Moreover, a level one hypothesis such as Kepler's first law (that states that planets orbit around a star in ellipses with the star at its one focus) can be tested by observing the positions of the planet and ascertaining that these points lie on the circumference of the ellipse with defined parameters. This may be called direct confirmation. Newton's laws, along with a few additional assumptions, mentioned earlier, can deduce an approximate form of Kepler's law. Newton's theory can however be confirmed by observation on planets, motions of projectiles and so on. The confirmation of Newtonian theory, along with the fact that Kepler's first law in an approximate form is obtained from Newtonian theory is a pointer to an indirect confirmation of Kepler's law. Popper (1972) wrote something similar: "Thus I assert that with the corroboration of Newton's theory, and the description of the earth as a rotating planet, the degree of corroboration of the statement s 'The sun rises in Rome once in every twenty-four hours' has greatly increased. For, on its own, s is not very well testable; but Newton's theory, and the theory of the rotation of the earth are well testable. And if these are true, s will be true also." The Duhem-Quine thesis says that it would be impossible to falsify an individual theory experimentally. Can the Popperian paradigm guide us in detecting errors in individual theories? Maxwell (1972) describes the following thought experiment in this regard. Let us consider two rival research programs based on two competing theories T1 and T2. The program based on T1 has long been stagnant; a number of well-corroborated hypotheses are at odds with this. The program has also not been able to come up with new predictions.
The research program based on T2 is surging ahead. Its empirical content and predictive power are far more. Popperian rules indicate that T2 should be accepted and T1 rejected.
But if T1 is true (perfectly possible), then on what basis should T1 be rejected? The logic advanced by Maxwell is that perhaps the universe is built in such a way that theories which plunge us into deeper errors are precisely the programs which forge ahead. However, the Popperian paradigm has no clear answer to this dilemma. The argument that T2 has been more corroborated than T1 implies that T2 is closer to the truth cannot be invoked by followers of Popper as that would mean resorting to the unreliable inductive method! Turney (1991) in an interesting paper points out certain 'errors' in Popper (1959) where Popper says that simplicity can be equated to falsifiability. Popper uses a geometrical example to develop his argument. He proposes two definitions and a theorem.
Definition 1: The theoretical dimension of a class of geometrical figures is one less than the cardinality of the smallest set of points such that there is no figure in the class on which all the points lie.
Definition 2: The geometrical dimension of a class of geometrical figures is the number of free parameters in the equations that define the class.
As an example, the equation for the class of circles is: Ax 2 + Ay 2 + Bx + Cy + D = 0 This has four parameters, A, B, C, and D but since the equation determines the value of one of them, hence the number of free parameters is three; the geometrical dimension of the class of circles is three. A lower geometrical dimension translates to a simpler class.
Theorem: The geometrical dimension of a class of geometrical figures equals the theoretical dimension of that class.
Turney (1991) claims that the theorem is false. Let us consider the conic Ax 2 + Bxy + Cy 2 + Dx + Ey + F = 0 where at least one of A, B, C, D, E is not equal to 0. For a circle: A=C, A is not equal to 0; B=0. The requirement that at least one of A, B, C, D, E be non-zero arises from the fact that if A = B= C= D= E= F= 0, then the solution is the entire x-y plane with infinite dimensions.
Considering the class of circles, since they have three free parameters, hence it has a geometrical dimension of three. Turney shows that it has a theoretical dimension of two. Let us consider the set of 3 points P = ((0, 0), (1, 1), (2, 2)). Using the conic equation, we get, This implies A=0, which is a contradiction, as A is not equal to 0 for a circle. On dropping this condition, certain undesirable results are obtained.
• Circles are not special cases of ellipses • Lines are special cases of ellipses Hence we do not drop this condition, thereby implying no circle contains all points in P. Hence the class of circles has a theoretical dimension of two, not three and geometrical dimension of three.
Such results can be illustrated with other conics as well. Goodman (1972) as cited in Turney (1991, has also shown an inconsistency in Popper's equalizing falsifiability with simplicity. Let us suppose a number of maple trees have been examined from a wide area and all of them have been found to be deciduous. Further suppose we have not visited a particular location, Eagleville. A choice among the following hypotheses is what we might be considering: • All maples are deciduous, except those in Eagleville • All maples are deciduous • All maples elsewhere and all sassafras trees in Eagleville are deciduous. Clearly the third statement is most specific and the easiest to falsify, whereas the second is the easiest and the best. Thus Goodman concludes that there is no reason to necessarily equate simplicity to falsifiability. Keita (1989) raises three important issues regarding falsification of universal statements: • If induction does not work, then a scientific law L derived from a finite set of universal statements is false. Then the veracity of a theory which is constituted by L is questionable. The implication is that Popper attempts to falsify 'false' theories.
• In actual science, researchers attach a predictive judgment before testing a hypothesis which has already been subject to experiments. This is really resorting to the inductive method. However, in case of statistical inference (which are not strictly scientific laws as they do not predict events with a probability one and are not explanatory but correlational) scientific laws are "established by enumeration of particular events".
• Actual scientific laws are not universal statements but are restricted in temporal and spatial scope. For instance, Henry's law, a law in the theory of Phase Equilibria: "At a fixed temperature, the amount of gas dissolved in a given quantity of solvent is proportional to the partial pressure of the gas above the solution or P=XK where K is Henry's constant. But Henry's law is applicable to finite class of gases subjected to experiments which defines a set of conditions such as temperature and pressure. If in T1 to Tn-1 experimental conditions this law has been satisfied then it would be prudent to infer that it would be satisfied at time Tn. Thus this takes the sting out of the criticism that induction takes a leap out of a finite set to an unrestricted class.

Results and Discussion
The following Table 2 records insights and discusses them from the preceding review of literature conducted in section 3 in a structured manner.  (Derksen, 1985) The verisimilitude argument of Popper can be stated as "in case a theory withstands a set of risky and varied tests, it is highly improbable that this is due to an accident…". It has been correctly pointed out by Derksen (1985) that with respect to the past and present the argument is noninductivist but is inductivist with respect to the future. The rationale that can justify Popper's reasoning is that the future is similar to the past. But this is an inductivist argument. Hence this contradicts Popper's deductivism. Hence Derksen's argument is valid. (Gillies, 2003) Popper, it seems, broadly agreed with the critique, and he answered the critique with a four-level model of statements based on their falsification and conformability. The critique is a valid one. The critique and the consequent four-level model of Popper led to an understanding (Gillies, 2003) that level two 'statements' such as Newton's second law had to be falsified through a "scientific revolution" as proposed by Thomas Kuhn. (Maxwell, 1972) The paper strikes at the basis of Popper's philosophy, namely, the ability of Popper's methodology and logic to demarcate science from non-science. The issue of two competing research programs T1 and T2 is well dealt. The weakness of some of Popper's arguments, seemingly deductivistic, but actually shown to be inductivistic is witnessed in this article as well. (Turney, 1991) Popper claims that the geometrical dimension of a class of geometrical figures is equal to the theoretical dimension of that class. He, however, does not prove it. Popper's claim is not founded on reasoning. Popper's general claim that simplicity is the same as falsifiabilty is dented.

Conclusion
A major critique of Popper is that Popper's methodology of falsification is based on the despised (by Popper) inductive method (Derksen, 1985;Keita, 1989;Maxwell, 1972). This makes Popper's methodologies and assertions on falsifiability open to arguments. In any case, actual empirical science generally uses induction and actual scientific laws are situated in specific spatio-temporal contexts, thus countering Popper's critique that inductivists take a wild leap of faith when interpolating their findings to a universal context. The second critique, which Popper agreed with and replied by strengthening his framework, is the Duhem-Quine thesis: actual science considers a group of hypotheses and in tandem, and in such cases whereas individual hypothesis are falsifiable, the group is only confirmable and not falsifiable. Third, while Popper equated simplicity with falsifiabilty, researchers such as Goodman (1972) and Turney (1991) have picked up flaws in this argument as well.
The paper takes a look at the various dimensions of the falsification criterion using a review of the extant literature. Various researchers have identified deficiencies in Popper's epistemology: also actual science works not necessarily in accordance with how Popper visualizes it. Nonetheless, Popper's falsifiability remains a very strong criterion, especially where research can be founded on value-laden assumptions such as in the social sciences including management, and scientists willing to subject their research to (Goodman, 1972) In the limited context, Goodman's argument holds. However, whether the second statement which is linguistically most simple is actually most simple or not may be open to argument. (Keita, 1989) Induction is a recurring theme in almost all critique of Popper's methodology. It is also a fact that induction does play a part in science, but that is not a problem with Popper's argument. With regard to Henry's law, the inductive approach cannot test for all possible situations in the restricted "temporal and spatial scope". Hence Popper's criterion of falsifiability as a criterion of demarcating science from non-science remains valid. more difficult tests for falsifiability would constitute sound research practice.
As a student of philosophy of sciences, one would contend that the idea of probabilistic verification of Kuhn (1970) is in many cases a good guide to philosophy of sciences. Normal science, according to Kuhn, advances by probabilistic verification of competing theories, wherein the better theory becomes the most viable one through a process similar to natural selection. There is always an imperfect data-theory fit and in case of severe inconsistency, testing the theory through falsification would require a degree of falsification or level of improbability which leads to probabilistic verification. In this regard, Popper and Kuhn share a degree of unanimity.