Francais | English | Espanõl

Howland will forgery trial

From Wikipedia, the free encyclopedia

Jump to: navigation, search
This article may contain original research or unverified claims.
Please help Wikipedia by adding references. See the talk page for details.

The Howland will forgery trial was a U.S. court case in 1868 to decide Henrietta Howland Robinson's contest of the will of Sylvia Ann Howland. It is famous for the forensic use of mathematics by Benjamin Peirce as expert witness.

Contents

[edit] Robinson v. Mandell

Sylvia Ann Howland died in 1865, leaving roughly half her fortune, of some USD 2 million, to various legatees with the residue to be held in trust for the benefit of Robinson, Howland's niece. The principal was to be distributed to various beneficiaries on Robinson's death.

Robinson produced an earlier will, leaving her the whole estate outright. To the will was attached a second and separate page, putatively seeking to invalidate any subsequent wills. Howell's executor, Thomas Mandell, rejected Robinson's claim, insisting that the second page was a forgery, and Robinson sued.

In the ensuing case of Robinson v. Mandell, Charles Sanders Peirce testified that he had made pairwise comparisons of 42 examples of Howland's signature, overlaying them and counting the number of downstrokes that overlapped. Each signature featured 30 downstrokes and he concluded that, on average, 6 of the 30 overlapped, 1 in 5. When the admitedly genuine signature on the first page of the contested will was compared with that on the second, all 30 downstrokes coincided, suggesting that the second signature was a tracing of the first.

Benjamin Peirce, Charles' father, then took the stand and asserted that the probability that all 30 downstrokes should coincide in two genuine signatures was 1 divided by 2,666,000,000,000,000,000,000. He went on to observe
So vast improbability is practically an impossibility. Such evanescent shadows of probability cannot belong to actual life. They are unimaginably less than those least things which the law cares not for. ... The coincidence which has occurred here must have had its origin in an intention to produce it. It is utterly repugnant to sound reason to attribute this coincidence to any cause but design.

The court ruled that Robinson's testimony in support of Howland's signature was inadmissible as she was a party to the will. The statistical evidence was not called upon in judgement.

[edit] Statistical analysis

The case is one of a series of attempts to introduce mathematical reasoning into the courts. People v. Collins is a more recent example.

[edit] Testing hypotheses suggested by the data

One potential issue with regard to Peirce's line of argument is why the particular metric of overlapping downstrokes chosen, rather than one of the many other ways of quantifying the similarity of two signatures. Did he look at the data and decide that downstroke matching would be a fruitful line of attach or did he decide on downstroke matching before seeing the data? In such situations, there is always a fear that analysts are, perhaps inadvertently, indulging in testing hypotheses suggested by the data, which can generate very impressive but utterly spurious p-values.

Posterior match rate distribution based on 42 signatures and a Beta(0.5,0.5) prior

[edit] A modern Bayesian analysis

In attempting to understand Pierce's argument more deeply it is helpful to try to replicate it using a modern statistical analysis. It is interesting to consider how his figure of "1 divided by 2,666,000,000,000,000,000,000" or :<math>3.75 \times 10^{-22}</math> was arrived at. One possibility is that it is the joint probability of 30 independent events—the downstroke matches—each of which has probability 1 in 5 (taken from the proportion found in the sample). However, 1/5 to the 30th power is not :<math>3.75 x 10^{-22}</math> but :<math>1.07 x 10^{-21}</math> or 1 in :<math>9.31 x 10^{20}</math>, which seems to be a substantial error. If Peirce simply used the argument above, raising 1/5 to the 30th power (which seems unlikely) then it is an approximate calculation of a Bayes factor, with the approximation being made that the proportion of downstroke matches in a collection of true signatures is exactly 1 in 5. However, this proportion was found from a sample of 42 signatures and is thus subject to some sampling error. A modern Bayesian analysis will take this uncertainty into account, yielding a slightly different answer.

We start by considering what we know about the downstroke match proportion before we see the data, i.e. we first capture any relevant contextual information that is available. This contextual information has to represented as a probability distribution, which we called the prior distribution. In this case we have no such contextual information, so we assign a vague prior distribution to the downstroke match proportion. This means that everything we know about the downstroke match proportion will come from the sample of 42 signatures.

A suitable prior distribution would be a beta distribution, which is a distribution that sits on the unit interval [0,1] and thus is useful for representing proportions. The beta distribution is defined by two parameters, alpha and beta. For setting up a vague prior there are three choices of alpha and beta that are widely use. We will set alpha = beta = 1, which corresponds to a uniform distribution, that is, all possible values of the match proportion from 0 to 1 are considered equally likely before we see tha data fron the 42 signatures. Other possibilities would be the improper prior alpha = beta = 0, or the Jeffreys prior alpha = beta = 0.5. (The result does not depend greatly on which is chosen.) The evidence is equivalent to saying that, of the 30 times 42 = 1260 downstroke events, 1 in 5 of them are matches, i.e. there are 252 matches and 1008 non-matches.

The next stage is to multiply the prior by the likelihood, then normalise the result to lie in the interval [0,1]. The result is another probability distribution, called the posterior distribution. This will be the distribution which tells us everything we know so far about the match proportion. The beta distribution is the natural conjugate prior to the binomial, which means that the posterior is another beta distribution. In this case, for a binomial likelihood with 252 matches and 1008 non-matches the posterior will be a beta with parameters (252 plus alpha) and (1008 plus beta). See the figure for a plot of this posterior distribution. It has a fairly sharp peak near 0.2, but it is not of zero width. Assuming that the match proportion was exactly 1 in 5 would be to approximate the peak by a spike of zero width at x = 0.2.

Having obtained this posterior distribution, the second stage of the calculation is to compute the probability of observing r = 30 matches, assuming a binomial distribution with N = 30 and a success probability which is unknown, but which follows the previously calculated beta posterior distribution. This is given by averaging over all possible values of the match proportion, but with the probabilities found from the previous posterior. This is

<math>

p(r = 30 | \theta)= \frac{1}{B(252.5,1008.5)}\int_0^1 \theta^{30} \theta^{251.5}(1 - \theta)^{1007.5}d\theta

= 4.153092037700561 \times 10^{-21}

.</math> where B(a,b) is the beta function.

This gives the probability that 30 matches would be observed, given that the signature on the codicil is genuine. It can be expressed in the form of odds, as Peirce did, as

<math>

\frac{1}{4.153092037700561 \times 10^{-21}} = 2.40784454311 \times 10^{20} .</math>

In view of the similarity of this result to Peirce's reported result, it is likely that he did a Bayesian calculation similar to this one. He may have used a prior other than those listed above. Alternatively, given the computing tools available in 1868, he may have approximated the integral, or he may simply have made an error in its calculation.

As it stands, however this argument misses the point. It is an example of what is often regarded as the prosecutor's fallacy. The Peirces' analysis attempts to calculate the probability that two signatures would display such a degree of similarity given that they were genuine:

PPeirce=P(30 coincident downstrokes|genuine signature).

However, what is relevant to the court is the probability that the signatures are genuine given their similarity:

PGenuine signature=P(genuine signature|30 coincident downstrokes).

To relate the two probabilities requires the use of Bayes' theorem:

PGenuine signature α PPeirce×P(genuine signature),

where P(genuine signature) is the probability that the signature is genuine given all the other evidence in the case.

Under the alternate hypothesis that the signature was a traced copy of the signature on the first page, the number of downstroke matches would be 30 i.e. the probability of 30 matches is 1.

Bayesian statistics uses a measure of evidence called the Bayes factor which is the probability of seeing the observed data if hypothesis of interest is true, divided by the probability of seeing the observed data if the hypothesis is false. Thus the numerator and the denominator of the Bayes factor are known. The posterior odds are obtained by multiplying the Bayes factor by the prior odds.

Suppose we assume that the two hypotheses are equally likely a priori (odds of 1:1), the odds against the hypothesis that the signature is genuine are 1.

<math>

1 \times 2.40784454311 \times 10^{20} .</math>

The factor of 1:1 is our prior estimate (before seeing the data) about the provenance of document, which can never be entirely removed from the problem. In this case the evidence is very powerful indeed. If our prior belief was that the odds against the codicil being fake were one million to one we would still arrive at the posterior conclusion, after factoring in the downstroke evidence, that the odds were :<math> 2.40784454311 \times 10^{14} </math> to one against the codicil being genuine.

[edit] Bibliography

  • Robinson v. Mandell, 20 F. Cas. 1027 (C.C.D. Mass. 1868) (No. 11,959)
  • Menand, L. (2002) The Metaphysical Club: A Story of Ideas in America ISBN 0-00-714737-6, pp163-176
  • Meier, P. & Zabell, S. (1980) "Benjamin Peirce and the Howland Will", 75 Journal of the American Statistical Association vol. 75 p497
  • "The Howland Will Case", American Law Review vol. 4 p625 (1870)
  • Eggleston, Richard (1983) Evidence, Truth and Probability ISBN 0-297-78263-0
Personal tools