Scraps And Crumbs A Review of the A.I.R. Report - 3 of 4
Paul H. Smith
"AN EVALUATION OF
REMOTE VIEWING: RESEARCH
AND APPLICATIONS"
This document is the third of a four-part review of the CIA- sponsored report by the American Institutes
of Research (AIR) of its evaluation of the U.S. government's twenty-four
year long remote viewing program. Part One, Bologna
on Wry Bread, covers the operational intelligence portion of the
program. Part Two, A Second Helping,
points out that the research reviewed by the AIR was inadequate as a basis
for a fair assessment of remote viewing. Part Three, Scraps
and Crumbs, examines the AIR's faulty evaluation of that research.
Part Four, has additional notes and corrections.
If one is limited only to the information
contained in the AIR report, one forms the impression that the evaluators
did a reasonably thorough job in assessing the SAIC/SRI experiments and
analyzing the results. The ambiguous conclusions (that there is an anomaly,
but after 20+ years of research it is still a tentative one, and no cause
and effect has yet been demonstrated) leads surely to the AIR conclusion-of-choice
that it really doesn't make sense for the government to waste further money
on it. But we would be misled. The AIR examination was neither in depth,
nor conclusive.
AIR employees themselves focused mostly
on their rather cursory evaluation of the intelligence operations part
of the STAR GATE program. Though some of them were involved as well with
evaluating the remote viewing research program, they contributed little
but a brief concluding summation to the final AIR report. Drs. Utts and
Hyman, specially engaged by AIR to review the research program, produced
by far the bulk of that assessment. Utts' is first sequentially in the
report. She starts with a general discussion of the statistical theory
used to gauge experimental success in parapsychology research. She follows
this with an instructive discussion about RV experimental design, some
history of RV research, and an exploration of the SAIC experiments, augmented
by more detailed information in an appendix. She also discusses briefly
how the results correlate with earlier work done at SRI (they are consistent
with these earlier statistically-significant experiments), and also lists
the results of a number of related remote viewing and ganzfeld (a form
of remote viewing) experiments conducted at various labs around the world.
According to Utts, the effects of these strongly correlate with those achieved
in the SAIC remote viewing experiments.
In the course of her remarks she anticipates
and answers many of the objections Hyman later brings up in his portion
of the review. Even allowing for my own personal bias in favor of her conclusions,
I find her assessment to be more rational, well- reasoned, and soundly
supported than is that of Hyman.
On the other hand, so general are Hyman's
comments that he could handily have written most of his evaluation without
ever once having to refer to the remote viewing experiments themselves.
Ultimately, he acknowledges that there are significant effects demonstrated,
but then spends a good deal of time discussing why in principle he rejects
these effects. He admits that he can find no flaws in the experiments,
yet says we must wait indefinitely to decide whether they have or have
not proved a psi effect so as to allow a lengthy interval for thus- far
unidentified flaws to be ferreted out. He warns that given enough time,
methodological flaws might turn up that had not yet dawned on anyone. He
then cites as his only examples of such methodological flaws two cases
that are decades-old and unrelated to remote viewing, where the only "flaws"
uncovered were instances of fraud. Meanwhile, Utts has already pointed
out that fraud as an explanation is untenable because of the numbers of
institutions in diverse locations around the globe that have produced results
equally significant as those of the SAIC experiments.
Utts later addresses and disposes of a
number of Hyman's other arguments and errors in her rebuttal that follows
Hyman's comments in the report. However, there were several other "literary
offenses" that Hyman or AIR or both commit that are not discussed. Since
Hyman's evaluation is at the heart of the AIR case against the remote viewing
research program, I will focus my attention there. In the interests of
space—which I consume ever more of as this review progresses—I will only
consider a few of the more egregious errors and misjudgments the good doctor
makes.
THE BABY OUT WITH THE BATH
To begin with, Hyman and AIR ignored twenty
years of research conducted prior to the SAIC experiments. Despite the
AIR's express assignment to thoroughly review "all laboratory experiments
and meta-analytic reviews conducted as part of the research program," ultimately
only ten experiments were actually reviewed—all of them performed at SAIC
in just the last three or four years of the government's program. One reason
for this was likely due, as Hyman says, to the "limited time frame [that
was] allotted for this evaluation" [p. 3-43, 3-44]. The AIR reviewers were
given only a month and a half—from mid-July to the end of August—to conduct
a supposedly "exhaustive" review.
Ed May asserts in his own rebuttal to the
AIR report (Journal of Scientific Exploration, vol. 10, no. 1, Spring 1996)
that in recognition of this unrealistically short time allotment, someone
at AIR requested May provide only the reports from his ten best experiments
for evaluation. Quite properly he demurred, since for sound scientific
reasons this would skew the results—in so doing, only successful results
would be considered, when to form a fair picture required that poor results
should be evaluated as well (selecting only experimental results that show
positive effects is known as the "file drawer" effect). As an alternative,
May proposed a different procedure that would have allowed examination
of all the materials within the time constraints, resulting in a much more
thorough and reliable assessment. His suggestion was ignored.
Instead, in a conference call between the
AIR evaluators, Hyman got agreement that only the ten latest experiments
would be evaluated. It was tacitly recognized that there were both relevant
and irrelevant experiments among these ten, but it made for a more manageable
evaluation pool, and it avoided the "file drawer" problem.
This is where it gets interesting. As earlier
noted, Hyman explains that a limited number of experiments were selected
because of lack of time to consider all of those available, and these ten
were the most recent. But he also cavalierly dismisses the need to examine
the other two decades worth of experiments by alleging that the handful
of SAIC experiments selected were "the only ones for which we have adequate
documentation" (p. 3-43). Earlier research was discounted as suffering
"from methodological inadequacies" upon which he chooses not to elaborate
further in his report. Hyman makes this amazing assertion despite the fact
that he had never even looked at the documents of which he is being so
dismissive. Sometime back in the mid 1980s, he reportedly saw some of the
results from the first few years of SRI experiments when he participated
in another flawed "scientific" evaluation of enhanced human performance
programs [i.e., the National Research Council's somewhat infamous "Enhancing
Human Performance" report].
Still, there remained perhaps ten years'
worth of subsequent remote viewing research conducted at SRI and elsewhere
to which Hyman had never previously had access. It, along with the ten
SAIC experiments, had been classified Secret or higher until the CIA decided
to make it all available in support of the AIR study.
Because of the CIA's declassification action,
Hyman finally WAS authorized access to the majority of the research, had
he chosen to examine it. However, he himself admits he never bothered,
since most of the experiments prior to the SAIC era were in the "three
large cartons of documents" he was given at the outset of the study but
which he freely admits in a recent article he "didn't have time" to look
into (Skeptical Inquirer, March/April 1996, p. 22). In short, he couldn't
possibly have known whether those experiments really did suffer from "methodological
inadequacies."
Still, Dr. Hyman couches his remarks in
such a way as to make an unsuspecting reader suppose that the ten experiments
reviewed were the best examples available. Though he clearly knew better,
he nevertheless claims in the Skeptical Inquirer article that the ten experiments
he and Dr. Utts evaluated were the "ten best studies," and "the best [RV]
laboratory studies" (p. 22), implying by assumption that they must therefore
be sufficient on which to base an adequate assessment of remote viewing.
This despite the fact previously explored in Part II of this review that
a number of the SAIC experiments had little or nothing to do with remote
viewing, and that the remainder were generally not fully state-of-the-art
RV experiments.
Nonetheless, a mere two pages after telling
us that he and his AIR fellows themselves arbitrarily decided that only
ten experiments would be reviewed, he proceeds to deplore the entire two-and-a-half
decades of research for producing "only ten adequate experiments for consideration."
Hyman writes:
"Unfortunately, ten experiments.
. .is far too few to establish reliable relationships in almost any area
of inquiry. In the traditionally elusive quest for psi, ten experiments
from one laboratory promise very little in the way of useful conclusions."
(3-46)
He is, of course, absolutely right in the
process of being altogether wrong.
PRIMA FACIE EVIDENCE
The arbitrarily limited data base is not
the only difficulty with AIR's study. Perhaps more problematic is Hyman's
arbitrary exclusion of so-called "prima facie" evidence (3-71). This is
introduced in the section where Hyman (without, I might add, any qualifications
whatsoever in the field of intelligence) considers whether RV has potential
for use in operational intelligence settings. Though in this part of his
discussion he is concerned with practical applications, he seems to have
carried over this bias against prima facie evidence from his treatment
of the research program itself.
Hyman says that he relies on a definition
of prima facie evidence that originated with May and Utts. In her remarks
(3- 11), Utts describes prima facie RV evidence as a remote viewing result
that is so spectacularly accurate that it virtually proves the existence
of the phenomenon, though it is beyond the ability of statistics to describe.
This meaning is derived from jurisprudence definitions of prima facie evidence
as that evidence which clearly proves a fact, if there can be no other
explanations for what has occurred.
Prima facie evidence of remote viewing
would be unambiguous information produced by a viewer about a target that
could not have been obtained in any other way (i.e., fraud, leaky methodology,
etc.). This might be in the form of sketches or verbal responses or both.
If the target were, for example, the Eiffel Tower, the sketches and/or
verbal descriptions would strikingly match the Eiffel Tower.
There was apparently no specific "prima
facie" proof in the ten SAIC experiments (though a couple of the RV sessions
appear to have come close), so Hyman's embargo of such evidence would seem
not to matter much. However, despite his remarks to the contrary, he doesn't
seem to be working from the same definition of prima facie evidence to
which Utts and May subscribe. Hyman doesn't elaborate further as to what
his personal understanding of the term is, but from the context it seems
apparent that he means to exclude all evidence that cannot be statistically
evaluated. If someone designated as judge must look at an RV result, compare
it to a target, then come to a conclusion based on his/her own opinion
as to whether or not it matches, that evidence is unacceptable because
it is based on a subjective judgement.
One of the most time-honored evaluation
methods in remote viewing research is to provide the judge with the same
set of targets used to task the remote viewers, then allow the judge to
"blind match" the remote viewer's results against all the possible targets
in that pool. Since the judge thus has no idea what the original target
was except that it had been selected from the available target pool, the
belief is that the better the RV session, the more likely is the judge
to correctly match the viewer's results to the actual target. How many
times the judge successfully matches a session to its correct target is
then quantified with statistics. It's obvious that this is only one step
removed from subjective judgement. But it allows the RV data to be turned
into numbers, which can then be more easily manipulated.
This procedure works so long as there is
a reasonably limited target pool. However, if the target pool is infinite—i.e.,
could be any site, person, object, or event in the entire world (as is
the case in intelligence operations)—it is virtually impossible for a judge
to be able to match an RV session transcript to a given target based only
on internal information. If the viewer says the site is the Eiffel Tower,
the judge must evaluate the session data, and if it matches the Eiffel
Tower, he/she must go with that conclusion. Success or failure cannot be
statistically determined in such a situation. Either the viewer accurately
and unmistakably describes the site, or he/she doesn't.
Let's say in the case of the "Eiffel Tower"
session that the site was actually a missile launch gantry at Vandenberg
AFB. Let's say further that the viewer's data was all extremely accurate
in describing the gantry, but that the girder lattice- work, the strong
vertical orientation, and the metallic construction caused the viewer to
subjectively interpret the site as the Eiffel Tower. In a blind-judging
situation with an infinite target pool, this session would be judged as
a miss.
Obviously, it was not a miss. The data
was accurate, but the viewer's subjective interpretation was wrong. It
is clear that another option for judging the accuracy of such a session
is necessary. The only alternative that I know of would allow the judge
to concurrently compare the actual target information with the session
data the remote viewer produced to see how close the RV data matches the
actual site. Of course, the judge is no longer "blind," so this becomes
an exercise in subjective judgement, and would therefore be rejected out
of hand according to Hyman's criteria.
Certainly, there are potential problems
with subjective evaluations of this nature. If the data is somewhat ambiguous—that
is, the elements contained in the feedback potentially match several targets—then
the human tendency might be for the judge to think he/she sees the target
in the data, even though the data itself isn't accurate enough for a truly
objective match.
But with "prima facie" evidence, we are
not talking about these ambiguous cases, but rather a target and transcript
that match unambiguously. Any competent person would recognize that the
target folder and the remote viewing data describe the same target. Ray
Hyman would, unfortunately, exclude this as evidence.
As justification for this rejection Hyman
cites a study done by David Marks and Richard Kamman in 1981 that purports
to prove that a psychological phenomenon they call "subjective validation"
was responsible for good results shown by early SRI remote viewing experiments.
Essentially, Marks and Kamman maintain that a judge may see what s/he wants
to see in evaluating any given remote viewing session, since viewers often
describe a variety of elements that might be found in more than one target.
However, this study centered around blind judging of targets from a limited
target pool, some targets of which shared characteristics with other targets
in the series.
This does not hold water in relation to
the definition that Utts and May had in mind when referring to prima facie
evidence. A true "prima facie" session is not ambiguous. There is NO DOUBT
that the correct target has been addressed and described, and any reasonable
person would be able to make that same judgement.
In effect, Hyman rejects the use of any
sketches or other visual data that must be subjectively compared to the
target to determine whether there is correspondence or not. If the viewer
is targeted (in the blind, of course) against the Eiffel Tower, and during
the course of the session draws unmistakably the Eiffel Tower, it is by
Hyman's standards still inadmissible as evidence of remote viewing. What
Hyman and his colleagues seem to be saying is that even if it looks like
a duck, walks like a duck, quacks like a duck, and floats like a duck,
we must assume that it's NOT a duck until we have something more convincing.
The irony is that if Hyman's strictures
were applied to conventional science, numerous branches of study that rely
on subjective comparisons between one thing and another would dry up and
blow away—among these, plant and animal taxonomy, paleontology, and comparative
biology.
LOST IN THE NUMBERS, OR "STATISTICS AIN'T EVERYTHING!"
Early in his remarks Hyman alleges that
"Parapsycholo[gy] is unique among the sciences in relying solely on significant
departures from a chance baseline to establish the presence of its alleged
phenomenon" (p. 3-51). In other words, parapsychology is the only science
that has to prove itself by showing that something consistently happens
more often than you would expect by accident.
Hyman is generally right in saying this
about statistical proof as far as psychokinesis (PK) research is concerned—no
one has yet demonstrated under scientific conditions the moving of lamps
or pianos through the air using "mental" power alone. Indeed, most PK research
involves microeffects that only manifest themselves as statistical deviations
from the chance baseline to which Hyman refers. One of SAIC's experiments—the
computer-driven binary-choice experiment—falls into this "deviation from
chance" category.
Hyman is wrong, however, in claiming that
remote viewing (obviously a parapsychological effect) is provable only
by a statistical deviation from chance. Valid remote viewing produces true
"macro" effects in the form of word descriptions, drawings, sketches, etc.,
that provide information directly applicable to the real world. The statistics
involved in evaluating RV research are really only an imperfect, after-the-fact
attempt to measure how well remote viewing works in a given experiment.
The statistical analysis also serves the goal of limiting the subjective
judging mistakes to which humans are vulnerable in ambiguous situations.
But the statistical evaluations are not
the proof. The proof is the information provided during the session that
could not possibly have been obtained through any other known means of
communication. Statistics can be extremely useful as an evaluative tool,
but relying too much on them can also be dangerous. It is too easy to get
lost in the numbers and lose sight of what they represent.
In theoretical terms, it only takes a single
successful remote viewing session to prove once and for all the existence
of the phenomenon. If a viewer in isolation provides accurate data about
a target, and if ALL other means by which the information could have been
obtained can be ruled out—to include both fraud and chance, no matter how
unlikely—the only possible conclusion left must be something beyond our
current understanding of the physical universe: in other words "paranormal."
We do not, however, live in a perfect world.
First, there is always a possibility that through some incredible hiccup
of fate the viewer might by accident hit on the correct target. Second,
in the real world theoretical perfection in experimental design is approachable
but ultimately unreachable; we often cannot conclusively rule out every
explanation besides psi for the effects of a given experiment, the first
(or even second or third) time around. Therefore, science insists on replication
of successful experiments before the phenomenon the experiments were meant
to confirm may be accepted as being real.
Let us assume, now, that after much thought,
trial, and error, a proposed set of remote viewing experiments have been
"hermetically sealed" against external contamination, mistaken analysis,
erroneous conclusions, etc. Let us further suppose that the experimental
design is excellent, with a virtually unlimited target pool, and constructed
such that clear distinctions between accurate and inaccurate data can be
made when it comes time to judge results. Let us finally suppose that there
is adequate oversight to guarantee against fraud.
Now, what if after one or two experimental
sessions, a RV researcher produces an excellent match with the chosen target?
This could of course be just wild, hole-in-one luck. Let's say further
that after two or three more sessions there is another unmistakable, if
uncanny match. Still chance? Yes, but considerably less likely. But what
if the viewer continues to have these explicit matches every few sessions—indeed
has runs where maybe two or three sessions in a row match significantly—or
even precisely—with the respective targets? At what point do we give up
on chance and acknowledge that something is going on that can't be explained
in standard physical terms?
These results could not be evaluated statistically—at
best one could say 50% of the time the viewer was accurate, or 30% or 72%,
or whatever. But these statistics would be completely meaningless. According
to Hyman's interpretation of the rules of empirical science, barring a
very rare accident of probability the viewer should not be able to describe
the target accurately even ONCE. If the viewer is successful in describing
the target not just once but a number of times on an ongoing basis the
fact is that it doesn't matter if he or she fails most of the rest of the
time. In the paradigm of the physical universe under which Hyman and his
AIR friends operate, the viewer should ALWAYS be wrong. This is not proof
obtained by statistical "deviation from a chance baseline." Those terms
make no sense here. Yet this is indeed proof, though proof that is unacceptable
to the skeptics.
Ironically, the requirement for statistical
proof that Hyman deplores was imposed on RV research by the skeptics themselves
when they rejected evidence that required subjective evaluation of any
sort, no matter how obvious. Now, based on Jessica Utts' thorough discussion
in the AIR report, it seems clear that the statistical evidence Hyman and
his fellows demanded has now been provided. Yet Hyman states that it is
premature to accept these figures as proof. We must wait to see if anyone
can come up with some way of showing that the data does not say what it
obviously does say. In other words, now that we can no longer dispute that
it looks, walks, and quacks like a duck, we must now carry out exhaustive
genetic tests to prove its ducky heritage. When THOSE tests confirm that
it is a duck, then we must wait through a few more generations of technical
development in genetic testing to see if we can create a test that WILL
prove that it is not a duck.
But this attitude is no surprise. Skeptical
evaluation of psi research has often resembled an archery match where during
the contest the judges keep moving the target of one competitor while leaving
those of all other contestants in place. By refusing to acknowledge that
there is now adequate proof that psi exists; by insisting that we cannot
make any judgement about the existence of psi based on SAIC's experiments
(as well as the others mentioned by Utts); by declining to examine ALL
the newly available experimental evidence; and by failing altogether to
consider the historical track record of the intelligence operations portion
of STAR GATE's predecessors, Hyman and his cohorts
have effectively "moved the target" once more. In so doing, he has not
preserved the purity of science. He has only demonstrated his apparent
intention never to accept ANY proof, no matter how compelling, for the
effectiveness of remote viewing or the existence of psi.
SUMMATION
Since at the conclusion of all three parts
of this review the discussion is now quite long and convoluted, I shall
summarize the general points below:
-
AIR narrowed the scope of its evaluation to
focus on only a few years and a few experiments out of more than two decades
of RV research and many experiments. As a result, the AIR assessment is
useless as a comprehensive and meaningful evaluation of remote viewing
and its practical applications.
-
The SAIC experiments that AIR reviewed were
not themselves a fair test of the remote viewing phenomenon. Yet despite
their shortcomings, the experiments still demonstrated a persistent positive
result that it seems can only be ascribed to a paranormal cause.
-
Though Hyman admits the data shows an effect,
he wants to keep the door open indefinitely—never admitting that psi may
be involved—in hopes that eventually an alternative explanation to psi
can be discovered to account for these effects (by inference, he seems
to imply fraud).
-
Ultimately, though Utts makes a far stronger
case for the existence of some sort of psi phenomenon being evidenced by
SAIC results, AIR throws the debate to Hyman, without satisfactorily explaining
why his case was deemed more compelling. Based on his flawed evaluation
Hyman decides that he has sufficient data and personal expertise to extend
his evaluation into the operational arena—and concludes that remote viewing
is of no use in intelligence collection.
Of course, the purported motivation for the
AIR evaluation that produced in the flawed report for the CIA was to determine
whether remote viewing was useful as an intelligence collection tool. By
the manner in which the study was conducted and in the way the negative
conclusions were reached in the report, it should be clear by now that
the evaluation not only failed to honestly determine whether remote viewing
was of any intelligence use: It also showed conclusively that there was
an unacknowledged, predetermined agenda to produce negative findings as
the conclusion to the report.
Presumably, the AIR itself had no particular
prior bias against remote viewing. This leaves the contracting agency as
the culprit. It would seem that the Central Intelligence Agency gave the
AIR its marching orders: To find no merit in the program no matter what
the evidence itself showed. In Part One I suggested reasons for this, but
at this point that all still remains only speculative. Nonetheless, there
does appear to be a smoking gun here; and, as has so often been the case
recently, it seems to be lying at the feet of the CIA.
This is part 3 in a series of 4.
© 1996 Leonard Buchanan on behalf of Paul Smith aka "Mr. X"
|