If one is limited only to the information contained in the AIR report, one forms the impression that the evaluators did a reasonably thorough job in assessing the SAIC/SRI experiments and analyzing the results. The ambiguous conclusions (that there is an anomaly, but after 20+ years of research it is still a tentative one, and no cause and effect has yet been demonstrated) leads surely to the AIR conclusion-of-choice that it really doesn’t make sense for the government to waste further money on it. But we would be misled. The AIR examination was neither in depth, nor conclusive.
AIR employees themselves focused mostly on their rather cursory evaluation of the intelligence operations part of the STAR GATE program. Though some of them were involved as well with evaluating the remote viewing research program, they contributed little but a brief concluding summation to the final AIR report. Drs. Utts and Hyman, specially engaged by AIR to review the research program, produced by far the bulk of that assessment. Utts’ is first sequentially in the report. She starts with a general discussion of the statistical theory used to gauge experimental success in parapsychology research. She follows this with an instructive discussion about RV experimental design, some history of RV research, and an exploration of the SAIC experiments, augmented by more detailed information in an appendix. She also discusses briefly how the results correlate with earlier work done at SRI (they are consistent with these earlier statistically-significant experiments), and also lists the results of a number of related remote viewing and ganzfeld (a form of remote viewing) experiments conducted at various labs around the world. According to Utts, the effects of these strongly correlate with those achieved in the SAIC remote viewing experiments.
In the course of her remarks she anticipates and answers many of the objections Hyman later brings up in his portion of the review. Even allowing for my own personal bias in favor of her conclusions, I find her assessment to be more rational, well- reasoned, and soundly supported than is that of Hyman.
On the other hand, so general are Hyman’s comments that he could handily have written most of his evaluation without ever once having to refer to the remote viewing experiments themselves. Ultimately, he acknowledges that there are significant effects demonstrated, but then spends a good deal of time discussing why in principle he rejects these effects. He admits that he can find no flaws in the experiments, yet says we must wait indefinitely to decide whether they have or have not proved a psi effect so as to allow a lengthy interval for thus- far unidentified flaws to be ferreted out. He warns that given enough time, methodological flaws might turn up that had not yet dawned on anyone. He then cites as his only examples of such methodological flaws two cases that are decades-old and unrelated to remote viewing, where the only “flaws” uncovered were instances of fraud. Meanwhile, Utts has already pointed out that fraud as an explanation is untenable because of the numbers of institutions in diverse locations around the globe that have produced results equally significant as those of the SAIC experiments.
Utts later addresses and disposes of a number of Hyman’s other arguments and errors in her rebuttal that follows Hyman’s comments in the report. However, there were several other “literary offenses” (to borrow from Mark Twain) that Hyman or AIR or both commit that are not discussed. Since Hyman’s evaluation is at the heart of the AIR case against the remote viewing research program, I will focus my attention there. In the interests of space—which I consume ever more of as this review progresses—I will only consider a few of the more egregious errors and misjudgments the good doctor makes.
THE BABY OUT WITH THE BATH
To begin with, Hyman and AIR ignored twenty years of research conducted prior to the SAIC experiments. Despite the AIR’s express assignment to thoroughly review “all laboratory experiments and meta-analytic reviews conducted as part of the research program,” ultimately only ten experiments were actually reviewed—all of them performed at SAIC in just the last three or four years of the government’s program. One reason for this was likely due, as Hyman says, to the “limited time frame [that was] allotted for this evaluation” [p. 3-43, 3-44]. The AIR reviewers were given only a month and a half—from mid-July to the end of August—to conduct a supposedly “exhaustive” review.
Ed May asserts in his own rebuttal to the AIR report (Journal of Scientific Exploration, vol. 10, no. 1, Spring 1996) that in recognition of this unrealistically short time allotment, someone at AIR requested May provide only the reports from his ten best experiments for evaluation. Quite properly he demurred, since for sound scientific reasons this would skew the resultsï¿½in so doing, only successful results would be considered, when to form a fair picture required that poor results should be evaluated as well (selecting only experimental results that show positive effects is known as the “file drawer” effect). As an alternative, May proposed a different procedure that would have allowed examination of all the materials within the time constraints, resulting in a much more thorough and reliable assessment. His suggestion was ignored.
Instead, in a conference call between the AIR evaluators, Hyman got agreement that only the ten latest experiments would be evaluated. It was tacitly recognized that there were both relevant and irrelevant experiments among these ten, but it made for a more manageable evaluation pool, and it avoided the “file drawer” problem.
This is where it gets interesting. As earlier noted, Hyman explains that a limited number of experiments were selected because of lack of time to consider all of those available, and these ten were the most recent. But he also cavalierly dismisses the need to examine the other two decades worth of experiments by alleging that the handful of SAIC experiments selected were “the only ones for which we have adequate documentation” (p. 3-43). Earlier research was discounted as suffering “from methodological inadequacies” upon which he chooses not to elaborate further in his report. Hyman makes this amazing assertion despite the fact that he had never even looked at the documents of which he is being so dismissive. Sometime back in the mid 1980s, he reportedly saw some of the results from the first few years of SRI experiments when he participated in another flawed “scientific” evaluation of enhanced human performance programs [i.e., the National Research Council’s somewhat infamous “Enhancing Human Performance” report].
Still, there remained perhaps ten years’ worth of subsequent remote viewing research conducted at SRI and elsewhere to which Hyman had never previously had access. It, along with the ten SAIC experiments, had been classified Secret or higher until the CIA decided to make it all available in support of the AIR study.
Because of the CIA’s declassification action, Hyman finally WAS authorized access to the majority of the research, had he chosen to examine it. However, he himself admits he never bothered, since most of the experiments prior to the SAIC era were in the “three large cartons of documents” he was given at the outset of the study but which he freely admits in a recent article he “didn’t have time” to look into (Skeptical Inquirer, March/April 1996, p. 22). In short, he couldn’t possibly have known whether those experiments really did suffer from “methodological inadequacies.”
Still, Dr. Hyman couches his remarks in such a way as to make an unsuspecting reader suppose that the ten experiments reviewed were the best examples available. Though he clearly knew better, he nevertheless claims in the Skeptical Inquirer article that the ten experiments he and Dr. Utts evaluated were the “ten best studies,” and “the best [RV] laboratory studies” (p. 22), implying by assumption that they must therefore be sufficient on which to base an adequate assessment of remote viewing. This despite the fact previously explored in Part II of this review that a number of the SAIC experiments had little or nothing to do with remote viewing, and that the remainder were generally not fully state-of-the-art RV experiments.
Nonetheless, a mere two pages after telling us that he and his AIR fellows themselves arbitrarily decided that only ten experiments would be reviewed, he proceeds to deplore the entire two-and-a-half decades of research for producing “only ten adequate experiments for consideration.” Hyman writes:
Unfortunately, ten experiments. . .is far too few to establish reliable relationships in almost any area of inquiry. In the traditionally elusive quest for psi, ten experiments from one laboratory promise very little in the way of useful conclusions. (3-46)
He is, of course, absolutely right in the process of being altogether wrong.
PRIMA FACIE EVIDENCE
The arbitrarily limited data base is not the only difficulty with AIR’s study. Perhaps more problematic is Hyman’s arbitrary exclusion of so-called “prima facie” evidence (3-71). This is introduced in the section where Hyman (without, I might add, any qualifications whatsoever in the field of intelligence) considers whether RV has potential for use in operational intelligence settings. Though in this part of his discussion he is concerned with practical applications, he seems to have carried over this bias against prima facie evidence from his treatment of the research program itself.
Hyman says that he relies on a definition of prima facie evidence that originated with May and Utts. In her remarks (3- 11), Utts describes prima facie RV evidence as a remote viewing result that is so spectacularly accurate that it virtually proves the existence of the phenomenon, though it is beyond the ability of statistics to describe. This meaning is derived from jurisprudence definitions of prima facie evidence as that evidence which clearly proves a fact, if there can be no other explanations for what has occurred.
Prima facie evidence of remote viewing would be unambiguous information produced by a viewer about a target that could not have been obtained in any other way (i.e., fraud, leaky methodology, etc.). This might be in the form of sketches or verbal responses or both. If the target were, for example, the Eiffel Tower, the sketches and/or verbal descriptions would strikingly match the Eiffel Tower.
There was apparently no specific “prima facie” proof in the ten SAIC experiments (though a couple of the RV sessions appear to have come close), so Hyman’s embargo of such evidence would seem not to matter much. However, despite his remarks to the contrary, he doesn’t seem to be working from the same definition of prima facie evidence to which Utts and May subscribe. Hyman doesn’t elaborate further as to what his personal understanding of the term is, but from the context it seems apparent that he means to exclude all evidence that cannot be statistically evaluated. If someone designated as judge must look at an RV result, compare it to a target, then come to a conclusion based on his/her own opinion as to whether or not it matches, that evidence is unacceptable because it is based on a subjective judgement.
One of the most time-honored evaluation methods in remote viewing research is to provide the judge with the same set of targets used to task the remote viewers, then allow the judge to “blind match” the remote viewer’s results against all the possible targets in that pool. Since the judge thus has no idea what the original target was except that it had been selected from the available target pool, the belief is that the better the RV session, the more likely is the judge to correctly match the viewer’s results to the actual target. How many times the judge successfully matches a session to its correct target is then quantified with statistics. It’s obvious that this is only one step removed from subjective judgement. But it allows the RV data to be turned into numbers, which can then be more easily manipulated.
This procedure works so long as there is a reasonably limited target pool. However, if the target pool is infinite—i.e., could be any site, person, object, or event in the entire world (as is the case in intelligence operations)—it is virtually impossible for a judge to be able to match an RV session transcript to a given target based only on internal information. If the viewer says the site is the Eiffel Tower, the judge must evaluate the session data, and if it matches the Eiffel Tower, he/she must go with that conclusion. Success or failure cannot be statistically determined in such a situation. Either the viewer accurately and unmistakably describes the site, or he/she doesn’t.
Let’s say in the case of the “Eiffel Tower” session that the site was actually a missile launch gantry at Vandenberg AFB. Let’s say further that the viewer’s data was all extremely accurate in describing the gantry, but that the girder lattice- work, the strong vertical orientation, and the metallic construction caused the viewer to subjectively interpret the site as the Eiffel Tower. In a blind-judging situation with an infinite target pool, this session would be judged as a miss.
Obviously, it was not a miss. The data was accurate, but the viewer’s subjective interpretation was wrong. It is clear that another option for judging the accuracy of such a session is necessary. The only alternative that I know of would allow the judge to concurrently compare the actual target information with the session data the remote viewer produced to see how close the RV data matches the actual site. Of course, the judge is no longer “blind,” so this becomes an exercise in subjective judgement, and would therefore be rejected out of hand according to Hyman’s criteria.
Certainly, there are potential problems with subjective evaluations of this nature. If the data is somewhat ambiguous—that is, the elements contained in the feedback potentially match several targets—then the human tendency might be for the judge to think he/she sees the target in the data, even though the data itself isn’t accurate enough for a truly objective match.
But with “prima facie” evidence, we are not talking about these ambiguous cases, but rather a target and transcript that match unambiguously. Any competent person would recognize that the target folder and the remote viewing data describe the same target. Ray Hyman would, unfortunately, exclude this as evidence.
As justification for this rejection Hyman cites a study done by David Marks and Richard Kamman in 1981 that purports to prove that a psychological phenomenon they call “subjective validation” was responsible for good results shown by early SRI remote viewing experiments. Essentially, Marks and Kamman maintain that a judge may see what s/he wants to see in evaluating any given remote viewing session, since viewers often describe a variety of elements that might be found in more than one target. However, this study centered around blind judging of targets from a limited target pool, some targets of which shared characteristics with other targets in the series.
This does not hold water in relation to the definition that Utts and May had in mind when referring to prima facie evidence. A true “prima facie” session is not ambiguous. There is NO DOUBT that the correct target has been addressed and described, and any reasonable person would be able to make that same judgement.
In effect, Hyman rejects the use of any sketches or other visual data that must be subjectively compared to the target to determine whether there is correspondence or not. If the viewer is targeted (in the blind, of course) against the Eiffel Tower, and during the course of the session draws unmistakably the Eiffel Tower, it is by Hyman’s standards still inadmissible as evidence of remote viewing. What Hyman and his colleagues seem to be saying is that even if it looks like a duck, walks like a duck, quacks like a duck, and floats like a duck, we must assume that it’s NOT a duck until we have something more convincing.
The irony is that if Hyman’s strictures were applied to conventional science, numerous branches of study that rely on subjective comparisons between one thing and another would dry up and blow awayï¿½among these, plant and animal taxonomy, paleontology, and comparative biology.
LOST IN THE NUMBERS, OR “STATISTICS AIN’T EVERYTHING!”
Early in his remarks Hyman alleges that “Parapsycholo[gy] is unique among the sciences in relying solely on significant departures from a chance baseline to establish the presence of its alleged phenomenon” (p. 3-51). In other words, parapsychology is the only science that has to prove itself by showing that something consistently happens more often than you would expect by accident.
Hyman is generally right in saying this about statistical proof as far as psychokinesis (PK) research is concerned—no one has yet demonstrated under scientific conditions the moving of lamps or pianos through the air using “mental” power alone. Indeed, most PK research involves micro effects that only manifest themselves as statistical deviations from the chance baseline to which Hyman refers. One of SAIC’s experiments—the computer-driven binary-choice experiment—falls into this “deviation from chance” category.
Hyman is wrong, however, in claiming that remote viewing (obviously a parapsychological effect) is provable only by a statistical deviation from chance. Valid remote viewing produces true “macro” effects in the form of word descriptions, drawings, sketches, etc., that provide information directly applicable to the real world. The statistics involved in evaluating RV research are really only an imperfect, after-the-fact attempt to measure how well remote viewing works in a given experiment. The statistical analysis also serves the goal of limiting the subjective judging mistakes to which humans are vulnerable in ambiguous situations.
But the statistical evaluations are not the proof. The proof is the information provided during the session that could not possibly have been obtained through any other known means of communication. Statistics can be extremely useful as an evaluative tool, but relying too much on them can also be dangerous. It is too easy to get lost in the numbers and lose sight of what they represent.
In theoretical terms, it only takes a single successful remote viewing session to prove once and for all the existence of the phenomenon. If a viewer in isolation provides accurate data about a target, and if ALL other means by which the information could have been obtained can be ruled out—to include both fraud and chance, no matter how unlikely—the only possible conclusion left must be something beyond our current understanding of the physical universe: in other words “paranormal.”
We do not, however, live in a perfect world. First, there is always a possibility that through some incredible hiccup of fate the viewer might by accident hit on the correct target. Second, in the real world theoretical perfection in experimental design is approachable but ultimately unreachable; we often cannot conclusively rule out every explanation besides psi for the effects of a given experiment, the first (or even second or third) time around. Therefore, science insists on replication of successful experiments before the phenomenon the experiments were meant to confirm may be accepted as being real.
Let us assume, now, that after much thought, trial, and error, a proposed set of remote viewing experiments have been “hermetically sealed” against external contamination, mistaken analysis, erroneous conclusions, etc. Let us further suppose that the experimental design is excellent, with a virtually unlimited target pool, and constructed such that clear distinctions between accurate and inaccurate data can be made when it comes time to judge results. Let us finally suppose that there is adequate oversight to guarantee against fraud.
Now, what if after one or two experimental sessions, a RV researcher produces an excellent match with the chosen target? This could of course be just wild, hole-in-one luck. Let’s say further that after two or three more sessions there is another unmistakable, if uncanny match. Still chance? Yes, but considerably less likely. But what if the viewer continues to have these explicit matches every few sessions—indeed has runs where maybe two or three sessions in a row match significantly—or even precisely—with the respective targets? At what point do we give up on chance and acknowledge that something is going on that can’t be explained in standard physical terms?
These results could not be evaluated statistically—at best one could say 50% of the time the viewer was accurate, or 30% or 72%, or whatever. But these statistics would be completely meaningless. According to Hyman’s interpretation of the rules of empirical science, barring a very rare accident of probability the viewer should not be able to describe the target accurately even ONCE. If the viewer is successful in describing the target not just once but a number of times on an ongoing basis the fact is that it doesn’t matter if he or she fails most of the rest of the time. In the paradigm of the physical universe under which Hyman and his AIR friends operate, the viewer should ALWAYS be wrong. This is not proof obtained by statistical “deviation from a chance baseline.” Those terms make no sense here. Yet this is indeed proof, though proof that is unacceptable to the skeptics.
Ironically, the requirement for statistical proof that Hyman deplores was imposed on RV research by the skeptics themselves when they rejected evidence that required subjective evaluation of any sort, no matter how obvious. Now, based on Jessica Utts’ thorough discussion in the AIR report, it seems clear that the statistical evidence Hyman and his fellows demanded has now been provided. Yet Hyman states that it is premature to accept these figures as proof. We must wait to see if anyone can come up with some way of showing that the data does not say what it obviously does say. In other words, now that we can no longer dispute that it looks, walks, and quacks like a duck, we must now carry out exhaustive genetic tests to prove its ducky heritage. When THOSE tests confirm that it is a duck, then we must wait through a few more generations of technical development in genetic testing to see if we can create a test that WILL prove that it is not a duck.
But this attitude is no surprise. Skeptical evaluation of psi research has often resembled an archery match where during the contest the judges keep moving the target of one competitor while leaving those of all other contestants in place. By refusing to acknowledge that there is now adequate proof that psi exists; by insisting that we cannot make any judgement about the existence of psi based on SAIC’s experiments (as well as the others mentioned by Utts); by declining to examine ALL the newly available experimental evidence; and by failing altogether to consider the historical track record of the intelligence operations portion of STAR GATE’s predecessors, Hyman and his cohorts have effectively “moved the target” once more. In so doing, he has not preserved the purity of science. He has only demonstrated his apparent intention never to accept ANY proof, no matter how compelling, for the effectiveness of remote viewing or the existence of psi.
Since at the conclusion of all three parts of this review the discussion is now quite long and convoluted, I shall summarize the general points below:
AIR narrowed the scope of its evaluation to focus on only a few years and a few experiments out of more than two decades of RV research and many experiments. As a result, the AIR assessment is useless as a comprehensive and meaningful evaluation of remote viewing and its practical applications.
The SAIC experiments that AIR reviewed were not themselves a fair test of the remote viewing phenomenon. Yet despite their shortcomings, the experiments still demonstrated a persistent positive result that it seems can only be ascribed to a paranormal cause.
Though Hyman admits the data shows an effect, he wants to keep the door open indefinitely—never admitting that psi may be involved—in hopes that eventually an alternative explanation to psi can be discovered to account for these effects (by inference, he seems to imply fraud).
Ultimately, though Utts makes a far stronger case for the existence of some sort of psi phenomenon being evidenced by SAIC results, AIR throws the debate to Hyman, without satisfactorily explaining why his case was deemed more compelling. Based on his flawed evaluation Hyman decides that he has sufficient data and personal expertise to extend his evaluation into the operational arena—and concludes that remote viewing is of no use in intelligence collection.
Of course, the purported motivation for the AIR evaluation that produced in the flawed report for the CIA was to determine whether remote viewing was useful as an intelligence collection tool. By the manner in which the study was conducted and in the way the negative conclusions were reached in the report, it should be clear by now that the evaluation not only failed to honestly determine whether remote viewing was of any intelligence use: It also showed conclusively that there was an unacknowledged, predetermined agenda to produce negative findings as the conclusion to the report.
Presumably, the AIR itself had no particular prior bias against remote viewing. This leaves the contracting agency as the culprit. It would seem that the Central Intelligence Agency gave the AIR its marching orders: To find no merit in the program no matter what the evidence itself showed. In Part One I suggested reasons for this, but at this point that all still remains only speculative. Nonetheless, there does appear to be a smoking gun here; and, as has so often been the case recently, it seems to be lying at the feet of the CIA.
Continue to Part 4: Addendums and Corrections