"Don't Ask, Don't Tell"
Two fatal statistical defects in the DoD surveys
16 Dec 2010 in Regulatory Policy, Information Quality
The Report of the Comprehensive Review of the Issues Associated with a Repeal of 'Don't Ask, Don't Tell' was released by the Department of Defense on November 30, 2010. The Report summarizes and interprets a pair of large-scale surveys of Service members and spouses hat was sponsored by DoD and conducted by WESTAT, a major consulting firm. Immediately thereafter, Congress took up the question whether to repeal the law on which the Department's "Don't Ask, Don't Tell" policy is based.
This post is about the Report's statistical back story, and why the conclusions given in the Report are unreliable as a guide for policy-making -- regardless of whether one prefers to repeal the law, retain the law, or replace it with a more stringent one.
"DON'T ASK, DON'T
TELL" IMPLEMENTS FEDERAL LAW
It is widely but incorrectly reported that DoD established
the policy
that prohibits gays and lesbians from openly serving in the
military. In fact, DoD's "DADT" guidelines implement a federal
law enacted in 1993, which states in part:
(1) That the member has engaged in, attempted to engage in, or solicited another to engage in a homosexual act or acts unless there are further findings, made and approved in accordance with procedures set forth in such regulations, that the member has demonstrated that--
(B) such conduct, under all the circumstances, is unlikely to recur;
(C) such conduct was not accomplished by use of force, coercion, or intimidation;
(D) under the particular circumstances of the case, the member's continued presence in the armed forces is consistent with the interests of the armed forces in proper discipline, good order, and morale; and
(E) the member does not have a propensity or intent to engage in homosexual acts.
(2) That the member has stated that he or she is a homosexual or bisexual, or words to that effect, unless there is a further finding, made and approved in accordance with procedures set forth in the regulations, that the member has demonstrated that he or she is not a person who engages in, attempts to engage in, has a propensity to engage in, or intends to engage in homosexual acts.
(3) That the member has married or attempted to marry a person known to be of the same biological sex.
10 USC § 654(b), Pub. L. 103-160, div. A, title V, Sec. 571(a)(1), Nov. 30, 1993, 107 Stat. 1670.
Arguably, Kagan was trying to thread a needle between the clear requirements of federal law, which mandates the discrimination she said she abhorred, and an important Law School interest group, which strongly opposed that law. Nonetheless, Kagan clearly implied that DoD had the authority to rescind its DADT policy implementing 10 USC § 654. When the text of the law is compared with DoD's policy, it is difficult to imagine a less stringent statutory interpretation.
In any case, supporters of repeal were unable to muster the votes for cloture in the Senate, and repeal seems less likely by the 112th Congress, which convenes on January 5, 2011.
THE REPORT
On March 2, 2010, Defense Secretary Robert Gates appointed a working group of "49 military and 19 civilian personnel from across the Department of Defense and the Military Services" led by two co-chairmen to "undertake a comprehensive review of the impacts of repeal." Specifically, the team was directed to:
- "assess the impact of repeal of Don't Ask, Don't Tell on military readiness, military effectiveness, unit cohesion, recruiting, retention, and family readiness; and
- "recommend appropriate changes, if necessary, to existing regulations, policies, and guidance in the event of repeal."
Both inferences are unremarkable; agency heads often order up studies and reports only after they have decided what they want to do, and the studies and reports are designed and produced for the purpose of eliciting public support and Congressional endorsement. Congress knows this, of course, and if it wanted genuinely independent information and advice it would, at a minimum, direct that it be produced by a source other than the agency. Moreover, these inferences are consistent with Secretary Gates' decision on March 25, 2010 -- three weeks after creating the working group charged with studying the issue -- to significantly weaken the Department's enforcement of DADT.
THE SURVEYS
A key element of the Report is a pair of surveys including one of "nearly 400,000 active duty and reserve component Service members with an extensive and professionally-developed survey." This Service member Survey (the "Main Survey") became the focal point of almost all press accounts, with much less attention devoted to the "Spouse Survey". However, the Surveys' methodology -- and the inferential constraints imposed by this methodology -- have attracted hardly any attention at all.
When the Surveys' methodology is examined, it is clear that they are unfit for the purpose of informing Congress with respect to whether gays and lesbians ought, or ought not, be allowed to serve openly in the military. The Surveys cannot predict the effects of repeal on military readiness, military effectiveness, unit cohesion, recruiting, retention, and family readiness. Surveys can be useful for capturing opinions and values, but it is inappropriate to ask respondents, as these Surveys do, to speculate about hypothetical future events over which they have no perceptible influence, much less control.
Second, even if a representative sample of Service members and their spouses could accurately predict these future events, these Surveys cannot provide such predictions. The response rates are way too low to obtain valid and reliable results.
These defects are surely known to Secretary Gates (who holds a doctorate from Georgetown University) and the statisticians who shepherded the Surveys from development through implementation. And this includes the statisticians employed by WESTAT, the contractor DoD hired to conduct the Surveys.
These statistical defects are best explained in reverse order. First, however, it is useful to explain the procedural reason why these Surveys are defective.
OMB DID NOT CONDUCT A CREDIBLE REVIEW OF EITHER SURVEY
Surveys conducted or sponsored by a federal agency must be approved in advance by the Office and Management and Budget, pursuant to its authorities set forth in the Paperwork Reduction Act (44 USC 3501 et seq.; 5 CFR 1320). It is unlawful for any agency to administer a survey without prior OMB clearance. One of the purposes of requiring OMB review and clearance is to ensure that surveys are designed such that their results have utility for their intended purpose -- in this case, informing Congressional debate about whether to retain, modify, or repeal 10 USC 654.
There is an exception to these requirements that is rarely important, but in this case it turned out to have been critical. The Paperwork Reduction Act exempts surveys directed at federal employees from the definition of "collection of information". Because Service members are federal employees, the Main Survey therefore was exempt from OMB review and required no OMB approval.
DoD also initiated a survey of Service members' spouses, and this survey was covered by the Paperwork Reduction Act because spouses are not, as a class, federal employees. The Spouse Survey thus was required to undergo OMB review.
DoD submitted the draft spouse survey on July 23, 2010. DoD sought, and obtained from OMB, "emergency" processing of this submission. OMB's Information Collection Rule (5 CFR 1320.13) provides for emergency processing based on the written determination by an agency's designated Senior Official that the information collection is "essential to the mission of the agency" and at least one of the following three conditions applied:
(ii) An unanticipated event has occurred; or
(iii) The use of normal clearance procedures is reasonably likely to prevent or disrupt the collection of information or is reasonably likely to cause a statutory or court ordered deadline to be missed.
Moreover, emergency processing does not exempt an information collection from the substantive standards of the Paperwork Reduction Act. It is clear, however, that OMB did in fact waive the usual substantive requirements that apply to government-sponsored surveys. OMB approved the Spouse Survey on August 6, 2010 -- 14 days after it was submitted. A credible review of almost any survey, much less a complicated survey containing highly sensitive questions, cannot be performed in 14 days. The speed of OMB's review alone is sufficient evidence to conclude that OMB did not perform a credible review.
THE SURVEY RESPONSE RATE IS SERIOUSLY DEFICIENT TO INFORM DECISION MAKING
OMB's Standards and Guidelines for Statistical Surveys apply to federal surveys whether they are performed by the agency (such as DoD) or a government contractor (such as WESTAT):
Standard 1.3 states the government's responsibility to ensure that surveys yield data that "can be used with confidence to inform decisions" (p. 8, emphasis added):
A survey that cannot be used with confidence to inform decisions if it lacks practical utility. If a survey lacks practical utility, OMB cannot approve it. Had OMB applied the law and its own rules and guidelines to the Spouse Survey, it would have had to disapprove it.
It is well known, and expected by survey researchers, that those who respond to a survey tend to be different than those who do not. The lower the response rate, the greater is the likelihood that nonrespondents will differ from respondents.
DoD's supporting statement (Word file) asserts, without documentation, that the Spouse Survey would achieve a 35% response rate (pp. 1, 8). The supporting statement merely says "statistical weighting will be used to decrease non-response bias and a nonresponse bias study will be conducted" (p. 11). There are two fatal problems here. First, nonresponse bias can be estimated, and survey results can be adjusted to take it into account, but it will not go away with "statistical weighting". Second, the description of this purported nonresponse bias "study" (pp. 12-13) contains no information suggesting that a genuine study of nonresponse bias would actually be performed.
With the exception of the Coast Guard, a majority of Service members in each service declined to respond. Seven of 10 spouses declined to respond. Therefore, a nonresponse bias analysis was an essential prerequisite to determine whether either Survey had practical utility to inform decision making.
There is no public evidence that a nonresponse bias analysis was conducted.
For further information on response rates, the Report directs readers to Appendix A to Volume 1 (25 MB download) and Volume 3. Appendix A to Volume 1 does not contain a nonresponse bias analysis, and Volume 3 has not been disclosed. The Report itself does not even discuss nonresponse bias.
Whether this is true cannot be verified. DoD's web page for the Report does not disclose Volume 3, and a search of the DoD website yields no hits. The absence of any discussion of nonresponse bias in the Report, combined with the nondisclosure of Volume 3, where the nonresponse bias analysis is supposed to have been published, suggests that the nonresponse bias analysis yielded information that does not support the conclusions of the Report.
No. Making unbiased inferences from a survey require the respondents to be representative of the population of interest. A larger sample frame will lower the margin of error of a representative sample, but it will not make a biased sample representative.
Nonetheless, the DoD Report touts the large number of respondents (115,052 Service members) as if this gives the survey greater validity. The Report describes the Main Survey as "one of the largest surveys in the history of the U.S. military" (p. 1). The response rate is described as "average for the U.S. military" (p. 3), which the Report elsewhere says is "29–32% for Active Duty Service members, and 25–29% for Reservists" (p. 37). If these response rates are accurate, it does not mean the Main Survey meets a high quality standard. It means that DoD surveys as a whole do not.
DoD's emphasis on getting a large number of responses, presumably to create the appearance of greater validity, came from Secretary Gates himself. The Report says the sample was doubled, from 200,000 to 400,000, "[a]t the direction of the Secretary of Defense in May [2010]" (p. 36). A survey with about 115,000 thousand respondents certainly appears more impressive than a survey with about 67,000 responses. However, doubling the number of responses does not improve representativeness. Had Secretary Gates been interested in improving the quality of the survey, he would have directed the work group and WESTAT to find a way to double (or treble) the response rate.
Without a rigorous nonresponse bias analysis, the margin of error cannot be calculated. We can be certain, however, that it is much greater than ±1%. If respondents materially differ from nonrespondents, then the error bounds also will not be the same on both sides.
Yes. The Report tries to compare the Main Survey favorably with surveys performed by others, focusing improperly on the number of respondents rather than the representativeness of the respective samples:
These comparisons are specious, particularly the comparison to the U.S. Census Bureau's 2008 American Community Survey, the one with 2,240 respondents representing 227 million people. The Census Bureau survey has a response rate of 98%. A small but representative sample yields higher quality information than a large and unrepresentative one.
The Main Survey, but not the Spouse Survey, was exempt from OMB review. Neither Survey is exempt, however, from the Information Quality Act. OMB's government-wide implementing guidelines are triggered when an agency disseminates information, and the standards are most stringent when the information meets OMB's test of being "influential," which these surveys surely do. Both Surveys are subject to OMB's information quality standards and its own. Thus, DoD has a presumptive obligation to adhere to OMB's Standards and Guidelines for Statistical Surveys.
On response rate alone, both Surveys are easily shown to violate applicable information quality guidelines for objectivity and utility. Low response rates render the results presumptively biased, and biased results cannot have utility for educating the public or informing Congress.
DoD's SURVEYS WERE NOT DESIGNED TO ELICIT THE KIND OF INFORMATION NECESSARY TO INFORM CONGRESSIONAL POLICY MAKING
Now we turn to the question whether these Surveys ask appropriate questions. For the reasons set forth below, it is reasonable to conclude that they do not. This conclusion is not based on the sensitivity of the issues or the potential for respondent strategic behavior, both of which are likely to be serious problems. Sensitivity often results in nonresponse, and strategic behavior changes the character of responses in material ways. Neither the Report nor WESTAT's supporting documents address these problems.
The first fundamental error is the Surveys deny the underlying values conflict that is a core element of the controversy. Indeed, the extent to which the inclusion of openly gay and lesbian Service members would create or exacerbate values conflicts that undermine military objectives is the key question the Survey was supposed to help determine. But the Surveys do not inform this debate because they do not even attempt to measure values or discern policy preferences.
It would have been a simple matter to ask respondents to disclose which they prefer from a menu of policy options. Alternatively, respondents could have given ordinal or metric rankings of various policy options. The Surveys did none of these things, and this appears to have been intended:
Sen. John McCain complained about this in his questioning of Secretary Gates and Joint Chiefs chairman Admiral Mullen during the December 2, 2010, oversight hearing: DoD made a policy decision not to ask Service members for advice.
DoD's intentional refusal to ask a direct question means it expected that a direct question have yielded limited support for repeal. Had such a "referendum" gone badly there would have been no practical way to proceed toward repeal. More revealing is the suggestion that respondents had multiple opportunities within the questionnaire to express support or opposition to current law and policy. This can only be true if DoD expected (encouraged?) respondents to reply strategically -- that is, give untruthful answers in hopes of affecting Secretary Gates and Congress. Strategic behavior renders survey responses inherently unreliable, so it is peculiar to note a survey sponsor touting how respondents could have gamed the survey.
The second fundamental error is the Surveys ask respondents to speculate about facts -- most notably, the effect repeal would have on military readiness, military effectiveness, unit cohesion, recruiting, retention, and family readiness. These are future facts, not mere opinions. When respondents are asked to speculate about facts they do not (and in this case cannot) know, they systematically overstate their confidence even when they respond honestly. And, as noted above, there is no way to discern whether respondents' speculations were honest or strategic.
Some questions that really are about facts were converted into questions about beliefs, but this does not necessarily improve reliability. For example, when asked "In your career, have you ever worked in a unit with a leader you believed to be homosexual?", 38.5% responded "Yes". This percentage exceeds by at perhaps tenfold the percentage of Service members who are gay or lesbian that has been promoted by advocates of repeal. In the Report, DoD does not endeavor to validate such figures or even question their validity. (A recent report by the Congressional Research Service notes that "a 'gay rights policy center' (the Williams Institute at the UCLA School of Law) suggested that homosexuals and bisexuals constitute 66,000 people in uniform" (p. 9), a figure they could not validate. If it were true, the proportion of gays and lesbians in the military would be about 2.2% of the 3 million Active Duty and Reserve Service members and the proportion discharged under DADT would be just 0.6% of the total. It is inexplicable how 38.5% of Service members could have worked in a unit with a homosexual or bisexual leader.)
DoD does not take all Survey responses at face value, however. In his Senate testimony, Secretary Gates said that the Survey overstates the proportion of Service members who would leave if DADT was repealed. The reason is they would not be allowed to leave:
CONCLUSION
The Defense Department's Surveys do not satisfy minimum information quality standards necessary to inform Congressional policy making. Both Surveys, and especially the Main Survey of Service members, avoid eliciting information about respondents' values and opinions relevant to the policy question -- information that well-designed surveys are capable of obtaining with a high degree of accuracy. Instead, both Surveys ask respondents to speculate about facts, most notably, the effects of repeal on military readiness, military effectiveness, unit cohesion, recruiting, retention, and family readiness. This does not mean that every question posed by the survey is flawed. Unfortunately, it is the most important questions in the Surveys that are the most problematic, and the most important questions that could have been asked were left out.
Yet the Surveys cannot be relied upon even if the sample was, on average, omniscient. This is because the response rates obtained by WESTAT are seriously deficient. When seven or eight out of 10 persons in the sample decline to participate, nonresponse bias is likely to be overwhelming. Yet the Report treats nonresponse bias as if it does not exist.
Most disturbingly, the Report misleads readers (including Congress) to believe that the Surveys generated valid and reliable information because over 100,000 responses were obtained. The authors of the Report, either being knowledgeable themselves about survey research or having been duly informed by WESTAT, knew these statements were false but made them anyway.
NOTES
Obtaining Information from Federal Employees is Exempt from the Paperwork Reduction Act
DoD's Supporting Statement Makes No Evidentiary Claim Justifying Emergency Processing by OMB
This collection is being processed under emergency clearance procedures. Public comments have been solicited in the Federal Register under a shortened timeframe. A notice published in the Federal Register on June 9, 2010 (75 FR 32749) and June 18, 2010 (75 FR 34706). Two sets of comments were received that are no relevant to the actual instrument(s) being reviewed and go to the policy under consideration. The policy is currently being studied by the CRWG and issues relevant to our policy decision will be considered as part of the CRWG's final report and recommendations. This, however, is not the appropriate time or place to address these comments and issues. See Word file, p. 4.
"Practical Utility"
Excerpts from the Transcript of the December 2, 2010, Hearing on the Report (pp. 54-55)
Certainly an issue of this magnitude deserves that leaders take into consideration the views of their subordinates. It doesn't mean that they are dictated by the views of their subordinates. But I never made a major decision in the military without going around and talking to the enlisted people, the ones that would be tasked to carry out whatever the mission it is.
So I'm almost incredulous to see that on an issue of this magnitude we wouldn't at least solicit the views of the military about whether it should be changed or not. Now, those views may be rejected. Those opinions for the sake of the security of the country may be discounted. But to somehow say, well, we're not going to have a referendum—it's not a referendum. That's not what leadership is. Leadership is soliciting the views of your subordinates and thereby you're able to carry out your mission, because you have to rely on them to do so.
So to say, well, we didn’t need to ask their opinion on whether it should be repealed or not, violates in my view one of the fundamental principles of leadership.
Admiral MULLEN. Sir, I’ve grown up on the deckplates my whole life, and certainly one of the things that I pay attention to, have paid attention to in every leadership position I’ve been in, are my people, what motivates them, how they think, what they think. Clearly, they are the reason any of us is able to accomplish any mission, small or big. That's a fundamental principle with me.
I think the report has spoken to in great part their views of whether this can be successfully done or not and from my perspec- tive, very much by implication, where they are on this.
Senator MCCAIN. Then why wouldn't we just ask the question?
Admiral MULLEN. Because I fundamentally, sir, think it's an incredibly bad precedent to ask them about—to essentially vote on a policy.
Senator MCCAIN. It's not voting, sir. It's asking their views. It's asking their views and whether they would agree or disagree with the change, the same way you would ask whenever any policy or any course of action were contemplated. You would ask the views of others. You wouldn't necessarily accept them.
But for you to sit there and say, well, we wouldn't want to ask them their views, that to me is—it makes this whole exercise here that took so much time and effort and money a bit of an unrealistic situation here.
Admiral MULLEN. Sir, I just—I guess I disagree with you——
Senator MCCAIN. You disagree with asking them whether——
Admiral MULLEN. No, sir. I just disagree with the approach, that we would go out and ask them for their views on this specifically, although I think we've gotten them.
Senator MCCAIN. And I understand——
Admiral MULLEN. We've gotten them——
Senator MCCAIN:—your answer is we would not ask them theirviews on whether this policy should be changed or not as the first question.
Admiral MULLEN. We've gotten in great part their views as a result of this survey.
Senator MCCAIN. Well, obviously we’ll go around and around. But why we didn't just simply ask them how they felt about it, just as you would about any other course of action. I go around—well, again, every great leader I've known has said, what are your views on this issue.
Finally, I guess it would be important to include for the record this survey: Those who served in combat with a servicemember believed to be homosexual, effect on unit's combat performance. Army, combat—mostly negative. Army, combat arms, 58 percent. Marines, combat arms, 57 percent.


