Checking the COMPAS: An Open Records Analysis

In 2004 Tim Brennan, David Wells, and Jack Alexander authored a report for the National Institute of Corrections titled Enhancing Prison Classification Systems: The Emerging Role of Management Information Systems. The report was commissioned through a contract with Northpointe Institute for Public Management of Traverse City, Michigan, founded by Wells and Brennan in 1989. 

The report’s goals were to explore how then-recent advances in networked computing technology could improve the efficiency of classifying those in prison by risk level. “Basically,” they write in their executive summary, “current methods of prison classification are underutilizing this information technology infrastructure. The vast memory and analytical power of today’s hardware and software offer great potential for improving classification decisionmaking” (page xix). 

Brennan et al. describe the work of classifying prisoners as “knowledge work”, as it involves prison staff compiling data from various sources and analyzing it using “implicit mental models and explicit algorithms”. Networked computers could improve productivity of those classifying prisoners by automating portions of the data collection process. They could also allow for more rapid classification of prisoners by prison staff, identifying potential trends or factors that might predict a person’s likelihood to commit new criminal offenses more quickly and accurate than human evaluators, or so they claimed.  

These technologies have advanced much further in the intervening years, and Northpointe has offered their services under contract to multiple states, including the Wisconsin of Department of Corrections (WIDOC). They entered into a contract to provide their  Correctional Offender Management Profiling for Alternative Sanctions (COMPAS) tool to the WIDOC in 2015 and have remained under contract until at least FY2022. I requested records related to this contract and, after a little under a year, I received them. I will provide these for download in full at the end of this post, but because the records were quite extensive and this is an often overlooked aspect of the US prison system (imo), I thought it would be useful to provide a preliminary summary and analysis. 

This issue is also of interest to me in part because it resembles at least in broad strokes the kind of work that I do for libraries, namely sorting, classifying, and describing all manner of published material from zines published in the 1980s to monographic series written by subject matter experts for a specialized, technical audience. This is done according to a dizzying array of description standards and metadata schema (each with their own acronym, of course), all of which must be processed and manipulated by the vendor-provided software where much of this work takes place. Because of this, I appreciate the vast improvements in efficiency that networked computers represent when it comes to classifying, storing, and retrieving information. However, as my e-mail inbox can attest, these technologies are not without their problems.

In my work the potential downside of misclassifying or misdescribing something is often minor. Perhaps a student does not find all the relevant books held by the library on a subject for their class assignment or a researcher is unable to find a specific work they are looking for even though it is held by our institution. There are more serious issues when it comes to the description of marginalized groups in libraries, such as the classification of literature from various parts of Africa alongside the literature of those countries that colonized them (See Classifying African Literary Authors). There are other examples related to other groups, but these hardly invalidate the inherent convenience of computerized library catalogs compared to their printed predecessors. In the case of prisoner classification systems, the risks are much greater: people may be kept in prison longer or required to take intensive psychological treatment programs that aren’t appropriate. 

Northpointe acknowledges these risks themselves in their original bid for the contract. One of the risks of classifying prisoners, according to Northpointe, is the risk of “prisoner ‘deterioration’ and ‘prisonization’”. Though they do not define these terms specifically, the text which follows gives a telling suggestion: 

These risks have serious consequences for both the institution (idleness, discipline problems) and the community (high recidivism, more alienated offenders). These risks are more likely if prisoners are simply “warehoused” and if the prison fails to match the inmates to needed programs to prepare for reentry to the community.

(Attachment B, page 133)

There is also the risk of litigation stemming from “custody classification errors” or the public relations problems stemming from “high profile crimes” being linked to early release or parole decisions. Conversely, “if agency policies and procedures adopt overly restrictive classifications styles,” they write, “more systematic ‘over-classification’ errors occur … escalating overcrowding”. Combine enough of these errors and it can lead to a “loss of public trust in agencies ability to distinguish high risk from low risk offenders” as well as a “failure to rehabilitate”, which produces “on-going cost escalations” and, again, the potential for costly litigation (ibid., page 133). 

So how are these decisions made? Northpointe is rather unambiguous in its assessment of their competition: most systems in prisons and jails “were not developed statistically and have minimal or unknown levels of predictive accuracy”. Their COMPAS system, by contrast, was developed to improve on these efforts so that more information could be gleaned about a particular prisoner as soon as they enter prison custody. Because they market the service to both courts as a way to assess risk after arrest but before trial as well as to prison officials assessing whether someone can be released prior to the completion of their sentence (e.g. on parole), accurate predictions made using existing or minimal additional data is of the utmost importance. “The aim,” they write, “was to use data that was currently available at the earliest stage for new incoming prisoners. These data include the offender’s criminal history and selected demographic and other criminogenic factors” (ibid., page 135).

Through an analysis of  “a large [Michigan Department of Corrections] database”, Northpointe claims to have found a number of these “criminogenic” factors that were statistically correlated to “the commission of new infractions”: 

  • Age at first arrest
  • Age at assessment
  • Number of probation revocations
  • High school graduation
  • Criminal thinking 
  • Educational vocational resources
  • Number of mental health commitments (ibid., page 135)

Unfortunately they do not provide a definition of “criminal thinking” in this document, though we will return to the subject later. “Age at first arrest” is also problematic, as being arrested is not the same as being found guilty and yet there seems to be no distinction made here. Furthermore, whether someone is arrested is often the decision of police officers rather than any kind of automatic process.

Nevertheless, “using both training and validation samples,” they write “and two separate statistical methods (Logistic regression and Random Forests), the above six [sic] factors formed an ‘optimal sub-set’ of factors for predicting new infractions.” That they misstate the number of factors when describing their own model in their own document makes me hesitant to take their numbers at face value. In any event, according to Northpointe’s submission their analysis found that both the logistic regression and random forest models had an accuracy of around 70%, meaning it correctly predicted whether additional disciplinary infractions occurred based on these factors (ibid. page 136). 

In studying to be a librarian I took a class on data mining. One of the most important lessons our instructor hammered into us throughout the course is that creating the dataset will often comprise over 90% of the work of any given machine learning project. Computers asked to develop a model to predict the likelihood of an outcome will always give you an answer. They will never tell you that there is not enough data or that you are analyzing the wrong kind of problem with a particular method. A computer cannot tell the difference between data representing weather patterns and crop yields or that conveying the results of a psychological questionnaire. It is up to the people both creating the data and evaluating the results to determine how reliable a given prediction is for a particular context.

 One of the projects we did in that class involved analyzing genetic sequencing data in an effort to identify potential connections between specific genes and specific attributes in a given animal. For this assignment we were paired with another person in the class and asked to prepare and then analyze a genetic dataset and provide a short impromptu presentation of our findings for the class, mainly to demonstrate we’d chosen an appropriate analysis type and prepared the dataset correctly. I happened to be paired with a woman pursuing her PhD in Animal Sciences. When I said that I found these datasets a little confusing because I lacked a background in genetics or biology, she informed me that when researchers in her field investigated some of the predictions made by models like the one we were practicing with they often found that the predicted connections were often faint or nonexistent. 

As interesting as this class was, I am humble enough to recognize the limitations of my knowledge in this area. Luckily because Northpointe has successfully implemented versions of its COMPAS software in other states, their work has attracted the attention of experts in this area. In 2007 Skeem and Louden of University of California-Davis conducted an independent assessment of the COMPAS tool then in place and found it significantly lacking. 

“The strengths of the COMPAS”, they write in their conclusion, “are that it appears relatively easy for professionals to apply, looks like it assesses criminogenic needs, possesses mostly homogenous scales, and generates reports that describe how high an offender’s score is on those scales relative to other offenders in that jurisdiction. In short, we can reliably assess something that looks like criminogenic needs and recidivism risk with the COMPAS. The problem is that there is little evidence that this is what the COMPAS actually assesses” [emphasis in original] (Skeem & Louden, page 29). 

In addition to critiquing specific factors added to COMPAS, Skeem and Louden also cast doubt on whether COMPAS does actually predict recidivism. “In our view, the reader must wonder why the COMPAS produces no single “risk” score that can be evaluated by independent investigators. Instead, the authors create various ‘Risk Scales’ that change from evaluation to evaluation, and often combine parts of the COMPAS with other variables” (ibid., page 29). In addition to this lack of data, the authors also highlight the fact that there is no evidence COMPAS actually adjusts to changes in criminogenic factors over time. This is particularly important for use in prisons if, for example, completion of treatment programs is considered a factor for someone being granted parole. Because of these issues, Skeem and Louden state that they cannot recommend the COMPAS for application to individual offenders within the California Department of Corrections (ibid., page 6).

From what I can tell, that is precisely how it is being used by WIDOC. 

A paper published by Northpointe staff in 2009 responded to these critiques to defend their product and its application in a response paper. It is no longer available on their website, but thanks to the Wayback Machine I was able to download a copy. It opens with a telling acknowledgement: “most of the evidence for the reliability and validity of COMPAS is found in the results of in-house research studies conducted by Northpointe across a variety of jurisdictions and states” (page 2). That is to say, the evidence purporting to show the efficacy of their tools is based on internal data not shared with anyone other than perhaps the agency with which they are under contract. Later on in their response the authors highlight the fact that peer-reviewed papers on COMPAS have subsequently been published, but the citations given for both of these were authored by the same people who authored this response paper. Neither share the underlying data used in their analysis.

They claim that because agency personnel do have access to this data that their analysis “are often subjected to a more thorough vetting than that provided by the editors or peer-reviewed journals” (ibid., page 2). However, it’s important to remember that these agencies are contracting with companies like Northpointe precisely because they do not have the ability or desire to develop their own tools for this kind of analysis. For example, I found no independent analysis by WIDOC staff in the records responsive my request, which did include specific mention of any meeting minutes or deliberations related to the bidding process.

Herein lies the limits of technological “efficiencies” in addressing inherently social problems like crime, punishment, and justice.  Vague variables like “criminal thinking” provide ample room for clinical and correctional professionals to conclude that, for example, someone describing the effect of larger social forces on the circumstances of their crime is demonstrating a lack of remorse or unwillingness to accept responsibility for their actions. This is not hypothetical, as we will see later.

Consider one of the features COMPAS claims to aid in analyzing: an “inmate’s behavioral adaptation to prison” to determine if, for example, they could be moved to a less restrictive prison or become eligible for things like work release. Northpointe lists the factors their system uses to determine these “behavioral adaptation ratings”:

  • Cooperation with staff  
  • Respect vs. Disrespect to staff  
  • Completion of work tasks  
  • Program successes vs. failures 
  •  Defiant
  •  Aggressive to staff  
  • Tries to Con staff  [sic]
  • Troublemaker with other inmates  
  • Victimizes weaker inmates 
  •  Quick Temper  Etc.

Using this scale, they found “floor officers can reliably assess an inmate on several key behavior dimensions within minutes using this short checklist (e.g. less than 4 minutes)”. They caution that whoever is performing this analysis should know the inmate well enough to provide “a reasonably fair assessment” of their adjustment. These criteria are further refined in order to classify inmates into a number of “behavioral classes”. “The results are very encouraging and we found that the ‘inmate classes’ were validly linked both to prospective disciplinary levels, criminal history patterns, and also to several main criminogenic factors (e.g. criminal personality, criminal attitudes” (Attachment B, page 139).

I think it’s important to focus on a couple aspects of these adaptation criteria. Firstly, the emphasis on how quickly they can be completed by staff. There is little attention paid to establishing any guidance for how long prison staff should know a particular prisoner in order to make these assessments. This is apparently left to the institution or staff themselves to decide. There is precious little discussion of how prison staff supervising those completing these analyses can check the work of their subordinates, though naturally Northpointe does include costs for “training the trainers” in their bid (approximately $11,000 in the most recent contract renewal). 

Secondly, the criteria are almost exclusively focused on interactions towards prison staff rather than the thoughts, emotions, behaviors, or actions of the prisoner themselves. When I would visit WIDOC prison I witnessed numerous staff members get visibly angry to the point of shouting at other prison visitors, including the elderly and small children, for very minor issues including moving beyond a taped line on the floor while waiting to be processed or failure to notify the prison in writing in advance that they would be using a wheelchair. Anecdotes are anecdotes, but personally these are not the kind of people I would want making snap judgments about my behavior (“less than 4 minutes”) that could determine whether I spend another year or more in prison. 

One might argue that these are implementation problems as opposed to methodological flaws, but the emphasis on staff interaction shows that these criteria have little to do with characteristics of the prisoner and more to do with the attitudes of staff towards prisoners. Of course these factors are not completely absent, but as these bidding materials show even when they are included it is not without issue. In their bid Norhtpointe included some sample COMPAS reentry narratives and bar charts to demonstrate how the tool can be used to evaluate an individual’s risk. Here is a sample:

Bar chart showing sample from COMPAS system showing the re-entry risk factors for a sample prisoner. Provided by Northpointe as part of its bid materials.

These are accompanied by a narrative assessment that is meant to elaborate on what some of these factors mean, though confusingly they do not map exactly onto what is being shown in the bar chart. For example, while criminal history (both personal and familial), mental health, substance abuse, and ReEntry Vocation/education are present in both the narrative and the bar chart, all of the factors shown in the chart under Personality/Attitudes are reduced to a single section in the re-entry narrative as “Cognitive Behavioral/Psychological Score”, which in this example shows a score of 10 or “highly probable”. The section of the sample narrative assessment it where a “Cognitive Behavioral/Psychological Statement” could be is literally left blank. 

There are training materials for how these COMPAS scales should be completed by prison staff, but one of the supposed benefits of the COMPAS software is that it can be customized to fit a variety of criminal justice settings, from pre-trial release to probation and parole decisions. They suggest that a subset of criteria be used to “triage” all offenders within a probation agents purview, with the full scale used for only “higher risk offenders” (“Meaning and Treatment Implications of COMPAS Score”, page 4). 

This slide deck also sheds some light on what is meant by “criminal thinking”, which is apparently determined using the questionnaire shown below:

Sample questionnaire used by COMPAS to assess a person's criminal thinking. It includes three sections describing how the scale is measured, notes and treatment implications, and sample scale items

Consider statements like “A hungry person has a right to steal” or “The law doesn’t help average people”. If I were asked for my reaction to these statements I would almost certainly strongly agree. Apparently this means I may be in need of “cognitive restructuring”. If you want a definition of what that means you will have to file an open records request of your own.

Here we should return to where I opened, with Brennan et al. expressing a desire to use networked computers to improve the efficiency and effectiveness of classifying those in criminal custody. The desire to reduce such an inherently complex question as “will someone convicted or accused of a crime commit another crime in the future?” to a set of numbers s implicitly linked to a desire to outsource more and more of this work to computers. After all, computers are indeed better able to evaluate a set of numerical variables to predict a given outcome than a human would be if they attempted to do the same calculations by hand. Appeals for more data by policymakers are usually a request for more numbers to be analyzed, as opposed to non-numerical kinds of data such as oral or written personal histories or the notes from a psychological evaluation conducted by a licensed therapist. Furthermore, those analyzing the data often have more incentives to keep people incarcerated (or at least disincentives for release) than they do moving the opposite direction, and this will invariably color decisions at either the institutional or systemic level.

In spite of these problems, the contract with the WIDOC has been lucrative. In their request to extend their current contract, Northpointe, now called equivant, gives a cost for the licenses and hosting fees for COMPAS at over $930,000. The total cost, which includes project management, technical support, and training, comes in at over $1 million ((“Exhibit_A_-_WI_DOC_Contract_Renewal_Price_Proposal_FY22”). As is common with government contracting, this is far above the cost submitted with the initial bid. In their original cost proposal, Northpointe estimated that the cost over the life of the contract (up to 7 years) to be somewhere between 2 and 3 million dollars in total (“Northpointe_Cost_Proposal-Options_1_2_with_Notes.pdf”). 

This brings me to my final point regarding what ultimately led me to request these documents in the first place. Services like COMPAS purport to improve the efficiency and effectiveness of prison operations, but in reality they often reinforce existing systemic issues while also providing plausible deniability in the form of a seemingly objective numerical rubric by which incarcerated people are evaluated. It is much easier for a DOC official to justify keeping someone in prison for any reason if they can point to a score on a chart to demonstrate instead of defend the decision solely on their own words and judgment. I’ve heard from others with loved ones in the WIDOC that they have faced many hurdles trying to request copies of COMPAS evaluations regardless of whether the person incarcerated has given their personal permission.  

These systems have a clear impact on the persistence of mass incarceration because of their use in determining when and if someone is released. While the war on drugs and over zealous prosecution have been correctly highlighted as leading to mass incarceration, an often overlooked factor is the length of sentences and the difficulty of being released on parole or other forms of supervised release. Interrogating how systems like COMPAS are used by prison and jail administrators can hopefully address this issue and I hope that by making these records available I can aid in that effort.

There is much more to be found by going through these documents, certainly more than I could hope to cover in one post. There are materials from other vendors who also sought this bid, additional training and sample materials from Northpointe, and specific documents related to how these services operate in women’s prison or those for juveniles. They can be browsed or downloaded using the link below.

“All experiences are real”: A Deep History of Havana Syndrome (Part I)

“All experiences are real”: A Deep History of Havana Syndrome (Part I)

Every Tuesday in my fifth grade science class our teacher asked us to share any recent animal sightings. If you claimed to have seen something that you didn’t recognize, she’d invite you to look through a small library of wildlife guides to see if you might be able to identify it. 

For someone like me, who did not particularly like science class but did like books, this was a great arrangement. I could spend 15-20 minutes of a 40 minute class period looking through books with color illustrations of wildlife instead of filling out worksheets. All I needed to do was come up with an animal that matched the made-up description I had given her at the beginning of class.

Over time, my reports of increasingly exotic birds in and around Minnehaha Creek did start to raise suspicions, and eventually she determined that my reported findings were not of scientific use to the class.

Though my teacher eventually did get wise to this arrangement, medical doctors examining US diplomats do not seem as confident expressing skepticism. One person familiar with how the incidents are now being treated said in a December 2021 Washington Post article that “the fundamental statement we use is that all symptoms are real, all experiences are real”. 

As Bartholomew and Baloh lay out in their book on the subject, the most plausible explanation for the Havana Syndrome phenomenon is mass hysteria. Their book, however, focuses on the chronology of Havana Syndrome events and the history of mass hysteria. Informative as it was, I could not shake the feeling that something deeper was informing how and why this was unfolding in the way that it did. 

For starters, there was the assumption that old Cold War enemies must be somehow involved, even if it defied the logic of diplomacy and the laws of physics. Though easy to dismiss as just another invocation of one of America’s Official Enemies, it’s an interesting detail given that this new mass psychogenic illness seems to primarily affect US government personnel and their families working overseas. 

There was also the repeated complaints from those afflicted that their illness was not being taken more seriously. This has won them many champions in Congress, who in the midst of a pandemic passed the HAVANA Act, which was designed to make sure that those being attacked by these mysterious symptoms could receive treatment for their injuries. The bill was passed unanimously in both the House and Senate and signed into law on October 9, 2021.

Mass hysteria is both an individual psychological ailment and a contagion that can only exist in groups. It often occurs during a time of heightened stress, manifesting itself in individual cases but depending on certain group dynamics to spread. To understand Havana Syndrome then requires understanding what was going on among US officials stationed in embassies in late 2016 and early 2017 that would cause such a reaction. 

Trouble in Foggy Bottom

On February 11, 2017, the New York Times ran an article with the headline ‘A Sense of Dread’ for Civil Servants Shaken by Trump Transition based on interviews with employees of many different agencies.

In a grim preview of his rehabilitation into a #Resistance hero, one EPA official said that while some had “bristled” at the industry-friendly regulations enacted by George W. Bush, “at no point did they feel the alarm they do now.” 

On May 17, 2017 Acting Attorney General Rod Rosenstein appointed Robert Mueller as Special Counsel to investigate possible coordination between the Trump campaign and Russia in the 2016 election. This was the culmination of speculation that had begun before his victory, but its immediate impetus was Trump’s firing of FBI director James Comey on May 9.

It’s not necessary for me to recount the story of his investigation because it was everywhere. In the end, as often happens, most of the people who followed this investigation closely saw more or less what they wanted to see. At no point was this clearer than upon the release of Mueller’s much-anticipated report. To those waiting for a very Mueller Christmas, the report, along with the various indictments that came out of the investigation, vindicated their suspicions at the precise moment that Trump himself was lauding Mueller’s findings as a total acquittal.

On November 24, 2017, the New York Times ran an article headlined Diplomats Sound the Alarm as They Are Pushed Out in Droves. It described a simmering conflict between then-Secretary of State Rex Tillerson and senior staff in the State Department. Even before he had been confirmed, Tillerson’s staff fired a number of senior foreign service officers and froze most new hiring. Tillerson made no secret of his belief that the department was too large and hoped to eliminate almost 2,000 positions. The article closes with a quote from Dana Shell Smith, former US ambassador to Qatar: 

“These people either do not believe the U.S. should be a world leader, or they’re utterly incompetent,” she said. “Either way, having so many vacancies in essential places is a disaster waiting to happen.”

The head of the American Foreign Service Association (AFSA) Barbara Stephenson offered a similar assessment. In the December 2017 issue of the Foreign Service Journal, the official publication of the AFSA, she titled her statement for that issue as Time to Ask Why

[T]he need to make the case for the Foreign Service with fellow Americans and our elected representatives has taken on a new urgency. The cover of the Time magazine that arrived as I was writing this column jarred me with its graphic of wrecking balls and warning of “dismantling government as we know it.”

While I do my best, as principal advocate for our institution and as a seasoned American diplomat, to model responsible, civil discourse, there is simply no denying the warning signs that point to mounting threats to our institution—and to the global leadership that depends on us.

By firing long-time employees and leaving many high level positions unfilled, the new administration seemed to be cutting off the head of an institution that operates on what Stephenson referred to in her 2017 testimony as the “up-or-out principle”. This means, roughly speaking, that employees must continue to advance in the organization or face dismissal.

That this is the tone of public comments among State Department personnel suggests that there was a lot of tension internally. Conflict between the State Department staff and those heading the agency had crossed from a political disagreement into an existential struggle.

Havana Syndrome Emerges 

On May 23, less than a week after Mueller was first appointed, two Cuban officials stationed at the recently reopened Cuban embassy were asked to leave the US, an act of diplomatic retaliation. Unlike the Mueller investigation, this act was not widely publicized. In fact, the State Department only revealed it a few months later at a briefing on August 9. 

That same day, the Associated Press published an article where anonymous officials suggested that the diplomats had been “exposed to an advanced device that operated outside the range of audible sound and had been deployed either inside or outside their residences”. This is the first mention of any kind of device being responsible for Havana Syndrome.

In addition to these mysterious events, US-Russia relations were concerned with a more overt form of diplomatic tit-for-tat. On August 2, 2017 Trump signed additional sanctions against Russia, Iran, and North Korea into law. Not long after, Putin announced that the US would have to reduce the size of its diplomatic mission in Russia. When asked about this at his New Jersey golf club, Trump told the reporters 

“I want to thank him because we’re trying to cut down our payroll, and as far as I’m concerned, I’m very thankful that he let go of a large number of people because now we have a smaller payroll,” 

The remark drew swift and stern condemnation from members of the foreign service. Ironically most of the positions eliminated were likely held by Russian nationals, but to former ambassadors like Nicholas Burns, the remark “justified mistreatment of US diplomats by Putin”

The State Department held its first dedicated press briefing on the subject of Havana Syndrome on September 29, 2017. At this briefing they announced that all non-emergency personnel assigned to the US embassy in Havana would be returning to the United States with their families. The State Department said that this was in response to “attacks” instead of “incidents” as they had been described before. 

Among the effects of these mysterious attacks were “ear complaints, hearing loss, dizziness, tinnitus, balance problems, visual complaints, headache, fatigue, cognitive issues, and difficulty sleeping.” When asked if anybody else at the hotel where some of these attacks supposedly occurred had experienced similar symptoms, one of the unnamed State Department officials gave an interesting answer: 

[W]e’re not aware of any hotel staff or other individuals who have been attacked or suffered these systems [sic] beyond the U.S. Government personnel at the hotel. And in terms of our Cuban staff at the embassy, we’re not aware of any incidents involving them or attacks involving them. The victims that we’re aware of are the 21 U.S. Government personnel.

The State Department used the fact that this only seemed to affect US government personnel to conclude that they must be the result of some kind of attack.  However, this could also be explained by mass hysteria, since symptoms often travel within particular social groups, in this case US officials. 

It’s worth mentioning now that “Havana Syndrome” is a misnomer. A syndrome refers to a group of symptoms that all occur together which suggest the presence of a disease. This is not what we see among Havana Syndrome patients. Not only do the potential exposures vary in location and duration, but the symptoms themselves are not consistent from person to person. This also suggests mass hysteria. 

In December 2017 Marc Polymeropoulos took a trip to Moscow. According to an interview with The World, he was there to meet the ambassador and embassy, routine for a longtime CIA intelligence officer. Then, it struck:

it was on the night of Dec. 5, I woke up to a start. I had vertigo. I had a terrible headache, tinnitus, which is ringing in my ears — something really, really traumatic had happened to me. I had been in Afghanistan, and Iraq, and other places. I served over three years after 9/11 in war zones. I’ve been shot at. I put myself in harm’s way. But this was the scariest moment of my life. And so I knew something terrible had happened. I made it through about 10 days with the symptoms on and off. I came back to the United States and then the symptoms got particularly awful. And about March, April of 2018, to the point where I couldn’t work anymore. And after really seeing numerous doctors and undergoing just this incredible journey of trying to find out what happened, I, you know, I couldn’t drive for a while, I lost my long-distance vision. And so, ultimately, I had to retire from the CIA in July of 2019.

Later in the interview, Marc speculates that Russia must have been involved. Unfortunately he does not explain how whatever attacked him in Moscow followed him back and in fact got worse after returning to the United States.

A CIA Officer Visits Moscow, Returns With Mysterious, Crippling Headaches :  NPR
Marc Polymeropoulos in Moscow 2017 (NPR)

The Havana Syndrome phenomenon then spread to other geopolitical hotbeds. In June 2018, the New York Times reported that diplomatic staff in Guangzhou, China were being sent home following similar reports as those from the diplomats in Cuba, including Mark Lenzi. From their article:

Mr. Lenzi said that over the past year he and his wife had experienced similar physical symptoms, including headaches, sleeplessness and nausea, and on three or four occasions they heard odd noises, though they did not put them together until the disclosures last month.

The footnotes of the Mueller report show that Lenzi was actually interviewed as part of the investigation on January 30, 2018, not long after he first started showing symptoms (p.133). The subject of their interview was Konstantin Kilimnik, whom Lenzi worked with at the International Republican Institute’s (IRI) Moscow office in the early 2000s. At the time, according to Lenzi, Kilimnik was fired from the IRI because of his close ties to Russian intelligence, though another IRI official seems to have remembered it differently.

In addition to his foreign service positions, Lenzi was a deputy spokesman for John McCain’s 2008 presidential campaign and then worked for the New Hampshire Republican Party. In 2016, he told a local New Hampshire news station that despite his party affiliation he would be voting for Hillary Clinton:

Lenzi, 42, said that after working on NATO-related issues, he was dismayed to read Trump’s criticisms of NATO. Trump told the New York Times this week that as president, he would condition U.S. support of NATO allies on whether “they fulfill their obligation to us.”

“It’s not just that Trump is pro-authoritarian. He goes against what I was trying to with the International Republican Institute. There is a palpable fear in these countries about him becoming president.”

“As Trump went after mentors of mine personally – John McCain and Lindsey Graham — and opposes the principle I’ve worked for overseas.”

More recently, Lenzi has filed a lawsuit against the State Department over workplace discrimination after seeking accommodations under the Americans with Disabilities Act. The accommodations, which include reduced workload and the ability to use tinted sunglasses, were granted, but he alleges that the State Department began reassigning him to jobs below his qualifications after being afflicted by Havana Syndrome. 

The filing provides a more detailed glimpse into the internal process of evaluating these claims and demonstrates the power of social networks in spreading psychogenic illness. Beginning in November 2017, while working as part of diplomatic security, Lenzi says he and his family started experiencing “headaches, lightheadedness, nausea, nosebleeds, sleeplessness, and memory loss,” which he reported to his superiors. In April, unbeknownst to Lenzi, another employee, whom Lenzi described as his “closest American neighbor” (it’s unclear if they were literally neighbors), was sent home from Guangzhou to be evaluated at the University of Pennsylvania. 

Not long after the State Department sent out a technician to evaluate the living quarters of the employee who had just been evacuated. As Lenzi was working in diplomatic security, this technician needed to go through Lenzi to obtain the necessary equipment. According to Lenzi, the person evaluating the room was not really doing a thorough job and so he reported his concerns to his supervisor. His supervisor apparently did not share Lenzi’s concerns and told Lenzi that he was “being too emotional about an equipment issue.”

Mark Lenzi maintains emotional control when discussing equipment issue with 60 Minutes CBS News

 In May 2018 the Consul General held a town hall for employees and informed them that the technician had found nothing out of the ordinary. Lenzi was upset. A few days after this meeting, he got in touch with his former “closest American neighbor”. From the filing:

On or about May 26, 2018, Mr. Lenzi contacted his former neighbor, who had been medevac’d to University Pennsylvania Hospital in Philadelphia. Mr. Lenzi informed her of the numerous symptoms he and his family had been experiencing over the past six months. After Mr. Lenzi described their short-term memory loss, Mr. Lenzi’s former neighbor stopped him and said that Mr. Lenzi needed to get himself and his family out of their apartment “right now.” ” She went on to say that she had pleaded with the State Department on three different occasions to inform and get her American consulate neighbors out of the Tower 7 Apartment Complex in Guangzhou, but that each time the State Department did nothing.

One day later he sent an email warning his diplomatic colleagues of these incidents. Eventually he and his family also left to be examined at the University of Pennsylvania’s Center for Brain Injury and Repair, which has examined multiple State Department employees claiming to suffer from Havana Syndrome. 

About a year after he left China, doctors at the University of Pennsylvania published a study titled “Neuroimaging Findings in US Government Personnel With Possible Exposure to Directional Phenomena in Havana, Cuba” in the Journal of American Medicine. The article does not mention any of the symptoms experienced by the cohort. Instead they compared brain images to find “potential differences in brain tissue volume, microstructure, and functional connectivity in government personnel compared with individuals not exposed to directional phenomena.”

This study was widely cited as being proof that employees had suffered “brain damage” and therefore could not be suffering from any kind of mass hysteria. However, the study had a number of important limitations: 

the analysis involved a small sample with high heterogeneity, compounded by clinical (as opposed to research) neuroimaging acquisition. In the absence of a common clinical severity score due to varied symptomatology, the cohort cannot be subdivided. … Additionally, it cannot be determined whether the differences among the patients are due to individual differences between patients or differences in level and degree of exposure to an uncharacterized directional phenomenon.

Not only did the symptoms from patients not match, but the samples involved were both small and highly varied. Therefore this analysis is only capable of showing differences between the group of diplomats studied and the reference cases as two separate groups. Nothing detected in the analysis could explain specific differences in brain matter between individual diplomats, to say nothing of whether or how an “uncharacterized directional phenomenon” might have caused them. 

Though it contains lots of descriptions of his experiences with Havana Syndrome, the substance of Lenzi’s suit is employment-related. One of his chief complaints is being denied job placements overseas after reporting his symptoms. Recall that his symptoms began at the end of November 2017, about two months from when the US evacuated many of its employees from the embassy in Havana and at the same time the New York Times was reporting about conflict between career foreign service employees and Secretary of State Rex Tillerson over unfilled positions. Not long after, the head of the American Foreign Service Association warned in their official publication of the dire threats facing the country and their members in particular. 

Early in their book, Baloh and Bartholomew dismiss a common misconception with respect to mass hysteria or mass psychogenic illness. Those suffering from these ailments are likely not self-consciously faking their symptoms or experiences. Instead the illness is itself a reaction to someone’s surroundings, both social and physical. In the late 19th century there were reports of vague illnesses associated with telegraph lines and steam engines as the devices came to play a larger and larger role in people’s lives. Often how the disease is described and spread can reveal deeper issues that might not be clear even to those reporting the symptoms. 

In this two-part essay I argue that Havana Syndrome is itself symptomatic of a crisis of strategy and identity that most acutely affects people at the highest levels of US foreign policy. Though in some ways precipitated by the chaos of the Trump years, it has its roots in the inconsistencies and convenient fictions that formed the foundation of US policy and informed the structure of the US state itself during the Cold War.

Building on the massive mobilization during World War II, the US underwent a significant reorganization of its defense and intelligence apparatus based on the belief that they were engaged in an ideological death struggle. Assumptions made about the USSR and Russia in these early days would develop into institutional reflexes for US diplomats and intelligence operatives. These would form the basis of not just US policy but the creation of organizations like NATO. 

When their opponents in this existential struggle voted themselves out of existence without firing a single ICBM (with lots of American help to be sure), both the US and NATO faced a strategic vacuum that was ultimately filled by the threat of terrorism. Though this was sufficient grounds for another massive reorganization of the national security state and a dramatic expansion of the US military’s presence overseas, these foundational fictions remained at the heart of US policy and persist unconfronted up until the present. Though these national security types may be the ones showing symptoms, as these fictions run aground against the changing geopolitical reality of the 21st century, nearly all Americans will have to confront them one way or another.

World War II and the National Security State

Though the US did have various informal and often ad-hoc intelligence operations during the 19th and early part of the 20th century, the modern national security state as we know it today is largely a product of World War II. In many cases, changes made in the immediate aftermath of the war were codifying changes that had either been discussed or implemented in a preliminary way during the war. 

Prior to the signing of the National Security Act, the US had a Department of War and a Department of the Navy, both having been established in the first decades after the American Revolution. The State Department had also existed since the country’s founding, and had been considered the lead department for all peacetime foreign relations matters.

Now, the Department of War would be split into separate Air Force and Army departments, joining the Department of the Navy under a new structure called the National Military Establishment. This new group was to be headed by a single cabinet-level Secretary of Defense, replacing the Secretary of War. In 1949 an amendment to the Act was signed which officially created the Department of Defense from this National Military Establishment. Many activities which the US military had begun as part of the wartime effort were also slotted into this newly reorganized military structure.

Truman signing the 1949 Amendment to the National Security Act

The US did not have a governmental agency primarily responsible for foreign intelligence work until President Roosevelt established the Office of Strategic Services in 1942. At war’s end, Truman established the National Intelligence Authority via executive order to oversee the Central Intelligence Group. This National Intelligence Authority consisted of the Secretaries for War, Navy, State, and the president’s chief of staff. With the signing of the National Security Act this group was renamed the National Security Council (NSC). This group would, among other things, be nominally tasked with overseeing the operations of the newly-created CIA.  

Though the exact membership of the NSC has changed as new departments are created and presidential priorities change, this is still the structure which still forms the foundation of presidential authority over the US military and intelligence capabilities.  It should also be noted that in addition to these bureaucratic links between diplomacy and intelligence, at this time the State Department and CIA were being run by two brothers, Allen and John Foster Dulles.

In addition to reorganizing the institutions of government themselves, the National Security Act realigned US politics around the need for permanent defense mobilization against any existing or potential threats. While there was no shortage of bureaucratic jostling for position within this structure, the foreign intelligence, diplomacy, and military capabilities of the US were now firmly and irrevocably fused together under the banner of US national security.

These organizational changes mirrored or perhaps precipitated ideological changes in America. As Douglas T. Stuart, a historian who teaches at the Army War College, wrote in a study of the National Security Act: 

Over time, the concept of national security displaced national interest as the leitmotif of American foreign policy, and it became increasingly difficult for U.S. policymakers to calculate American interests unless they were framed, and justified, by reference to national security … In accordance with the logic of national security, policymakers were predisposed toward worst case scenarios and tended to favor military instruments of power and influence. (p. 303)

Stuart, Douglas T. Ministry of Fear: The 1947 National Security Act in Historical and Institutional Context
International Studies Perspectives, Volume 4, Issue 3, August 2003, Pages 293–313, https://doi.org/10.1111/1528-3577.403006

Though Stuart frames this result as an unintentional consequence, considering it alongside the ideological origins of the containment doctrine suggests this may as well have been the purpose of this reorganization all along. By laying the groundwork for an extensive system of intelligence collection and covert activity, and by creating procedures for consultation between the civilian and military agencies involved in national security planning, the National Security Act gave the United States a framework designed for anti-Soviet containment.

Birth of Containment

George Kennan

George Kennan was working as deputy chief of mission at the US embassy in Moscow. It was from this post that he wrote his famous Long Telegram in 1946. The telegram, addressed to the Secretary of State but widely circulated throughout the government, lays out Kennan’s view of the Soviet state at the end of World War II. Following his assessment of the Soviet ideology and state structure, he writes: 

It should not be thought from above that Soviet party line is necessarily disingenuous and insincere on part of all those who put it forward. Many of them are too ignorant of outside world and mentally too dependent to question self-hypnotism, and who have no difficulty making themselves believe what they find it comforting and convenient to believe. … The very disrespect of Russians for objective truth–indeed, their disbelief in its existence–leads them to view all stated facts as instruments for furtherance of one ulterior purpose or another. There is good reason to suspect that this Government is actually a conspiracy within a conspiracy; and I for one am reluctant to believe that Stalin himself receives anything like an objective picture of outside world. Here there is ample scope for the type of subtle intrigue at which Russians are past masters.

The picture Kennan paints of the Soviet Union is of an intensely paranoid and unpredictable adversary run by men so blinded by ideology that they may not even realize the depths of their self-deception. Officials operating within the Soviet government are encouraged to ignore any information which would contradict the official ideology. Therefore it is impossible to expect that the average Soviet citizen has anything approaching an understanding of the world around them. It is within this hall of mirrors that, according to Kennan, the Soviets have ample room to advance their own anti-American agenda as only they know how. Elsewhere in the telegram Kennan says that his assessment of the Soviet state should not be taken as an assessment of the Russian people, but I find this difficult to rectify with his allusions to Russian mastery of “subtle intrigue” and Russian “disrespect for objective truth.”

According to an article written by his biographer C. Ben Wright, earlier that year he told the audience at the Air War College that “with probably ten good hits with atomic bombs you could, without any great loss of life or loss of the prestige or reputation of the United States as a well-meaning and humane people, practically cripple Russia’s war-making potential.”

Even for the heady days of total US nuclear superiority, the idea of dropping ten nuclear bombs on a country “without any great loss of life” strikes me as basically insane. Nevertheless, Kennan would often contend that people citing his own words while arguing for aggressive action were misinterpreting his true intentions.

If Kennan’s stance was paradoxical, he would argue that with Russia this simply comes with the territory. Writing in a letter, he declared “in the consideration of Russian matters, [whenever] there is a question as to whether this or that, the answer is usually ‘both.’ 

In the same article Wright cites an unpublished essay titled “The United States and Russia” that gives some concrete suggestions as to what he may have meant in his own terms :

  • Don’t act chummy with them.
  • Don’t assume a community of aims with them which does not really exist.
  • Don’t make fatuous gestures of good will.
  • Make no requests of the Russians unless we are prepared to make them feel our displeasure in a practical way in case the request is not granted.
  • Take up matters on a normal level and insist that Russians take full responsibility for their actions on that level.
  • Do not encourage high-level exchanges of views with the Russians unless the initiative comes at least 50 percent from their side.
  • Do not be afraid to use heavy weapons for what seem to us to be minor matters.
  • Do not be afraid of unpleasantness and public airing of differences. Coordinate . . . all activities of our government relating to R ussia and all private American activities of this sort which the government can influence.
  • Strengthen and support our representation in Russia 

Just as the US was fusing together its new intelligence and military agencies, Kennan was creating an image of the USSR and Russia that would inform how the inhabitants of this new national security apparatus saw their Cold War adversary. Attempts at normal diplomatic relations were futile, and instead the US policy should be to contain and isolate the Soviets (and indeed any communist or socialist government) lest the contagion spread.

While the US was molding its own bureaucratic and political ideology into its Cold War form, the transnational arm of the anti-Soviet coalition was also taking shape: NATO. 

Diplomatic Origins of NATO

It’s easy to see the breakdown of the Allied powers as inevitable given the ideological differences between the parties, but the events themselves are interesting because the legal and political structures which emerged as a result of this fallout did much to shape the events of the ensuing decades.

The North Atlantic Treaty Organization (NATO) officially came into existence with the ratification of the North Atlantic Treaty in 1949. The basis of the alliance took shape during secret negotiations between the US, UK, and Canada held  at the Pentagon earlier in March of the same year. (See Wiebes and Zieman 1983).  

On April 1 the State Department produced minutes of these meetings which represented a preliminary agreement for what would become NATO. The most important aspect from a geostrategic point of view is the concept of mutual defense. This principle would form the basis of NATO’s Article 5, also known as the mutual assistance pledge. Then as now, which countries should be party to this agreement was the subject of intense debate, though establishing it as a regional alliance was important legally because such arrangements are allowed under Chapter VIII of the UN Charter.

Secretary of State Dean Acheson signs North Atlantic Treaty with President Harry Truman and Vice President Alben Barkley.   

On April 1, 1954, almost five years after the North Atlantic Treaty was ratified by the United States, the Soviet Union presented a proposal for them to join the alliance:

Plainly enough, given the proper conditions, the North Atlantic Treaty Organization could lose its aggressive character, that is, if all the big powers which belonged to the anti-Hitler coalition became its participants. In view of this the Soviet Government, guided by the unchanged principles of its foreign policy of peace and desirous of relaxing the tension in international relations, states its readiness to join with the interested governments in examining the matter of having the Soviet Union participate in the North Atlantic Treaty.

In addition to perceiving the existence of NATO itself as a de facto anti-Soviet alliance, the immediate impetus for this request was the establishment of the European Defense Community via a treaty signed on May 27, 1952 but which was never ratified. This proposed arrangement was to include a newly rearmed West Germany, which had become a highly contentious issue in the years following the war.

The Bundeswehr would not come into existence until 1955, but its formation was the product of many years of effort among officials in West Germany, including former members of the Nazi Wehrmacht like Hasso von Manteuffel. I will admit to being a partisan of the Soviet view on these matters, but even setting that aside I find it difficult to see in this proposal the kind of irrationality or paradoxical thinking that Kennan presents in his writings. Is there really no community of aims among the former anti-Nazi alliance? Nevertheless, this request was rejected and the Cold War counterpoint to NATO, the Warsaw Pact, was established a few years later.

That NATO is an international alliance conceived of initially through negotiations at the Pentagon between Anglo countries is significant for understanding its real purpose. One of NATO’s most important functions during the Cold War was as a mouthpiece which could claim to speak for an alliance of many countries. NATO would prove essential for US rhetoric during the Cold War as a way of contrasting liberal capitalist democracies with the authoritarian Soviet communism. Over time, speaking for “the international community” or later the “rules-based international order” would become common even in cases where it was clear that the US was in the driver’s seat.

Detail from 1956 travel map of Western Europe published by ESSO

As fascinating as all this diplomatic and covert action was, the ultimate backbone of US strategy post-World War II was the nuclear bomb. Though the term “mutually assured destruction” has become a shorthand for Cold War nuclear policy, it is notable that US military strategists were aware of the thorny political issues raised by nuclear weapons even before the Soviets had successfully tested a weapon of their own in 1949. 

Nuclear Strategy and the American Cold War Ethos

A 1946 paper titled The Absolute Weapon: Atomic Power and World Order was one of the earliest attempts at formulating US nuclear policy. Nuclear weapons caused destruction at such an epic scale that it was challenging many of the underlying assumptions of military strategy. Even from this early date, though, it was clear that the biggest challenges presented by nuclear weapons would be political, not technological.

The paper identified two political dilemmas posed by nuclear weapons. The first was that the primary political tool for arriving for any agreement between sovereign states is a voluntary treaty. Even if a treaty limiting atomic weapons could be negotiated and signed, the immense power wielded by nuclear armed states meant that the incentive to break the agreement would only grow as more states signed on. 

The second dilemma was that the growth in military air power meant attacks could happen more quickly, so the prospect of any kind of negotiated solution to the threat of atomic attack among any international bodies seemed slim to none. 

The political response to these twin dilemmas posed by nuclear weapons among political leaders was ably summarized in the report: 

After a few early flights of fancy, most of the political analysts lapsed into a discreet silence on the subject. It was quickly apparent that they had been handed one of the toughest problems which the members of their guild had ever had to face. … Each sortie into some promising opening either ended up against a solid wall or led to another tangle of seemingly insoluble problems. No clue could be found to a simple formula which would offer repose to men’s minds while opening up new vistas of unruffled prosperity. In fact there was reason to believe that nothing of the sort would ever be found and that the job was one of arduous and patient examination of a whole mosaic of related problems extending indefinitely into the future. 


Indeed no straightforward political solution to the problem of nuclear weapons has yet been found. Though some at the time believed no country could summon the scientific knowledge and technical resources required to create such a sophisticated device, the political benefits to having one all but guaranteed that total US nuclear superiority would not last for long. The work of nuclear strategists and the national security state was developing sufficient countermeasures such that any use of nuclear weapons on one side would all but guarantee that the attacker would be also annihilated by a nuclear counterattack. 

In order to reap the geopolitical benefits of this power, the holders of nuclear weapons must convince their opponents that they would indeed use it when faced with an existential threat. In a word: credibility. 

It was against the backdrop of this uneasy standoff that the complex politics of the Cold War played out. The main concern of US intelligence, military planning, and diplomacy would be to be under constant vigilance against any technological or strategic development that might give the USSR enough of an advantage to encourage them to actually fire their weapons. 

The authors of the report cited above believed that once other countries reached nuclear parity, it was far more likely that nuclear weapons would be used either at the very beginning of a war or after a few years of sustained fighting, probably by whichever side was losing. They declared it would be “foreign to human nature” for war to erupt between nuclear powers that would not involve nuclear weapons because if the conflict was great enough so as to lead to the outbreak of war, then it would be unlikely for war to occur in the first place. 

Bernard Brodie (military strategist) - Wikipedia
Bernard Brodie, one of the authors of the 1946 report

Though the US and USSR never did fight a head-on conventional war, looking back on the 20th century we can see that almost the exact opposite of what Brodie predicted happened. Instead of one single moment of annihilation, the Cold War was fought all over the world, with those in the decolonizing world doing the majority of the fighting and dying. Governments were overthrown covertly out of US embassies and weapons of the non-nuclear variety circulated around the world, all the while the threat of nuclear weapons waiting in the offing in case things really got serious. These wars were fought with covert arming and training of militias and aggressive propaganda efforts, often more intensely concerned with domestic rather than foreign opinion.

Intelligence work has been murky since the days of the Venetian Council of Ten, but now with weapons which could destroy the world many times over with just a few sorties or launches from submarines, the stakes of spook paranoia would reach dizzying heights. Over time, it became clear to anybody either already at the top of US national security as well as those who sought to rise up the ranks that there was rarely much to gain from deflating potential threats.

As the national security state solidified, this constant need for newer threats lurking around every corner became a matter of bureaucratic survival. Disagreements over strategy and tactics between different factions within it would reverberate through US politics, though largely unacknowledged by and often unknown to the general public.

On all sides, though, national security became the prime directive of the US state. Wars overseas were justified in the name of warding off a seemingly omnipresent enemy that no matter how much the US spent was always about to pull ahead if they weren’t ten steps ahead already. The imagery and rhetoric used to convince the American people became more sophisticated alongside similar refinements in the tools for effecting US policy overseas in both covert and overt ways.

What had once been fixed distinctions between wartime and peacetime were increasingly becoming blurred. Whether we were at war or peace became less dependent on formal declarations and more a matter of assessing the levels of violence in a given place and time.

The US may have “won” the Cold War, but the strategies and tactics used to achieve that victory would have profound effect on both domestic and international politics up to the present. These effects and their connection to this string of bizarre incidents known as Havana Syndrome will be the subject of part 2.