The changing complexity of congressional speech


(This post was prepared in collaboration with Dan Drinkard)

Congress now speaks at almost a full grade level lower than it did just seven years ago, with the most conservative members of Congress speaking on average at the lowest grade level, according to a new Sunlight Foundation analysis of the Congressional Record using Capitol Words.

Of course, what some might interpret as a dumbing down of Congress, others will see as more effective communications. And lawmakers of both parties still speak above the heads of the average American, who reads at between an 8th and 9th grade level.

Today’s Congress speaks at about a 10.6 grade level, down from 11.5 in 2005. By comparison, the U.S. Constitution is written at a 17.8 grade level, the Federalist Papers at a 17.1 grade level, and the Declaration of Independence at a 15.1 grade level. The Gettysburg Address comes in at an 11.2 grade level and Martin Luther King’s “I Have a Dream” speech is at a 9.4 grade level. Most major newspapers are written at between an 11th and 14th grade level. (You can find more comparisons here)

All these analyses use the Flesch-Kincaid test, which produces the 'reads at a n-th grade level' terminology that is likely familiar to many readers. At its core, Flesch-Kincaid equates higher grade levels with longer words and longer sentences. It is important to understand the limitations of this metric: it tells us nothing about the clarity or correctness of a passage of text. But although an admittedly crude tool, Flesch-Kincaid can nonetheless provide insights into how different legislators speak, and how Congressional speech has been changing.

To see how different legislators rank, click here for a full database of all current members of Congress.

To see how many top SAT words lawmakers speak, click here.

Historical trends

Overall, the complexity of speech in the Congressional Record has declined steadily since 2005, with the drop among Republicans slightly outpacing that for Democrats (see Figure 1). Through April 25, 2012, this year's Congressional Record clocks in at a 10.6 grade level, down from 11.5 in 2005.

Between 1996 and 2005, Republicans overall spoke at consistently 2/10ths of a grade level higher than Democrats, except for 2001, when a rare moment of national unity also seems to have extended to speaking at the same grade level. But following 2005, something happened, and Congressional speech has been on the decline since. For Republicans as a whole, the decline was from an 11.6 grade level to a 10.3 grade level in 2011 (up slightly to 10.4 in 2012 so far). For Democrats, it was a decline from 11.4 to 10.6 in 2011 (also up slightly to 10.8 in 2012 so far.)

Figure 1. Congressional speech grade level by year




Ideology and speech complexity

To analyze the relationship between ideology and speech level, we took the first dimension DW-Nominate scores (DW1) for the current Congress, as of April 25, 2012. For the non-political scientists in the audience, DW1 scores take roll call voting data to place members of Congress on a liberal-conservative scale. On this scale, -1 is most liberal and 1 is most conservative. A negative value on the scale implies that the member votes most often with Democrats; a positive value implies that the member votes most often with Republicans.

Turning to Figure 2, we can immediately notice that grade level of Congressional Record speeches declines among Republicans as the voting record becomes more conservative. Among Republicans, the drop from the most moderate to most conservative is, on average, almost three whole grade levels, from 13th to 10th grade.

Among Democrats, the scatterplot does not reveal any relationship between grade level and ideology. However, when we hold all other factors constant in the regression analysis (see further below), we find that being on the far left is associated with lower speech grade levels. There is also a clearer correlation between further left voting score and lower grade level among more junior members.

Figure 2. The relationship of ideology to speech grade level




Changing members and members’ changes

It’s hard to pinpoint the exact cause of the decline. Perhaps it reflects lawmakers speaking more in talking points, and increasingly packaging their floor speeches for YouTube. Gone, perhaps, are the golden days when legislators spoke to persuade each other, thoughtfully wrestled with complex policy trade-offs, and regularly quoted Shakespeare.

The data indicate that part of the decline has to do with new junior members speaking at a lower grade level than more senior members, and some of it has to do with individual senior members simplifying their speech over time.

Figure 3 (below) breaks Congress into four seniority cohorts and details the relationship between ideology and grade level for speeches in the 112th Congress.

Here, a telling pattern emerges. Among the newest members (those with 1-3 years in their seat), there is drop off in speech level as we move from the center out to either extreme of the political spectrum, though the pattern is more pronounced on the far right. For the next cohort (4-10 years of experience), the same pattern continues on both the political right and left, though the relationship is much stronger among Republicans.

For the next cohort (11-20 years in their seat), the pattern on the right (more conservative, simpler speech) remains, but the pattern on the left reverses (there is a slight correlation between more liberalism and higher speech grade level). In the most senior cohort (more than 20 years in their seat), Republicans speak, on average, at a higher level than Democrats, with only the slightest relationship between conservatism and more simple speech.

Figure 3. Ideology and Seniority



At the individual level, prior to the 109th Congress (2005-2006), both individual Democrats and Republicans on average grew more sophisticated in their speech with each passing session of Congress. Individual Democrats gained on average 0.06 grade levels per session, and Republicans gained on average 0.12 grade levels per session. Then, starting with the 109th Congress, the trends reversed. Individual Democrats began dropping 0.07 grade levels of speech per session and individual Republicans began dropping 0.12 grade levels per session.


Table 1. Average estimated effect of each passing Congress on individual member grade level (results from regression analysis estimating annual member change with member fixed effects)


The top and bottom lawmakers by grade level

Table 2 (below) shows the 20 members of Congress with the lowest grade level score for their Congressional record corpus dating back to 1996. Of them, 85% (17 of 20) are Republicans; 65% (13/20) are freshmen, and another 15% (3/20) are sophomores. Additionally, 90% (18/20) are House members. The two Senators to make the bottom 20 are Rand Paul (R-KY) and Ron Johnson (R-WI), both Tea Party-supported freshmen.

Table 2. Bottom 20 speakers by grade level (all speeches since 1996) Republicans also outnumber Democrats among the members who speak at the highest grade levels. Among the top 20, 12 are Republicans, 7 are Democrats, and one (Joe Lieberman) is an Independent. And eight of the top ten are Republicans. There are also 14 House members and six Senators. And perhaps most notably, there are only two freshmen and three sophomores. More than half of the members have been in their seat for at least 15 years, which is well above the median of nine years across all members of the 112th Congress.

Table 3. Top 20 speakers by grade level (all speeches since 1996) Regression analysis

To estimate the effects of all the different factors (holding all the other factors constant), we estimated two ordinary least squared regression models. Model 1 uses the different factors to explain the variation in the grade level of individual members’ combined speeches since 1996. Model 2 uses the same factors to explain the grade level of member speeches just in the 112th Congress. (The correlation between all speeches since 1996 and just 112th Congress speeches for non-freshmen members is 0.74. For freshmen, these two measures will obviously be the same.)

For Democrats, moving from most moderate (DW1 score of 0) to most liberal (DW1 score of -1) is associated with a decrease in 1.59 grade levels for all speeches since 1996 combined, and an decrease of 1.35 grade levels for speeches from just the 112th Congress, all else being equal. This estimate is statistically significant.

For Republicans, moving from most moderate (0) to most conservative (1) is associated with a decrease in 2.07 grade levels in speech for all speeches since 1996 combined, and 2.06 grade levels for just the 112th Congress, all else being equal. Both are statistically significant. That the estimates for the relationship between ideology and grade level are consistent across the two models shows that this is both a current and a historic phenomenon.

Another takeaway point from the regression analysis is that the more a member speaks overall, the more simply that member is likely to speak, all else being equal. For just the 112th Congress, going from least to most talkative is associated with a decrease in almost a grade level and a half. For the historic corpus, going from the least talkative to most talkative member is associated with a decrease in a full grade level.

Socioeconomic status of member district does not play much of a role, so there is no story to tell of members speaking to their constituents. If anything, the reverse is true. Having a higher percentage of high school graduates in the district or state is associated with members speaking at a slightly lower grade level (though since half of the districts have high school graduation levels between 82% and 90%, this doesn’t add up to all that much). District median income (which is closely correlated with education generally) has no relationship to speech grade level. There is also no statistically significant difference between chambers. Members of Congress from the Northeast speak at a slightly higher grade level than their colleagues from the rest of the country.

Of course, a fair amount of variation remains unexplained. There are many reasons why members speak at different levels, and these explanations only tell part of the story.


Table 4. OLS Regression explaining member speech level (standard errors in parenthesis, significant variables bolded) Does it matter?

Earlier this year, the University of Minnesota’s Smart Politics noted that Obama’s 2012 State of the Union address clocked in at an eighth-grade level for the third year in a row, and that Obama’s average grade level of 8.4 was well below the average of 10.7 for the previous 67 addresses. Fox News ran the story alongside the image of a child in a dunce cap, and right-wing blogs mocked the President’s intelligence.

Others pointed out that maybe speaking clearly was a good thing. After all, the SOTU speech was pretty much right at the level of the average American’s reading level. And writing gurus like George Orwell (“If it is possible to cut a word out, always cut it out”) and Strunk & White (“omit needless words”) famously advise simplicity.

But whether you see it as plain speak or you see it as a dumbing down, the data are clear: The overall complexity of speech in the Congressional Record has dropped almost a full grade level since 2005. And those on the political extremes, especially those on the far right, tend to be associated with the most simple speech patterns.

Methodology for generating grade level scores

(by Dan Drinkard)

Grade levels were calculated using Flesch-Kincaid readability tests applied to various facets of text queries against the Capitol Words API. For example, Barbara Lee's entire corpus of words spoken can be retrieved by paging through the following url:

Flesch-Kincaid scores can be determined as: 0.39 * (Words/Sentences) + 11.8 * (Syllables/Words) - 15.59.

To derive counts: The python Natural Language Toolkit (NLTK)'s sentence tokenizer was used to count sentences, the Capitol Words ngram tokenizer was used to count words, and the Carnegie Mellon pronouncing dictionary was used to count syllables. For fallback syllable counting when a word wasn't present in the dictionary, three different sets of calculations employing different methods were tried—discarding unknown words, treating unknown words as the average word of 1.66 syllables, and using a trained fallback syllable counter from NLTK_Contrib. We found the results of each method to be nearly indistinguishable from the others. An example F-K calculator (this one using the aforementioned 'padding with averages' method) can be found at