Grade Level

 

Colbert brings the sunlight down on our congressional speech study

Dear Mr. Stephen Colbert:

We here at Sunlight all quite enjoyed your mock outrage at the declining level of congressional discourse, as featured here in our original analysis.  So glad to have you in on the joke.

You are sooooo right, Stephen: “If we're paying these clowns, I say we should get something for our money. They should talk in a fancy way that shows more respect for the sacred institutions….”

Amen, Stephen. And while they’re at it, we just don’t see why Congress can’t get its act together and make everyday “National Corvette Day.” Why does it only have to come on June 30? And while they’re at, can’t they get on with making July 30 “National Dance Day” already? And making July 28 “National Day of the American Cowboy”? Talk about gridlock! A real dancing cowboy congress would get on its Corvette and make this biznatch happen.

And yes, Stephen, you are also soooo right about how sad the Founding Fathers must be, since they wrote the Constitution at a 17.8th grade level. As you say, “They used soaring poetic language about freedom so that no one would notice that they had slaves.”

How times have changed! Today, the only slavery we have is to our addictions to constant stimulus. Sorry, were you saying something? There was a new tweet I had to check. Something about Congress getting dumber.  Hehehe. Somebody said dumber.

Oh, and Mr. Colbert, one more thing. That fancy sentence of yours at the end of your report: "From now on I want to hear naught but the most refined oratory from the distinguished exemplars of democratic dicta ensconced within our sacrosanct legislative chambers, supercilious lupine pachyderm, for twixt the profligate libertinism of the latitudinarian swarms and the scrupulous helms of the forthright conserveatocracy resides concurrence upon one veritable axiom -- if you use big words, no one will know you aren't doing jack squat."

For those of you keeping score at home, that was 66 words, 125 syllables, 1 sentence. Grade level: 32.498.

Congratulations, Mr. Colbert. You are the smartest man who EVER lived.

 

 

 

 

 

Dear Rep. Quigley: right back at ya, re: Sunlight's congressional speech study

Dear Rep. Mike Quigley (D-IL):

With this post, I offer my humble appreciation that you should deign to take to the House floor to, as you so eloquently say, “decry an ignominy perpetuated on this Body by the captious Sunlight Foundation.”

The ignominy you refer to, the findings that you deem “fatuous,"  sir, are those from our recent study, the one in which we found that Congressional speech had dropped a full grade level since 2005, prompting much discussion as to whether Congress is indeed, as we say in the popular parlance, “dumbing it down.”

I must admit, sir, your clever references do sparkle and shine:

So if the Sunlight Foundation must lampoon our verbal buffoonery, reducing us to linguistic lummoxes, remember Cecil Terwilliger's immortal retort to his brother Sideshow Bob's comment about spending four years in clown college: "I'll thank you not to refer to Princeton that way.''

Consider me besotted, bemused, and bewitched by your rapier wit (not to mention your fulsome GRE-worthy lexicon).

But, I will not go on long. Rather, to borrow from Polonious, “since brevity is the soul of wit, and tediousness the limbs and outward flourishes, I will be brief.”

Sir, we ran your speech through the Flesch-Kincaid calculator. Grade level: 9.858. 334 words, 560 syllables, 23 sentences.

Your speech is a mix of high and low discourse, but it does expose some of the brute force of the F-K test. The test makes no accounting for all your fancy words (zeitgeist, badinage, schadenfreude, priapistic, salubrious). They are all just words with syllables. As we warned in the original post: “It is important to understand the limitations of this metric: it tells us nothing about the clarity or correctness of a passage of text.”

Or rather, in the eternal words of our apparently shared favorite philosopher, Homer Simpson, “I am so smart, I am so smart, s-m-r-t... I mean s-m-A-r-t.”

 

 

Is Congress getting dumber, or just more plainspoken?

(This post was prepared in collaboration with Dan Drinkard)

Congress now speaks at almost a full grade level lower than it did just seven years ago, with the most conservative members of Congress speaking on average at the lowest grade level, according to a new Sunlight Foundation analysis of the Congressional Record using Capitol Words.

Of course, what some might interpret as a dumbing down of Congress, others will see as more effective communications. And lawmakers of both parties still speak over the heads of the average American, who reads at between at 8th and 9th grade level.

Today’s Congress collectively speaks at a 10.6 grade level, down from 11.5 in 2005. By comparison, the U.S. Constitution is written at a 17.8 grade level and the Declaration of Independence at a 15.1 grade level. The Gettysburg Address comes in at an 11.2 grade level and Martin Luther King’s “I Have a Dream” speech is at a 9.4 grade level. All these analyses use the Flesch-Kincaid test, which equates higher grade levels with longer words and longer sentences.

In looking at the data more closely, a few other important patterns jump out:

  1. Controlling for other factors, it is generally the most moderate members of both parties who speak at the highest grade levels, and the most extreme members who speak at the lowest grade levels. This pattern is most pronounced among freshmen and sophomore members.
  2. Prior to 2005, Republicans on average spoke at a slightly higher grade level than Democrats. Since then, Democrats have spoken on average at a slightly higher grade level than Republicans.
  3. Some of the decline in grade level since 2005 is because junior members speak at a lower grade level than senior members, and some of it is because senior members have simplified their speech patterns over time.
  4. On average, the more words individual members speak on the floors of Congress, the simpler their speech tends to be.

For more detail on these and other findings, as well as a more complete methodology, you can find the full analysis here.

For a complete list of lawmakers and the grade level of their speech, click here.

To see which top SAT words come up most often in the 112th Congress – and who speaks them most often, click here.

So what does it all add up to?

Earlier this year, when the University of Minnesota’s Smart Politics noted that Obama’s 2012 State of the Union address clocked in at an eighth-grade level for the third year in a row, and that Obama’s average grade-level of 8.4 was well below the average of 10.7 for the previous 67 addresses, Fox News ran the story alongside the image of a child in a dunce cap, and right-wing blogs mocked the President’s intelligence.

Others pointed out that maybe speaking clearly was a good thing. After all, the SOTU speech was pretty much right at the level of the average American’s reading level. And writing gurus like George Orwell (“If it is possible to cut a word out, always cut it out”) and Strunk & White (“omit needless words”) famously advise simplicity.

But whether you see it as plainspeak or you see it as a dumbing down, the data are clear: The overall complexity of speech in the Congressional Record has dropped almost a full grade level since 2005. And those on the political extremes, especially those on the far right, tend to have the most simple speech patterns.

Updated May 22, 2012: The graphic above originally misidentified Mississippi as Arkansas.

 

Congress far from exemplary in SAT word proficiency

(This post was prepared in collaboration with Dan Drinkard)

If you do well on your SAT test, then you will ____ your chance of becoming a member of the U.S. Congress some day.

A. Vindicate

B. Scrutinize

C. Compromise

D. Discredit

E. Enhance

While the correct answer should probably be E (Enhance), the reality is that it might be closer to C  (Compromise) or D (Discredit).  At least, when it comes to the 112th Congress, top SAT words are far and few between.

We find that only 10 members of Congress have used at least 20 of the Kaplan 100 Most Common SAT Words so far in the 112th Congress, and that only 92 members of Congress have used at least 10 of these words. More than half of the members of Congress have used five or fewer. And 32 members did not use a single Kaplan 100 word, while 52 members only said one. In total, 0.046% of all words spoken in the Congressional Record were Kaplan 100 words.

For an analysis of how Congressional speech has dropped by a full grade level since 2005, click here.

Among the Kaplan 100, the word spoken most frequently in Congress is “compromise.” It had been uttered 1,820 times this Congress as of the end of April, far more as an aspiration than a description. Majority Leader Sen. Harry Reid (D-NV) has uttered the word 142 times, more than anyone else. Unfortunately, speaking it does not make it so.

Likewise, the other top words – prosperity (923 times), integrity (883 times), and exemplary (582) – also seem far more hopeful than reality-based. Table 1 (below) shows the Kaplan 100 words spoken most frequently in the 112th Congress.

Table 1. Top 20 most-spoken Kaplan 100 words, 112th Congress

Of the Kaplan 100, 14 words are missing entirely from the Congressional Record for the 112th Congress so far. They are: abbreviate; conformist; enervating; evanescent; florid; hackneyed; haughty; hedonist; ostentatious; perfidious; pretentious; querulous; sagacity; submissive.

For the full list of the top 100 words and how much they’ve been spoken and by whom, click here.

Who’s used the most unique SAT words in the 112th Congress? That distinction belongs to Senator Patrick Leahy (D-VT), who, as of April 2012, had used 27 of the Kaplan 100, putting him just ahead of fellow Senator Dick Durbin (D-IL), who has verbalized 26 of the 100 words so far, and Sen. Orrin Hatch (R-UT), who has uttered 25. Leahy has also used Kaplan 100 words a total of 127 times, also just edging out Durbin, who used the words 122 times.

Rounding out the top ten list for most unique Kaplan 100 words spoken are Sen. Mitch McConnell (R-KY), Sen. Benjamin Cardin (D-MD), Rep. Dennis Kucinich (D-OH), Rep. Steve King (R-IA), Sen. Dianne Feinstein (D-CA), Sen. John McCain (R-AZ), and Sen. Olympia Snowe (R-ME). All have got there by speaking at least 100,000 words so far in the 112th Congress. Of the top ten list, Snowe has both the highest grade level for her speech (14th grade), and the highest number of Kaplan 100 words per 100,000 words spoken: 76.5.

 

Table 2. Members who speak the most unique Kaplan 100 words For a full list of how all members compare, click here.

The changing complexity of congressional speech

(This post was prepared in collaboration with Dan Drinkard)

Congress now speaks at almost a full grade level lower than it did just seven years ago, with the most conservative members of Congress speaking on average at the lowest grade level, according to a new Sunlight Foundation analysis of the Congressional Record using Capitol Words.

Of course, what some might interpret as a dumbing down of Congress, others will see as more effective communications. And lawmakers of both parties still speak above the heads of the average American, who reads at between an 8th and 9th grade level.

Today’s Congress speaks at about a 10.6 grade level, down from 11.5 in 2005. By comparison, the U.S. Constitution is written at a 17.8 grade level, the Federalist Papers at a 17.1 grade level, and the Declaration of Independence at a 15.1 grade level. The Gettysburg Address comes in at an 11.2 grade level and Martin Luther King’s “I Have a Dream” speech is at a 9.4 grade level. Most major newspapers are written at between an 11th and 14th grade level. (You can find more comparisons here)

All these analyses use the Flesch-Kincaid test, which produces the 'reads at a n-th grade level' terminology that is likely familiar to many readers. At its core, Flesch-Kincaid equates higher grade levels with longer words and longer sentences. It is important to understand the limitations of this metric: it tells us nothing about the clarity or correctness of a passage of text. But although an admittedly crude tool, Flesch-Kincaid can nonetheless provide insights into how different legislators speak, and how Congressional speech has been changing.

To see how different legislators rank, click here for a full database of all current members of Congress.

To see how many top SAT words lawmakers speak, click here.

Historical trends

Overall, the complexity of speech in the Congressional Record has declined steadily since 2005, with the drop among Republicans slightly outpacing that for Democrats (see Figure 1). Through April 25, 2012, this year's Congressional Record clocks in at a 10.6 grade level, down from 11.5 in 2005.

Between 1996 and 2005, Republicans overall spoke at consistently 2/10ths of a grade level higher than Democrats, except for 2001, when a rare moment of national unity also seems to have extended to speaking at the same grade level. But following 2005, something happened, and Congressional speech has been on the decline since. For Republicans as a whole, the decline was from an 11.6 grade level to a 10.3 grade level in 2011 (up slightly to 10.4 in 2012 so far). For Democrats, it was a decline from 11.4 to 10.6 in 2011 (also up slightly to 10.8 in 2012 so far.)

Figure 1. Congressional speech grade level by year

 

 

 

Ideology and speech complexity

To analyze the relationship between ideology and speech level, we took the first dimension DW-Nominate scores (DW1) for the current Congress, as of April 25, 2012. For the non-political scientists in the audience, DW1 scores take roll call voting data to place members of Congress on a liberal-conservative scale. On this scale, -1 is most liberal and 1 is most conservative. A negative value on the scale implies that the member votes most often with Democrats; a positive value implies that the member votes most often with Republicans.

Turning to Figure 2, we can immediately notice that grade level of Congressional Record speeches declines among Republicans as the voting record becomes more conservative. Among Republicans, the drop from the most moderate to most conservative is, on average, almost three whole grade levels, from 13th to 10th grade.

Among Democrats, the scatterplot does not reveal any relationship between grade level and ideology. However, when we hold all other factors constant in the regression analysis (see further below), we find that being on the far left is associated with lower speech grade levels. There is also a clearer correlation between further left voting score and lower grade level among more junior members.

Figure 2. The relationship of ideology to speech grade level

 

 

 

Changing members and members’ changes

It’s hard to pinpoint the exact cause of the decline. Perhaps it reflects lawmakers speaking more in talking points, and increasingly packaging their floor speeches for YouTube. Gone, perhaps, are the golden days when legislators spoke to persuade each other, thoughtfully wrestled with complex policy trade-offs, and regularly quoted Shakespeare.

The data indicate that part of the decline has to do with new junior members speaking at a lower grade level than more senior members, and some of it has to do with individual senior members simplifying their speech over time.

Figure 3 (below) breaks Congress into four seniority cohorts and details the relationship between ideology and grade level for speeches in the 112th Congress.

Here, a telling pattern emerges. Among the newest members (those with 1-3 years in their seat), there is drop off in speech level as we move from the center out to either extreme of the political spectrum, though the pattern is more pronounced on the far right. For the next cohort (4-10 years of experience), the same pattern continues on both the political right and left, though the relationship is much stronger among Republicans.

For the next cohort (11-20 years in their seat), the pattern on the right (more conservative, simpler speech) remains, but the pattern on the left reverses (there is a slight correlation between more liberalism and higher speech grade level). In the most senior cohort (more than 20 years in their seat), Republicans speak, on average, at a higher level than Democrats, with only the slightest relationship between conservatism and more simple speech.

Figure 3. Ideology and Seniority

 

 

At the individual level, prior to the 109th Congress (2005-2006), both individual Democrats and Republicans on average grew more sophisticated in their speech with each passing session of Congress. Individual Democrats gained on average 0.06 grade levels per session, and Republicans gained on average 0.12 grade levels per session. Then, starting with the 109th Congress, the trends reversed. Individual Democrats began dropping 0.07 grade levels of speech per session and individual Republicans began dropping 0.12 grade levels per session.

 

Table 1. Average estimated effect of each passing Congress on individual member grade level

(results from regression analysis estimating annual member change with member fixed effects)

 

The top and bottom lawmakers by grade level

Table 2 (below) shows the 20 members of Congress with the lowest grade level score for their Congressional record corpus dating back to 1996. Of them, 85% (17 of 20) are Republicans; 65% (13/20) are freshmen, and another 15% (3/20) are sophomores. Additionally, 90% (18/20) are House members. The two Senators to make the bottom 20 are Rand Paul (R-KY) and Ron Johnson (R-WI), both Tea Party-supported freshmen.

Table 2. Bottom 20 speakers by grade level (all speeches since 1996)

Republicans also outnumber Democrats among the members who speak at the highest grade levels. Among the top 20, 12 are Republicans, 7 are Democrats, and one (Joe Lieberman) is an Independent. And eight of the top ten are Republicans. There are also 14 House members and six Senators. And perhaps most notably, there are only two freshmen and three sophomores. More than half of the members have been in their seat for at least 15 years, which is well above the median of nine years across all members of the 112th Congress.

Table 3. Top 20 speakers by grade level (all speeches since 1996)

Regression analysis

To estimate the effects of all the different factors (holding all the other factors constant), we estimated two ordinary least squared regression models. Model 1 uses the different factors to explain the variation in the grade level of individual members’ combined speeches since 1996. Model 2 uses the same factors to explain the grade level of member speeches just in the 112th Congress. (The correlation between all speeches since 1996 and just 112th Congress speeches for non-freshmen members is 0.74. For freshmen, these two measures will obviously be the same.)

For Democrats, moving from most moderate (DW1 score of 0) to most liberal (DW1 score of -1) is associated with a decrease in 1.59 grade levels for all speeches since 1996 combined, and an decrease of 1.35 grade levels for speeches from just the 112th Congress, all else being equal. This estimate is statistically significant.

For Republicans, moving from most moderate (0) to most conservative (1) is associated with a decrease in 2.07 grade levels in speech for all speeches since 1996 combined, and 2.06 grade levels for just the 112th Congress, all else being equal. Both are statistically significant. That the estimates for the relationship between ideology and grade level are consistent across the two models shows that this is both a current and a historic phenomenon.

Another takeaway point from the regression analysis is that the more a member speaks overall, the more simply that member is likely to speak, all else being equal. For just the 112th Congress, going from least to most talkative is associated with a decrease in almost a grade level and a half. For the historic corpus, going from the least talkative to most talkative member is associated with a decrease in a full grade level.

Socioeconomic status of member district does not play much of a role, so there is no story to tell of members speaking to their constituents. If anything, the reverse is true. Having a higher percentage of high school graduates in the district or state is associated with members speaking at a slightly lower grade level (though since half of the districts have high school graduation levels between 82% and 90%, this doesn’t add up to all that much). District median income (which is closely correlated with education generally) has no relationship to speech grade level. There is also no statistically significant difference between chambers. Members of Congress from the Northeast speak at a slightly higher grade level than their colleagues from the rest of the country.

Of course, a fair amount of variation remains unexplained. There are many reasons why members speak at different levels, and these explanations only tell part of the story.

 

Table 4. OLS Regression explaining member speech level (standard errors in parenthesis, significant variables bolded)

Does it matter?

Earlier this year, the University of Minnesota’s Smart Politics noted that Obama’s 2012 State of the Union address clocked in at an eighth-grade level for the third year in a row, and that Obama’s average grade level of 8.4 was well below the average of 10.7 for the previous 67 addresses. Fox News ran the story alongside the image of a child in a dunce cap, and right-wing blogs mocked the President’s intelligence.

Others pointed out that maybe speaking clearly was a good thing. After all, the SOTU speech was pretty much right at the level of the average American’s reading level. And writing gurus like George Orwell (“If it is possible to cut a word out, always cut it out”) and Strunk & White (“omit needless words”) famously advise simplicity.

But whether you see it as plain speak or you see it as a dumbing down, the data are clear: The overall complexity of speech in the Congressional Record has dropped almost a full grade level since 2005. And those on the political extremes, especially those on the far right, tend to be associated with the most simple speech patterns.

Methodology for generating grade level scores

(by Dan Drinkard)

Grade levels were calculated using Flesch-Kincaid readability tests applied to various facets of text queries against the Capitol Words API. For example, Barbara Lee's entire corpus of words spoken can be retrieved by paging through the following url: http://capitolwords.org/api/text.json?bioguide_id=L000551&apikey=####.

Flesch-Kincaid scores can be determined as: 0.39 * (Words/Sentences) + 11.8 * (Syllables/Words) - 15.59.

To derive counts: The python Natural Language Toolkit (NLTK)'s sentence tokenizer was used to count sentences, the Capitol Words ngram tokenizer was used to count words, and the Carnegie Mellon pronouncing dictionary was used to count syllables. For fallback syllable counting when a word wasn't present in the dictionary, three different sets of calculations employing different methods were tried—discarding unknown words, treating unknown words as the average word of 1.66 syllables, and using a trained fallback syllable counter from NLTK_Contrib. We found the results of each method to be nearly indistinguishable from the others. An example F-K calculator (this one using the aforementioned 'padding with averages' method) can be found at https://gist.github.com/2483508.