Here's the viz, click to enlarge:
The code I used to download and process it is nicely formatted with NBViewer as an IPython notebook on my GitHub. The data comes not from Billboard itself, but from www.bullfrogspond.com; I don't know much about the data source, but it certainly looks thorough and painstaking, and up to date.
Here are some observations about the methodology, written with both the non-data-expert and the cognoscenti in mind.
What's "keyness" all about? Keyness is a common approach when comparing word frequencies between two sets; it's particularly useful when you're comparing two sets of unequal size (in this case, I was comparing, for example, all of the words from the 2010s, to every word in the entire dataset). I used the log-likelihood method, which returns a measure of the statistical significance of finding that word in that decade. A keyness of about 11 means there's only a 0.1% chance that you would get the same result higher picking words from the entire collection at random instead of restricting yourself to the words in that subset (decade), and about 14 is a 0.01% chance. There's a good, intermediate-technical description of log-likelihood here.
The contour charts: I made this with Excel sparklines since it was so easy. The absolute scale of the y axis is different for each one, otherwise the less popular words would be invisible. If you look at some of the contours, you'll see that there are actually higher bars in other decades than the highlighted one; that's because "keyness" considers other factors than just the heights of the bars (such as the relative number of words in the decade).
Why is "f**k" censored? It was like that in the dataset, I'm no prude. Fuck fuck fuck fuck fuck.
The binning effect: It's natural to bin decades together, but it's completely arbitrary in a statistical sense. The popular songs in 1960s far, far more resembled those of the 1950s than they did of what one thinks of as "sixties music". What this means for this dataset is that words that tend to respect this artificial boundary will be overrepresented. For example, if a word was particularly popular between 1972 and 1979, it will have a higher keyness than one that was popular across a decade break from 1976 to 1983. That's just a choice that has to be made to make for an easy-to-grasp analysis, if I were analyzing things more important than song titles I'd be more rigorous in this regard.
Why the top five from each decade instead of the top keyness overall? Basically because it was more interesting this way. The database goes back to 1890, and there are fewer songs overall back then with more uncommon words, which means they have higher keyness and any such list would be full of songs nobody's heard of. The top word in the 2010s, "we", is number 63 overall, and over half of the words about it are from the first three decadess.
I considered leaving out entire decades, but the number of songs in them was small but not negligible, as you can see from the following chart (that I made in about 14 seconds with Chartbuilder):

What's "keyness" all about? Keyness is a common approach when comparing word frequencies between two sets; it's particularly useful when you're comparing two sets of unequal size (in this case, I was comparing, for example, all of the words from the 2010s, to every word in the entire dataset). I used the log-likelihood method, which returns a measure of the statistical significance of finding that word in that decade. A keyness of about 11 means there's only a 0.1% chance that you would get the same result higher picking words from the entire collection at random instead of restricting yourself to the words in that subset (decade), and about 14 is a 0.01% chance. There's a good, intermediate-technical description of log-likelihood here.
The contour charts: I made this with Excel sparklines since it was so easy. The absolute scale of the y axis is different for each one, otherwise the less popular words would be invisible. If you look at some of the contours, you'll see that there are actually higher bars in other decades than the highlighted one; that's because "keyness" considers other factors than just the heights of the bars (such as the relative number of words in the decade).
Why is "f**k" censored? It was like that in the dataset, I'm no prude. Fuck fuck fuck fuck fuck.
The binning effect: It's natural to bin decades together, but it's completely arbitrary in a statistical sense. The popular songs in 1960s far, far more resembled those of the 1950s than they did of what one thinks of as "sixties music". What this means for this dataset is that words that tend to respect this artificial boundary will be overrepresented. For example, if a word was particularly popular between 1972 and 1979, it will have a higher keyness than one that was popular across a decade break from 1976 to 1983. That's just a choice that has to be made to make for an easy-to-grasp analysis, if I were analyzing things more important than song titles I'd be more rigorous in this regard.
Why the top five from each decade instead of the top keyness overall? Basically because it was more interesting this way. The database goes back to 1890, and there are fewer songs overall back then with more uncommon words, which means they have higher keyness and any such list would be full of songs nobody's heard of. The top word in the 2010s, "we", is number 63 overall, and over half of the words about it are from the first three decadess.
I considered leaving out entire decades, but the number of songs in them was small but not negligible, as you can see from the following chart (that I made in about 14 seconds with Chartbuilder):

In the end, I went with the more interesting approach. Data visualization is narrative by nature, don't let anyone tell you otherwise.
Finally, here's a table with the most popular song for each decade containing the top-five word in question. "Most popular" is decided by a metric particular to the dataset source, but it seems thorough and defensible.
Dec. Word Ky. Max. Most popular song 2010s We 22 1.4% Rihanna, "We Found Love" (2011) Yeah 18 0.2% Austin Mahone, "Mmm Yeah" (2014) Hell 18 0.3% Avril Lavigne, "What The Hell" (2011) F**k 15 0.1% Cee Lo Green, "F**K You (Forget You)" (2011) Die 14 0.2% Ke$ha, "Die Young" (2012) 2000s U 71 1.1% Usher, "U Got It Bad" (2001) Like 28 1.1% T.I., "Whatever You Like" (2008) Breathe 25 0.2% Faith Hill, "Breathe" (2000) It 24 2.4% Usher, "U Got It Bad" (2001) Ya 19 0.7% OutKast, "Hey Ya!" (2003) 1990s U 49 1.1% Sinead O'Connor, "Nothing Compares 2 U" (1990) You 28 5.1% Stevie B, "Because I Love You (The Postman Song)" (1990) Up 21 1.0% Brandy, "Sittin' Up In My Room" (1996) Get 20 1.0% En Vogue, "My Lovin' (You're Never Gonna Get It)" (1992) Thang 18 0.2% Dr. Dre, "Nuthin' But A "G" Thang" (1993) 1980s Love 48 3.8% Joan Jett & The Blackhearts, "I Love Rock 'N Roll" (1982) Fire 24 0.5% Billy Joel, "We Didn't Start The Fire" (1989) Don't 20 1.6% Human League, The, "Don't You Want Me" (1982) Rock 14 0.7% Joan Jett & The Blackhearts, "I Love Rock 'N Roll" (1982) On 14 3.2% Bon Jovi, "Livin' On A Prayer" (1987) 1970s Woman 33 0.6% The Guess Who, "American Woman" (1970) Disco 31 0.4% Johnnie Taylor, "Disco Lady" (1976) Rock 24 0.7% Elton John, "Crocodile Rock" (1973) Music 24 0.6% Wild Cherry, "Play That Funky Music" (1976) Dancin' 20 0.5% Leif Garrett, "I Was Made For Dancin'" (1979) 1960s Baby 51 1.9% Supremes, The, "Baby Love" (1964) Twist 24 0.7% Joey Dee & the Starliters, "Peppermint Twist - Part 1" (1962) Little 16 4.0% Steve Lawrence, "Go Away Little Girl" (1963) Twistin' 15 0.4% Chubby Checker, "Slow Twistin'" (1962) Lonely 14 0.5% Bobby Vinton, "Mr. Lonely" (1964) 1950s Christmas 31 0.8% Art Mooney & Orch., "(I'm Getting) Nuttin' For Christmas" (1955) Penny 18 0.4% Dinah Shore & Tony Martin, "A Penny A Kiss" (1951) Mambo 15 0.5% Perry Como, "Papa Loves Mambo" (1954) Rednosed 15 0.3% Gene Autry, "Rudolph, the Red-Nosed Reindeer" (1950) Three 15 0.5% Browns, The, "The Three Bells" (1959) 1940s Polka 50 0.4% Kay Kyser & Orch., "Strip Polka" (1942) Serenade 35 0.7% Andrews Sisters, "Ferry Boat Serenade" (1940) Boogie 28 0.6% Will Bradley & Orch., "Scrub Me, Mama, With a Boogie Beat" (1941) Blue 26 1.6% Tommy Dorsey & Frank Sinatra, "In The Blue Of Evening" (1943) Christmas 22 0.8% Bing Crosby, "White Christmas" (1942) 1930s Moon 79 1.4% Glenn Miller & Orch., "Moon Love" (1939) In 38 6.5% Ted Lewis & His Band, "In A Shanty In Old Shanty Town" (1932) Swing 34 0.5% Ray Noble & Orch., "Let's Swing It" (1935) Sing 34 1.4% Benny Goodman & Martha Tilton, "And the Angels Sing" (1939) A 30 5.8% Ted Lewis & His Band, "In A Shanty In Old Shanty Town" (1932) 1920s Blues 153 3.1% Paul Whiteman & Orch., "Wang Wang Blues" (1921) Pal 42 0.9% Al Jolson, "Little Pal" (1929) Sweetheart 27 0.9% Isham Jones & Orch., "Nobody's Sweetheart" (1924) Rose 25 1.4% Ted Lewis & His Band, "Second Hand Rose" (1921) Mammy 23 1.0% Paul Whiteman & Orch., "My Mammy" (1921) 1910s Gems 70 1.1% Victor Light Opera Co., "Gems from 'Naughty Marietta'" (1912) Rag 52 1.2% Original Dixieland Jazz Band, "Tiger Rag" (1918) Home 43 2.9% Henry Burr, "When You're a Long, Long Way from Home" (1914) Land 41 0.6% Al Jolson," Hello Central, Give Me No Man's Land" (1918) Old 38 3.7% Harry Macdonough, "Down by the Old Mill Stream" (1912) 1900s Uncle 58 4.5% Cal Stewart, "Uncle Josh's Huskin' Bee Dance" (1901) Old 58 3.7% Haydn Quartet, "In the Good Old Summer Time" (1903) Josh 44 3.7% Cal Stewart, "Uncle Josh On an Automobile" (1903) Reuben 38 1.4% S. H. Dudley, "When Reuben Comes to Town" (1901) When 33 3.8% George J. Gaskin, "When You Were Sweet Sixteen" (1900) 1890s Uncle 59 4.5% Cal Stewart, "Uncle Josh's Arrival in New York" (1898) Casey 54 3.3% Russell Hunting, "Michael Casey Taking the Census" (1892) Josh 53 3.7% Cal Stewart, "Uncle Josh at the Opera" (1898) Old 26 3.7% Dan Quinn, "A Hot Time in the Old Town" (1896) Michael 24 2.7% Russell Hunting, "Michael Casey Taking the Census" (1892)
I was fascinated by Uncle Josh's popularity - it turns out he was a vaudeville character created by Cal Stewart. It's not quite accurate to call all his Billboard hits "songs," though - all except "Uncle Josh's Huskin' Bee Dance" are monologues.
ReplyDeleteUncle Josh On an Automobile: https://www.youtube.com/watch?v=71BrerwyxBo
David, what was the reference corpus you used for keyness? Also, did you use a minimum frequency threshold?
ReplyDeleteWe have different modifications of several genres of music like christian rock bands, which makes an almost never-ending list of different styles, or genres, of music.
ReplyDeleteI have Make my chat rooms. if you wants to chat with Girls then join us at following links.
ReplyDeleteChat Rooms
Pakistani Chat Rooms
Desi Chat Rooms
Pakistani Chat Rooms
India Chat Rooms
It is in point of fact a nice and useful piece of information. I am happy that you shared this useful info with us. Please keep us informed like this. Thank you for sharing.
ReplyDeleteNursery Rhymes
Nice Post (Y)
ReplyDeleteDigital Marketing Dubai
This comment has been removed by a blog administrator.
ReplyDeletethe site bullfrog pond seems to be offline. do you happen to know where i can get a copy of the source data?
ReplyDeleteI still had the file I downloaded from there in December 2014; I've put it on my server for download at http://dtdata.io/prm/charts.rar
DeleteAwesome work...
ReplyDeleteGupShupChatRoom
What are the ringtones you are using?
ReplyDeleteI have the best free quality ringtone collection that you and anyone can refer and install here: ringtonedownload
Unique ringtone. Ringtones are attractive. Ringtones for everyone in the world. Here are the most downloaded songs of the day:
Hindi ringtones
Telugu ringtones
Bollywood ringtones
Malgudi Days ringtones
Ringtones Airtel ringtones
Hope my ringtone collection brings a new look to the ringtone world. Thank you for your interest!
Big Data and Hadoop is an ecosystem of open source components that fundamentally changes the way enterprises store, process, and analyze data.
ReplyDeletepython training in bangalore
aws training in bangalore
artificial intelligence training in bangalore
data science training in bangalore
machine learning training in bangalore
hadoop training in bangalore
devops training in bangalore
corporate training companies
ReplyDeletecorporate training companies in mumbai
corporate training companies in pune
corporate training companies in delhi
corporate training companies in chennai
corporate training companies in hyderabad
corporate training companies in bangalore
Gaining Python certifications will validate your skills and advance your career.
ReplyDeletepython certification
This is really a nice and informative, containing all information and also has a great impact on the new technology. Thanks for sharing it. satta matka
ReplyDeleteIt is goal of each organization to confirm that production runs with efficiency and with as very little downtime as possible with packaging machine repair service. With the continued mindset of sharing valuable info for plant operation and cost savings, let’s consider common electrical problems that arise within the field.
ReplyDelete
ReplyDeleteغسيل خزانات بمكة شركة غسيل خزانات بمكة
غسيل خزانات بجدة شركة غسيل خزانات بجدة
غسيل خزانات بالدمام شركة غسيل خزانات بالدمام
ninonurmadi.com
ReplyDeleteninonurmadi.com
ninonurmadi.com
ninonurmadi.com
ninonurmadi.com
ninonurmadi.com
ninonurmadi.com
ninonurmadi.com
ninonurmadi.com
python training in bangalore | python online training
ReplyDeleteaws training in bangalore | aws online training
artificial intelligence training in bangalore | artificial intelligence online training
machine learning training in bangalore | machine learning online training
blockchain training in bangalore | blockchain online training
uipath training in bangalore | uipath online training
It is great and so amazing post and I enjoyed a lot while i reading your blog. I am very grateful for the effort put on by you, to guide us, Thank a lot for this informative post ,keep posting such type of wonderful post. Keep it up. We will also offer solution for QuickBooks Error 6123 so Contact us +1 877-751-0742 for instant help
ReplyDelete