Wednesday, September 17, 2014

Python scripts to shorten column names, or to fetch Google Ngrams data

I've made a couple new GitHub repos:

google_ngram_py, which allows you to look up one- to five-word phrases in Google Ngrams Viewer (which shows the frequency by year) from python and returns the data as pandas dataframes, separated into parent and child for case-insensitive searches (e.g. parent is 'the (All)', children are 'the', 'The', 'THE').

shorten_column_names, which allows you to find the most common words in a list of phrases and abbreviate them; I used them for shortening the sometimes 100+-character column names from World Bank data (e.g. population -> pop), but you could use it on any list of strings.
• • •

2 comments:

  1. thanks for shared wonderful information of giving best information.its more useful and more helpful. great doing keep sharing
    Germany Education Consultants in Chennai

    ReplyDelete
  2. Hello,
    The Article on Python scripts to shorten column names, or to fetch Google Ngrams data is very informative give detail information about it .Thanks for Sharing the information about it. hire data scientists

    ReplyDelete