Google’s a pretty neat tool, and it’s fascinating to see how it’s evolved over the past ten years. When I first came to Thailand in 2001, I did all my searches using either Lycos, HotBot, or Excite, but now I – indeed, most of the world – simply uses Google. It’s pretty amazing how far their shadow stretches, touching pretty much every corner of the internet, no matter how insignificant. For non-insanely smart computer geeks, it’s a bit hard to wrap your mind around exactly how access to all this seemingly minor information can congeal into a useful whole, but a new Google tool called Ngram consolidates an incredible amount of information – 200 years’ worth of printed material, to be precise – and lets you search. Just for fun, I did some popular searches for Thai terms.
A recent Wall Street Journal article that describes just how Google Ngram Viewer can be used says:
Scientists at Harvard University, Massachusetts Institute of Technology, Google and the Encyclopedia Britannica unveiled a database of two billion words and phrases drawn from 5.2 million books in Google’s digital library published during the past 200 years. With this tool, researchers can measure trends through the language authors used and the names of people they mentioned.
Analyzing the computerized text, the researchers reported that they could measure the hardening rhetoric of nations facing off for war, by tracking increasing use of the word “enemy.” They also could track changing tastes in food, noting the waning appetite for sausage, which peaks in the 1940s, and the advent of sushi, the mentions of which start to soar in the 1980s. They documented the decline of the word “God” in the modern era, which falls sharply from its peak in the 1840s.
The Los Angeles Times also has a pretty good article. So, there’s a lot of information to be had if you know how to tweak it correctly, so I ran a few searches to see what I could find out, and here’s what I saw:
Now, as far as actually knowing what any of this means, well… I’m not that smart, I’m afraid. It probably has some deep meaning that can be pulled from the raw data, but truth be told, I just like the pretty colors. At any rate, it’s a pretty fascinating toy. Are there any other Thai-specific terms you can think of comparing?
Interesting. Also note English-language context with regards to US/UK-centric (?) sources. What is and what isn't represented in google's digital library?
Another interesting graph from google is here :
http://www.google.com/transparencyreport/traffic/?r=TH&l=WEBSEARCH&csd=1231406281773&ced=1292659200000
It shows that internet usage really drops off in Thailand on Songkran and New Year, which is a different profile to most countries. Maybe its because Thai actually spend time with their families on the Holidays, or maybe everyone is just to drunk to google 🙂
Another interesting graph is usage of Google Translate from Thailand
http://www.google.com/transparencyreport/traffic/?r=TH&l=TRANSLATE&csd=1230796800000&ced=1292659200000
It really peeks around the RedShirt protests, I presume because were suddenly more curious/worried about the image of the country …
Serene, you're in a better position than I to analyze this sort of thing, seeing as you're way smarter than me. 😛
Justin, thanks, that's an interesting graph. If nothing else, Google has somehow figured out how to take raw, boring statistical data and make it un-boring.
NGram tool is case sensitive. "Bangkok" or "Thailand" give proper results, while "thailand" or "bangkok" return how many careless misspellings appeared in print or online.
Also, in the English corpus, it's more fair to compare Bangkok to Asian cities (e.g. Tokyo, Seoul or Peking/Beijing) than to Paris or London.