In the old days, the only way to find answers to questions was to check with some authoritative source, such as an encyclopedia or some expert in the field. But with the abundant availability of data of various kinds and some really good tools to view the data, we now have some alternative ways to answer many questions in a data-driven manner. What I love about this approach is not just that it is based on real evidence rather than having to take someone’s word for it, but that it usually leads to more interesting and complex answers than what a typical authoritative source might give you.
Here is an example. Suppose you wanted to find out when the first day of summer is. But instead of consulting your favorite oracle, let us try a different approach.
Google Trends is a tool that lets you see the number of queries people all over the world have made for a given search term during a given time period. Let us give it a try and see if we can find something interesting.
Go to Google Trends and enter “first day of summer” as the term that you want to see search traffic for. You will get a chart like the one below:
Here is the link if you want to see it better: http://www.google.com/tre
It is pretty clear that there is a repeating pattern here. The queries seem to be peaking somewhere in the middle of the year every year. Let us drill down a bit. Zoom into the year 2012.
Here is the link to that one: http://www.google.com/tre
It is pretty clear that the peak is in the time period June 17th to 23. Right on target, right? So, now you have a kind of a “social proof” that the first day of summer has something to do with the time period June 17 to 23. A lot more reassuring than taking someone’s word for it, right?
Ok, let us try an even more interesting one. Try “father’s day”.
Here is that link: http://www.google.com/tre
Here again, a repeating pattern. Let us zoom in to just the year 2012:
Here’s that link: http://www.google.com/tre
And bingo! Father’s day must be somewhere in the range of June 10 – 16th! Cool?
But there is something more intriguing here. There is a secondary peak in the chart:
Let us see if this is just an aberration, or is there a pattern to that, too? Zoom back to the “all years” view:
Well, there’s that secondary peak again. Repeating like clockwork :-), every year. A bit later in the year than the primary peak. What is that all about?
Well, there is more than one way to slice and dice the data. Let see if we can find something by looking at “Regional Interest”, where they show the search traffic for each region of the world. As you move from region to region, when you click on Australia, you will see the order of the peaks reversing i.e. the primary peak becomes secondary and the secondary peak becomes primary (I have pasted the worldwide chart first and then the one for Australia to help you compare the two better):
Interesting, isn’t it? What is going on?
Well, it turns out that this is because, in Australia, Father’s day is celebrated a little later in the year, on September 1st!
See? Not only do you have a more reassuring answer, but now you know that the answer is more complex than what you would probably have gotten from an expert. Not to mention that you got to look at some pretty pictures rather than reading some bland text 🙂
Want to try something on your own? Try “Mother’s day” and see whether you can get some similar insights. Feel free to comment below with what you find. May be try some other terms, too!