Google Search Analysis with Python
Google handles over billions of searches every day and trillions of searches each year. This shows how important it is to understand what people are searching for and in this article, we’ll learn how to use Python to analyze Google search data focusing on search queries.
Understanding Pytrends
Pytrends is an unofficial Python tool that lets you access Google Trends data. It helps you find out the most popular search topics or subjects on Google. With Pytrends you can explore trends, compare search interest from different places and understand what people are searching for in a better way.
Installing Pytrends
To use this API you first need to install it on your systems. You can easily install it using the following command:
pip install pytrends
Python Implementation of Google Search Analysis
1. Import Necessary Libraries and Connect to Google
We will be using pandas, pytrends, matplotlib and time library for this.
import pandas as pd
from pytrends.request import TrendReq
import matplotlib.pyplot as plt
import time
Trending_topics = TrendReq(hl='en-US', tz=360)
2. Build Payload
Now, we will be creating a dataframe of top 10 countries that search for the term "Cloud Computing". For this we will be using the method build_payload which allows storing a list of keywords that you want to search. In this you can also specify the timeframe and the category to query the data from.
kw_list=["Cloud Computing"]
Trending_topics.build_payload(kw_list,cat=0, timeframe='today 12-m')
time.sleep(5)
3. Interest Over Time
The interest_over_time() method returns the historical indexed data for when the specified keyword was most searched according to the timeframe mentioned in the build payload method.
data = Trending_topics.interest_over_time()
data = data.sort_values(by="Cloud Computing", ascending = False)
data = data.head(10)
print(data)
Output:

4. Historical Hour Interest
The get_historical_interest() allows us to specify periods such as year_start, month_start, day_start, hour_start, year_end, month_end, day_end and hour_end.
kw_list = ["Cloud Computing"]
Trending_topics.build_payload(kw_list, cat=0, timeframe='2024-01-01 2024-02-01', geo='', gprop='')
data = Trending_topics.interest_over_time()
data = data.sort_values(by="Cloud Computing", ascending = False)
data = data.head(10)
print(data)
Output:

5. Interest By Region
Next is the interest_by_region method which lets you know the performance of the keyword per region. It will show results on a scale of 0-100 where 100 indicates the country with the most search and 0 indicates with least search or not enough data.
data = Trending_topics.interest_by_region()
data = data.sort_values(by="Cloud Computing",
ascending = False)
data = data.head(10)
print(data)
Output:

6. Visualizing Interest By Region
data.reset_index().plot(x='geoName', y='Cloud Computing',
figsize=(10,5), kind="bar")
plt.style.use('fivethirtyeight')
plt.show()
Output:

7. Searching for Related Queries
Whenever a user searches for something about a particular topic on Google there is a high probability that the user will search for more queries related to the same topic. These are known as related queries. Let us find a list of related queries for "Cloud Computing".
try:
Trending_topics.build_payload(kw_list=['Cloud Computing'])
related_queries = Trending_topics.related_queries()
related_queries.values()
except (KeyError, IndexError):
print("No related queries found for 'Cloud Computing'")
Below is the output when we searched for queries related to Cloud Computing.
Output:
No related queries found for 'Cloud Computing'
8. Keyword Suggestions
The suggestions() method helps you to explore what the world is searching for. It returns a list of additional suggested keywords that can be used to filter a trending search on Google.
keywords = Trending_topics.suggestions(
keyword='Cloud Computing')
df = pd.DataFrame(keywords)
df.drop(columns= 'mid')
Output:

With this we can find trends in google search history and can be used for various purposes.
You can download the source-code from here.