Exploring Relationships in CTR Data: Insights for Marketing Strategies

In this analysis, I used R(ggplot2) to explore a dataset of click-through rates (CTR) and examine the relationships between various data points. My goal was to uncover insights that could inform marketing strategies.

In this report, I present my findings by examining variables such as ad format, product type, and time of day, and offer recommendations for optimizing CTR on digital advertising campaigns.

As this is a programming assignment in R, I include the code in this report and show the process of optimizing visualizations.

Category: Dataviz Assignment

Keywords: R(ggplot2); Marketing Strategies

Background: What is CTR?

In the world of digital advertising, click-through rate (CTR) is a crucial metric used to measure the effectiveness of online ad campaigns. CTR is defined as the ratio of clicks on an ad to the total number of impressions, expressed as a percentage. For example, if an ad receives 100 clicks for every 1000 impressions, the CTR would be 10%.

Sourse: as shown in the image

CTR is essential in advertising analysis because it provides valuable insights into the performance of an ad campaign. A high CTR indicates that the ad is engaging and relevant to the audience. Marketers and advertisers use CTR to optimize their ad campaigns by identifying underperforming ads and making changes to improve their CTR.

Data Source & Menu

The data source for this report is an open dataset from Kaggle website: https://www.kaggle.com/datasets/arashnic/ctr-in-advertisement

In this CSV file, the valid data columns contain the following information: DateTime, webpage_id, product_category_1, gender, age_level, user_depth, city_development_index, and is_click.

You can also find out and download the CSV file here.

611f5ebe2709f0744c0f4413_Sourcing talent on Kaggle image.png

Explore wherever you want! →

Age Group

man-light-skin-tone-beard_1f9d4-1f3fb-200d-2642-fe0f.png

Firstly, we can pose a question: how will CTR data differ among users of different age groups? There are four age groups provided in the file are 0, 1, 2, 3, 4, 5 and 6 which respectively represent users aging 0-10, 10-20, 20-30, 30-40, etc, years old.

We can try using a bar chart first to see if it can display visually appealing and promising results.

Interestingly, because the "ceiling" is different for each age group, we can not visualize the difference in the percentage of "Clicked" and "Not Clicked" for each age group in this graph. Therefore, we can try to use a stack bar to unify the "ceiling".

To facilitate the comparison of Clicked differences between age groups, it is advisable to adjust the length of the pink bar. This can be achieved through the implementation of the "scale_y_continuous" function.

The changes have been magnified. From 0 to 70 years old, the average CTR gradually decreases, with the minimum at 40-50 years old and then gradually increases. However, this is very microscopic, and the difference in benefits that it can bring to advertisements is also very limited.

City Development Level

After examining the CTR differences across age groups, let's take a look at how CTR varies across different levels of urban development. Do residents in Toronto click on ads more than those in Prince Edward Island?

In view of the almost insignificant differences in CTR across different levels of urban development, it is necessary to explore other variables that may have an impact on ad click rates. As such, we propose to investigate the impact of gender on CTR in various tiers of cities. Specifically, we seek to determine which gender, male or female, exhibits higher ad click rates across different tiers of cities, and whether there are any significant differences in overall ad click rates between the two genders.

The graph provides valuable insights into the impact of gender and city development index on ad click rates. As a general trend, it appears that women are less likely to click on ads compared to men, irrespective of the level of urban development.

However, the graph also reveals interesting differences in ad click rates between men and women across different tiers of cities. Specifically, in level 3 cities (i.e., higher levels of development), men exhibit significantly higher ad click rates compared to women. On the other hand, in level 2 cities, while the overall ad click rates are relatively high for both genders, the difference between men and women is smaller.

Time Period

This will be one of the more challenging tasks: how can we visualize changes in the data over time? The data in our table consists of scattered time points, so organizing it into a regular line or time interval/period will require some effort.

To begin with, let's examine changes in total impressions during different time periods. Since seven days is too long to be effectively visualized in R, we will focus on one day (24 hours) at a time. We will use Python to transform the data from counting every minute to counting every 15 minutes, which will allow us to more easily visualize the changes over time.

The specific code will not be expanded here. Here we use ggplot to plot the line graph.

The graph shows a gradual increase in impressions in the morning, with a peak around noon, followed by a decrease in the afternoon. Impressions then increase rapidly in the evening, reaching a peak around 9pm before dropping sharply.

To further analyze the data, we will add the total clicks to the table using a similar process.

This is a very interesting result! We can see that although impressions show a significant increase in the morning period, rising from almost zero to around 1300, the growth of clicks is slow, climbing only slightly. Similarly, during the evening peak of impressions, clicks do not show a significant increase.

However, it is important to note that these findings are based on absolute figures rather than rates. To further analyze the performance, we can calculate click-through rates (CTRs) and add them to our ggplot visualization.

We observe that the CTR fluctuates throughout the day, but generally stays within the range of 5% to 10%. Interestingly, the CTR shows the least variability around noon. We could speculate that this might be due to the larger size of the sample during these hours.

Product Type & Webpage

As we have progressed in our analysis, we should not limit ourselves to exploring a single factor. Let's examine the intersectional relationship between product type and webpage to determine which product types are most popular on certain webpages. This analysis could provide valuable marketing insights for ad placements.

This graph is quite cluttered, therefore a heatmap will be introduced to display the relationship between product categories and webpage click rates. Heatmaps are particularly useful for visualizing large datasets and identifying patterns or trends in the data. In this context, a heatmap will allow for easy comparison of click rates across multiple product categories and webpages. Additionally, the color-coding scheme of the heatmap makes it easy to identify which combinations of webpage and product category have the highest and lowest click rates, thereby providing valuable insights for marketing purposes.

The heatmap color-codes the click rates, with darker colors indicating higher click rates, making it easy to identify the patterns and trends in the data. By comparing the click rates of multiple product categories and webpages, the heatmap helps to visualize the relationship between them. Reading from left to right, the click rates for a given product category across different webpages are easily discernible, while reading from bottom to top, the click rates for different product categories on a specific webpage can be determined with ease.

The analysis of the heatmap reveals that displaying clothing advertisements on landing pages results in significantly higher click rates than any other combination. On the other hand, displaying electronic and beauty products on the homepage yields the lowest click rates. These findings have important implications for online marketing strategies.

Based on this finding, we can suggest that marketers at Apple and L'Oreal stop from putting ads on homepages when you are doing online marketing. People just don't buy it.

Insights

age

According to the CTR data, there is not a significant correlation between age group and click-through rate. This suggests that marketers should focus more on which user groups are attracted to the product itself and target their advertising efforts accordingly. For example, they should not target products for infants and young children to users in the 60-70 age group just because they have a slightly higher CTR than the 0-10 age group.

CITY

While the click-through rates across cities with varying levels of economic development exhibit minimal variation, it is important to note that such rates may not necessarily correspond to equal conversion rates. As such, to optimize ad placement strategies, more extensive data on conversion rates is required to identify cities with a higher likelihood of purchasing. This would enable marketers to tailor their advertising efforts towards cities that have a greater propensity to convert, ultimately leading to more efficient and effective targeting strategies.

GENDER

Based on the CTR data, it appears that women tend to have a lower click-through rate than men. However, it is important to note that this does not necessarily imply that targeting men would result in a higher CTR. If a product has a significant gender preference, it is crucial to collect more detailed gender-related data to inform more precise targeting strategies. For example, when targeting feminine hygiene products, it is more important to focus on gender-based targeting rather than just observing the click-through rates of male and female users. For everyday household items without gender-specific features, gender data of the primary purchaser within a household can be collected and combined with CTR data for consideration.

TIME

The range of CTR values throughout the day remains relatively consistent, fluctuating around 6% with no significant increase during peak impression times in the morning and evening. This is a significant insight: if marketers prioritize impressions, such as in a viral campaign focused on leaving a strong psychological impression on potential users, then it is meaningful to invest more in expensive ads during peak hours. However, if clicks and even purchases are the focus of marketers, such as in a best-selling but ordinary e-commerce product, then it may not be necessary to heavily advertise during peak hours.

PRODUCT

Finally, marketers need to closely monitor the performance of different product categories in different contexts, such as on different webpages. This is because data shows that their differences can be significant. Further analysis can be done, such as whether there are significant differences in the CTR of different gender or geographic groups for different product categories. As a result, marketers can target high-performing combinations for more intense advertising.

Back to Portfolio