Skip to content

bluella/keywords-distribution-analysis

Repository files navigation

Keywords Distribution Analysis

Overview

distributions for the Account X and Account Y. Performance of every Keyword (Clicks, Impressions, ROAS, etc.) is a random process.

  • Keywords can have very different purposes: e.g. there are ones

with Brand name ("Puma sneakers") to pick audience, which already seeks for specific products or Generic ones ("sneakers") to catch customers, who are just wandering around. In theory groups from the example above are very different ones and should be treated separately. So it is important to understand, which groups of Keywords have the same performance distribution and which do not. | P.S. Speaking math, we are just comparing distibutions over here.

  • Also, it's a good idea to look at dynamics of Keywords distributions

to see, if changes being made in the account lead to healthier performance with time. And it gives us better insight, than just looking at graph of some mean statistic vs time.

Installation

Make sure you have Python3.7, git, virtualenv installed.

  • For Linux:
git clone git@github.com:bluella/keywords-distribution-analysis.git
cd keywords-distribution-analysis
virtualenv -p /usr/bin/python3.7 ds_env
source ds_env/bin/activate
pip install -r requirements.txt
  • For Mac:
git clone git@github.com:bluella/keywords-distribution-analysis.git
cd keywords-distribution-analysis
virtualenv -p /usr/local/bin/python3.7 ds_env
source ds_env/bin/activate
pip install -r requirements.txt

Workflow

Few things were done:

  • Datesets were created with Google Ads Reports
  • Visual comparison of monthly distributions
  • Visual comparison of distributions of a few keywords groups
  • Hypothesis testing of distributions being different from each other.

Done with four tests for three timeframes (August, September, October):

Those were picked because:

  • distributions are clearly not Gaussian, so we need nonparametric tests
  • sample sizes are different
  • tests look at different properties of distributions

Results

  • Via doing various testing on the multiple timeframes, we are ensuring robustness of our results.
  • Distributions from Account X send us mixed signals, so we fail to reject null Hypothesis, that random variables come from the same distribution.
  • Distributions from Account Y are clearly different ones. Null is reject by every test. Graphs suggest the same.

Releases

See CHANGELOG.

License

This project is licensed under the MIT License - see the LICENSE for details.