Keywords Distribution Analysis

Overview

This repository is an analysis of the Google Ads Keywords

distributions for the Account X and Account Y. Performance of every Keyword (Clicks, Impressions, ROAS, etc.) is a random process.

Keywords can have very different purposes: e.g. there are ones

with Brand name ("Puma sneakers") to pick audience, which already seeks for specific products or Generic ones ("sneakers") to catch customers, who are just wandering around. In theory groups from the example above are very different ones and should be treated separately. So it is important to understand, which groups of Keywords have the same performance distribution and which do not. | P.S. Speaking math, we are just comparing distibutions over here.

Also, it's a good idea to look at dynamics of Keywords distributions

to see, if changes being made in the account lead to healthier performance with time. And it gives us better insight, than just looking at graph of some mean statistic vs time.

Installation

Make sure you have Python3.7, git, virtualenv installed.

For Linux:

git clone git@github.com:bluella/keywords-distribution-analysis.git
cd keywords-distribution-analysis
virtualenv -p /usr/bin/python3.7 ds_env
source ds_env/bin/activate
pip install -r requirements.txt

For Mac:

git clone git@github.com:bluella/keywords-distribution-analysis.git
cd keywords-distribution-analysis
virtualenv -p /usr/local/bin/python3.7 ds_env
source ds_env/bin/activate
pip install -r requirements.txt

Workflow

Few things were done:

Datesets were created with Google Ads Reports
Visual comparison of monthly distributions
Visual comparison of distributions of a few keywords groups
Hypothesis testing of distributions being different from each other.

Done with four tests for three timeframes (August, September, October):

Brown–Forsythe test , Scipy docs

Wilcoxon Rank-Sum Test , Scipy docs

Kolmogorov - Smirnov test , Scipy docs

Kruskal - Wallis test , Scipy docs

Those were picked because:

distributions are clearly not Gaussian, so we need nonparametric tests

sample sizes are different

tests look at different properties of distributions

Results

Via doing various testing on the multiple timeframes, we are ensuring robustness of our results.
Distributions from Account X send us mixed signals, so we fail to reject null Hypothesis, that random variables come from the same distribution.
Distributions from Account Y are clearly different ones. Null is reject by every test. Graphs suggest the same.

Releases

See CHANGELOG.

License

This project is licensed under the MIT License - see the LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.vscode		.vscode
datasets		datasets
docs		docs
src		src
.editorconfig		.editorconfig
.gitignore		.gitignore
.pylintrc		.pylintrc
AUTHORS.rst		AUTHORS.rst
CHANGELOG.rst		CHANGELOG.rst
LICENSE.txt		LICENSE.txt
README.rst		README.rst
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Keywords Distribution Analysis

Overview

Installation

Workflow

Results

Releases

License

About

Releases

Packages

Contributors 3

Languages

License

bluella/keywords-distribution-analysis

Folders and files

Latest commit

History

Repository files navigation

Keywords Distribution Analysis

Overview

Installation

Workflow

Results

Releases

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages