Skip to content

alisto92/dataengineering_project-3

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Project 3: Understanding User Behavior

Alissa Stover

Due April 14th, 2020

Our assignment in this project was as follows:

  • You're a data scientist at a game development company

  • Your latest mobile game has two events you're interested in tracking: buy a sword & join guild

  • Each has metadata characterstic of such events (i.e., sword type, guild name, etc)

Tasks

The commands used to execute the following are recorded in the commands.txt file in this repository:

  • Instrument your API server to log events to Kafka

    • The API server is instrumented in the game_api.py file in this repository
  • Use Apache Bench to generate test data for your pipeline

The following is executed in the Project_3 Jupyter Notebook in this repository:

  • Assemble a data pipeline to catch these events: use Spark streaming to filter select event types from Kafka, land them into HDFS/parquet to make them available for analysis using Presto

  • Produce an analytics report where you provide a description of your pipeline and some basic analysis of the events

We were given the following guidelines for this notebook:

Use a notebook to present your queries and findings. Remember that this notebook should be appropriate for presentation to someone else in your business who needs to act on your recommendations.

It's understood that events in this pipeline are generated events which make them hard to connect to actual business decisions. However, we'd like students to demonstrate an ability to plumb this pipeline end-to-end, which includes initially generating test data as well as submitting a notebook-based report of at least simple event analytics.

Options

We were given the following guidelines for additional options.

I did not attempt the options that have been italicized as these are outside of my skillset.

There are plenty of advanced options for this project. Here are some ways to take your project further than just the basics we'll cover in class:

  • Generate and filter more types of events. There are plenty of other things you might capture events for during gameplay

    • I created additional events. The final list of event types is as follows:
      • Given in assignment:
        • default
        • purchase_sword
        • purchase_knife
        • join_guild
      • Optional additions:
        • purchase_shield
        • declare_fealty
        • declare_war
  • Enhance the API to use additional http verbs such as POST or DELETE as well as additionally accept parameters for events (e.g., purchase events might accept sword or item type)

  • Connect a user-keyed storage engine such as Redis or Cassandra up to Spark so you can track user state during gameplay (e.g., user's inventory or health)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published