Skip to content

Big Data, MapReduce, Spark, PySpark @ Santa Clara University, Winter 2019

Notifications You must be signed in to change notification settings

Leo8216/big-data-mapreduce-course

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spring 2019

Course Information:

  • Graduate School, Leavey School of Business
  • Department of Information Systems & Analytics
  • Course MSIS 2627: Big Data Modeling & Analytics
  • Big-Data-MapReduce Course @ Santa Clara University
  • Class Meeting dates: April 2, 2019 - June 12, 2019
  • Class hours:
    • Tuesday 5:45PM - 7:00PM PST
    • Thursday 5:45PM - 7:00PM PST
  • Class room: Lucas Hall 306
  • Office: 221 W, Lucas Hall
  • Office Hours: by appointment

Required Books, Papers, and Documentation


Midterm Exam

  • Midterm Exam Date: May TBDL, 2019
  • Midterm Exam Time: 5:45PM - 7:00PM PST

Final Exam

  • Final Exam Date: June TBDL, 2019
  • Final Exam Time: 5:45PM - 7:45PM PST

Course Description

The main focus of this class is to cover the following concepts:

  • Concepts of Big Data
  • Distributed File Systems
  • Distributed Computing
  • Distributed and Parallel Algorithms
  • MapReduce Paradigm
  • MapReduce Algorithms
  • Scale-out Architectures (using Hadoop, Spark, PySpark)
  • Apache Spark: http://spark.apache.org/
  • Use Spark, Py-Spark, Hadoop, and Java to teach MapReduce and distributed computing
  • SQL for NoSQL Data, How?
  • Amazon Athena

My latest book:

Data Algorithms: Recipes for Scaling up with Hadoop and Spark

Data Algorithms Book

About

Big Data, MapReduce, Spark, PySpark @ Santa Clara University, Winter 2019

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • HTML 85.0%
  • Shell 5.9%
  • Java 5.7%
  • Batchfile 1.6%
  • Python 1.3%
  • XSLT 0.3%
  • TeX 0.2%