A guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support
-
Updated
Nov 3, 2017 - Jupyter Notebook
A guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support
Command line interface for spark cluster management app
This project provides an end-to-end data processing and visualization of visa numbers in Japan using PySpark and Plotly. The spark clusters are set up within a Docker container on Azure.
This project create an Hadoop and Spark cluster on Amazon AWS with Terraform
Terraform module to create managed, full-spectrum, open-source analytics service Azure HDInsight. This module creates Apache Hadoop, Apache Spark, Apache HBase, Interactive Query (Apache Hive LLAP) and Apache Kafka clusters.
📓 Repository/Tutorial for initiallizing Jupyter Notebook and Spark cluster on Amazon EMR
Spark on Kubernetes PoCs
Performing various product review analysis on Amazon dataset using Apache Spark and MongoDB
A collection of scripts to easily start HDFS and Spark clusters
Template for Spark Data Science Projects
Research to setup and use a Spark Standalone Multi-Node Cluster.
local kubernetes-based ml setup
spark-clusters management with docker
A python library to submit spark job in yarn cluster at different distributions (Currently CDH, HDP)
Stuff done on AWS. Gathered the steps of creating spark cluster on EC2.
Docker image to deploy a spark cluster in containers
Add a description, image, and links to the spark-clusters topic page so that developers can more easily learn about it.
To associate your repository with the spark-clusters topic, visit your repo's landing page and select "manage topics."