spark-clusters management with docker
-
Updated
May 11, 2021 - Dockerfile
spark-clusters management with docker
Terraform module to create managed, full-spectrum, open-source analytics service Azure HDInsight. This module creates Apache Hadoop, Apache Spark, Apache HBase, Interactive Query (Apache Hive LLAP) and Apache Kafka clusters.
Research to setup and use a Spark Standalone Multi-Node Cluster.
Spark on Kubernetes PoCs
Stuff done on AWS. Gathered the steps of creating spark cluster on EC2.
Docker image to deploy a spark cluster in containers
Template for Spark Data Science Projects
Performing various product review analysis on Amazon dataset using Apache Spark and MongoDB
A collection of scripts to easily start HDFS and Spark clusters
A python library to submit spark job in yarn cluster at different distributions (Currently CDH, HDP)
This project create an Hadoop and Spark cluster on Amazon AWS with Terraform
📓 Repository/Tutorial for initiallizing Jupyter Notebook and Spark cluster on Amazon EMR
local kubernetes-based ml setup
This project provides an end-to-end data processing and visualization of visa numbers in Japan using PySpark and Plotly. The spark clusters are set up within a Docker container on Azure.
Command line interface for spark cluster management app
A guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support
Add a description, image, and links to the spark-clusters topic page so that developers can more easily learn about it.
To associate your repository with the spark-clusters topic, visit your repo's landing page and select "manage topics."