Skip to content

🧑‍🏫 Practical guide to big data analysis, with Python

Notifications You must be signed in to change notification settings

nebari-dev/big-data-tutorial

Repository files navigation

Data of an Unusual Size: A practical guide to analysis and interactive visualization of massive datasets

In this hands-on tutorial, you will learn the fundamentals of analyzing massive datasets with real-world examples on actual powerful machines on a public cloud -- starting from how the data is stored and read, to how it is processed and visualized. You will understand how large-scale analysis differs from local workflows, the unique challenges associated with scale, and some best practices to work productively with your data.

Setup for SciPy 2024

You can use Nebari (JupterHub) hosted at scipy.quansight.dev to follow along with this tutorial.

Follow this participant's guide to register & sign-in (re-register if you used for a different tutorial), select the Medium Instance in the Server Options, and click on the "Data of an Unusual Size" card in the JupyterLab launcher to clone the materials.

In the tutorials/big-data-tutorial folder that's created with all material, navigate to 00-introduction.ipynb.

The environment for this tutorial is scipy-scipy-data-of-unusual-size, and it is automatically selected for you. :)

Live presentations

You can check out the tags for previous versions of this tutorial.


This repository is covered by the Nebari Code of Conduct, and is under BSD 3-Clause license.

About

🧑‍🏫 Practical guide to big data analysis, with Python

Resources

Code of conduct

Stars

Watchers

Forks

Packages

No packages published