From 6551a42666ed9243da4aaaee86d81826d92eba29 Mon Sep 17 00:00:00 2001 From: Myles Braithwaite Date: Mon, 12 Nov 2018 11:38:43 -0500 Subject: [PATCH] :memo: Add docs for accessing files from S3. Add some documetnation about accessing files from a remote S3 bucket in pandas. #12206 --- doc/source/cookbook.rst | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/doc/source/cookbook.rst b/doc/source/cookbook.rst index 53468e755a722..2b6f86d4f4d0e 100644 --- a/doc/source/cookbook.rst +++ b/doc/source/cookbook.rst @@ -1370,3 +1370,28 @@ of the data values: 'weight': [100, 140, 180], 'sex': ['Male', 'Female']}) df + + +Load a file from S3 +------------------- + +Pandas support loading files from a S3 bucket for remote file interactivity. +You will be required to install the S3Fs_ library. + +.. code-block:: python + + df = pd.read_csv('s3://baseballdatabank/core/Parks.csv') + df.head(1) + +If your S3 bucket requires cedentials you will need to set them as environment +variables or in the ``~/.aws/credentials`` config file, refer to the `S3Fs +documentation on credentials +`_. + +.. code-block:: shell + + export AWS_ACCESS_KEY_ID = '' + export AWS_SECRET_ACCESS_KEY = '' + export AWS_SESSION_TOKEN = '' + +.. _S3Fs: https://pypi.org/project/s3fs/