Step-by-step guide on retrieving S3 data in to Athina.
Amazon S3 (Simple Storage Service) is widely used for storing both structured and unstructured data. If you have datasets stored in an S3 bucket and want to use them in Athina IDE for evaluation or experimentation, this guide will walk you through the step-by-step process of fetching data from S3 and adding it to Athina IDE datasets using Python.
Step 3: Retrieve Data from S3 and Load into Pandas
1
Now, let’s fetch the file from S3, read its content, and convert it into a Pandas DataFrame:
Copy
Ask AI
try: # Fetch the file from S3 obj = s3.get_object(Bucket=BUCKET_NAME, Key=FILE_KEY) data = obj['Body'].read().decode('utf-8') # Convert JSON data to Pandas DataFrame df = pd.read_json(StringIO(data)) print("S3 Data Successfully Loaded!")except s3.exceptions.NoSuchKey: print("The specified object does not exist in the bucket.")except Exception as e: print(f"Error retrieving S3 data: {e}")
💡 If your file is in CSV format, replace pd.read_json() with pd.read_csv(StringIO(data)).
To upload the retrieved data into Athina IDE, follow these steps:
Set up the Athina API key
Convert the DataFrame into a format suitable for Athina IDE
Upload the dataset using Dataset.add_rows()
Copy
Ask AI
# Import Athina clientfrom athina_client.datasets import Datasetfrom athina_client.keys import AthinaApiKey# Set your Athina API KeyAthinaApiKey.set_key('your-athina-api-key')# Upload DataFrame to Athina Datasettry: Dataset.add_rows( dataset_id='your-dataset-id', # Replace with the correct dataset ID from Athina IDE rows=df.to_dict(orient="records") # Convert DataFrame to a list of dictionaries ) print("Data successfully uploaded to Athina!")except Exception as e: print(f"Failed to add rows to Athina IDE: {e}")
2
Then, go to the Datasets section to verify that the data has been uploaded successfully.
By following this guide, you can retrieve data from an S3 bucket and upload it to Athina IDE for further analysis, evaluation, and experimentation. This integration allows you to efficiently work with large-scale datasets stored in Amazon S3, making it easier to process and analyze data using Athina IDE.