10/13/2023 0 Comments Amazon athena vs redshift![]() There are several measures that we can take in order to optimize our queries within AWS Athena so that we can boost our performance as well as keep the cost in check. When working with cloud services, we need to take care of the services that we use the least possible resources and yield the best results out in a cost-effective manner. To query 100 GB of data, daily if 10 queries are executed on an So, the final calculation comes out to be somewhere around 150 USD approximately per month Since the default pricing is based on per TB of data, we need to calculate the pricing based on GB. To roughly approximately 304 queries executed each month. We have configuredĪWS Athena to query this data daily and run around 10 queries on an average to yield the analyses. Let us assume that a customer has approximately 100 GB of data stored on S3 in plain CSV files. You can understand the pricing model of AWS Athena by using the calculator available on the official website. Techniques that will save us some costs while using Amazon Athena.įigure 3 – AWS Athena Pricing Calculator ( Source) In the later part of this article, I am going to talk about some optimization Although it looks quite a small amount at a first glance, when you have multiple queries running on hundreds and thousands of GB of data, the price The normal charge for scanning 1TB of data from S3 is 5 USD. Usually, customers are charged on a pay per query basis which translates to the number of queries that are executed on a given time period. Alternatively, you can also move your data to Redshift, which is an MPPĭata warehouse for fast data analysis and then visualize your data from Redshift using QuickSight.Īmazon Athena is a serverless data query tool which means it is scalable and cost-effective at the same time. An important point worth mentioning here is that Amazon QuickSight can directly be used to connect to Athena and create stunning visuals of your data which resides on S3. Once you are done with your analysis and you have found out your desired results, you can use an EMR cluster to run your complex analytical data transformations, and clean and process your raw data, and then store it back to S3.Īt this stage, you can again use Amazon Athena to query your processed data for further analysis. This is a very simple process as you do not need to set up any database or external tools to query the raw data. At this stage, you can use Amazon Athena to connect to these data in S3 and start analyzing them. These are raw data which means there are no transformations applied to the data yet. YouĬan read more about Apache Hive and Presto from theįigure 2 – Amazon Athena use case ( Source)Īs you can see in the figure above, it represents a simple data pipeline in which data from various multiple sources are being fetched and dumped into S3 buckets. By the official definition of Apache Hive – “ The Apache Hive ™ data warehouse software facilitates reading, writing, and managing large datasets residing in distributed storage using SQL”. Presto is based on the popular open-source technology Hive, to store structured, semi-structured and Under to hood, Athena used a distributed SQL engine called Presto, which is used to run the It is a robust tool that can help customers quickly gain insights on their data stored on S3 as this is serverless and there is no infrastructure to manage.ĪWS Athena is a serverless interactive analytics service offered by Amazon that can be readily used to gain insights ![]() ![]() This article talks about AWS Athena, a service from the analytics domain of Amazon that focuses on the retrieval of static data stored in S3 buckets using standard SQL statements. This allows the customer to build architectures that answer key questions to their business decisions. One of the popular areas of these services in the Analytics domain. The services offered by Amazon range widely from compute, storage, databases, analytics, IoT, security, and a lot more. Almost more than a hundred services are being offered by Amazon which offers competitive performance and cost-effective solutions to run workloads as compared to on-premise architectures. AWS is considered to be a leader in the cloud computing world. In this article, I am going to introduce AWS Athena, a service offered by Amazon which allows users to query dataįrom S3 using standard SQL syntax.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |