π Welcome to practice AWS Glue here with me.
AWS Glue provide Powerful functions for users,including building ETL pipeline,identify the data by specific user data schema.
we can use it to build the datalake too.One of the most important place is that Glue is serverless .It cost less and achieve it more.
In this Blog we gonna using AWS Glue to connect to data sources in Amazon S3. and then using Athena with AWS Glue.
βββ meijuan-long-glue
| βββ athena_results
| βββ data
| | βββ customers_database
| | βββ customers_csv
| | βββ dataload=20221124
| | βββ customer.csv
| βββscripts
| βββtemp-dir
we need use this role for Glue access data in s3 or other service .So here I gave administratorAccess
we need use this role for Glue access data in s3 or other service .So here I gave administratorAccess
we need use this role for Glue access data in s3 or other service .So here I gave administratorAccess
So After so long process,We have successfully manage the data in S3 to Glue.we have defined the schema by ourself and we can use athena to operate the data.the schema is stored in Glue,and the Data is stored in s3. It is pretty like hive framework.In have the data is stored in HDFS and the meta data is stored in Mysql table which is managed by hive.