Saturday, April 27, 2019

AWS Glue python ApplyMapping / apply_mapping example


The ApplyMapping class is a type conversion and field renaming function for your data. To apply the map, you need two things:
  1. A dataframe
  2. The mapping list

Friday, April 5, 2019

The Glue code that runs on AWS Glue and on Dev Endpoint


When you develop code for Glue with the Dev Endpoint, you soon get annoyed with the fact that the code is different in Glue vs on Dev Endpoint
  • glueContext is created in a different manner
  • there's no concept of 'job' on dev endpoint, and therefore
  • no arguments for the job, either
So Mike from The MIS Theorist asked if there was a simpler way. And sure there is!

Friday, March 22, 2019

AWS Glue, Dev Endpoint and Zeppelin Notebook



AWS Glue is quite a powerful tool. What I like about it is that it's managed: you don't need to take care of infrastructure yourself, but instead AWS hosts it for you. You can schedule scripts to run in the morning and your data will be in its right place by the time you get to work.

The downside is that developing scripts for AWS Glue is cumbersom, a real pain in the butt. I first tried to code the scripts through the console, but you end up waiting a lot only to realize you had a syntax error in your code.