Amazon has come up with a new tool called ‘Amazon Glue’, which was announced in re:Invent conference that is taking place currently in Las Vegas. Glue is aimed at helping the developers to process the data irrespective of where it has been stored, whether the cloud or on-premises.
ETL or Extract, Transform, and Load of data is considered by the developers as the toughest part of the analytics and Amazon CTO Werner Vogels also stated that the developers spend around 80% of the time collecting the data and spend just 20% of the time getting information out of it. In order use the data for the analytics, it has to be in a particular form in which it can be used.
Amazon is going to change the game with this new tool as it plans to help the developers to spend more time on getting the information from the data rather than spending time collecting it. The Analytics tools are usually present in a way such that it only accepts a certain type of data and the transformation that is required for the source data is time-consuming and hard. This is where the ‘Glue’ comes in as it is able to do all the work so that developers can focus on the information.
In order to make use of this tool, the developers need to point to it to the data sources, which is either stored in cloud or on-premises. Then the tool will build a data catalog along with the access control which can be set by the user. It then transforms the data into the format which is required by the analytics package. Finally, the jobs can be run against the data sets and changes will be made to it if there are any changes done to the source data.