Amazon S3 access points
A fundamental component of cloud computing is storage. AWS released a new feature to simplify managing data access at scale, for applications using shared data sets on S3: Amazon S3 Access Points. This makes it easier to customize access permission rules for each of your applications.
This is one of my favorite releases. In the past, it was sometimes complicated to keep track of the access policies that were assigned on specific applications. This release gives you the ability to assign different access permissions to each application, which scales much better.
One of the biggest announcements covered new releases for Amazon SageMaker, Amazon’s machine learning service:
- Amazon SageMaker Studio: a fully integrated development environment (IDE) for machine learning
- Amazon SageMaker Notebooks: a managed machine learning compute instance running the Jupyter Notebook App
- Amazon SageMaker Experiments: lets you organize, track, compare and evaluate machine learning experiments and model versions
- Amazon SageMaker Debugger: lets you debug and analyze complex training issues, and receive alerts
- Amazon SageMaker Model Monitor: detects quality deviations for deployed models, and receives alerts
- Amazon SageMaker Autopilot: helps you to build models automatically with full control and visibility
EC2 image builder
A very nice release which will save a lot of time during the creations of OS images. This service makes it faster and easier to build and maintain secure images for Windows Server and Amazon Linux 2, using automated build pipelines. You start with a source image and can define the recipe of specific activities that need to be applied to the image. Once the build has completed, the AMI will be ready to launch from the EC2 console or used in CloudFormation templates.
The announcement of this release brought a big smile to my face. At Itility we manage multiple customers, each with a specific application landscape. Traditionally, we had to use a third-party tool like Packer to standardize images for our different customers. The newly released service makes this a lot easier. Now, we can create a standardized image directly in the AWS portal, without worrying about the implementation of a third-party tool and test cases.
AWS Outpost was already discussed during last year’s re:Invent. As of now, this service is generally available for orders and installation of the Outpost racks into your local data centers. Outpost offers a comprehensive, single-vendor compute & storage solution that is designed for customers who need local processing and very low latency. Outposts are available in the following countries:
- North America
- Asia Pacific
I was a bit sad that this service was not immediately available during the last year’s re:Invent. Since then we have been looking for a solution for a low latency platform, near the branch office of the customer, without worrying about the underlying components. AWS Outpost would have saved us a lot of hassle, as we now had to implement a more complicated solution.
A new machine learning service to automate code reviews and receive intelligent recommendations. Here is a summary of what Amazon CodeGure can do to help you:
- Automatically detect code issues during code reviews before they reach production
- Improve overall application performance and quality
- Link to relevant documentation
A major frustration for every DevOps engineer is having to rewrite code over and over until the pipeline stops complaining about syntax errors. This service will help you by referring to the relevant documentation, which will help you to understand why your pipeline fails. It is also able to automatically detect code issues when your colleague just missed that specific code to review.
What’s to come?
The last days were very informative with a lot of new services and innovative technologies. Also, it is nice to see is that AWS is developing its own chips named ‘Graviton2’ which enables a great price/performance for cloud workloads.
By Bowie Bartels
Re:Invent 2019 was an amazing experience with more than 60,000 people visiting from all over the world. I attended sessions related to DevOps, Hybrid Cloud, IoT, Data lakes, Machine Learning and Security. In this post, I share some of the lessons learned.
We are currently running different IoT projects, often based on Azure reference architecture. With this in mind, I visited several sessions to get a better understanding of AWS IoT capabilities.
You can probably guess that AWS is continuously innovating. Just like Azure, AWS offers a reference architecture and several managed services that can be used to manage a large fleet of devices or kickstart an IoT project.
Success factors for IoT
I first attended a session that explained the success factors that can impact the outcome of your IoT project. While some of the factors seemed to be obvious, others were not. For example, the number of personas that can be involved to make an IoT project successful is something to pay close attention to. Furthermore, I extracted the following factors from the session that help to make an IoT project succeed:
- Have a long term strategy and a clear outcome
- Design to scale (using managed services)
- Maximize success with personas (identify and engage)
- Understand your overall system architecture and interfaces (e.g. APIs, dashboards)
- Understand how co-workers are going to use the data you will produce
(Involved personas in an IoT project lifecycle)
IoT smart product solutions
Next, I got a grasp on Amazon’s reference architecture. AWS offers a smart product solution deployment, which basically is a solution to help developers set up their IoT projects in the cloud. It is kind of similar to the Azure IoT solution accelerators. And it comes with an SDK for the device side. IoT smart product solutions can be completely customized based on your requirements and it comes with a CI/CD pipeline by default. I was not immediately able to identify all similarities or differences with Azure, but that is something we will investigate further during a deep dive at Itility.
(Smart product solutions architecture)
iRobot customer case
There was a nice case study about the IoT journey of iRobots, a company that sells connected vacuum cleaners. They fully run on AWS PaaS (managed) IoT and Analytics services and, because of that, their operations can completely focus on new feature releases instead of traditional operations. The only thing they had to do is scale some units for their streaming processes! Even during busy times (like Christmas) they checked the logs just because it was fun to see the increased number of requests. To me, this shows how cloud can be of your convenience and helps you to scale at any time. But you must design right.
(number of device registrations during Christmas)
Overall, I think AWS is doing a great job from a technical perspective. They really target developers, and this shows in their IoT services. When I compare this with Azure, I think Microsoft spends a bit more time on productization and serves every type of user. But that doesn't mean that one or the other is better. We will find out in one of the Itility winter school sessions how this works when we get our hands on the buttons.
Data lakes and DevOps
At Itility we are experienced in data engineering and data science. We run several data factories for different customers. For this reason, I decided to also look into building a data lake on AWS. I attended a session that covered some best practices and another on how to run all of this in DevOps mode.
Building the data lake
A session that covered some practices when you start implementing a data lake on AWS:
Just like GCP (Cloud storage) and Azure (Storage Account), if you run a data lake on AWS, S3 is your core service to store structured and unstructured data
- S3 has to become your single source of truth, you innovate around it
- When designing a data lake, one has to think in functional building blocks
- Use (de-coupled) managed services, it enables you to optimize independently
- Plan for massive growth by using security, lifecycle and classification policies
- Use a pipeline workflow (ingest, stage, serve, etc.), it is used as a separate governance domain
(Lift and shift vs Managed Services for data lakes)
Running the data lake
Once a data lake is implemented, one also has to run it. There is a number of reasons to apply DevOps principles to a data lake. Below is a list of some typical examples that would be operated using Git and deployed using pipelines:
- Amazon Red Shift: Schema changes
- AWS Glue: Update crawler settings
- Amazon EMR: Spark script changes
- Amazon Athena: SQL view updates
- Amazon Kinesis: Producer/consumer code
- Amazon S3: Partition key changes
(CI/CD on a data lake)
- A few best practices to use in your CI/CD pipeline:
- Use a structure in your repository that represents service components in the data lake
- Include test scripts
- Consider immutable deployment if possible
- Use branching for parallel development features
- Segregate accounts
If I compare this with Azure or GCP I immediately notice that AWS includes the CI/CD way of working in all of their resource types. However, I think the biggest challenge of running your data lake in CI/CD mode is moving data between test and production. Definitely something to look into.
Machine Learning (workshops)
As a cloud consultant, I nowadays get in touch with data science topics more often than before. It becomes very important to understand this workflow to be more effective in helping our customers and understand new business opportunities better. Therefore, I joined workshops that covered AWS Lake Formation, ML transforms and SageMaker. The latter for which Amazon just released multiple major features.
I went through various steps to provision a data lake, catalog data in AWS Glue Data Catalog and use AWS Lake Formation ML Transforms to cleanse data in a data lake. Also, I got to work with the latest features related to creating, deploying, monitoring, and debugging machine learning models in SageMaker. These managed services help in speeding up the Machine Learning development process. While training the models, one can save and monitor the internal state in S3 for analysis and look for unwanted conditions.
AWS is doing a great job by jumping into data engineering and data science workflows. They have rich integration and SDKs and everything can be controlled by using APIs. Both workshops provided very good learnings that help to determine which platform offers the best fit for purpose.
(ML transform implementation with AWS Glue)
(Amazon SageMaker Debugger)
To me, this is the next big thing in going digital! I believe AI and robotics are the future in many business cases. There was a small booth on the Expo that caught my attention. I was presented with an end-to-end solution for warehousing with robotized material movement and delivery using different AWS products and services. They used AWS RoboMaker, an open source robotics software framework with software libraries. AWS RoboMaker helps in building robotic applications and provides a development environment. This development environment is based on AWS Cloud9 (web-based IDE) and comes with a nice simulation environment. This helps to understand how a robotic application behaves without the need for high investments.
(AWS RoboMaker for Material Movement and Delivery)
DevOps @ AWS
One thing we can all learn from is how the big cloud service providers run resilient systems which, in all cases, are based on DevOps principles. Therefore, I visited a session where senior principal engineer Marc Brooker explained the cultural and technical approach for how AWS builds, monitors, and operates services that handle the unexpected.
(DevOps feedback loop)
Learnings - cultural:
- Create a culture to learn from failure (see picture)
- Allow to fix things that are not broken
- Assure you have leaders that understand the details – this is required for greater success
- Teams run what they build, and operators can never be blamed in a postmortem (this is not the solution)
- Principal engineers are builders (avoid non-practitioner architects)
- Everyone, including the management layer, has an operational role to understand every layer of the business
- Create small teams with strong ownership
- Set up a friendly learning environment that matches the production environment. Operators cannot know everything of systems they did not create
- Operators should be set up for success
Learnings - Technical:
To be stable at scale for distributed systems, one should follow some basic design principles:
- Limit queue sizes (large is a bad idea)
- Limit retries (clients cannot retry forever)
- Implement a back-off mechanism for all services (to avoid a layer cake or retries)
- Use randomness (JITR) to add resilience
Overall an interesting session, especially the cultural side. I see that we follow some of these principles at Itility already. However, making this more explicit helps to build a stronger focus on these cultural aspects, so it is definitely something to reflect on.
We see more and more security-related questions being raised at our customers. With Itility Cloud Control (ICC) we already designed a framework to make sure we deploy secure and controlled foundations. However, we also encounter situations with enterprise customers who keep their services on-premises for the reason of protecting their IP, choosing a custom solution over PaaS or SaaS. I attended two sessions that covered security topics and it all came down to the following fundamentals:
Control your cloud infrastructure: AWS IAM
- Protect by utilizing authentication and authorization with least privileges for both human and/or application API callers
- Write advanced policies to define the actions that are (not) allowed for a specified resource (e.g. tags or dynamic properties in conditions)
Control your data: AWS KMS
- Use the easy and simple key management service to encrypt AWS services
- Access to encryption keys is also managed using the aforementioned IAM capabilities
Control you network: AWS VPC
- From all the advanced topics in a VPC, make sure you at least understand security groups and apply them to your workloads
- Get your routing right to control traffic inside you VPC
Manage your accounts: AWS organizations
- Organize accounts in organizational units to enforce governance and security controls
Without looking at other advanced security features, these controls will lay a foundation by which you know you are deploying in a secure way.
With all this, what are we doing next?
The event gave me plenty food for thought and learnings that we will follow up on back at Itility. With Azure being great for creating managed PaaS or SaaS solutions, I see that AWS is getting closer at a fast pace. I loved the focus on IoT and Machine Learning topics, especially with their major release in SageMaker. With all of the different topics in this blog in mind, I already identified a few personal follow-ups below:
- Compare Azure IoT reference architecture with AWS IoT Product Solutions
- Apply IoT success factors to our IoT projects
- Elaborate on the data engineering and data science workflows in AWS and find how they are different from Azure or GCP
- Deep dive in AWS Robomaker during our Itility winter school to get ready for the next end-to-end digital solution
- Reflect the AWS cultural DevOps principles to our way of working and identify possible gaps
Back to overview