Rating: 8.0/10.
An overview of the AWS services used by machine learning engineers to preprocess training data, train models, and deploy them to production. The book has good coverage of all of the most AWS services with various different options of doing things, such as code vs no-code tools, and server vs serverless deployment options.
Most of the book gives step-by-step instructions of how to set up and run each AWS service so it can be skimmed quickly if you’re not trying to actually run them, while still getting a sense of what’s involved in using the service. Although all of the instructions are toy examples, the reader still gets a good sense of the design tradeoffs between the different options.
Notably the book covers only ways to use AWS to train and deploy your own models, and does not cover the AWS AI services (eg: to analyze text / audio / images) which are assumed to be out of scope of the MLE role.
Ch1. Some EDA and auto ML using tabular dataset. Cloud9 is a cloud IDE running on AWS, AutoGluon is a library for auto ML. Sagemaker Canvas is a no-code tool for training models (its data is stored in a Sagemaker Domain); Sagemaker Autopilot lets you do auto ML in Sagemaker.
Ch2. Start EC2 instance via web UI using deep learning AMI (image preloaded with deep learning packages), trains a model using keras, shuts it down.
Ch3. Serverless ML model deployment: package a model using Docker, push it to ECR. Then create a Lambda function from ECR container, and API gateway to trigger Lambda function with an HTTP endpoint. This lets you pay only for what you use instead of idle containers.
Ch4. Serverless data processing is good for only paying for data you query / process instead of provisioning resources that become idle. Setup of how to configure IAM and VPC to give it only the resources it needs. Load CSV data from S3 to AWS Redshift (which stores the data in its own columnar storage) and query it with SQL, and unload it to S3. Then use AWS Glue to transform to data lake format, and query it with AWS Athena (also uses a SQL-like query language, but operates on data directly from S3).
Ch5. Data cleaning and wrangling with AWS Glue DataBrew, where you upload your data (eg: in parquet format), do some transformations in no-code UI, and export to S3. Another tool that does something similar is Sagemaker Data Wrangler, more integrated with other Sagemaker tools and you write code to do data transformations in notebooks.
Ch6. Sagemaker ecosystem for model training: Sagemaker Studio is a managed Jupyter notebook environment in the cloud. Sagemaker Python SDK can programatically control Sagemaker instances to train models using Estimator class: you point it to your data on S3 and it automatically starts a training run and shuts down the instance when done training.
Ch7. Model deployment with Sagemaker can be done independently from training, you can deploy any model by providing a Python script with several specified entry points for loading the model, processing input, making the prediction, and processing output; pass the script to the Sagemaker Python SDK to turn it into an inference endpoint. Endpoints can either be real-time (always running), serverless (requires cold-start), or asynchronous (good for large jobs, takes data input and outputs to S3), using the same deployment method.
Ch8. Deploy a model by first pushing it to Sagemaker Model Registry (with metadata on the expected format of the input) then attach it to an endpoint. Sagemaker Model Monitor first captures baseline statistics for model inputs / outputs with a reference file, attach a data capture class to upload some percentage of production data to S3, then set up periodic job to check if statistical assumptions are violated.
Ch9. Ways to secure your AWS resources using IAM permissions when you need users to run arbitrary code in notebooks or upload arbitrary models. Also variety of tools for model interpretability, logging, privacy, etc.
Ch10. Set up a managed Kubernetes cluster using Amazon EKS, using YAML files to configure the cluster, then use Kubeflow to run a ML pipeline on the cluster.
Ch11. Use Sagemaker Pipelines to create an execution graph consisting of several steps: process data, train model, conditionally push and deploy model to endpoint, all done using Python code.