1. Home
  2. Amazon
  3. MLA-C01 Exam Syllabus

Amazon MLA-C01 Exam Syllabus

Start Free MLA-C01 Exam Practice After Reviewing the Topics

Before starting your MLA-C01 exam preparation, it is recommended to review the complete Amazon AWS Certified Machine Learning Engineer - Associate exam syllabus and carefully go through the exam objectives listed below. Once you understand the exam structure and objectives, you should practice using our free MLA-C01 questions. We also provide premium MLA-C01 practice test, fully updated according to the latest exam objectives, to help you accurately assess your preparedness for the actual exam.

Amazon MLA-C01 Exam Objectives

Section Objectives
Domain 1: Data Preparation for Machine Learning (ML) Task Statement 1.1: Ingest and store data.
Knowledge of:
Data formats and ingestion mechanisms (for example, validated and
non-validated formats, Apache Parquet, JSON, CSV, Apache ORC, Apache
Avro, RecordIO)
How to use the core AWS data sources (for example, Amazon S3, Amazon
Elastic File System [Amazon EFS], Amazon FSx for NetApp ONTAP)
How to use AWS streaming data sources to ingest data (for example,
Amazon Kinesis, Apache Flink, Apache Kafka)
AWS storage options, including use cases and tradeoffs


Task Statement 1.2: Transform data and perform feature engineering.
Knowledge of:
Data cleaning and transformation techniques (for example, detecting and
treating outliers, imputing missing data, combining, deduplication)
Feature engineering techniques (for example, data scaling and
standardization, feature splitting, binning, log transformation,
normalization)
 Encoding techniques (for example, one-hot encoding, binary encoding, label
encoding, tokenization)
Tools to explore, visualize, or transform data and features (for example,
SageMaker Data Wrangler, AWS Glue, AWS Glue DataBrew)
Services that transform streaming data (for example, AWS Lambda, Spark)
Data annotation and labeling services that create high-quality labeled
datasets
 
Task Statement 1.3: Ensure data integrity and prepare data for modeling.
Knowledge of:
Pre-training bias metrics for numeric, text, and image data (for example,
class imbalance [CI], difference in proportions of labels [DPL])
Strategies to address CI in numeric, text, and image datasets (for example,
synthetic data generation, resampling)
Techniques to encrypt data
Data classification, anonymization, and masking
Implications of compliance requirements (for example, personally
identifiable information [PII], protected health information [PHI], data
residency)
 
Domain 2: ML Model Development Task Statement 2.1: Choose a modeling approach.
Knowledge of:
Capabilities and appropriate uses of ML algorithms to solve business
problems
How to use AWS artificial intelligence (AI) services (for example, Amazon
Translate, Amazon Transcribe, Amazon Rekognition, Amazon Bedrock) to
solve specific business problems
How to consider interpretability during model selection or algorithm
selection
SageMaker built-in algorithms and when to apply them
Task Statement 2.2: Train and refine models.
Knowledge of:
Elements in the training process (for example, epoch, steps, batch size)
Methods to reduce model training time (for example, early stopping,
distributed training)
Factors that influence model size
Methods to improve model performance
Benefits of regularization techniques (for example, dropout, weight decay,
L1 and L2)
Hyperparameter tuning techniques (for example, random search, Bayesian
optimization)
Model hyperparameters and their effects on model performance (for
example, number of trees in a tree-based model, number of layers in a
neural network)
Methods to integrate models that were built outside SageMaker into
SageMaker


Task Statement 2.3: Analyze model performance.
Knowledge of:
Model evaluation techniques and metrics (for example, confusion matrix, heat
maps, F1 score, accuracy, precision, recall, Root Mean Square Error [RMSE],
receiver operating characteristic [ROC], Area Under the ROC Curve [AUC])
Methods to create performance baselines
Methods to identify model overfitting and underfitting
Metrics available in SageMaker Clarify to gain insights into ML training data
and models
Convergence issues

 
Domain 4: ML Solution Monitoring, Maintenance, and Security Task Statement 4.1: Monitor model inference.
Knowledge of:
Drift in ML models
Techniques to monitor data quality and model performance
Design principles for ML lenses relevant to monitoring

Task Statement 4.2: Monitor and optimize infrastructure and costs.
Knowledge of:
Key performance metrics for ML infrastructure (for example, utilization,
throughput, availability, scalability, fault tolerance)
Monitoring and observability tools to troubleshoot latency and
performance issues (for example, AWS X-Ray, Amazon CloudWatch Lambda
Insights, Amazon CloudWatch Logs Insights)
How to use AWS CloudTrail to log, monitor, and invoke re-training activities
Differences between instance types and how they affect performance (for
example, memory optimized, compute optimized, general purpose,
inference optimized)
Capabilities of cost analysis tools (for example, AWS Cost Explorer, AWS
Billing and Cost Management, AWS Trusted Advisor)
Cost tracking and allocation techniques (for example, resource tagging)

Task Statement 4.3: Secure AWS resources.
Knowledge of:
IAM roles, policies, and groups that control access to AWS services (for
example, AWS Identity and Access Management [IAM], bucket policies,
SageMaker Role Manager)
SageMaker security and compliance features
Controls for network access to ML resources
Security best practices for CI/CD pipelines

 
Domain 3: Deployment and Orchestration of ML Workflows Task Statement 3.1: Select deployment infrastructure based on existing architecture
and requirements.
Knowledge of:
Deployment best practices (for example, versioning, rollback strategies)
AWS deployment services (for example, SageMaker)
 Methods to serve ML models in real time and in batches
How to provision compute resources in production environments and test
environments (for example, CPU, GPU)
Model and endpoint requirements for deployment endpoints (for example,
serverless endpoints, real-time endpoints, asynchronous endpoints, batch
inference)
How to choose appropriate containers (for example, provided or customized)
Methods to optimize models on edge devices (for example, SageMaker Neo)

Task Statement 3.2: Create and script infrastructure based on existing architecture
and requirements.
Knowledge of:
Difference between on-demand and provisioned resources
How to compare scaling policies
Tradeoffs and use cases of infrastructure as code (IaC) options (for example,
AWS CloudFormation, AWS Cloud Development Kit [AWS CDK])
Containerization concepts and AWS container services
How to use SageMaker endpoint auto scaling policies to meet scalability
requirements (for example, based on demand, time)

Task Statement 3.3: Use automated orchestration tools to set up continuous
integration and continuous delivery (CI/CD) pipelines.
Knowledge of:
Capabilities and quotas for AWS CodePipeline, AWS CodeBuild, and AWS
CodeDeploy
Automation and integration of data ingestion with orchestration services
Version control systems and basic usage (for example, Git)
CI/CD principles and how they fit into ML workflows
Deployment strategies and rollback actions (for example, blue/green,
canary, linear)
How code repositories and pipelines work together


 
Official Information https://d1.awsstatic.com/training-and-certification/docs-machine-learning-engineer-associate/AWS-Certified-Machine-Learning-Engineer-Associate_Exam-Guide.pdf