AWS Fundamentals Logo
AWS Fundamentals
L1 ConstructAWS::EMR::Cluster

CfnCluster

The `AWS::EMR::Cluster` resource specifies an Amazon EMR cluster. This cluster is a collection of Amazon EC2 instances that run open source big data frameworks and applications to process and analyze vast amounts of data. For more information, see the [Amazon EMR Management Guide](https://docs.aws.amazon.com//emr/latest/ManagementGuide/) . Amazon EMR now supports launching task instance groups and task instance fleets as part of the `AWS::EMR::Cluster` resource. This can be done by using the `JobFlowInstancesConfig` property type's `TaskInstanceGroups` and `TaskInstanceFleets` subproperties. Using these subproperties reduces delays in provisioning task nodes compared to specifying task nodes with the `AWS::EMR::InstanceGroupConfig` and `AWS::EMR::InstanceFleetConfig` resources. Please refer to the examples at the bottom of this page to learn how to use these subproperties.

Import

import { CfnCluster } from 'aws-cdk-lib/aws-emr';

Or use the module namespace:

import * as emr from 'aws-cdk-lib/aws-emr';
// emr.CfnCluster

Properties

Configuration passed to the constructor as CfnClusterProps.

instancesRequired
IResolvable | JobFlowInstancesConfigProperty

A specification of the number and type of Amazon EC2 instances.

jobFlowRoleRequired
string

Also called instance profile and Amazon EC2 role. An IAM role for an Amazon EMR cluster. The Amazon EC2 instances of the cluster assume this role. The default role is `EMR_EC2_DefaultRole` . In order to use the default role, you must have already created it using the AWS CLI or console.

nameRequired
string

The name of the cluster. This parameter can't contain the characters <, >, $, |, or ` (backtick).

serviceRoleRequired
string

The IAM role that Amazon EMR assumes in order to access AWS resources on your behalf.

additionalInfoOptional
any

A JSON string for selecting additional features.

applicationsOptional
IResolvable | IResolvable | ApplicationProperty[]

The applications to install on this cluster, for example, Spark, Flink, Oozie, Zeppelin, and so on.

autoScalingRoleOptional
string

An IAM role for automatic scaling policies. The default role is `EMR_AutoScaling_DefaultRole` . The IAM role provides permissions that the automatic scaling feature requires to launch and terminate Amazon EC2 instances in an instance group.

autoTerminationPolicyOptional
IResolvable | AutoTerminationPolicyProperty

An auto-termination policy for an Amazon EMR cluster. An auto-termination policy defines the amount of idle time in seconds after which a cluster automatically terminates. For alternative cluster termination options, see [Control cluster termination](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-termination.html) .

bootstrapActionsOptional
IResolvable | IResolvable | BootstrapActionConfigProperty[]

A list of bootstrap actions to run before Hadoop starts on the cluster nodes.

configurationsOptional
IResolvable | IResolvable | ConfigurationProperty[]

Applies only to Amazon EMR releases 4.x and later. The list of configurations that are supplied to the Amazon EMR cluster.

customAmiIdOptional
string

Available only in Amazon EMR releases 5.7.0 and later. The ID of a custom Amazon EBS-backed Linux AMI if the cluster uses a custom AMI.

ebsRootVolumeIopsOptional
number

The IOPS, of the Amazon EBS root device volume of the Linux AMI that is used for each Amazon EC2 instance. Available in Amazon EMR releases 6.15.0 and later.

ebsRootVolumeSizeOptional
number

The size, in GiB, of the Amazon EBS root device volume of the Linux AMI that is used for each Amazon EC2 instance. Available in Amazon EMR releases 4.x and later.

ebsRootVolumeThroughputOptional
number

The throughput, in MiB/s, of the Amazon EBS root device volume of the Linux AMI that is used for each Amazon EC2 instance. Available in Amazon EMR releases 6.15.0 and later.

kerberosAttributesOptional
IResolvable | KerberosAttributesProperty

Attributes for Kerberos configuration when Kerberos authentication is enabled using a security configuration. For more information see [Use Kerberos Authentication](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-kerberos.html) in the *Amazon EMR Management Guide* .

logEncryptionKmsKeyIdOptional
string

The AWS KMS key used for encrypting log files. This attribute is only available with Amazon EMR 5.30.0 and later, excluding Amazon EMR 6.0.0.

logUriOptional
string

The path to the Amazon S3 location where logs for this cluster are stored.

managedScalingPolicyOptional
IResolvable | ManagedScalingPolicyProperty

Creates or updates a managed scaling policy for an Amazon EMR cluster. The managed scaling policy defines the limits for resources, such as Amazon EC2 instances that can be added or terminated from a cluster. The policy only applies to the core and task nodes. The master node cannot be scaled after initial configuration.

osReleaseLabelOptional
string

The Amazon Linux release specified in a cluster launch RunJobFlow request. If no Amazon Linux release was specified, the default Amazon Linux release is shown in the response.

placementGroupConfigsOptional
IResolvable | IResolvable | PlacementGroupConfigProperty[]
releaseLabelOptional
string

The Amazon EMR release label, which determines the version of open-source application packages installed on the cluster. Release labels are in the form `emr-x.x.x` , where x.x.x is an Amazon EMR release version such as `emr-5.14.0` . For more information about Amazon EMR release versions and included application versions and features, see [](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/) . The release label applies only to Amazon EMR releases version 4.0 and later. Earlier versions use `AmiVersion` .

scaleDownBehaviorOptional
string

The way that individual Amazon EC2 instances terminate when an automatic scale-in activity occurs or an instance group is resized. `TERMINATE_AT_INSTANCE_HOUR` indicates that Amazon EMR terminates nodes at the instance-hour boundary, regardless of when the request to terminate the instance was submitted. This option is only available with Amazon EMR 5.1.0 and later and is the default for clusters created using that version. `TERMINATE_AT_TASK_COMPLETION` indicates that Amazon EMR adds nodes to a deny list and drains tasks from nodes before terminating the Amazon EC2 instances, regardless of the instance-hour boundary. With either behavior, Amazon EMR removes the least active nodes first and blocks instance termination if it could lead to HDFS corruption. `TERMINATE_AT_TASK_COMPLETION` is available only in Amazon EMR releases 4.1.0 and later, and is the default for versions of Amazon EMR earlier than 5.1.0.

securityConfigurationOptional
string

The name of the security configuration applied to the cluster.

stepConcurrencyLevelOptional
number

Specifies the number of steps that can be executed concurrently. The default value is `1` . The maximum value is `256` .

stepsOptional
IResolvable | IResolvable | StepConfigProperty[]

A list of steps to run.

tagsOptional
CfnTag[]

A list of tags associated with a cluster.

visibleToAllUsersOptional
boolean | IResolvable

Indicates whether the cluster is visible to all IAM users of the AWS account associated with the cluster. If this value is set to `true` , all IAM users of that AWS account can view and manage the cluster if they have the proper policy permissions set. If this value is `false` , only the IAM user that created the cluster can view and manage it. This value can be changed using the SetVisibleToAllUsers action. > When you create clusters directly through the EMR console or API, this value is set to `true` by default. However, for `AWS::EMR::Cluster` resources in CloudFormation, the default is `false` .

CloudFormation Resource

This L1 construct maps directly to the following CloudFormation resource type.

Learn AWS the Practical Way

Our bi-weekly newsletter teaches hands-on AWS fundamentals. No certification fluff - just practical knowledge.

Subscribe to Newsletter