AWS::EMR::ClusterThe `AWS::EMR::Cluster` resource specifies an Amazon EMR cluster. This cluster is a collection of Amazon EC2 instances that run open source big data frameworks and applications to process and analyze vast amounts of data. For more information, see the [Amazon EMR Management Guide](https://docs.aws.amazon.com//emr/latest/ManagementGuide/) . Amazon EMR now supports launching task instance groups and task instance fleets as part of the `AWS::EMR::Cluster` resource. This can be done by using the `JobFlowInstancesConfig` property type's `TaskInstanceGroups` and `TaskInstanceFleets` subproperties. Using these subproperties reduces delays in provisioning task nodes compared to specifying task nodes with the `AWS::EMR::InstanceGroupConfig` and `AWS::EMR::InstanceFleetConfig` resources. Please refer to the examples at the bottom of this page to learn how to use these subproperties.
import { CfnCluster } from 'aws-cdk-lib/aws-emr';Or use the module namespace:
import * as emr from 'aws-cdk-lib/aws-emr';
// emr.CfnClusterConfiguration passed to the constructor as CfnClusterProps.
instancesRequiredIResolvable | JobFlowInstancesConfigPropertyA specification of the number and type of Amazon EC2 instances.
jobFlowRoleRequiredstringAlso called instance profile and Amazon EC2 role. An IAM role for an Amazon EMR cluster. The Amazon EC2 instances of the cluster assume this role. The default role is `EMR_EC2_DefaultRole` . In order to use the default role, you must have already created it using the AWS CLI or console.
nameRequiredstringThe name of the cluster. This parameter can't contain the characters <, >, $, |, or ` (backtick).
serviceRoleRequiredstringThe IAM role that Amazon EMR assumes in order to access AWS resources on your behalf.
additionalInfoOptionalanyA JSON string for selecting additional features.
applicationsOptionalIResolvable | IResolvable | ApplicationProperty[]The applications to install on this cluster, for example, Spark, Flink, Oozie, Zeppelin, and so on.
autoScalingRoleOptionalstringAn IAM role for automatic scaling policies. The default role is `EMR_AutoScaling_DefaultRole` . The IAM role provides permissions that the automatic scaling feature requires to launch and terminate Amazon EC2 instances in an instance group.
autoTerminationPolicyOptionalIResolvable | AutoTerminationPolicyPropertyAn auto-termination policy for an Amazon EMR cluster. An auto-termination policy defines the amount of idle time in seconds after which a cluster automatically terminates. For alternative cluster termination options, see [Control cluster termination](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-termination.html) .
bootstrapActionsOptionalIResolvable | IResolvable | BootstrapActionConfigProperty[]A list of bootstrap actions to run before Hadoop starts on the cluster nodes.
configurationsOptionalIResolvable | IResolvable | ConfigurationProperty[]Applies only to Amazon EMR releases 4.x and later. The list of configurations that are supplied to the Amazon EMR cluster.
customAmiIdOptionalstringAvailable only in Amazon EMR releases 5.7.0 and later. The ID of a custom Amazon EBS-backed Linux AMI if the cluster uses a custom AMI.
ebsRootVolumeIopsOptionalnumberThe IOPS, of the Amazon EBS root device volume of the Linux AMI that is used for each Amazon EC2 instance. Available in Amazon EMR releases 6.15.0 and later.
ebsRootVolumeSizeOptionalnumberThe size, in GiB, of the Amazon EBS root device volume of the Linux AMI that is used for each Amazon EC2 instance. Available in Amazon EMR releases 4.x and later.
ebsRootVolumeThroughputOptionalnumberThe throughput, in MiB/s, of the Amazon EBS root device volume of the Linux AMI that is used for each Amazon EC2 instance. Available in Amazon EMR releases 6.15.0 and later.
kerberosAttributesOptionalIResolvable | KerberosAttributesPropertyAttributes for Kerberos configuration when Kerberos authentication is enabled using a security configuration. For more information see [Use Kerberos Authentication](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-kerberos.html) in the *Amazon EMR Management Guide* .
logEncryptionKmsKeyIdOptionalstringThe AWS KMS key used for encrypting log files. This attribute is only available with Amazon EMR 5.30.0 and later, excluding Amazon EMR 6.0.0.
logUriOptionalstringThe path to the Amazon S3 location where logs for this cluster are stored.
managedScalingPolicyOptionalIResolvable | ManagedScalingPolicyPropertyCreates or updates a managed scaling policy for an Amazon EMR cluster. The managed scaling policy defines the limits for resources, such as Amazon EC2 instances that can be added or terminated from a cluster. The policy only applies to the core and task nodes. The master node cannot be scaled after initial configuration.
osReleaseLabelOptionalstringThe Amazon Linux release specified in a cluster launch RunJobFlow request. If no Amazon Linux release was specified, the default Amazon Linux release is shown in the response.
placementGroupConfigsOptionalIResolvable | IResolvable | PlacementGroupConfigProperty[]releaseLabelOptionalstringThe Amazon EMR release label, which determines the version of open-source application packages installed on the cluster. Release labels are in the form `emr-x.x.x` , where x.x.x is an Amazon EMR release version such as `emr-5.14.0` . For more information about Amazon EMR release versions and included application versions and features, see [](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/) . The release label applies only to Amazon EMR releases version 4.0 and later. Earlier versions use `AmiVersion` .
scaleDownBehaviorOptionalstringThe way that individual Amazon EC2 instances terminate when an automatic scale-in activity occurs or an instance group is resized. `TERMINATE_AT_INSTANCE_HOUR` indicates that Amazon EMR terminates nodes at the instance-hour boundary, regardless of when the request to terminate the instance was submitted. This option is only available with Amazon EMR 5.1.0 and later and is the default for clusters created using that version. `TERMINATE_AT_TASK_COMPLETION` indicates that Amazon EMR adds nodes to a deny list and drains tasks from nodes before terminating the Amazon EC2 instances, regardless of the instance-hour boundary. With either behavior, Amazon EMR removes the least active nodes first and blocks instance termination if it could lead to HDFS corruption. `TERMINATE_AT_TASK_COMPLETION` is available only in Amazon EMR releases 4.1.0 and later, and is the default for versions of Amazon EMR earlier than 5.1.0.
securityConfigurationOptionalstringThe name of the security configuration applied to the cluster.
stepConcurrencyLevelOptionalnumberSpecifies the number of steps that can be executed concurrently. The default value is `1` . The maximum value is `256` .
stepsOptionalIResolvable | IResolvable | StepConfigProperty[]A list of steps to run.
tagsOptionalCfnTag[]A list of tags associated with a cluster.
visibleToAllUsersOptionalboolean | IResolvableIndicates whether the cluster is visible to all IAM users of the AWS account associated with the cluster. If this value is set to `true` , all IAM users of that AWS account can view and manage the cluster if they have the proper policy permissions set. If this value is `false` , only the IAM user that created the cluster can view and manage it. This value can be changed using the SetVisibleToAllUsers action. > When you create clusters directly through the EMR console or API, this value is set to `true` by default. However, for `AWS::EMR::Cluster` resources in CloudFormation, the default is `false` .
This L1 construct maps directly to the following CloudFormation resource type.
Our bi-weekly newsletter teaches hands-on AWS fundamentals. No certification fluff - just practical knowledge.
Subscribe to Newsletteraws-emrAWS::EMR::Cluster