AWS Fundamentals Logo
AWS Fundamentals
L2 Construct

EmrCreateCluster

A Step Functions Task to create an EMR Cluster. The ClusterConfiguration is defined as Parameters in the state machine definition. OUTPUT: the ClusterId.

Import

import { EmrCreateCluster } from 'aws-cdk-lib/aws-stepfunctions-tasks';

Or use the module namespace:

import * as stepfunctions_tasks from 'aws-cdk-lib/aws-stepfunctions-tasks';
// stepfunctions_tasks.EmrCreateCluster

Properties

Configuration passed to the constructor as EmrCreateClusterProps.

instancesRequired
InstancesConfigProperty

A specification of the number and type of Amazon EC2 instances.

nameRequired
string

The Name of the Cluster.

additionalInfoOptional
string

A JSON string for selecting additional features.

Default: - None

applicationsOptional
ApplicationConfigProperty[]

A case-insensitive list of applications for Amazon EMR to install and configure when launching the cluster.

Default: - EMR selected default

autoScalingRoleOptional
IRole

An IAM role for automatic scaling policies.

Default: - A role will be created.

autoTerminationPolicyIdleTimeoutOptional
Duration

The amount of idle time after which the cluster automatically terminates. You can specify a minimum of 60 seconds and a maximum of 604800 seconds (seven days).

Default: - No timeout

bootstrapActionsOptional
BootstrapActionConfigProperty[]

A list of bootstrap actions to run before Hadoop starts on the cluster nodes.

Default: - None

clusterRoleOptional
IRole

Also called instance profile and EC2 role. An IAM role for an EMR cluster. The EC2 instances of the cluster assume this role. This attribute has been renamed from jobFlowRole to clusterRole to align with other ERM/StepFunction integration parameters.

Default: - * A Role will be created

configurationsOptional
ConfigurationProperty[]

The list of configurations supplied for the EMR cluster you are creating.

Default: - None

customAmiIdOptional
string

The ID of a custom Amazon EBS-backed Linux AMI.

Default: - None

ebsRootVolumeIopsOptional
number

The IOPS of the EBS root device volume of the Linux AMI that is used for each EC2 instance. Requires EMR release label 6.15.0 or above. Must be in range [3000, 16000].

Default: - EMR selected default

ebsRootVolumeSizeOptional
Size

The size of the EBS root device volume of the Linux AMI that is used for each EC2 instance.

Default: - EMR selected default

ebsRootVolumeThroughputOptional
number

The throughput, in MiB/s, of the EBS root device volume of the Linux AMI that is used for each EC2 instance. Requires EMR release label 6.15.0 or above. Must be in range [125, 1000].

Default: - EMR selected default

kerberosAttributesOptional
KerberosAttributesProperty

Attributes for Kerberos configuration when Kerberos authentication is enabled using a security configuration.

Default: - None

logUriOptional
string

The location in Amazon S3 to write the log files of the job flow.

Default: - None

managedScalingPolicyOptional
ManagedScalingPolicyProperty

The specified managed scaling policy for an Amazon EMR cluster.

Default: - None

releaseLabelOptional
string

The Amazon EMR release label, which determines the version of open-source application packages installed on the cluster.

Default: - EMR selected default

scaleDownBehaviorOptional
EmrClusterScaleDownBehavior

Specifies the way that individual Amazon EC2 instances terminate when an automatic scale-in activity occurs or an instance group is resized.

Default: - EMR selected default

securityConfigurationOptional
string

The name of a security configuration to apply to the cluster.

Default: - None

serviceRoleOptional
IRole

The IAM role that will be assumed by the Amazon EMR service to access AWS resources on your behalf.

Default: - A role will be created that Amazon EMR service can assume.

stepConcurrencyLevelOptional
number

Specifies the step concurrency level to allow multiple steps to run in parallel. Requires EMR release label 5.28.0 or above. Must be in range [1, 256].

Default: 1 - no step concurrency allowed

tagsOptional
{ [key: string]: string }

A list of tags to associate with a cluster and propagate to Amazon EC2 instances.

Default: - None

visibleToAllUsersOptional
boolean

A value of true indicates that all IAM users in the AWS account can perform cluster actions if they have the proper IAM policy permissions.

Default: true

2 properties inherited from TaskStateBaseProps
resultPathOptionalinherited from TaskStateBaseProps
string

JSONPath expression to indicate where to inject the state's output. May also be the special value JsonPath.DISCARD, which will cause the state's input to become its output.

Default: $

resultSelectorOptionalinherited from TaskStateBaseProps
{ [key: string]: any }

The JSON that will replace the state's raw result and become the effective result before ResultPath is applied. You can use ResultSelector to create a payload with values that are static or selected from the state's raw result.

Default: - None

3 properties inherited from StateBaseProps
commentOptionalinherited from StateBaseProps
string

A comment describing this state.

Default: No comment

queryLanguageOptionalinherited from StateBaseProps
QueryLanguage

The name of the query language used by the state. If the state does not contain a `queryLanguage` field, then it will use the query language specified in the top-level `queryLanguage` field.

Default: - JSONPath

stateNameOptionalinherited from StateBaseProps
string

Optional name for this state.

Default: - The construct ID will be used as state name

6 properties inherited from TaskStateBaseOptions
credentialsOptionalinherited from TaskStateBaseOptions
Credentials

Credentials for an IAM Role that the State Machine assumes for executing the task. This enables cross-account resource invocations.

Default: - None (Task is executed using the State Machine's execution role)

heartbeatOptionalDeprecatedinherited from TaskStateBaseOptions
Duration

Timeout for the heartbeat.

Default: - None

Deprecated: use `heartbeatTimeout`

heartbeatTimeoutOptionalinherited from TaskStateBaseOptions
Timeout

Timeout for the heartbeat. [disable-awslint:duration-prop-type] is needed because all props interface in aws-stepfunctions-tasks extend this interface

Default: - None

integrationPatternOptionalinherited from TaskStateBaseOptions
IntegrationPattern

AWS Step Functions integrates with services directly in the Amazon States Language. You can control these AWS services using service integration patterns. Depending on the AWS Service, the Service Integration Pattern availability will vary.

Default: - `IntegrationPattern.REQUEST_RESPONSE` for most tasks. `IntegrationPattern.RUN_JOB` for the following exceptions: `BatchSubmitJob`, `EmrAddStep`, `EmrCreateCluster`, `EmrTerminationCluster`, and `EmrContainersStartJobRun`.

taskTimeoutOptionalinherited from TaskStateBaseOptions
Timeout

Timeout for the task. [disable-awslint:duration-prop-type] is needed because all props interface in aws-stepfunctions-tasks extend this interface

Default: - None

timeoutOptionalDeprecatedinherited from TaskStateBaseOptions
Duration

Timeout for the task.

Default: - None

Deprecated: use `taskTimeout`

1 property inherited from AssignableStateOptions
assignOptionalinherited from AssignableStateOptions
{ [key: string]: any }

Workflow variables to store in this step. Using workflow variables, you can store data in a step and retrieve that data in future steps.

Default: - Not assign variables

2 properties inherited from JsonPathCommonOptions
inputPathOptionalinherited from JsonPathCommonOptions
string

JSONPath expression to select part of the state to be the input to this state. May also be the special value JsonPath.DISCARD, which will cause the effective input to be the empty object {}.

Default: $

outputPathOptionalinherited from JsonPathCommonOptions
string

JSONPath expression to select part of the state to be the output to this state. May also be the special value JsonPath.DISCARD, which will cause the effective output to be the empty object {}.

Default: $

1 property inherited from JsonataCommonOptions
outputsOptionalinherited from JsonataCommonOptions
any

Used to specify and transform output from the state. When specified, the value overrides the state output default. The output field accepts any JSON value (object, array, string, number, boolean, null). Any string value, including those inside objects or arrays, will be evaluated as JSONata if surrounded by {% %} characters. Output also accepts a JSONata expression directly.

Default: - $states.result or $states.errorOutput

Learn AWS the Practical Way

Our bi-weekly newsletter teaches hands-on AWS fundamentals. No certification fluff - just practical knowledge.

Subscribe to Newsletter