[ aws . databrew ]

create-profile-job

Description

Creates a new job to analyze a dataset and create its data profile.

See also: AWS API Documentation

See ‘aws help’ for descriptions of global parameters.

Synopsis

  create-profile-job
--dataset-name <value>
[--encryption-key-arn <value>]
[--encryption-mode <value>]
--name <value>
[--log-subscription <value>]
[--max-capacity <value>]
[--max-retries <value>]
--output-location <value>
--role-arn <value>
[--tags <value>]
[--timeout <value>]
[--job-sample <value>]
[--cli-input-json | --cli-input-yaml]
[--generate-cli-skeleton <value>]

Options

--dataset-name (string)

The name of the dataset that this job is to act upon.

--encryption-key-arn (string)

The Amazon Resource Name (ARN) of an encryption key that is used to protect the job.

--encryption-mode (string)

The encryption mode for the job, which can be one of the following:

  • SSE-KMS - SSE-KMS - Server-side encryption with AWS KMS-managed keys.

  • SSE-S3 - Server-side encryption with keys managed by Amazon S3.

Possible values:

  • SSE-KMS

  • SSE-S3

--name (string)

The name of the job to be created. Valid characters are alphanumeric (A-Z, a-z, 0-9), hyphen (-), period (.), and space.

--log-subscription (string)

Enables or disables Amazon CloudWatch logging for the job. If logging is enabled, CloudWatch writes one log stream for each job run.

Possible values:

  • ENABLE

  • DISABLE

--max-capacity (integer)

The maximum number of nodes that DataBrew can use when the job processes data.

--max-retries (integer)

The maximum number of times to retry the job after a job run fails.

--output-location (structure)

An Amazon S3 location (bucket name an object key) where DataBrew can read input data, or write output from a job.

Bucket -> (string)

The S3 bucket name.

Key -> (string)

The unique name of the object in the bucket.

Shorthand Syntax:

Bucket=string,Key=string

JSON Syntax:

{
  "Bucket": "string",
  "Key": "string"
}

--role-arn (string)

The Amazon Resource Name (ARN) of the AWS Identity and Access Management (IAM) role to be assumed when DataBrew runs the job.

--tags (map)

Metadata tags to apply to this job.

key -> (string)

value -> (string)

Shorthand Syntax:

KeyName1=string,KeyName2=string

JSON Syntax:

{"string": "string"
  ...}

--timeout (integer)

The job’s timeout in minutes. A job that attempts to run longer than this timeout period ends with a status of TIMEOUT .

--job-sample (structure)

Sample configuration for profile jobs only. Determines the number of rows on which the profile job will be executed. If a JobSample value is not provided, the default value will be used. The default value is CUSTOM_ROWS for the mode parameter and 20000 for the size parameter.

Mode -> (string)

Determines whether the profile job will be executed on the entire dataset or on a specified number of rows. Must be one of the following:

  • FULL_DATASET: Profile job will be executed on the entire dataset.

  • CUSTOM_ROWS: Profile job will be executed on the number of rows specified in the Size parameter.

Size -> (long)

Size parameter is only required when the mode is CUSTOM_ROWS. Profile job will be executed on the the specified number of rows. The maximum value for size is Long.MAX_VALUE.

Long.MAX_VALUE = 9223372036854775807

Shorthand Syntax:

Mode=string,Size=long

JSON Syntax:

{
  "Mode": "FULL_DATASET"|"CUSTOM_ROWS",
  "Size": long
}

--cli-input-json | --cli-input-yaml (string) Reads arguments from the JSON string provided. The JSON string follows the format provided by --generate-cli-skeleton. If other arguments are provided on the command line, those values will override the JSON-provided values. It is not possible to pass arbitrary binary values using a JSON-provided value as the string will be taken literally. This may not be specified along with --cli-input-yaml.

--generate-cli-skeleton (string) Prints a JSON skeleton to standard output without sending an API request. If provided with no value or the value input, prints a sample input JSON that can be used as an argument for --cli-input-json. Similarly, if provided yaml-input it will print a sample input YAML that can be used with --cli-input-yaml. If provided with the value output, it validates the command inputs and returns a sample output JSON for that command.

See ‘aws help’ for descriptions of global parameters.

Output

Name -> (string)

The name of the job that was created.