Skip to main content

CREATE AND RUN JOB

Description

Create a new Job, then immediately trigger a Job run.

Note that the SQL statement does not end with ;

Syntax

CREATE AND RUN JOB <job_name>
TYPE = { 'JAR' | 'PYTHON' }
PARAMETERS = <array>
CLUSTER = <string>

Example

Example command for creating and running a Job.

CREATE AND RUN JOB count_transactions
TYPE = 'JAR'
PARAMETERS = ( '--class', 'com.example.MySparkApp', '/path/to/my-spark-app.jar', 'arg1', 'arg2' )
CLUSTER = 'onehouse_cluster_spark'

Required parameters

  • <job_name>: Unique name to identify the Job.
  • TYPE: Specify the type of Job - this can be a JAR (for Java or Scala code) or Python script.
  • PARAMETERS: Specify an array of Strings to pass as parameters to the Job, which will be used in a spark-submit (see Apache Spark docs). This should include the following:
    • [Required] For JAR Jobs, you must include the --class parameter.
    • [Required] Include the cloud storage bucket path containing the code for your Job. The Onehouse agent must have access to read this path.
    • [Optional] Include any other Spark properties you'd like the Job to use.
    • [Optional] Include any arguments you'd like to pass to the Job.
  • CLUSTER: Specify the name of an existing Onehouse Cluster with type Spark to run the Job.

Status API

Status API response

  • API_OPERATION_STATUS_SUCCESS from Status API confirms that the Job has been created and submitted to the cluster, but does not reflect the status of the submitted Job. To monitor the Job's status, send a request to the DESCRIBE JOB_RUN API, and refer to sparkJobRun.status field from its Status API response.
  • API_OPERATION_STATUS_FAILED from Status API indicates that the Job has not been created or that the Job has been created but its submission to the cluster failed. To check whether the Job has been created, send a request to DESCRIBE JOB API, and refer to sparkJobRun.latestJobStatus and sparkJobRun.latestJobRunSubmittedAt fields from its Status API response. You can also get the Job run id from the sparkJobRun.latestJobRunId. To monitor the Job's status, send a request to the DESCRIBE JOB_RUN API, and refer to sparkJobRun.status field from its Status API response.

Example Status API response

The Status API response of a successful creation and submission of a Job.

{
"apiStatus": "API_OPERATION_STATUS_SUCCESS",
"apiResponse": {
"runJobApiResponse": {
"jobRunId": "<job_run_id>"
}
}
}