FalconCLI is a interface between user and Falcon. It is a command line utility provided by Falcon. FalconCLI supports Entity Management, Instance Management and Admin operations.There is a set of web services that are used by FalconCLI to interact with Falcon.
Submit option is used to set up entity definition.
Example: $FALCON_HOME/bin/falcon entity -submit -type cluster -file /cluster/definition.xml
Note: The url option in the above and all subsequent commands is optional. If not mentioned it will be picked from client.properties file. If the option is not provided and also not set in client.properties, Falcon CLI will fail.
Once submitted, an entity can be scheduled using schedule option. Process and feed can only be scheduled.
Usage: $FALCON_HOME/bin/falcon entity -type [process|feed] -name <<name>> -schedule
Example: $FALCON_HOME/bin/falcon entity -type process -name sampleProcess -schedule
Suspend on an entity results in suspension of the oozie bundle that was scheduled earlier through the schedule function. No further instances are executed on a suspended entity. Only schedule-able entities(process/feed) can be suspended.
Usage: $FALCON_HOME/bin/falcon entity -type [feed|process] -name <<name>> -suspend
Puts a suspended process/feed back to active, which in turn resumes applicable oozie bundle.
Usage: $FALCON_HOME/bin/falcon entity -type [feed|process] -name <<name>> -resume
Delete removes the submitted entity definition for the specified entity and put it into the archive.
Usage: $FALCON_HOME/bin/falcon entity -type [cluster|feed|process] -name <<name>> -delete
Entities of a particular type can be listed with list sub-command.
Usage: $FALCON_HOME/bin/falcon entity -type [cluster|feed|process] -list
Optional Args : -fields <<field1,field2>> -filterBy <<field1:value1,field2:value2>> -tags <<tagkey=tagvalue,tagkey=tagvalue>> -nameseq <<namesubsequence>> -orderBy <<field>> -sortOrder <<sortOrder>> -offset 0 -numResults 10
Summary of entities of a particular type and a cluster will be listed. Entity summary has N most recent instances of entity.
Usage: $FALCON_HOME/bin/falcon entity -type [feed|process] -summary
Optional Args : -start "yyyy-MM-dd'T'HH:mm'Z'" -end "yyyy-MM-dd'T'HH:mm'Z'" -fields <<field1,field2>> -filterBy <<field1:value1,field2:value2>> -tags <<tagkey=tagvalue,tagkey=tagvalue>> -orderBy <<field>> -sortOrder <<sortOrder>> -offset 0 -numResults 10 -numInstances 7
Update operation allows an already submitted/scheduled entity to be updated. Cluster update is currently not allowed.
Usage: $FALCON_HOME/bin/falcon entity -type [feed|process] -name <<name>> -update -file <<path_to_file>>
Example: $FALCON_HOME/bin/falcon entity -type process -name HourlyReportsGenerator -update -file /process/definition.xml
Force Update operation allows an already submitted/scheduled entity to be updated.
Usage: $FALCON_HOME/bin/falcon entity -type [feed|process] -name <<name>> -touch
Status returns the current status of the entity.
Usage: $FALCON_HOME/bin/falcon entity -type [cluster|feed|process] -name <<name>> -status
With the use of dependency option, we can list all the entities on which the specified entity is dependent. For example for a feed, dependency return the cluster name and for process it returns all the input feeds, output feeds and cluster names.
Usage: $FALCON_HOME/bin/falcon entity -type [cluster|feed|process] -name <<name>> -dependency
Definition option returns the entity definition submitted earlier during submit step.
Usage: $FALCON_HOME/bin/falcon entity -type [cluster|feed|process] -name <<name>> -definition
Lookup option tells you which feed does a given path belong to. This can be useful in several scenarios e.g. generally you would want to have a single definition for common feeds like metadata with same location otherwise it can result in a problem (different retention durations can result in surprises for one team) If you want to check if there are multiple definitions of same metadata then you can pick an instance of that and run through the lookup command like below.
Usage: $FALCON_HOME/bin/falcon entity -type feed -lookup -path /data/projects/my-hourly/2014/10/10/23/
If you have multiple feeds with location as /data/projects/my-hourly/${YEAR}/${MONTH}/${DAY}/${HOUR} then this command will return all of them.
Kill sub-command is used to kill all the instances of the specified process whose nominal time is between the given start time and end time.
Note: 1. The start time and end time needs to be specified in TZ format. Example: 01 Jan 2012 01:00 => 2012-01-01T01:00Z
3. Process name is compulsory parameter for each instance management command.
Usage: $FALCON_HOME/bin/falcon instance -type <<feed/process>> -name <<name>> -kill -start "yyyy-MM-dd'T'HH:mm'Z'" -end "yyyy-MM-dd'T'HH:mm'Z'"
Suspend is used to suspend a instance or instances for the given process. This option pauses the parent workflow at the state, which it was in at the time of execution of this command.
Usage: $FALCON_HOME/bin/falcon instance -type <<feed/process>> -name <<name>> -suspend -start "yyyy-MM-dd'T'HH:mm'Z'" -end "yyyy-MM-dd'T'HH:mm'Z'"
Continue option is used to continue the failed workflow instance. This option is valid only for process instances in terminal state, i.e. KILLED or FAILED.
Usage: $FALCON_HOME/bin/falcon instance -type <<feed/process>> -name <<name>> -continue -start "yyyy-MM-dd'T'HH:mm'Z'" -end "yyyy-MM-dd'T'HH:mm'Z'"
Rerun option is used to rerun instances of a given process. On issuing a rerun, by default the execution resumes from the last failed node in the workflow. This option is valid only for process instances in terminal state, i.e. SUCCEEDED, KILLED or FAILED. If one wants to forcefully rerun the entire workflow, -force should be passed along with -rerun Additionally, you can also specify properties to override via a properties file.
Usage: $FALCON_HOME/bin/falcon instance -type <<feed/process>> -name <<name>> -rerun -start "yyyy-MM-dd'T'HH:mm'Z'" -end "yyyy-MM-dd'T'HH:mm'Z'" [-force] [-file <<properties file>>]
Resume option is used to resume any instance that is in suspended state.
Usage: $FALCON_HOME/bin/falcon instance -type <<feed/process>> -name <<name>> -resume -start "yyyy-MM-dd'T'HH:mm'Z'" -end "yyyy-MM-dd'T'HH:mm'Z'"
Status option via CLI can be used to get the status of a single or multiple instances. If the instance is not yet materialized but is within the process validity range, WAITING is returned as the state. Along with the status of the instance time is also returned. Log location gives the oozie workflow url If the instance is in WAITING state, missing dependencies are listed. The job urls are populated for all actions of user workflow and non-succeeded actions of the main-workflow. The user then need not go to the underlying scheduler to get the job urls when needed to debug an issue in the job.
Example : Suppose a process has 3 instance, one has succeeded,one is in running state and other one is waiting, the expected output is:
{"status":"SUCCEEDED","message":"getStatus is successful","instances":[{"instance":"2012-05-07T05:02Z","status":"SUCCEEDED","logFile":"http://oozie-dashboard-url"},{"instance":"2012-05-07T05:07Z","status":"RUNNING","logFile":"http://oozie-dashboard-url"}, {"instance":"2010-01-02T11:05Z","status":"WAITING"}]
Usage: $FALCON_HOME/bin/falcon instance -type <<feed/process>> -name <<name>> -status
Optional Args : -start "yyyy-MM-dd'T'HH:mm'Z'" -end "yyyy-MM-dd'T'HH:mm'Z'" -colo <<colo>> -filterBy <<field1:value1,field2:value2>> -lifecycle <<lifecycles>> -orderBy field -sortOrder <<sortOrder>> -offset 0 -numResults 10
List option via CLI can be used to get single or multiple instances. If the instance is not yet materialized but is within the process validity range, WAITING is returned as the state. Instance time is also returned. Log location gives the oozie workflow url If the instance is in WAITING state, missing dependencies are listed
Example : Suppose a process has 3 instance, one has succeeded,one is in running state and other one is waiting, the expected output is:
{"status":"SUCCEEDED","message":"getStatus is successful","instances":[{"instance":"2012-05-07T05:02Z","status":"SUCCEEDED","logFile":"http://oozie-dashboard-url"},{"instance":"2012-05-07T05:07Z","status":"RUNNING","logFile":"http://oozie-dashboard-url"}, {"instance":"2010-01-02T11:05Z","status":"WAITING"}]
Usage: $FALCON_HOME/bin/falcon instance -type <<feed/process>> -name <<name>> -list
Optional Args : -start "yyyy-MM-dd'T'HH:mm'Z'" -end "yyyy-MM-dd'T'HH:mm'Z'" -colo <<colo>> -lifecycle <<lifecycles>> -filterBy <<field1:value1,field2:value2>> -orderBy field -sortOrder <<sortOrder>> -offset 0 -numResults 10
Summary option via CLI can be used to get the consolidated status of the instances between the specified time period. Each status along with the corresponding instance count are listed for each of the applicable colos. The unscheduled instances between the specified time period are included as UNSCHEDULED in the output to provide more clarity.
Example : Suppose a process has 3 instance, one has succeeded,one is in running state and other one is waiting, the expected output is:
{"status":"SUCCEEDED","message":"getSummary is successful", "cluster": <<name>> [{"SUCCEEDED":"1"}, {"WAITING":"1"}, {"RUNNING":"1"}]}
Usage: $FALCON_HOME/bin/falcon instance -type <<feed/process>> -name <<name>> -summary
Optional Args : -start "yyyy-MM-dd'T'HH:mm'Z'" -end "yyyy-MM-dd'T'HH:mm'Z'" -colo <<colo>> -lifecycle <<lifecycles>>
Running option provides all the running instances of the mentioned process.
Usage: $FALCON_HOME/bin/falcon instance -type <<feed/process>> -name <<name>> -running
Optional Args : -colo <<colo>> -lifecycle <<lifecycles>> -filterBy <<field1:value1,field2:value2>> -orderBy <<field>> -sortOrder <<sortOrder>> -offset 0 -numResults 10
Get falcon feed instance availability.
Usage: $FALCON_HOME/bin/falcon instance -type feed -name <<name>> -listing
Optional Args : -start "yyyy-MM-dd'T'HH:mm'Z'" -end "yyyy-MM-dd'T'HH:mm'Z'" -colo <<colo>>
Get logs for instance actions
Usage: $FALCON_HOME/bin/falcon instance -type <<feed/process>> -name <<name>> -logs
Optional Args : -start "yyyy-MM-dd'T'HH:mm'Z'" -end "yyyy-MM-dd'T'HH:mm'Z'" -runid <<runid>> -colo <<colo>> -lifecycle <<lifecycles>> -filterBy <<field1:value1,field2:value2>> -orderBy field -sortOrder <<sortOrder>> -offset 0 -numResults 10
Describes list of life cycles of a entity , for feed it can be replication/retention and for process it can be execution. This can be used with instance management options. Default values are replication for feed and execution for process.
Usage: $FALCON_HOME/bin/falcon instance -type <<feed/process>> -name <<name>> -status -lifecycle <<lifecycletype>> -start "yyyy-MM-dd'T'HH:mm'Z'" -end "yyyy-MM-dd'T'HH:mm'Z'"
Displays the workflow params of a given instance. Where start time is considered as nominal time of that instance and end time won't be considered.
Usage: $FALCON_HOME/bin/falcon instance -type <<feed/process>> -name <<name>> -params -start "yyyy-MM-dd'T'HH:mm'Z'"
dot format. You can use the output and view a graphical representation of DAG using an online graphviz viewer like this.
Usage:
$FALCON_HOME/bin/falcon metadata -lineage -pipeline my-pipeline
pipeline is a mandatory option.
Get the vertex with the specified id.
Usage: $FALCON_HOME/bin/falcon metadata -vertex -id <<id>>
Example: $FALCON_HOME/bin/falcon metadata -vertex -id 4
Get all vertices for a key index given the specified value.
Usage: $FALCON_HOME/bin/falcon metadata -vertices -key <<key>> -value <<value>>
Example: $FALCON_HOME/bin/falcon metadata -vertices -key type -value feed-instance
Get the adjacent vertices or edges of the vertex with the specified direction.
Usage: $FALCON_HOME/bin/falcon metadata -edges -id <<vertex-id>> -direction <<direction>>
Example: $FALCON_HOME/bin/falcon metadata -edges -id 4 -direction both $FALCON_HOME/bin/falcon metadata -edges -id 4 -direction inE
Get the edge with the specified id.
Usage: $FALCON_HOME/bin/falcon metadata -edge -id <<id>>
Example: $FALCON_HOME/bin/falcon metadata -edge -id Q9n-Q-5g
Lists of all dimensions of given type. If the user provides optional param cluster, only the dimensions related to the cluster are listed. Usage: $FALCON_HOME/bin/falcon metadata -list -type [cluster_entity|feed_entity|process_entity|user|colo|tags|groups|pipelines]
Optional Args : -cluster <<cluster name>>
Example: $FALCON_HOME/bin/falcon metadata -list -type process_entity -cluster primary-cluster $FALCON_HOME/bin/falcon metadata -list -type tags
List all dimensions related to specified Dimension identified by dimension-type and dimension-name. Usage: $FALCON_HOME/bin/falcon metadata -relations -type [cluster_entity|feed_entity|process_entity|user|colo|tags|groups|pipelines] -name <<Dimension Name>>
Example: $FALCON_HOME/bin/falcon metadata -relations -type process_entity -name sample-process
Version returns the current version of Falcon installed. Usage: $FALCON_HOME/bin/falcon admin -version
Status returns the current state of Falcon (running or stopped). Usage: $FALCON_HOME/bin/falcon admin -status
Submit the specified recipe.
Usage: $FALCON_HOME/bin/falcon recipe -name <name> Name of the recipe. User should have defined <name>-template.xml and <name>.properties in the path specified by falcon.recipe.path in client.properties file. falcon.home path is used if its not specified in client.properties file. If its not specified in client.properties file and also if files cannot be found at falcon.home, Falcon CLI will fail.
Optional Args : -tool <recipeToolClassName> Falcon provides a base tool that recipes can override. If this option is not specified the default Recipe Tool RecipeTool defined is used. This option is required if user defines his own recipe tool class.
Example: $FALCON_HOME/bin/falcon recipe -name hdfs-replication