How to use the REST API to Resume a Failed Workflow

One of the relatively new 5.x features of vCenter Orchestrator (vCO) is the ability to Enable a workflow to resume on failure. Essentially, this means that a workflow could fail 1/2 or 3/4 the way through and you could go and tell vCO to resume that workflow, perhaps after fixing whatever issue caused it to fail in the first place, rather than start a fresh instance of the workflow.

Introduction

introduction.png

As noted in the intro snippet, vCO now has the ability to let you resume a failed workflow. See the following vCO Documentation page vCO Documentation on Resuming a Failed Workflows to learn more about this feature and get it setup. (I recommend doing this on a workflow-by-workflow basis only.) This new feature can be quite helpful as it automatically generates a User Interaction prompt when your workflow fails, allowing you to resume the workflow from where it left off. This could be very helpful when, for instance, your target environment lacks resources for a deployment and the workflow has already progressed through several steps of external integration (IE: Generated a Helpdesk request for tracking, reserved an IP Address, etc…) rather than rolling everything back and starting all over each time a workflow fails.

Failed workflow appearance when Enabled

failed_workflow_appearance_when_enabled.png
  • When that option is enabled, rather than the workflow being in a permanently failed state, upon failure the workflow will enter into “Waiting” state for an interaction as depicted above by the icon next to the workflow execution.
  • The Schema shows you where the workflow had failed by highlighting the failed element in Red.
  • The Variable tab will show the Exception details in the “Exception” window at the bottom in RED TEXT.

Using the vCO Client to Answer

using_the_vco_client_to_answer.png

The process to resume a failed workflow using the vCO Client is the same as answering a User Interaction - Right Click on the workflow execution, then select the “Answer” link.

The Workflow interaction window will come up, allowing you to choose to either “Resume” the workflow or “Cancel” the workflow.

If you chose Cancel and hit Submit, the workflow would cancel out and would no longer be a viable execution to resume.

media_1398455105328.png

However, if you chose to “Resume”, the “Parameters” section of input gets loaded with all the Input Parameters for your workflow, allowing you to modify as needed before submitting the workflow to complete from where it had failed.

Okay, great but the title said REST API…

Right, so I wanted to lay a little ground work to make sure you understood the general flow of a failed workflow and what the UI process was before we go off to XML land for the REST API.

Before you continue on, be sure you have:

  1. Set the “Resume from failed behavior” to “Enabled” on your test workflow
  2. Have executed the workflow and gotten it to Fail before completing (Feel free to use the attached Test workflow at the bottom of this article.)

Retrieve the Workflow Executions list

retrieve_the_workflow_executions_list.png

Reminder: vCO API Documentation can be found on your vCO Server – https://your-vco-server:8281/vco/api/docs

In order to retrieve our list of Executions, we need the following information:

  • vCO API URL format –> https://your-vco-server:8281/vco/api/workflows/<workflow-ID>/executions/<workflow-execution-id>
  • Workflow ID –> See item 1 in Screenshot above –> The workflow ID will remain the same across vCO instances. So, if you import the workflow attached to this post, your id will be the same.
  • Workflow Execution ID –> See item 2 in Screenshot above –> this is your workflow execution ID, it is unique for every run of the workflow.

Review results

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<workflow-execution
    xmlns="http://www.vmware.com/vco" href="https://vco55.vcoteam.lab:8281/vco/api/workflows/883af9aa-7b98-4c6a-8cf5-6ec54f28c3cb/executions/ff808081458f848b01459a60144f0723/">
    <relations>
        <link href="https://vco55.vcoteam.lab:8281/vco/api/workflows/883af9aa-7b98-4c6a-8cf5-6ec54f28c3cb/executions/" rel="up"/>
        <link href="https://vco55.vcoteam.lab:8281/vco/api/workflows/883af9aa-7b98-4c6a-8cf5-6ec54f28c3cb/executions/ff808081458f848b01459a60144f0723/" rel="remove"/>
        <link href="https://vco55.vcoteam.lab:8281/vco/api/workflows/883af9aa-7b98-4c6a-8cf5-6ec54f28c3cb/executions/ff808081458f848b01459a60144f0723/logs/" rel="logs"/>
        <link href="https://vco55.vcoteam.lab:8281/vco/api/workflows/883af9aa-7b98-4c6a-8cf5-6ec54f28c3cb/executions/ff808081458f848b01459a60144f0723/state/" rel="state"/>
        <link href="https://vco55.vcoteam.lab:8281/vco/api/workflows/883af9aa-7b98-4c6a-8cf5-6ec54f28c3cb/executions/ff808081458f848b01459a60144f0723/interaction/" rel="interaction"/>
        <link href="https://vco55.vcoteam.lab:8281/vco/api/workflows/883af9aa-7b98-4c6a-8cf5-6ec54f28c3cb/executions/ff808081458f848b01459a60144f0723/state/" rel="cancel"/>
    </relations>
    <id>ff808081458f848b01459a60144f0723</id>
    <state>waiting</state>
    <input-parameters>
        <parameter type="boolean" name="isFailWorkflow" scope="local">
            <boolean>true</boolean>
        </parameter>
    </input-parameters>
    <output-parameters/>
    <start-date>2014-04-25T15:32:39.118-04:00</start-date>
    <business-state>Default System Error Handling for item: item2</business-state>
    <started-by>bazbill@VCOTEAM.LAB</started-by>
    <name>3A) Resume tester</name>
    <content-exception>Workflow failed because user decided so</content-exception>
    <current-item-display-name>Workflow Error System Handler</current-item-display-name>
</workflow-execution>

Based on the above information, the URL I need to use is: https://my-vco-server:8281/vco/api/workflows/883af9aa-7b98-4c6a-8cf5-6ec54f28c3cb/executions/ff808081458f848b01459a60144f0723/ (Be sure to adjust YOUR request to reflect YOUR workflow ID and Execution ID).

Upon submitting a GET request to that url, the above XML (screenshot) is displayed.

We can see in this execution that there is an “interaction” link – see Line 9, state is waiting – see Line 13, and a “content-exception” tag is present – see Line 24. The 3 of these present in a workflow execution indicates that the workflow has failed and the Resume feature is enabled and waiting for a user interaction.

We now have the link to the interaction, so we can learn more about it by performing a GET on that URL…

GET the interaction

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<user-interaction 
    xmlns="http://www.vmware.com/vco" href="https://vco55.vcoteam.lab:8281/vco/api/workflows/883af9aa-7b98-4c6a-8cf5-6ec54f28c3cb/executions/ff808081458f848b01459a60144f0723/interaction/">
    <relations>
        <link href="https://vco55.vcoteam.lab:8281/vco/api/workflows/883af9aa-7b98-4c6a-8cf5-6ec54f28c3cb/executions/ff808081458f848b01459a60144f0723/" rel="up"/>
        <link href="https://vco55.vcoteam.lab:8281/vco/api/workflows/883af9aa-7b98-4c6a-8cf5-6ec54f28c3cb/executions/ff808081458f848b01459a60144f0723/interaction/presentation/" rel="down"/>
        <link href="https://vco55.vcoteam.lab:8281/vco/api/workflows/883af9aa-7b98-4c6a-8cf5-6ec54f28c3cb/executions/ff808081458f848b01459a60144f0723/interaction/" rel="add"/>
    </relations>
    <input-parameters>
        <parameter type="boolean" name="isFailWorkflow"/>
        <parameter type="Date" name="resume.fail.timeout.date"/>
        <parameter type="string" name="__System_Action"/>
    </input-parameters>
    <name>3A) Resume tester : Workflow Error System Handler</name>
    <state>waiting</state>
</user-interaction>

The interaction XML provides the inputs that are needed for the interaction, but don’t provide any decorators, default values, etc… – see Lines 10-12.

If you wish to see the current values and other presentation options, you can drill down to the /interaction/presentation URL – see Line 6

Presentation XML

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<presentation
    xmlns="http://www.vmware.com/vco" id="883af9aa-7b98-4c6a-8cf5-6ec54f28c3cb:ff808081458f848b01459a60144f0723" name="3A) Resume tester : Workflow Error System Handler" href="https://vco55.vcoteam.lab:8281/vco/api/workflows/883af9aa-7b98-4c6a-8cf5-6ec54f28c3cb/executions/ff808081458f848b01459a60144f0723/interaction/presentation/">
    <relations>
        <link href="https://vco55.vcoteam.lab:8281/vco/api/workflows/883af9aa-7b98-4c6a-8cf5-6ec54f28c3cb/executions/ff808081458f848b01459a60144f0723/interaction/" rel="up"/>
        <link href="https://vco55.vcoteam.lab:8281/vco/api/workflows/883af9aa-7b98-4c6a-8cf5-6ec54f28c3cb/executions/ff808081458f848b01459a60144f0723/interaction/presentation/instances/" rel="down"/>
        <link href="https://vco55.vcoteam.lab:8281/vco/api/workflows/883af9aa-7b98-4c6a-8cf5-6ec54f28c3cb/executions/ff808081458f848b01459a60144f0723/interaction/presentation/instances/" rel="add"/>
    </relations>
    <steps>
        <step hidden="false">
            <display-name>Error in workflow</display-name>
            <description>Workflow execution has stopped on error</description>
            <messages/>
            <group hidden="false">
                <messages/>
                <fields>
                    <field type="string" id="__System_Action" hidden="false">
                        <display-name>Choose action to continue</display-name>
                        <description>Choose action to continue</description>
                        <messages/>
                        <constraints/>
                        <decorators>
                            <refresh-on-change/>
                            <drop-down>
                                <array>
                                    <string>Cancel</string>
                                    <string>Resume</string>
                                </array>
                            </drop-down>
                        </decorators>
                        <fields/>
                    </field>
                </fields>
            </group>
        </step>
        <step hidden="false">
            <display-name>Parameters</display-name>
            <description>Modify the parameters for resume</description>
            <messages/>
            <group hidden="false">
                <messages/>
                <fields>
                    <field type="boolean" id="isFailWorkflow" hidden="false">
                        <display-name>isFailWorkflow</display-name>
                        <description>isFailWorkflow</description>
                        <messages/>
                        <constraints/>
                        <decorators/>
                        <fields/>
                        <boolean>true</boolean>
                    </field>
                    <field type="Date" id="resume.fail.timeout.date" hidden="true">
                        <display-name>resume.fail.timeout.date</display-name>
                        <description>resume.fail.timeout.date</description>
                        <messages/>
                        <constraints/>
                        <decorators/>
                        <fields/>
                        <date>2014-04-26T15:32:40-04:00</date>
                    </field>
                </fields>
            </group>
        </step>
    </steps>
    <input-parameters>
        <parameter description="Choose action to continue" type="string" name="__System_Action"/>
        <parameter description="isFailWorkflow" type="boolean" name="isFailWorkflow"/>
        <parameter description="resume.fail.timeout.date" type="Date" name="resume.fail.timeout.date"/>
    </input-parameters>
    <output-parameters>
        <parameter type="string" name="__System_Action"/>
        <parameter type="Date" name="resume.fail.timeout.date"/>
        <parameter type="boolean" name="isFailWorkflow"/>
    </output-parameters>
</presentation>

In the code above, you can see the extra info about the running workflow in the Parameters section of the Steps section.

Prepare POST BODY

As noted earlier, resuming a failed workflow is similar to a User Interaction. The vCO Develop Web Service Documentation - provides details on this.

After a little testing, I found that the necessary params for my workflow to resume were the “isFailWorkflow” (this was my Input Parameter for the workflow - if your workflow has additional inputs, you should populate them as well) and the “__System_Action” Parameter. The “__System_Action” Parameter is what we saw at the beginning of the article - the drop-down with “Resume” and “Cancel”. The third available parameter (resume.fail.timeout.date) is not needed when Submitting the body to resume or cancel the workflow.

Here’s the body required to answer the attached test workflow to get it to Resume using the Resume Failed Workflow feature:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
<execution-context xmlns="http://www.vmware.com/vco">
   <parameters>
     <parameter name="__System_Action" type="string">
       <string>Resume</string>
     </parameter>
     <parameter name="isFailWorkflow" type="boolean">
       <string>false</string>
     </parameter>
   </parameters>
</execution-context>

Summary

This article has provided a quick intro to a cool vCO feature and provided a light walk-through of not only using the vCO client to use the feature, but also covered the necessary steps to take advantage of the feature over the REST API :)