Multi-Step Workflow¶
This tutorial extends the previously created “Hello World” workflow by adding a second step. The second step executes the wc command and returns a text file with the word count, line count, and byte count of the file created by the “Hello World” app.
You can either start from your previous “Hello World” workflow or the GeneFlow workflow template using one of the following commands
git clone https://github.com/[USER]/hello-world-workflow-gf2.git hello-world-2step-workflow-gf2
Or
git clone https://gitlab.com/geneflow/workflows/workflow-template-gf2.git hello-world-2step-workflow-gf2
The “wc” app¶
The wc app has been created already. The app essentially executes the following command: wc input.file > output.file
You can clone the git repository, look over the app, and run the test script to get a better understanding of how it works.
Hello World output is the wc input¶
The most important parts multi-step workflows are the inputs and outputs of the apps, as the output of one app will usually be the input of another app.
wc app input and output¶
Let’s look at the input and output section of the “wc” app at https://gitlab.com/geneflow/apps/wc-gf2/blob/master/app.yaml.
inputs:
file:
label: Input File
description: Input file
type: File
required: true
test_value: ${SCRIPT_DIR}/data/file.txt
parameters:
output:
label: Output File
description: Output file
type: File
required: true
test_value: output.txt
We see that the “wc” app takes a file as the input in the “file” field. In our workflow, we will use the output file of the “Hello World” app as the input to the “wc” app.
Update the workflow.yaml file¶
Update the appropriate sections of the workflow.yaml file as follows:
vi ./hello-world-2step-workflow-gf2/workflow.yaml
Metadata¶
Update the metadata section with the new information for the package. Add - wc
to final_output
for the output of the wc step to be included in the final output.
# metadata
name: hello-world-2step-workflow-gf
description: Hello World two-step workflow
documentation_uri:
repo_uri: 'https://github.com/jiangweiyao/hello-world-2step-workflow-gf.git'
version: '0.1'
username: USER
final_output:
- hello
- wc
Apps¶
Update the entries in the “apps” section to include both the “hello-world” and “wc” apps. We will use the first version (0.1) of the “Hello World” app. You can use the app provided or substitute the name and version of your “Hello World” app.
apps:
hello-world:
git: https://github.com/[USER]/hello-world-gf2.git
version: '0.1'
word-count:
git: https://gitlab.com/geneflow/apps/wc-gf2.git
version: '0.1'
Steps¶
Add the wc app as the second step. Set the app:
value to the location specified in the apps-repo.yaml
file. The depend:
value specifies the steps that must complete before the current step runs. Set the “wc” step to depend on the “hello” step since the output of the “hello-world” app is the input to the “wc” app. Set the file:
option of “wc” to ‘{hello->output}/helloworld.txt’ specifying the “helloworld.txt” file produced in the “hello” step as the input to “wc”. Finally, set the output:
option under the “wc” step as the name of the output file.
# steps
steps:
hello:
app: hello-world
depend: []
template:
file: ${workflow->file}
output: helloworld.txt
wc:
app: word-count
depend: [ "hello" ]
template:
file: ${hello->output}/helloworld.txt
output: wc.txt
Update Workflow README¶
Update the README.rst to include the relevant information
Commit and Tag the New Workflow¶
We’ll use GitHub as an example, but the commands are similar for other repositories. If you cloned the workflow from an existing repository, delete the .git folder to create a new repository.
cd hello-world-2step-workflow-gf2
rm -rf .git
Create a new repository on GitHub named “hello-world-2step-workflow-gf2”. Push the code to GitHub using the following commands:
git init
git add .
git commit -m "1st commit"
git tag 0.1
git remote add origin https://github.com/[name]/hello-world-2step-workflow-gf2.git
git push -u origin master
git push origin 0.1
Be sure to replace [name]
with your GitHub username.
Install and Test the Workflow¶
Now that the workflow has been committed to a Git repo, it can be installed anywhere:
geneflow install-workflow -g https://github.com/[name]/hello-world-2step-workflow-gf2.git -c --make_apps ./hello-world-2step
Make a dummy file named “test.txt”:
touch test.txt
Finally, test the workflow to validate its functionality:
geneflow run ./hello-world-2step -o output --in.file=test.txt
This command runs the workflow in the “hello-world-2step” directory using the test data and copies the output to the “output” directory. The output of the two steps are placed in separate folders.
tree ./geneflow_output/geneflow-job-[JOB ID]
You should see the following file structure:
geneflow-job-50dd420d
├── hello
│ └── helloworld.txt
└── wc
└── wc.txt
Summary¶
Congratulations! You created a two-step workflow that uses the output of one app as the input of the second app.