User functions#

Writing a DAG in the YAML configuration file is fine for simple functions with one or two operations, but at some point we’ll want more complicated processing. Let’s say we want to call a function weighted_average from the package my_analysis.

  1. Install the python package in the same environment as canesm-processor.

  2. Add our function call to the YAML file

  3. Register the package (or a specific package module) with canesm-processor

Install the Package#

This is usually accomplished with pip install my_analysis or conda install my_analysis after activating the appropriate conda environment. If this code isn’t installable it must at least be accessible by the next step.

Add to YAML#

my-pipeline.yaml#
monthly:
  variables:
    - OLR
    - FSR
    - FSO
    - BALT:
        dag:
          - function: myanalysis.weighted_average
            args: [OLR, FSR, FSO]

Register the Package#

Before running canesm_pipeline register your code with register_module. The prefix must match the value used in your pipeline file, in this example “myanalysis”.

from canproc import register_module
from canproc.pipelines import canesm_pipeline
from canproc.runners import DaskRunner
import my_analysis

register_module(my_analysis, prefix='myanalysis')

pipeline = canems_pipeline('my-pipeline.yaml')
runner = DaskRunner()
runner.run(pipeline)