Migrating from Prefect1.x to Prefect2.x #369

annshress · 2023-11-06T14:27:44Z

annshress
Nov 6, 2023
Collaborator

(This is a work in progress. Please comment below if there are things relevant and necessary to be added to this related topic.)

Prefect Recipes

Change in workflows code

Flow registration occurs through a decorator @flow now
upstream_tasks changes to wait_for
❗️NOTE❗️ Upstream tasks failure does not trigger failure in downstream tasks. In Prefect 1.x, there is a state called TriggerFailed that is created by Failed state from the upstream task. Read More

In case of Prefect 2.x, the TriggerFailed does not exist. The downstream tasks stays in a NotReady state as the upstream tasks did not Complete. In order allow TriggerFailed behavior, we should use allow_failure(previous_task_run) annotation. [Annotations Docs] ... More information below
There are state change hooks such as on_completion and on_failed (Check the docs)
We might need some tasks to always run no matter what happens to previous task (For example, cleaning up temporary directories). For that we can use allow_failure annotations. Check here.

If you are using dask, the DaskRunner is now its own package. Link

from prefect.executors import DaskExecutor
# changes to
from prefect_dask.task_runners import DaskTaskRunner

Prefect captures all exceptions and re-raises it. In Prefect 1, we normally raised signals.Fail to make the process smoother. You can also still retain that behavior by using return_state=True.
Subflows.
One big takeaway of Prefect2 is the ease of using subflows. With subflows, you can have a different task runners acting on different tasks under the same flow. Docs

Upstream tasks failed

This test in prefect is a good example of how prefect handles upstream failures. The downstream tasks are moved to a Pending state named NotReady which is techincally a terminal state, but not so.

This impacts the execution of downstream tasks specially if you are using wait_for argument, such that you are waiting for an upstream task that has failed. This prevents the given task from being executed. In order to handle this, try to minimize the use of wait_for and instead inject the upstream dependencies as arguments to the task function call (as shown in the example)

x = [1, 2, 3]
a = taskA.map(x)
b = taskB.map(x, wait_for=[a])

# Minimize use of wait_for as such,
a = taskA.map(x)  # <--- taskA returns the input as is which is used by taskB
b = taskB.map(a)

In the end, you could manually get the state result as below

future = task.submit(...)
state = future.wait()
result = future.result() if state.is_completed() else "Oops! Something went wrong"

Importance of having a good task dependencies

Aim to return a single important entity that is used in the downstream tasks. Assume we have following tasks:

def task_jpg_to_png(jpg_in: Path):
    ...
    # generate png and return nothing

def task_read_png(possible_png_paths: Path):
    ...
    # try and find png
    # if found: do something
    # else: do nothing

def my_flow():
    jpgs = list_jpg_files()
    pngs = task_jpg_to_png.map(jpgs)
    task_read_png.map(jpgs, wait_for=[pngs])

### ---------> Do the following instead

def task_jpg_to_png(jpg_in: Path):
    ...
    return png_paths

def task_read_png(png_in: Path):
    ...

def my_flow():
    jpgs = list_jpg_files()
    pngs = task_jpg_to_png.map(jpgs)
    task_read_png.map(pngs)

Here we avoided the wait_for argument because we passed a prefect future (pngs) directly as an input to the next task.

In order to understand subtle difference between the two pieces of code, try the following:
In the task task_jpg_to_png, raise an exception for only one of the inputs. You will see that in the first approach, png conversion won't run for any mapped tasks, while in the second approach, the png conversion runs for other non-failed tasks.

Getting Upstream Task Dependency

Scenario: Assume one of our downstream tasks (lets call it DF for 'Downstream Failed') did not run, which means it is still in the Pending(NotReady) state. In order to know the culprit upstream task, we need to know which tasks did DF depend upon.
So we need to get task_run_id of DF and go up the stream. Since task run ids are generated by the prefect server rather than prefect workers, we need to make api calls to the server to get input task_run ids that DF depends on (api is GET /api/task_runs/{id}).

This is a cumbersome approach with lot of api calls which can be slow, so we are looking into how we can get it while staying on the worker side.

Logging

You cannot access prefect.context.logger outside a flow or a task. We will get a MissingContextError if we try to access get_run_logger outside a flow/task. The way to access the logger is explained here. [This needs elaboration]

Testing

Testing flows Without prefect_test_harness(), the flow calls in tests will be registered by your real prefect server.
Flows that return None are tricky with testing. In p1, we could simply do
```
state = flow_call()
assert state.is_completed()
```
In p2, the final state is a list of states as mentioned here. Although this is already logged in the final state result as follows,
```
prefect.flow_runs:engine.py:546 Finished in state Completed('All states completed.')
```
tests still do get a list of states.

In order to fix this, you can add return_state=True to your flow calls as shown here (credit: @mbopfNIH)
```
state = flow_call(return_state=True)
```
...

Deploying Prefect Server

Prefect server acts as a storage/queue for the world to submit their jobs. The current deployment process used by Hedwig using Monarch Solutions and Terraform should be extendable to other projects as well.

Current concerns:

The server is available to anyone who is in the NIAID VPN. Anyone can access the website and register a random flow, or anyone can constantly send a flow run request to the prefect server, which is bound to be picked up by prefect workers.

In order to tackle this, we followed a basic guide to add minimal authentication into prefect server Guide. The secret values, such as bearer token and username/password pairs, can be added into aws secrets. The secret values can be rotated as well, which will require some extra work for retrieval.

Workers are preferred over Agents

Prefect 1 only had agents to retrieve and execute jobs from prefect server. Prefect 2 recommends using workers and work pools over agents.

Possible Upgrades

For applications with jobs/tasks that are competing for resources, take benefits from dask-annotations

mbopfNIH · 2023-11-06T17:16:45Z

mbopfNIH
Nov 6, 2023
Collaborator

You can tell a flow to return the state like this:

@flow 
def subflow():
    return 42 

@flow 
def my_flow():
    state = subflow(return_state=True) # return State
    result = state.result() # return int

From: https://docs.prefect.io/2.14.3/concepts/states/#return-prefect-state

This should work the same if the return value is None

What complicates matters is that many Hedwig tasks are Mapped, so the return value is a List of States (or Futures).

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrating from Prefect1.x to Prefect2.x #369

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

Migrating from Prefect1.x to Prefect2.x #369

annshress Nov 6, 2023 Collaborator

Change in workflows code

Upstream tasks failed

Importance of having a good task dependencies

Getting Upstream Task Dependency

Logging

Testing

Deploying Prefect Server

Current concerns:

Workers are preferred over Agents

Possible Upgrades

Replies: 1 comment

mbopfNIH Nov 6, 2023 Collaborator

annshress
Nov 6, 2023
Collaborator

mbopfNIH
Nov 6, 2023
Collaborator