-
Notifications
You must be signed in to change notification settings - Fork 980
Open
Labels
HacktoberfestIssue suitable for hacktoberfestIssue suitable for hacktoberfestIssue: Feature RequestNew feature or improvement to existing featureNew feature or improvement to existing feature
Description
Description
Originated from kedro-org/vscode-kedro#140
%load_node works great but it doesn't work on MemoryDataset, I don't save every node and it's not easy to figure out which nodes I need to run again to produce the data
Context
- This is much more powerful because not every node has persisted data and this limited the usage of the feature.
- This enable much more powerful slicing feature in
kedro-viz, which is currently limited because we do not want the slicing generate a command that Kedro does not know how to run. - Same as above, if we are able to support more flexible slicing, we may end up expanding the API of
kedro runas well since we can support different combinations of slicing
Possible Implementation
Check out the suggest_resume_scanerio in SequentialRunner:
Lines 200 to 210 in 91765e3
| remaining_nodes = set(pipeline.nodes) - set(done_nodes) | |
| postfix = "" | |
| if done_nodes: | |
| start_node_names = _find_nodes_to_resume_from( | |
| pipeline=pipeline, | |
| unfinished_nodes=remaining_nodes, | |
| catalog=catalog, | |
| ) | |
| start_nodes_str = ",".join(sorted(start_node_names)) | |
| postfix += f' --from-nodes "{start_nodes_str}"' |
This roughly has the logic to resume pipeline but it's hidden in a private API, we need to surface this for more generic usage.
Metadata
Metadata
Assignees
Labels
HacktoberfestIssue suitable for hacktoberfestIssue suitable for hacktoberfestIssue: Feature RequestNew feature or improvement to existing featureNew feature or improvement to existing feature
Type
Projects
Status
No status