Skip to content

Conversation

@rsareddy0329
Copy link
Contributor

Issue #, if available:

Description of changes:

  • Modified/Updated to better error handling for Mlflow app creation
  • S3 Output path validation - create if the prefix does not exist, instead of error.
  • Update notebooks to reflect recent code updates.

Testing:

  • Unit tests are successful
  • Submitted jobs to all finetuning technique flows on these changes

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

error_msg = f"MLflow app creation failed. Current status: {new_app.status}"
if hasattr(new_app, 'failure_reason') and new_app.failure_reason:
error_msg += f". Reason: {new_app.failure_reason}"
raise RuntimeError(error_msg)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to raise error in case of timeout exception or should just display error message ? If MLFlow app is required parameter for CTJ API then it's fine to throw error.

if "NoSuchBucket" in str(e) or "Not Found" in str(e):
# Create bucket
region = sagemaker_session.boto_region_name
if region == 'us-east-1':
Copy link
Collaborator

@jam-jee jam-jee Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this region specific check here ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is expected for s3

if "NoSuchBucket" in str(e) or "Not Found" in str(e):
# Create bucket
region = sagemaker_session.boto_region_name
if region == 'us-east-1':
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is expected for s3

@rsareddy0329 rsareddy0329 merged commit cd406fa into aws:master Dec 10, 2025
12 of 22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants