Clarify trace archival payload location. by HumairAK · Pull Request #7 · mlflow/rfcs

HumairAK · 2026-04-07T16:43:21Z

This updates the trace archival RFC to keep mlflow.artifactLocation as the existing trace artifact root and to introduce a separate tag for archived traces.pb payloads.

The main reason for the change is that mlflow.artifactLocation already does more than point to trace JSON. It is also the location used for trace attachments. If we repoint that tag during archival, archived traces would start looking for attachments in the archive repo, even though those files were never moved there. That would make attachment reads fail unless archival also copied attachment files over.

I considered that alternative, but it adds extra work and more failure cases to the archival flow. Archival would need to discover, copy, and validate attachments before updating the trace location, and it would need to stay safe and retryable if part of that process failed. That is a lot of extra complexity for something we do not actually need.

Using a separate archive-location tag is simpler and safer. It preserves the current meaning of mlflow.artifactLocation, keeps existing attachment behavior intact, and makes the archived span payload location explicit. It also keeps the retrieval logic easier to understand: regular trace artifacts and attachments continue to use the existing tag, while archived span payloads use the new one.

Here's an example script illustrating the point:

attachments_example.py

from pathlib import Path

import mlflow
from mlflow.tracing.attachments import Attachment
from mlflow.tracing.client import TracingClient


base_dir = Path("./mlflow-sqlite-demo").resolve()
base_dir.mkdir(exist_ok=True)

db_path = base_dir / "mlflow.db"
artifacts_path = base_dir / "artifacts"
artifacts_path.mkdir(exist_ok=True)

mlflow.set_tracking_uri(f"sqlite:///{db_path}")
experiment_name = "attachment-demo-sqlite"

client = mlflow.MlflowClient()
experiment = client.get_experiment_by_name(experiment_name)
if experiment is None:
    experiment_id = client.create_experiment(
        experiment_name,
        artifact_location=artifacts_path.as_uri(),
    )
else:
    experiment_id = experiment.experiment_id

mlflow.set_experiment(experiment_name)

image_bytes = b"\x89PNG\r\n\x1a\nfake-png-content"

with mlflow.start_span(name="demo-span") as span:
    span.set_inputs(
        {
            "prompt": "describe this image",
            "image": Attachment(content_type="image/png", content_bytes=image_bytes),
        }
    )
    trace_id = span.trace_id

trace = mlflow.get_trace(trace_id)
root_span = trace.data.spans[0]
image_ref = root_span.inputs["image"]
parsed = Attachment.parse_ref(image_ref)

print(f"trace_id: {trace_id}")
print(f"stored image field in trace: {image_ref}")
print(f"parsed attachment id: {parsed['attachment_id']}")

tracing_client = TracingClient(mlflow.get_tracking_uri())
trace_info = tracing_client.get_trace_info(trace_id)
artifact_repo = tracing_client._get_artifact_repo_for_trace(trace_info)
stored_bytes = artifact_repo.download_trace_attachment(parsed['attachment_id'])

print(f"trace artifact location: {trace_info.tags['mlflow.artifactLocation']}")
print(f"downloaded attachment bytes: {stored_bytes!r}")

if hasattr(artifact_repo, 'artifact_dir'):
    print(f"local artifact directory: {artifact_repo.artifact_dir}")

Keep mlflow.artifactLocation reserved for trace artifacts and attachments, and document a separate archive-location tag for archived traces so archival does not break attachment reads.

mprahl

I think this makes sense and I don't see an alternative such as archiving the attachments in the trace archival either because the artifact repository can be user provided.

mprahl · 2026-04-07T17:38:34Z

@B-Step62 could you please take a look?

Clarify trace archival payload location.

6a0cdbf

Keep mlflow.artifactLocation reserved for trace artifacts and attachments, and document a separate archive-location tag for archived traces so archival does not break attachment reads.

mprahl approved these changes Apr 7, 2026

View reviewed changes

HumairAK mentioned this pull request Apr 9, 2026

Implement SQLAlchemy trace archival pass mlflow/mlflow#22469

Closed

31 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify trace archival payload location.#7

Clarify trace archival payload location.#7
HumairAK wants to merge 1 commit intomlflow:mainfrom
HumairAK:update_archival_rfc

HumairAK commented Apr 7, 2026 •

edited

Loading

Uh oh!

mprahl left a comment

Uh oh!

mprahl commented Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

HumairAK commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mprahl left a comment

Choose a reason for hiding this comment

Uh oh!

mprahl commented Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

HumairAK commented Apr 7, 2026 •

edited

Loading