Skip to content

fix: allow numeric resource IDs in _VALID_RESOURCE_NAME_REGEX#6440

Open
urmzd wants to merge 5 commits intogoogleapis:mainfrom
urmzd:fix/valid-resource-name-regex
Open

fix: allow numeric resource IDs in _VALID_RESOURCE_NAME_REGEX#6440
urmzd wants to merge 5 commits intogoogleapis:mainfrom
urmzd:fix/valid-resource-name-regex

Conversation

@urmzd
Copy link

@urmzd urmzd commented Mar 17, 2026

Description

_VALID_RESOURCE_NAME_REGEX in the RAG SDK requires the first character to be a lowercase letter ([a-z]), which rejects bare numeric IDs (e.g. 1234567890) that the Vertex AI API assigns to resources like RAG corpora and files.

Calling get_corpus() or get_file() with a valid numeric resource ID raises ValueError:

from vertexai.preview import rag

corpus = rag.get_corpus("1234567890")
# ValueError: name must be of the format
#   `projects/{project}/locations/{location}/ragCorpora/{rag_corpus}` or `{rag_corpus}`

The ID is valid — parse_rag_corpus_path returns {} (not a full path), and the regex rejects it because "1" doesn't match [a-z]. The fix branch never reaches rag_corpus_path() to expand the short ID.

Fixes #6442

Changes

Updated the regex from [a-z][a-zA-Z0-9._-]{0,127} to [a-zA-Z0-9][a-zA-Z0-9._-]{0,127} in both definitions:

  • vertexai/rag/utils/_gapic_utils.py
  • vertexai/preview/rag/utils/_gapic_utils.py

Testing

  • Added test_get_corpus_numeric_id_success and test_get_file_numeric_id_success to both test_rag_data.py and test_rag_data_preview.py
  • Tests use isolated mocks with real side_effect on path helpers so bare numeric IDs exercise the regex code path (the shared fixture's Mock() always returns truthy for parse_rag_corpus_path, bypassing the regex entirely)
  • All 201 RAG tests pass (4 new + 197 existing)

urmzd added 2 commits March 17, 2026 00:03
The regex required the first character to be a lowercase letter [a-z],
which rejected bare numeric IDs (e.g. "1234567890") that the API
assigns to resources like RAG corpora and files. Updated to accept
any alphanumeric first character [a-zA-Z0-9].

Fixes all three definitions of _VALID_RESOURCE_NAME_REGEX:
- vertexai/preview/rag/utils/_gapic_utils.py
- vertexai/rag/utils/_gapic_utils.py
- google/cloud/aiplatform/vertex_ray/util/_validation_utils.py
Adds tests for get_corpus and get_file with numeric IDs to verify
the regex fix accepts API-assigned numeric resource identifiers.
@urmzd urmzd requested a review from a team as a code owner March 17, 2026 05:17
@product-auto-label product-auto-label bot added size: s Pull request size is small. api: vertex-ai Issues related to the googleapis/python-aiplatform API. labels Mar 17, 2026
urmzd added 3 commits March 17, 2026 00:45
The vertex_ray _VALID_RESOURCE_NAME_REGEX intentionally requires a
lowercase letter first for persistent resource names, which is a
different context from RAG resource IDs.
@product-auto-label product-auto-label bot added size: m Pull request size is medium. and removed size: s Pull request size is small. labels Mar 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: vertex-ai Issues related to the googleapis/python-aiplatform API. size: m Pull request size is medium.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

RAG SDK rejects bare numeric resource IDs assigned by the Vertex AI API

1 participant