Skip to content

SQL injection in SnowflakeSearchTool via database and schema parameters #4993

@jmanico

Description

@jmanico

Hey folks, I was reviewing the SnowflakeSearchTool and noticed that the database and snowflake_schema parameters in the _run method get dropped straight into SQL strings without any validation or quoting.

Here's the relevant code in snowflake_search_tool.py around line 261:

if database:
    await self._execute_query(f"USE DATABASE {database}")
if snowflake_schema:
    await self._execute_query(f"USE SCHEMA {snowflake_schema}")

These values come from SnowflakeSearchToolInput, which defines both fields as plain str | None with no pattern constraints. Compare that with SnowflakeConfig.account which already has pattern=r"^[a-zA-Z0-9\-_]+$" on it. The input schema fields have nothing similar.

Since _execute_query just calls cursor.execute(query) directly (line 218), a malicious value like test_db; DROP TABLE users; -- would get executed as SQL.

In a CrewAI context, these parameters are filled by the LLM agent based on task instructions, which means prompt injection in upstream task context could influence these values.

Suggested fix

The simplest approach would be to add an identifier validation regex to both fields in the input schema, and also double quote the identifiers in the SQL:

# On SnowflakeSearchToolInput
database: str | None = Field(None, pattern=r"^[A-Za-z_][A-Za-z0-9_$]*$")
snowflake_schema: str | None = Field(None, pattern=r"^[A-Za-z_][A-Za-z0-9_$]*$")
# In _run()
if database:
    await self._execute_query(f'USE DATABASE "{database}"')
if snowflake_schema:
    await self._execute_query(f'USE SCHEMA "{snowflake_schema}"')

Snowflake's USE DATABASE doesn't support bind parameters, so identifier validation is the way to go here. The regex above matches Snowflake's unquoted identifier rules.

Worth noting that the query parameter itself is also passed through unsanitized, but that's somewhat by design since the tool is meant to execute SQL. The database and snowflake_schema fields are the ones that look like safe metadata but actually aren't.

Also worth mentioning that the NL2SQL tool has a similar pattern at nl2sql_tool.py:57 where table_name from a prior query result gets f-string interpolated into an information_schema query (second order injection). That one already has a # noqa: S608 on it acknowledging the issue, and there's documentation about it from PR #4465, but no code fix yet.

Happy to put together a PR if you'd like.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions