FIX: Auto-detect chat intent server-side; remove user-facing selector#79
FIX: Auto-detect chat intent server-side; remove user-facing selector#79tahierhussain wants to merge 3 commits intomainfrom
Conversation
…ected The AI service now classifies chat intent server-side (via the gatekeeper) and emits a dedicated CHAT_INTENT (event_type=5) stream message before any prompt_response chunks. Visitran-cloud no longer reads chat_intent_id from the POST /chat request body and no longer forwards it in the outbound LLM payload. Instead, the backend extracts chat_intent from the inbound stream, persists it on ChatMessage on first sight, and pushes chat_intent_name to the frontend via socket so the UI can branch (Apply / Run SQL / etc.) in real time. Pre-charge token-balance check now defaults to TRANSFORM (worst-case cost) since intent isn't yet known at request time. Both ChatSerializer and ChatMessageSerializer surface a new chat_intent_name field so the frontend can render historical messages without an id->name lookup. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… intent Removes the user-facing chat intent Segmented selector from PromptActions, along with the chat-intent state, getChatIntents fetch, onboarding auto-select effect, and the chat_intent_id field in postChatPrompt. Strips the intent props that were drilled through Body, NewChat, ExistingChat, InputPrompt, ResponseFooter, and ActionButtons. Conversation now derives the intent object directly from message.chat_intent_name (added to the chat-message serializer), and PastConversations renders its badge from chat.chat_intent_name. ChatAI merges chat_intent_name into the message's local state when the new CHAT_INTENT socket event arrives, so ResponseFooter renders the correct branch (Apply / Run SQL / Build Models) in real time without a refresh. Body.jsx falls back llm_model_developer to the architect model so that removing the (already gated-off) Coder selector doesn't leave the field null on Chat creation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
| Filename | Overview |
|---|---|
| backend/backend/application/context/llm_context.py | Adds _resolve_chat_intent helper that extracts and persists AI-detected intent; removes chat_intent from stream listener signatures. Two P2 issues: intent_name cached after a failed DB write (post-refresh buttons disappear), and _chat_intent_by_msg never evicted. |
| backend/backend/application/context/chat_ai_context.py | Adds _process_chat_intent dispatcher that pushes chat_intent_name over the socket; registers chat_intent as a supported event type. Looks correct. |
| backend/backend/application/context/chat_message_context.py | Removes chat_intent_id parameter from persist_prompt; Chat no longer gets intent at creation time — now deferred to AI response. Clean removal. |
| backend/backend/core/routers/chat/views.py | Removes chat_intent_id from request handling; hardcodes worst-case TRANSFORM intent for pre-charge balance check. Intentional trade-off documented in PR. |
| backend/backend/core/routers/chat/serializers.py | Adds chat_intent_name field using display_name source (e.g. Transform), which differs from ChatMessageSerializer's use of name (e.g. TRANSFORM) — serializer inconsistency flagged in prior review. |
| backend/backend/core/routers/chat_message/serializers/chat_message_serializer.py | Adds chat_intent_name using chat_intent.name (raw, e.g. TRANSFORM) — inconsistent with ChatSerializer which uses display_name. The raw name is correct for Conversation.jsx button logic, but the cross-serializer mismatch is confusing. |
| backend/backend/core/web_socket.py | Adds chat_intent_name to the send_socket_message allowed_args whitelist. None values are stripped before emit, so a null intent won't pollute the socket payload. |
| frontend/src/ide/chat-ai/ChatAI.jsx | Merges incoming chat_intent_name socket event into local message state for live rendering. Guarded with a truthy check so null intent events are safely ignored. |
| frontend/src/ide/chat-ai/Conversation.jsx | Derives intent object directly from message.chat_intent_name; removes chatIntents array dependency. Clean simplification. |
| frontend/src/ide/chat-ai/ExistingChat.jsx | Simplifies lastTransformIndex lookup to use chat_intent_name directly instead of building an id-to-name map from chatIntents. Props fully cleaned up. |
| frontend/src/ide/chat-ai/Body.jsx | Removes chatIntents state and getChatIntents call; adds selectedCoderLlmModel |
| frontend/src/ide/chat-ai/PromptActions.jsx | Removes the Segmented intent selector and all related props, icon maps, and auto-select effects. UI simplification is consistent with server-side intent detection. |
| frontend/src/ide/chat-ai/PastConversations.jsx | Removes chatIntents/setSelectedChatIntent props; reads conversation.chat_intent_name directly from ChatSerializer for the intent badge. Cleaner data flow. |
| frontend/src/ide/chat-ai/services.js | Removes chatIntentId parameter from postChatPrompt and stops sending chat_intent_id in the POST /chat request body. |
Sequence Diagram
sequenceDiagram
participant FE as Frontend
participant BE as Backend (views.py)
participant CMC as ChatMessageContext
participant LLM as LLMServerContext
participant AI as visitran-ai service
participant Redis as Redis Stream
FE->>BE: POST /chat (no chat_intent_id)
BE->>BE: fetch_token_balance(intent=TRANSFORM)
BE->>CMC: persist_prompt() — chat_intent=null
CMC-->>BE: ChatMessage created
BE->>LLM: stream_prompt_response()
LLM->>Redis: xreadgroup (listen)
BE->>AI: send_event_to_llm_server(payload)
AI->>Redis: publish CHAT_INTENT event (event_type=5)
Redis-->>LLM: CHAT_INTENT message
LLM->>LLM: _resolve_chat_intent() → DB write ChatMessage+Chat
LLM->>FE: socket push chat_intent_name
AI->>Redis: publish prompt_response chunks
Redis-->>LLM: prompt_response messages
LLM->>LLM: _resolve_chat_intent() → return cached intent
LLM->>FE: socket push response chunks
AI->>Redis: publish completed (event_type=4)
Redis-->>LLM: completed
LLM->>FE: socket push completed + token_usage_data
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 2
backend/backend/application/context/llm_context.py:112-127
**Intent cached despite DB write failure**
`self._chat_intent_by_msg[chat_message_id] = intent_name` and the corresponding `return intent_name` both execute unconditionally after the `except` block. If `ChatIntent.objects.get(name=intent_name)` throws `DoesNotExist` (e.g. AI returns a name not yet seeded), the in-session socket push still delivers the intent (buttons render during the session), but `chat_message.chat_intent` is never written. After a page refresh, `ChatMessageSerializer.chat_intent_name` resolves to `null`, and the Apply / Run SQL footer disappears permanently for that message.
### Issue 2 of 2
backend/backend/application/context/llm_context.py:99-104
**`_chat_intent_by_msg` never evicted**
The dict is initialized lazily on `self` and is never cleared. If `LLMServerContext` is a long-lived object (or is reused across streaming sessions in the same process), every processed `chat_message_id` accumulates an entry indefinitely. For high-traffic deployments this is a gradual memory leak. Even for per-session instances, consider returning early (or setting a sentinel) on the first call to avoid the `hasattr` guard on every subsequent event.
Reviews (2): Last reviewed commit: "fix(backend): address PR review — drop d..." | Re-trigger Greptile
…o Chat Removes a leftover debug print in process_message that flooded logs with every Redis stream payload (potentially exposing user prompts). Also restores the pre-PR behavior where Chat.chat_intent tracks the latest message's intent. _resolve_chat_intent now mirrors the persisted intent onto chat_message.chat as well, so the PastConversations badge (which reads chat.chat_intent_name from ChatSerializer) populates correctly for chats created after this PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Most of the diff is a clean unwind of frontend prop-drilling and a sensible move of intent classification to the AI service — that direction is right. Two things before this can ship:
-
Cloud billing has a real race for TRANSFORM (see inline on
llm_context.py). The cloud override ofprocess_promptreadschat_message.chat_intentsynchronously aftersuper().process_prompt(...)returns, but in non-WS mode the Redis listener (which is what writes the intent) doesn't even start until aftersend_event_to_llm_serverreturns. Result: TRANSFORM messages get charged once at generation (because intent isNone→ defaults to INFO →!= TRANSFORM) and again at Apply success → double-charge. The PR's test plan likely only covered WS mode + happy-path eventlet ordering. -
_resolve_chat_intentswallows + caches failure (inline). Bareexcept Exceptionplus an unconditional cache write means a single failed persist (DB blip, unseeded intent name, etc.) silently and permanently breaks the response footer for that message — no Apply / Run SQL button, no actionable signal.
Already addressed since first read — kudos
Chat.chat_intentis now mirrored alongsideChatMessage.chat_intentinside_resolve_chat_intent✅- The
print(...)debug is gone ✅
Test-plan gaps worth filling before merge
- HTTP / non-WS billing path (item 1)
- Transform retry path — with
TRANSFORM_RETRYno longer signaled to the AI side, the gatekeeper now classifies"Faulty yaml: ... Error: ..."from scratch, and there's no test coverage for that - AI publishing an unknown / typo'd intent name → user must still see a usable response
- Race where
prompt_responsechunks arrive beforeCHAT_INTENT(theif chat_intent == "INFO" and event_type == "prompt_response"branch silently no-ops in that window becausechat_intentisNone)
Not blocking but worth folding in: the inline notes on the chat_intent_name serializer asymmetry, the hasattr-based lazy init, and the per-event local imports.
| discussion_status=chat_message.discussion_type, | ||
| ) | ||
| logging.info(f"process_prompt: chat_intent={chat_intent}, sid={sid}, channel_id={channel_id}") | ||
| logging.info(f"process_prompt: sid={sid}, channel_id={channel_id}") |
There was a problem hiding this comment.
Likely billing bug — TRANSFORM may be double-charged in non-WS mode.
In the cloud override of process_prompt (pluggable_apps/subscriptions/context.py), right after super().process_prompt(...) returns we do:
chat_intent = chat_message.chat_intent.name if chat_message.chat_intent else "INFO"
if chat_intent != "TRANSFORM":
self.consume_tokens(...)In this (non-WS) branch the Redis stream listener is started after send_event_to_llm_server(...) returns, so when super().process_prompt returns the listener has just begun and chat_message.chat_intent is still None. That falls through to "INFO" != "TRANSFORM" → consume_tokens() runs at generation. Then transformation_save charges again on Apply success. Net: every TRANSFORM in non-WS mode is charged twice.
WS mode probably happens to work because eventlet yields on Redis I/O and the listener drains in time, but relying on yield ordering for billing correctness is brittle.
Fix options, in order of preference:
- Have the AI side guarantee
CHAT_INTENT(event_type=5) is the first event, and drain it synchronously insideprocess_promptbefore returning → no race in either mode. - Move the cloud-side
consume_tokensdecision to a hook fired from_resolve_chat_intent(i.e. when intent is known), not fromprocess_prompt's return path. - At minimum, make the cloud override re-fetch and skip charging when
chat_intentis still null (defer to Apply or to a later event), rather than defaulting to"INFO".
The test plan claims billing was verified — please add an explicit non-WS test before merge.
| except Exception as e: | ||
| logging.error(f"Failed to persist chat_intent={intent_name}: {e}") | ||
|
|
||
| self._chat_intent_by_msg[chat_message_id] = intent_name |
There was a problem hiding this comment.
Cache write outside the try — silently locks in failure.
If the ChatIntent.objects.get / chat_message.save block above raised, the bare except Exception swallowed it and we still reach this line and record intent_name in the cache. Subsequent events on the same stream then short-circuit on cached, never retrying the persist. Meanwhile the socket push at chat_ai_context._process_chat_intent does still send the name to the frontend (it reads from kwargs["chat_intent"] which is the cached return value), so the UI shows the intent during streaming but the DB row stays null forever — historical reload is then inconsistent with what the user just saw.
Move the cache write into the try block (after the save), and narrow the except so ChatIntent.DoesNotExist (typo / unseeded value) doesn't get conflated with transient OperationalErrors.
Suggested shape:
try:
chat_intent_obj = ChatIntent.objects.get(name=intent_name)
except ChatIntent.DoesNotExist:
logging.error(f"Unknown chat_intent name from AI: {intent_name!r}")
return None # don't cache; let a later event try again
try:
with transaction.atomic():
ChatMessage.objects.filter(chat_message_id=chat_message_id) \
.update(chat_intent=chat_intent_obj)
Chat.objects.filter(chat_messages__chat_message_id=chat_message_id) \
.update(chat_intent=chat_intent_obj)
except DatabaseError:
logging.exception("Failed to persist chat_intent")
return None
self._chat_intent_by_msg[chat_message_id] = intent_name
return intent_name| chat_message.chat.chat_intent = chat_intent | ||
| chat_message.chat.save(update_fields=["chat_intent"]) | ||
| except Exception as e: | ||
| logging.error(f"Failed to persist chat_intent={intent_name}: {e}") |
There was a problem hiding this comment.
Bare except Exception masks distinct failure modes.
This catches ChatIntent.DoesNotExist (typo / unseeded value), OperationalError (transient DB), IntegrityError, etc. all the same. The frontend then renders a response with no Apply / Run SQL footer (because Conversation.jsx returns null when chat_intent_name is missing), and the only signal is one logging.error line. For something that controls billing routing and the primary CTA on every AI response, that's too quiet — narrow the except, and at minimum surface a sentry/metric so we notice when the AI starts emitting an unknown intent name.
See companion comment on the line below for the cache-write-outside-try issue, which compounds this.
| chat_message_id = data["chat_message_id"] | ||
| content = data["content"] | ||
|
|
||
| chat_intent = self._resolve_chat_intent(chat_message_id, data, content) |
There was a problem hiding this comment.
Retry no longer signals "this is a retry" to the AI gatekeeper.
Pre-PR, the is_retry branch in process_prompt set chat_intent = ChatMessageStatus.TRANSFORM_RETRY and forwarded it in llm_payload. Post-PR the assignment is gone and chat_intent is no longer in the outbound payload at all (see process_prompt below). The AI gatekeeper now classifies the retry prompt — which is literally "Faulty yaml: <yaml> \n Error: <error>" — from scratch. There's a real chance it routes that to SQL or INFO instead of TRANSFORM, breaking retry.
Second, related concern: when the gatekeeper does classify the retry, the new CHAT_INTENT event will hit this _resolve_chat_intent and overwrite ChatMessage.chat_intent. If the new classification differs from the original message's intent, the row's intent flips on retry — surprising for anyone reading historical data, and breaks the ResponseFooter rendering for the original message.
Mitigations (pick one):
- Keep an explicit
is_retry: Truefield inllm_payloadand have the AI side short-circuit gatekeeper when set - Or have
_resolve_chat_intentnot overwriteChatMessage.chat_intentonce it's already set - And add retry coverage to the test plan — there's none right now
| class ChatSerializer(serializers.ModelSerializer): | ||
| user = UserMinimalSerializer(read_only=True) | ||
| chat_intent_name = serializers.CharField( | ||
| source='chat_intent.display_name', read_only=True, default=None |
There was a problem hiding this comment.
Same field name (chat_intent_name), different source from the chat-message serializer:
- here:
source='chat_intent.display_name'→ e.g."Transform" - in
chat_message_serializer.py:14:source='chat_intent.name'→ e.g."TRANSFORM"
The rationale (chat list shows the human label, message-level needs the canonical token for branching) is fine, but the asymmetry under one shared field name is undocumented and a future bug magnet — anyone wiring up a new component will assume both return the same thing. Either rename the chat-list one to chat_intent_label (or similar) or pick a single source and have the frontend map for display.
|
|
||
| class ChatMessageSerializer(serializers.ModelSerializer): | ||
| user = UserMinimalSerializer(read_only=True) | ||
| chat_intent_name = serializers.CharField(source='chat_intent.name', read_only=True, default=None) |
There was a problem hiding this comment.
See companion comment on chat/serializers.py:9. This one returns chat_intent.name ("TRANSFORM") while the chat-list serializer returns chat_intent.display_name ("Transform") under the same field name. Worth disambiguating.
| Subsequent events re-use the cached value. | ||
| """ | ||
| if not hasattr(self, "_chat_intent_by_msg"): | ||
| self._chat_intent_by_msg = {} |
There was a problem hiding this comment.
Cleaner to initialize self._chat_intent_by_msg = {} in __init__ rather than lazy-init via hasattr. If LLMServerContext is ever pooled or reused across requests (the rest of the class is stateful enough that this seems plausible), this dict carries entries across requests because there's no reset point. An explicit init in __init__ makes the lifecycle obvious.
|
|
||
| try: | ||
| from backend.core.models.chat_intent import ChatIntent | ||
| from backend.core.models.chat_message import ChatMessage |
There was a problem hiding this comment.
These imports run on every stream event (the helper is invoked from process_message for all event types, not just chat_intent). Python's import cache makes the cost small but non-zero, and there's no circular-import excuse here — both modules are already top-level imported elsewhere. Hoist to the top of the file.
What
Migrates chat-intent classification from the client to the AI service. Removes the user-facing intent Segmented selector, stops sending
chat_intent_idinPOST /chat, and instead persists the AI-detected intent onChatMessageby reading a dedicatedCHAT_INTENTstream event from the AI service. Pusheschat_intent_nameto the frontend over the channel socket soResponseFootercan render the correct branch (Apply / Run SQL / Build Models) without a refresh.Why
The visitran-ai PR (Zipstack/visitran-ai#270) added a server-side gatekeeper that classifies prompts into TRANSFORM or SQL automatically. With that in place, the client-side intent toggle becomes redundant — the user picking "SQL" vs "TRANSFORM" up-front no longer affects routing on the AI side. Keeping the toggle would be both confusing UX (a control that doesn't do anything) and a maintenance burden (state plumbed through eight components for nothing). This PR brings the visitran client in line with the AI service's new contract.
How
The backend stops reading
chat_intent_idfrom the request and stops forwardingchat_intentin the outbound LLM payload. A new_resolve_chat_intenthelper extracts the intent from the inboundCHAT_INTENT(event_type=5) Redis stream message — fired by the AI service immediately after gatekeeper classification, before anyprompt_responsechunks — and persists it onChatMessageon first sight. A new_process_chat_intentdispatcher pusheschat_intent_nameto the frontend via the existing channel socket (added tosend_socket_message'sallowed_args). BothChatSerializerandChatMessageSerializernow exposechat_intent_nameso the UI can render historical messages without a separate id→name lookup.The frontend removes the Segmented intent selector and all the prop drilling for
chatIntents/selectedChatIntent/setSelectedChatIntent/selectedChatIntentNameacrossPromptActions,Body,NewChat,ExistingChat,InputPrompt,Conversation,PastConversations,ResponseFooter, andActionButtons.Conversation.jsxderives its intent object directly frommessage.chat_intent_name;ChatAI.jsxmergeschat_intent_nameinto local message state when the new socket event arrives; andBody.jsxfalls backllm_model_developerto the architect model so the now-removed Coder selector doesn't leave the column null at Chat creation.Can this PR break any existing features. If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)
Yes — four points worth calling out:
CHAT_INTENTstream-event publish MUST be deployed first. Without them, visitran-cloud will receive noCHAT_INTENTevent,ChatMessage.chat_intentwill stay null, and the response footer (Apply / Run SQL buttons) will not render.pluggable_apps/subscriptions/context.pystill readschat_message.chat_intent.nameto decide TRANSFORM-deferred vs immediate charging. The logic is unchanged — it just now sees an AI-detected intent instead of a user-selected one. Worth verifying the first TRANSFORM and SQL prompts in production charge correctly.chat_intentpopulated from the user's prior selection. They continue to render correctly because the newchat_intent_nameserializer field resolves the same FK on the model side.Database Migrations
None. The
ChatIntentmodel, FK columns, and seed data are all preserved unchanged — the migration is purely in where the value comes from (AI response instead of user input).Env Config
None.
Relevant Docs
TODO
Related Issues or PRs
None.
Dependencies Versions
None.
Notes on Testing
ChatMessage.chat_intent="TRANSFORM"in the DB, and renders the Build Models / Apply button without a page refreshChatMessage.chat_intent="SQL", and renders the Run SQL buttonPOST /chatrequest body contains nochat_intent_id(DevTools Network tab)ChatTokenCostrow written for eachchat.chat_intent_nameon the chat-list serializer)chat_intentchat_intent_namepopulated on each message (chat-message serializer's new field)Screenshots
Chat input – Selector removed
TRANSFORM response – Build Models button rendered after streaming
SQL response – Run button rendered after streaming
Past conversations panel – Intent badge
Checklist
I have read and understood the Contribution Guidelines.