Lokasi ngalangkungan proxy:   [ UP ]  
[Ngawartoskeun bug]   [Panyetelan cookie]                
Skip to content

feat: add get_app_runtime_logs tool to global chat#9502

Draft
centdix wants to merge 11 commits into
mainfrom
claude-runtime-logs-chat-tool
Draft

feat: add get_app_runtime_logs tool to global chat#9502
centdix wants to merge 11 commits into
mainfrom
claude-runtime-logs-chat-tool

Conversation

@centdix

@centdix centdix commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

Summary

Adds session-only global-chat diagnostics for raw app previews. The assistant can inspect the live preview's browser console with get_app_runtime_logs, list backend runnable jobs triggered by that preview with list_app_runs, and then use get_job_logs on a returned job_id for server-side debugging.

The tool outputs are split deliberately: the model receives explicit, instructional next steps, while the UI only shows a short minimal status such as Read 3 runtime logs or Listed 2 app runs.

The companion windmill-code-ui-builder preview-shell change has merged, and this PR now pins the frontend UI builder artifact to that merged build (ad76918) so the preview can answer runtime-log requests.

Enregistrement.de.l.ecran.2026-06-10.a.17.52.54.mov

Changes

  • Adds get_app_runtime_logs as a session-only global chat tool, including schema, prompt guidance, handler registration, and tests.
  • Adds list_app_runs as a session-only global chat tool so the assistant can find raw app backend runnable job IDs and chain to get_job_logs.
  • Gates both tools to AI sessions via SESSION_PREVIEW_TOOL_NAMES, keeping them out of the regular global side-panel chat.
  • Introduces explicit AI-facing vs UI-facing tool results: instructional aiResult for the model, terse uiMessage for the visible tool status.
  • Wires raw app runtime log requests from RawAppEditor to the preview iframe with request IDs, timeout cleanup, and teardown cleanup.
  • Tracks raw app backend runs from the editor runner's jobs / jobsById state and exposes them through the session runtime.
  • Registers raw app log and run providers through RawAppEditorView and dispatches handlers by the calling session ID.
  • Formats runtime logs and app runs in raw_apps/utils.ts with explicit next-step instructions for the assistant.
  • Keeps raw app backend message handling scoped to the runner's own iframe to avoid duplicate runnable executions from multiple mounted previews.
  • Bumps frontend/scripts/ui_builder_artifact.json to the merged preview-shell artifact ad76918.
  • Extends global chat tests for the new tools, session-only gating, default limits, explicit limits, and minimal UI statuses.

Test plan

  • npm run test:unit -- --run src/lib/components/copilot/chat/global/core.test.ts — 67 tests passed
  • Svelte autofixer on modified Svelte surfaces / focused raw app editor bridge
  • git diff --check
  • node ./scripts/untar_ui_builder.js from frontend/ — downloaded and checksum-verified ui_builder-ad76918.tar.gz
  • Browser smoke test on localhost:3080: log in, open workspace test, open raw app editor u/admin/two_plus_two, verify editor and preview iframe mount
  • npm run check:fast currently blocked by unrelated existing errors in ChatContextPicker.svelte:62 and EditTokenScopesModal.svelte:73
  • npm run check currently blocked by existing EditTokenScopesModal.svelte:73 error plus existing Svelte warnings
  • Open a raw app preview in an AI session, emit console logs/errors, call get_app_runtime_logs, and confirm the assistant receives recent logs oldest-first
  • Trigger a raw app backend runnable, call list_app_runs, then call get_job_logs with the returned job_id

centdix and others added 2 commits June 10, 2026 02:15
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@cloudflare-workers-and-pages

cloudflare-workers-and-pages Bot commented Jun 10, 2026

Copy link
Copy Markdown

Deploying windmill with  Cloudflare Pages  Cloudflare Pages

Latest commit: ccbb995
Status: ✅  Deploy successful!
Preview URL: https://e05f78a5.windmill.pages.dev
Branch Preview URL: https://claude-runtime-logs-chat-too.windmill.pages.dev

View logs

centdix and others added 7 commits June 10, 2026 15:30
…hat-tool

# Conflicts:
#	frontend/src/lib/components/sessions/RawAppEditorView.svelte
#	frontend/src/lib/components/sessions/sessionRuntime.svelte.ts
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@centdix centdix marked this pull request as ready for review June 10, 2026 15:54
@claude

claude Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Claude finished @centdix's task in 9m 53s —— View job


cc @centdix

Mergeable, but should ideally address nits: iframe-supplied requestId used as plain-object key in RawAppEditor.svelte, unvalidated runtime-log entry shape feeding formatRuntimeLogsForChat, unrelated formatting churn in sessionRuntime.svelte.ts/utils.ts that diverges from the repo prettier config.

Findings

Found 3 issues (all posted as inline comments):

  1. [P2] pendingRuntimeLogReqs is a plain Record indexed by the iframe-supplied e.data.requestId. A lookup with an inherited key ('__proto__', 'constructor', 'toString') returns a truthy non-pending value, passes the if (!pending) return guard, and then throws pending.resolve is not a function inside the window message listener. The preview iframe runs arbitrary user app code and the surrounding listener explicitly treats it as a spoofable source. Use a Map<string, PendingRuntimeLogRequest> (or guard with Object.hasOwn) so only locally-generated UUID keys ever resolve.
    frontend/src/lib/components/raw_apps/RawAppEditor.svelte:1107Fix this →

  2. [P2] runtimeLogsResponse payloads are forwarded with only an Array.isArray(e.data.logs) check (RawAppEditor.svelte:1005), but formatRuntimeLogsForChat then calls e.level.toUpperCase() and new Date(e.ts).toISOString() — the latter throws RangeError for finite-but-out-of-range ts, so the Number.isFinite guard is insufficient. A malformed entry posted by preview app code makes the get_app_runtime_logs tool throw instead of returning a result. Coerce/validate each entry's shape (string level/message, in-range ts) when resolving the request or inside the formatter.
    frontend/src/lib/components/raw_apps/utils.ts:31Fix this →

  3. [P2] ~60 lines of pure re-indentation churn unrelated to the feature: the rawApp/savedRawApp union types and the baseline/structuredClone block in sessionRuntime.svelte.ts, and the genWmillTs template interpolations in utils.ts. The new indentation contradicts frontend/.prettierrc — running npm run format would revert these hunks — and it pollutes blame/merge history. Revert the non-functional hunks (or run the repo prettier over the touched files).
    frontend/src/lib/components/sessions/sessionRuntime.svelte.ts:87Fix this →

Focus areas checked — no findings

  • postMessage bridge: inbound messages are gated by e.source === previewIframe?.contentWindow identity checks in both RawAppEditor.svelte and the new guard in RawAppBackgroundRunner.svelte (which also fixes the runner previously accepting messages from any window). Outbound '*' targetOrigin matches the established pattern for the same-origin preview iframe and carries no sensitive data.
  • Request-ID/timeout lifecycle: delete-before-resolve prevents double resolution, the timeout path and late-response path are mutually exclusive, and the onMount cleanup (e12e786) resolves all pending requests on teardown so awaiting tool calls can't hang.
  • Provider registration: single registration site per session runtime (RawAppEditorViewsessionRuntime); Svelte's destroy-before-create ordering means the unconditional onRuntimeLogRequester?.(undefined) cleanup can't clobber a newer editor's registration in practice.
  • Session-only gating: both new tools are in SESSION_PREVIEW_TOOL_NAMES, globalToolsFor filters them out of non-session chats, core.ts has a no-handler fallback, and the handler returns a polite error when no runtime resolves for the session id. Tests cover all three layers.
  • jobs/jobsById tracking for list_app_runs: the status heuristic holds in both completion paths (timing-fetch success sets duration_ms on the replacing object; timing-fetch failure leaves the original object with the job.result write visible).

Test coverage

Frontend-only diff. The pure-logic tool plumbing in copilot/chat/global/core.ts is covered by new tests in its sibling core.test.ts: handler dispatch with session id and default/explicit limits, no-handler error paths, SESSION_PREVIEW_TOOL_NAMES gating, and system-prompt mention gating. The new helpers in raw_apps/utils.ts (formatRuntimeLogsForChat, formatAppRunsForChat) have no sibling *.test.ts and that file family is not test-expected per the shared policy — though if finding 2 is fixed with validation logic, a small test would be cheap. No component tests expected for the Svelte files.

Manual verification still needed before merge: in an AI session, open a raw app preview, trigger console output and an uncaught error in the app, then ask the chat to read runtime logs — get_app_runtime_logs should return the formatted lines and the tool status should collapse to the terse UI message. Trigger a backend.<id>() call from the preview, then ask for app runs — list_app_runs should list the job newest-first with a job_id that works when chained into get_job_logs. Confirm the bumped ui_builder artifact (ad76918) actually answers getRuntimeLogs with runtimeLogsResponse — that contract lives outside this repo, so it can only be verified live; with a stale/unresponsive preview the tool should fall back to the "no preview running" message after the 2s timeout rather than hanging. Finally, confirm the regular (non-session) global chat does not offer either tool.


  • Gather PR context (diff, changed files, prior discussion)
  • Run fresh-context review per REVIEW.md (severity triage, public-surface checklist, AGENTS.md compliance)
  • Post inline comments for line-specific findings
  • Post review summary with verdict and test-coverage assessment

@github-actions

Copy link
Copy Markdown
Contributor

Pi Review

cc @centdix

Should address issues before merging: broken randomUUID import in RawAppEditor.svelte


P1 — Broken import: randomUUID not exported from FlowChatManager.svelte

File: frontend/src/lib/components/raw_apps/RawAppEditor.svelte, line 53

import { randomUUID } from '$lib/components/flows/conversations/FlowChatManager.svelte'

randomUUID is not exported from FlowChatManager.svelte.ts (the actual module at that path). The module only exports:

  • ChatMessage
  • ConversationWithDraft
  • FlowChatManager (class)
  • createFlowChatManager

Calling randomUUID() at line 1117 inside requestRuntimeLogs will throw a runtime error. The get_app_runtime_logs tool will fail whenever a raw app preview is open in a session.

Per AGENTS.md, the correct import is:

do not call crypto.randomUUID() in frontend code. Import randomUUID from $lib/utils/uuid instead.

Fix: Replace the import with import { randomUUID } from '$lib/utils/uuid'.


Test coverage

Frontend (all changes are frontend-only):

The PR adds tests in core.test.ts covering:

  • Session-only gating (tools missing outside a session, present inside)
  • Handler dispatch with default and explicit limits
  • System-prompt inclusion/exclusion gating
  • Tool-status callback flow

What's not covered: The integration path from the RawAppEditor iframe → requestRuntimeLogspostMessage → response → resolvePendingRuntimeLogRequestsessionRuntime.svelte.ts handler → formatRuntimeLogsForChat → AI result. This chain is exercised only by manual browser testing, which is appropriate for Svelte component-level messaging. No additional automated tests expected.

The broken import (P1 above) would be caught by TypeScript if npm run check passed, but the PR notes it's blocked by pre-existing errors.

Manual verification still needed (per the PR's own test plan, items still unchecked):

  • Open a raw app preview in an AI session, emit console logs/errors, call get_app_runtime_logs, and confirm the assistant receives recent logs oldest-first.
  • Trigger a raw app backend runnable, call list_app_runs, then call get_job_logs with the returned job_id.

These are particularly important because the preview-shell artifact (ad76918) is a separate external component — the postMessage contract between the Windmill frontend and the ui-builder preview must be verified end-to-end.

@github-actions

Copy link
Copy Markdown
Contributor

Codex Review

cc @centdix
Should address issues before merging: broken randomUUID import in RawAppEditor.svelte

Found 1 issue:

  1. [P1] frontend/src/lib/components/raw_apps/RawAppEditor.svelte:53 imports randomUUID from FlowChatManager.svelte, but that module only imports randomUUID for its own use and does not export it. This makes the new runtime-log request path fail at typecheck/bundle time when RawAppEditor imports a non-exported symbol. It also violates the root AGENTS.md rule: “Frontend UUIDs: do not call crypto.randomUUID() in frontend code. Import randomUUID from $lib/utils/uuid instead.” Import randomUUID directly from $lib/utils/uuid.

Test coverage

Frontend unit coverage was added for the global chat tool registration, session-only gating, defaults, and UI status plumbing. No Svelte component tests are expected here.

Manual verification still needed before merge: in an AI session, open a raw app preview, emit console logs/errors from the live preview, call get_app_runtime_logs, and confirm the assistant receives the recent logs in the expected order. Also trigger a backend runnable from the raw app, call list_app_runs, then use the returned job_id with get_job_logs and confirm the server-side logs are reachable.

@centdix centdix marked this pull request as draft June 10, 2026 16:01
requestId: string,
entries: RawAppRuntimeLogEntry[] | undefined
) {
const pending = pendingRuntimeLogReqs[requestId]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P2] pendingRuntimeLogReqs is a plain Record indexed by the iframe-supplied e.data.requestId. A lookup with an inherited key ('__proto__', 'constructor', 'toString') returns a truthy non-pending value, passes the if (!pending) return guard, and then throws pending.resolve is not a function inside the window message listener. The preview iframe runs arbitrary user app code and the listener explicitly treats it as spoofable. Use a Map<string, PendingRuntimeLogRequest> (or guard with Object.hasOwn) so only locally-generated UUID keys ever resolve.

export function formatRuntimeLogsForChat(entries: RawAppRuntimeLogEntry[]): string {
const lines = entries.map((e) => {
const time = Number.isFinite(e.ts) ? new Date(e.ts).toISOString().slice(11, 23) : '--:--:--'
return `[${time}] ${e.level.toUpperCase()}: ${e.message}`

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P2] runtimeLogsResponse payloads are forwarded with only an Array.isArray check (RawAppEditor.svelte:1005), but this formatter calls e.level.toUpperCase() and new Date(e.ts).toISOString() — the latter throws RangeError for finite-but-out-of-range ts, so the Number.isFinite guard is insufficient. A malformed entry posted by preview app code makes get_app_runtime_logs throw instead of returning a result. Coerce/validate each entry's shape (string level/message, in-range ts) before formatting.

custom_path?: string
}
| undefined
| {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P2] ~60 lines of pure re-indentation churn unrelated to the feature: the rawApp/savedRawApp union types and the baseline/structuredClone block here, plus the genWmillTs template interpolations in raw_apps/utils.ts. The new indentation contradicts frontend/.prettierrc — running npm run format would revert these hunks — and it pollutes blame. Revert the non-functional hunks or run the repo prettier over the touched files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant