Skip to content

fix: add memory pressure guard to prevent OOM during heavy OCR load#5819

Open
kodjima33 wants to merge 1 commit intomainfrom
fix/issue-5791
Open

fix: add memory pressure guard to prevent OOM during heavy OCR load#5819
kodjima33 wants to merge 1 commit intomainfrom
fix/issue-5791

Conversation

@kodjima33
Copy link
Copy Markdown
Collaborator

Fixes #5791

Problem: During video calls, OCR processing can't keep up with capture rate, causing memory to spike to 2-3GB and potentially OOM crash.

Fix: Added a memory pressure safety valve that skips frame capture entirely when the app's resident memory exceeds 1.5GB. This complements the existing per-frame backpressure mechanism with a global memory-based circuit breaker.

  • Checks mach_task_basic_info.resident_size before each capture
  • Skips capture when memory exceeds 1.5GB threshold
  • Logs dropped frame counts and recovery
  • Minimal overhead: single syscall per capture cycle

Scope: Minimal bug fix — no new features or refactoring.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 19, 2026

Greptile Summary

This PR adds a memory-pressure circuit breaker to ProactiveAssistantsPlugin that skips frame capture when the app's resident memory exceeds 1.5 GB, complementing the existing per-frame backpressure mechanism to prevent OOM crashes during sustained heavy OCR load (e.g., video calls). The implementation is minimal and surgical, but contains one real bug and one best-practice concern.

Changes:

  • Adds isMemoryPressureHigh() using mach_task_basic_info.resident_size to check memory before each capture cycle
  • Skips capture and increments memoryPressureDropCount when threshold is exceeded
  • Logs dropped-frame counts on first occurrence and every 30th frame, then logs recovery when memory drops back below threshold

Issues found:

  • memoryPressureDropCount is never reset in stopMonitoring(), unlike the analogous droppedFrameCount which is explicitly logged and zeroed there. This will produce a misleading "Recovered after N dropped frames" log at the start of a new monitoring session if the previous one ended under high memory.
  • resident_size (RSS) includes shared system frameworks charged to all processes, which can cause the guard to fire earlier than warranted. Apple's recommended metric for true per-process memory pressure is phys_footprint from task_vm_info_data_t.

Confidence Score: 3/5

  • Safe to merge with the counter-reset bug fixed; the worst-case impact is a misleading log entry and a slightly over-eager guard threshold.
  • The core guard logic is correct and the syscall usage is idiomatic Swift. The missing reset of memoryPressureDropCount in stopMonitoring() is a real (if low-severity) bug that produces misleading diagnostics. Using resident_size instead of phys_footprint means the threshold can fire earlier than necessary but won't cause correctness issues — it's a conservative over-guard rather than an under-guard.
  • desktop/Desktop/Sources/ProactiveAssistants/ProactiveAssistantsPlugin.swift — specifically the stopMonitoring() method and the isMemoryPressureHigh() implementation.

Important Files Changed

Filename Overview
desktop/Desktop/Sources/ProactiveAssistants/ProactiveAssistantsPlugin.swift Adds a mach_task_basic_info-based memory pressure circuit breaker before each frame capture; one real bug (counter not reset in stopMonitoring()) and one best-practice concern (resident_size vs phys_footprint).

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[captureTimer fires] --> B{isMonitoring?}
    B -- No --> Z[return]
    B -- Yes --> C{isInSpecialSystemMode?}
    C -- Yes --> Z
    C -- No --> D{isMemoryPressureHigh?\nresident_size > 1.5 GB}
    D -- Yes --> E[memoryPressureDropCount++]
    E --> F{count == 1 OR\ncount % 30 == 0?}
    F -- Yes --> G[log: Skipping capture\nN frames dropped]
    F -- No --> Z
    G --> Z
    D -- No --> H{memoryPressureDropCount > 0?}
    H -- Yes --> I[log: Recovered after N frames\nmemoryPressureDropCount = 0]
    H -- No --> J[continue capture pipeline]
    I --> J
    J --> K{isVideoCallApp?}
    K -- Yes, throttled --> Z
    K -- No / allowed frame --> L[captureFrame & OCR]
Loading

Comments Outside Diff (1)

  1. desktop/Desktop/Sources/ProactiveAssistants/ProactiveAssistantsPlugin.swift, line 471-475 (link)

    P1 memoryPressureDropCount not reset in stopMonitoring()

    memoryPressureDropCount is never reset when monitoring is stopped. Compare this to droppedFrameCount on line 472–475, which is explicitly logged and reset to 0 inside stopMonitoring().

    If the user stops monitoring while memory is still above the threshold (i.e., memoryPressureDropCount > 0), then when monitoring restarts and the first frame captures successfully (after memory has recovered), the "Recovered" log on line 599 will fire with the stale counter from the previous session, producing a misleading message like "Recovered, resuming capture after 47 dropped frames" even though those drops occurred in a prior session.

Last reviewed commit: "fix: add memory pres..."

}
}
guard result == KERN_SUCCESS else { return false }
return info.resident_size > memoryPressureThresholdBytes
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 resident_size overstates actual app memory footprint

resident_size (RSS) includes all physical pages currently mapped into the process address space — including shared system frameworks (AppKit, CoreFoundation, etc.) that are shared across every macOS app. This means the 1.5 GB guard will trip earlier than necessary, potentially suspending capture even when the app's true memory usage is well within limits.

Apple's recommended metric for actual process memory pressure is phys_footprint from task_vm_info_data_t, which only counts memory pages actually charged to this process. Consider using it for a more accurate threshold:

private func isMemoryPressureHigh() -> Bool {
    var info = task_vm_info_data_t()
    var count = mach_msg_type_number_t(MemoryLayout<task_vm_info_data_t>.size) / 4
    let result = withUnsafeMutablePointer(to: &info) {
        $0.withMemoryRebound(to: integer_t.self, capacity: Int(count)) {
            task_info(mach_task_self_, task_flavor_t(TASK_VM_INFO), $0, &count)
        }
    }
    guard result == KERN_SUCCESS else { return false }
    return info.phys_footprint > memoryPressureThresholdBytes
}

@kodjima33 kodjima33 force-pushed the fix/issue-5791 branch 4 times, most recently from 160bde0 to 0cbaf63 Compare March 24, 2026 06:11
…spikes (#5791)

- Add 50MB hard cap on pending frame data bytes
- Track pendingFrameDataBytes across both capture paths
- Drop frames when buffer exceeds cap (prevents 2-3GB memory spikes)
- Reset counter when monitoring stops

Fixes #5791
@beastoin
Copy link
Copy Markdown
Collaborator

Hey @kodjima33 👋 Friendly reminder — this PR has been quiet for a bit. If you don't need further assistance from the CTO, feel free to go ahead and merge it. Let's close the loop and not let it go stale. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Desktop: ProactiveAssistants frame backpressure can cause 2-3GB memory spikes

2 participants