fix: cap payload depth and improve serialized arg readability for Hawk#74
fix: cap payload depth and improve serialized arg readability for Hawk#74pavelzotikov wants to merge 10 commits intomasterfrom
Conversation
Harden stack-trace and event payload building when frames include deeply nested or self-referential data (e.g. $GLOBALS): cap recursion depth in Serializer::prepare and EventPayloadBuilder::transformForJson, limit arguments per frame and per-line size, and keep only safe frame keys. Improve readability of serialized argument values for the Hawk UI by using JSON_PRETTY_PRINT and inserting zero-width break opportunities in long string scalars (valid JSON, smaller horizontal overflow). Tests updated to compare structured JSON and to cover long-string soft breaks.
Co-authored-by: Peter <specc.dev@gmail.com>
There was a problem hiding this comment.
Pull request overview
This PR hardens Hawk event/stacktrace payload building against deeply nested and self-referential data while improving how serialized argument values display in the Hawk UI.
Changes:
- Add max-depth caps to
Serializer::prepare()andEventPayloadBuilder::transformForJson()to prevent runaway recursion. - Limit stacktrace argument count/line length and prevent raw
argsfrom being copied intoadditionalData. - Switch serializer output to
JSON_PRETTY_PRINTand insert soft wrap opportunities into long scalar strings; update/add unit tests accordingly.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
src/Serializer.php |
Adds max-depth protection and soft-break insertion for long strings; uses JSON_PRETTY_PRINT. |
src/EventPayloadBuilder.php |
Normalizes frames more defensively, formats/limits arguments, and depth-limits additional data transformation. |
src/StacktraceFrameBuilder.php |
Adds argument count limiting and truncates long “name = value” lines. |
tests/Unit/SerializerTest.php |
Updates assertions to compare decoded JSON structures and adds coverage for soft-break insertion + depth truncation. |
tests/Unit/EventPayloadBuilderTest.php |
Adds targeted tests for backtrace normalization behavior (args handling, limits, truncation, deep additional data). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| return implode($zwsp, $parts); | ||
| } | ||
|
|
||
| return implode($zwsp, str_split($value, $chunk)); |
| return substr($line, 0, self::MAX_ARGUMENT_LINE_BYTES - 3) . '...'; | ||
| } | ||
|
|
| if (strlen($line) <= $maxBytes) { | ||
| return $line; | ||
| } | ||
|
|
||
| return substr($line, 0, $maxBytes - 3) . '...'; |
There was a problem hiding this comment.
Trade-off: Serialized argument values are intentionally not byte-capped so lines stay full valid JSON for tooling; size is mitigated by argument count, depth, and circular markers; stricter limits belong at the API/transport layer or an optional SDK setting.
| if (strlen($line) <= $maxBytes) { | ||
| return $line; | ||
| } | ||
|
|
||
| return substr($line, 0, $maxBytes - 3) . '...'; |
https://github.com/codex-team/hawk.php into fix/stack-trace-serialization-globals-and-readability
- Stop truncating the value side of `name = serializedValue`; only cap parameter names (MAX_ARGUMENT_NAME_BYTES). - Prebuilt argument lines without ` = ` pass through untouched. - Remove MAX_ARGUMENT_LINE_BYTES / MAX_ARGUMENT_VALUE_BYTES; update tests.
There was a problem hiding this comment.
Pull request overview
This PR hardens Hawk event/stacktrace payload generation against deeply nested and cyclic data structures, while reshaping stack argument formatting to better match Hawk’s expected arguments string-list contract.
Changes:
- Add depth/cycle protection to
Serializer::prepare()and switch serializer output to pretty-printed JSON. - Update
EventPayloadBuilder::normalizeBacktrace()to map rawargsinto Hawkargumentsstrings and keep unknown frame fields inadditionalDatawith depth limiting. - Extend
StacktraceFrameBuilderwith argument count limits and name-side truncation helpers; add tests for these behaviors.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
src/Serializer.php |
Adds max-depth and circular detection in prepare(), and uses JSON_PRETTY_PRINT output. |
src/EventPayloadBuilder.php |
Normalizes backtraces by mapping args→arguments, skipping raw args in additionalData, and depth-limiting JSON transforms. |
src/StacktraceFrameBuilder.php |
Caps argument count per frame and introduces helpers to truncate argument names / normalize prebuilt lines. |
tests/Unit/SerializerTest.php |
Updates assertions to compare decoded JSON and adds tests for depth, circular arrays, and long scalar handling. |
tests/Unit/EventPayloadBuilderTest.php |
Adds coverage for normalizeBacktrace() argument mapping, count limit, prebuilt-line handling, and deep additional data. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (is_array($value)) { | ||
| $result = []; | ||
| foreach ($value as $k => $v) { | ||
| $result[$k] = $this->transformForJson($v); | ||
| $result[$k] = $this->transformForJson($v, $depth + 1); | ||
| } |
There was a problem hiding this comment.
transformForJson() only enforces a max depth; it does not detect circular/self-referential arrays. For values like $GLOBALS (or any array with a self-reference), this will repeatedly re-walk the same structure until the depth limit and can still produce a very large nested payload and high CPU. Consider adding circular reference detection (similar to Serializer::prepare()), returning a sentinel like [circular] when an ancestor is encountered.
| /** | ||
| * Build Hawk `arguments`: string list like "name = serializedValue" (from raw `args` via Serializer). | ||
| * Limits the number of lines ({@see StacktraceFrameBuilder::MAX_FRAME_ARGUMENTS}). Serialized values are not | ||
| * length-truncated; only param names are capped ({@see StacktraceFrameBuilder::formatTruncatedArgumentLine}); | ||
| * prebuilt strings are split on the first `" = "` with {@see StacktraceFrameBuilder::truncatePrebuiltArgumentLine}. |
There was a problem hiding this comment.
The PR description mentions capping/truncating each stack argument line by byte length, but buildArgumentsList() explicitly keeps serialized values unbounded and the tests preserve very long prebuilt argument lines. This can still produce oversized events. Either implement a per-line byte cap (for both prebuilt arguments and formatted args) or update the PR description/contract expectations to match the current behavior.
| public static function formatTruncatedArgumentLine(string $name, string $serializedValue): string | ||
| { | ||
| $namePart = self::truncateUtf8StringToMaxBytes($name, self::MAX_ARGUMENT_NAME_BYTES); | ||
|
|
||
| return $namePart . ' = ' . $serializedValue; | ||
| } |
There was a problem hiding this comment.
formatTruncatedArgumentLine() only truncates the argument name; the serialized value (and therefore the full "name = value" line) can still be arbitrarily large (e.g., very long strings/arrays), which can defeat the goal of keeping payloads small. Consider enforcing an explicit max byte length for the entire line and applying it consistently to both formatted and prebuilt argument lines.
| * Lines without `" = "` are returned as-is (no length limit). | ||
| */ | ||
| public static function truncatePrebuiltArgumentLine(string $line): string | ||
| { | ||
| $separator = ' = '; | ||
| $position = strpos($line, $separator); | ||
| if ($position === false) { | ||
| return $line; |
There was a problem hiding this comment.
truncatePrebuiltArgumentLine() returns lines without the " = " delimiter unchanged (no length cap). That allows a single prebuilt argument entry to be extremely large and bypass any payload-size protections. Consider applying the same max-byte truncation to the whole line even when the delimiter is missing.
| * Lines without `" = "` are returned as-is (no length limit). | |
| */ | |
| public static function truncatePrebuiltArgumentLine(string $line): string | |
| { | |
| $separator = ' = '; | |
| $position = strpos($line, $separator); | |
| if ($position === false) { | |
| return $line; | |
| * Lines without `" = "` are truncated as whole lines to the configured byte cap. | |
| */ | |
| public static function truncatePrebuiltArgumentLine(string $line): string | |
| { | |
| $separator = ' = '; | |
| $position = strpos($line, $separator); | |
| if ($position === false) { | |
| return self::truncateUtf8StringToMaxBytes($line, self::MAX_ARGUMENT_NAME_BYTES); |
| try { | ||
| $newArguments[] = sprintf('%s = %s', $name, $value); | ||
| $newArguments[] = self::formatTruncatedArgumentLine((string) $name, $value); | ||
| } catch (\Exception $e) { | ||
| // Ignore unknown types | ||
| } |
There was a problem hiding this comment.
The try/catch around formatTruncatedArgumentLine() appears unnecessary: neither serializeValue() nor formatTruncatedArgumentLine() throws, so the catch block should be unreachable. Removing it would simplify the control flow (or, if an exception is expected, catch a specific exception from the actual throwing call).
…ry/catch in stack args - Mirror Serializer-style reference stack for additionalData arrays ([circular]). - Remove unreachable try/catch around formatTruncatedArgumentLine. - Add test for self-referential custom frame data.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| return substr($string, 0, $cutLength) . '...'; | ||
| } | ||
|
|
||
| /** |
Replace raw substr fallback with a valid-UTF-8 byte prefix so json_encode cannot fail on split codepoints; add StacktraceFrameBuilder tests.
Summary
Hardens Hawk event and stacktrace payloads when frames carry deeply nested, self-referential, or bulky data (e.g. $GLOBALS-shaped arrays in arguments or extra frame fields). Stack arguments stay a list of strings name = serializedValue: serialized values are full JSON from Serializer (pretty-printed, circular keys become [circular]) so they stay json_decode-friendly in the Hawk UI; only parameter names are shortened. Argument count per frame is capped; there is no byte cap on the value side by design (trade-off vs. transport/API limits elsewhere).
What changed