feat: deep source provenance, inline query rules, and public env API#4
Open
theaspirational wants to merge 3 commits intoRayforceDB:masterfrom
Open
feat: deep source provenance, inline query rules, and public env API#4theaspirational wants to merge 3 commits intoRayforceDB:masterfrom
theaspirational wants to merge 3 commits intoRayforceDB:masterfrom
Conversation
Replaces the two stub implementations of dl_get_provenance_src_offsets
and dl_get_provenance_src_data with a full CSR-format source tracking
pass that runs after the existing rule-attribution provenance.
For each derived (IDB) row, the engine now records which rows in
which body relations contributed to the derivation, stored as two
parallel vectors on dl_rel_t:
prov_src_offsets — I64[nrows+1] in CSR format: offsets[i] is the
start index in prov_src_data for derived row i.
prov_src_data — flat I64 vector of packed source references,
each entry = (relation_index << 32) | row_index.
This gives callers a complete derivation trace without materialising
intermediate proof trees. The encoding is self-contained: relation
indices refer to prog->rels[], making the result portable alongside
the program.
Caveats:
- Body-only variables (variables that appear in body atoms but not in
the head) are unconstrained during source lookup. Entries may be a
superset of the true proof for such rules.
- Only populated when DL_FLAG_PROVENANCE is set.
Both vectors are released in dl_program_free.
Tests: two new cases in test/test_datalog.c verify the CSR structure
for a base-case path rule and confirm the flag-guard behaviour.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Add optional (rules ...) clause to (query ...) that passes rules inline at query time, following the Datomic/DataScript pattern. - Extract dl_parse_inline_rule() and dl_parse_rule_from_head_and_body() - Parse (rules ((head ...) body1 body2 ...) ...) in ray_query_fn - When (rules ...) present, use only inline rules (ignore globals) - When absent, use globals (backward compatible) - Add tests for inline rules, globals fallback, override semantics Made-with: Cursor
These functions exist in src/lang/env.c and are stable across upstream refactors. Promoting them to the public header enables embedding use cases where host code needs to bind named values into the evaluator's environment. Made-with: Cursor
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Three related additions to the Datalog engine, each independently useful and backward compatible.
1. Deep source provenance (
dl_get_provenance_src_offsets/dl_get_provenance_src_data)Replaces the two stub implementations with a full CSR-format source tracking pass that runs after the existing rule-attribution provenance (
prov_col).When
DL_FLAG_PROVENANCEis set, for each derived row the engine records which rows in which body relations contributed to the derivation:dl_rel_tprov_src_offsetsI64[nrows+1]offsets[i]= start index inprov_src_datafor derived rowiprov_src_dataI64[total](relation_index << 32) | row_indexCSR format keeps the result in two flat allocations per relation, lets callers slice out sources for any row in O(1), and mirrors the encoding already used by the engine's CSR edge indices.
Caveat: for rules where a variable appears only in body atoms, that variable is unconstrained during source lookup. Entries may be a superset of the true derivation (all body rows whose head-visible columns match). This is documented in the header. False negatives would be worse.
Both vectors are released in
dl_program_free.2. Inline query rules via
(rules ...)clauseAdds an optional fourth argument to
(query ...)that supplies rules inline, Datomic-style, instead of using the global rule set:Each entry in
(rules ...)follows the same head+body syntax as(rule ...). When the clause is present only those rules are loaded into the temporarydl_program_t; when absent the existing global-copy path runs unchanged.This enables multi-database sessions where different queries carry different rule sets without global state pollution, and makes rule sets composable as plain data (lists of lists).
Internally,
dl_parse_rule_from_head_and_bodyis extracted as a shared helper used by bothray_rule_fnand the new inline parse path, removing the duplication.3. Promote
ray_env_get/ray_env_setto public APIAdds two declarations to
include/rayforce.h:Both functions already exist and are widely used internally (
src/lang/env.c). Exposing them allows embedders to bind named values into the evaluation environment and retrieve them after eval — the natural way to pass tables into Rayfall queries by name and read results back. Without this, embedders have to link against the internalenv.hheader directly.Changes
src/ops/datalog.hdl_rel_t; stub comments replaced with full docs for both getterssrc/ops/datalog.cdl_build_source_prov()(new);dl_parse_rule_from_head_and_body()(extracted helper); inline rules parsing inray_query_fn; real getter implementations;dl_program_freecleanupinclude/rayforce.hray_env_get/ray_env_setdeclarationstest/test_datalog.ctest/test_main.c/datalogsuiteAll 575 tests pass.