diff --git a/README.md b/README.md
index f515061..b31cd68 100644
--- a/README.md
+++ b/README.md
@@ -1,6 +1,10 @@
 # SparseDiffPy
 
-Python bindings for [SparseDiffEngine](https://github.com/SparseDifferentiation/SparseDiffEngine), a C library for computing sparse Jacobians and Hessians.
+Python library for computing **sparse Jacobians** and **Hessians** of nonlinear expressions via automatic differentiation.
+
+SparseDiffPy wraps [SparseDiffEngine](https://github.com/SparseDifferentiation/SparseDiffEngine), a C library that exploits the sparsity structure of expression graphs to compute derivatives efficiently. Instead of building dense Jacobian and Hessian matrices, SparseDiffPy analyzes the expression graph at compile time to determine which entries are structurally nonzero, then computes only those entries. The results are returned as `scipy.sparse.csr_matrix` objects, ready for use in optimization solvers.
+
+This matters for large-scale nonlinear optimization, where the Jacobian and Hessian are typically very sparse — a variable in one constraint rarely affects all other constraints. Computing full dense matrices wastes both time and memory.
 
 ## Installation
 
@@ -8,10 +12,221 @@ Python bindings for [SparseDiffEngine](https://github.com/SparseDifferentiation/
 pip install sparsediffpy
 ```
 
-## Usage
+Requires Python >= 3.11 and NumPy >= 2.0.0.
+
+### Building from source
+
+```bash
+git clone https://github.com/SparseDifferentiation/SparseDiffPy.git
+cd SparseDiffPy
+git submodule update --init
+pip install -e .
+```
+
+## Quick start
+
+```python
+import sparsediffpy as sp
+import numpy as np
+
+# 1. Create a scope (owns the variable space)
+scope = sp.Scope()
+
+# 2. Declare variables (always 2D: rows, cols)
+x = scope.Variable(3, 1)   # column vector with 3 elements
+
+# 3. Build an expression using operators and functions
+f = sp.exp(x) + 2.0 * x
+
+# 4. Compile (analyzes sparsity — do this once)
+fn = sp.compile(f)
+
+# 5. Set variable values and evaluate
+x.value = np.array([1.0, 2.0, 3.0])
+
+fn.forward()                        # array([4.71828183, 11.3890561, 26.08553692])
+fn.jacobian()                       # 3x3 sparse CSR matrix
+fn.hessian(weights=np.ones(3))      # 3x3 sparse CSR matrix
+```
+
+## Core concepts
+
+### Scope, Variables, and Parameters
+
+A **Scope** owns the flat variable buffer. All variables must belong to a scope.
+
+```python
+scope = sp.Scope()
+x = scope.Variable(3, 1)       # 3x1 column vector
+Y = scope.Variable(2, 3)       # 2x3 matrix
+a = scope.Variable(1, 1)       # scalar
+```
+
+**Parameters** are fixed data that can be updated without recompiling:
+
+```python
+A = scope.Parameter(3, 3, value=np.eye(3))
+
+f = A @ x
+fn = sp.compile(f)
+
+x.value = np.array([1.0, 2.0, 3.0])
+fn.jacobian()       # uses A = eye(3)
+
+A.value = 2 * np.eye(3)
+fn.jacobian()       # uses A = 2*eye(3), no recompile needed
+```
+
+### Building expressions
+
+Use Python operators and named functions. NumPy arrays and scalars auto-convert to constants:
+
+```python
+x = scope.Variable(3, 1)
+A = np.array([[1, 0, 0], [0, 2, 0], [0, 0, 3]], dtype=float)
+b = np.array([1.0, 2.0, 3.0])
+
+f = A @ x + b                # linear
+g = sp.sin(x) + sp.exp(x)    # nonlinear elementwise
+h = x.T @ x                  # quadratic (scalar output)
+```
+
+Broadcasting follows NumPy/CVXPY conventions:
+
+```python
+a = scope.Variable(1, 1)     # scalar
+X = scope.Variable(3, 2)     # matrix
+r = scope.Variable(1, 2)     # row vector
+c = scope.Variable(3, 1)     # column vector
+
+f = a * X + r + c            # scalar, row, and column all broadcast to (3, 2)
+```
+
+### Compiling and evaluating
+
+`sp.compile(f)` analyzes the sparsity pattern of the expression (this is the expensive step — do it once). The returned object is cheap to evaluate repeatedly:
+
+```python
+fn = sp.compile(f)
+
+x.value = some_values
+fn.forward()                     # evaluate f(x)
+fn.jacobian()                    # sparse Jacobian df/dx as csr_matrix
+fn.hessian(weights=w)            # sparse Hessian of w^T f(x) as csr_matrix
+```
+
+For a vector-valued function `f: R^n -> R^m`, the **Jacobian** is the `m x n` matrix of partial derivatives. The **Hessian** requires a weight vector `w` of length `m` — it computes the `n x n` Hessian of the scalar function `w^T f(x)`.
+
+### Solver integration
+
+In a solver loop, use `scope.set_values()` to write the entire flat variable vector at once:
+
+```python
+scope = sp.Scope()
+x = scope.Variable(3, 1)
+f = sp.sin(x)
+fn = sp.compile(f)
+
+def eval_f(x_flat):
+    scope.set_values(x_flat)
+    return fn.forward()
+
+def eval_jac(x_flat):
+    scope.set_values(x_flat)
+    return fn.jacobian()
+
+def eval_hess(x_flat, weights):
+    scope.set_values(x_flat)
+    return fn.hessian(weights)
+
+# These functions can be passed directly to scipy.optimize, IPOPT, etc.
+```
+
+With multiple variables, the flat vector concatenates them in declaration order:
+
+```python
+scope = sp.Scope()
+x = scope.Variable(3, 1)    # occupies flat positions [0, 1, 2]
+y = scope.Variable(2, 1)    # occupies flat positions [3, 4]
+
+# In a solver callback:
+def eval_jac(z_flat):
+    scope.set_values(z_flat)   # z_flat has length 5
+    return fn.jacobian()       # returns a sparse matrix with 5 columns
+```
+
+## Supported operations
+
+### Arithmetic operators
+
+| Operator | Description |
+|---|---|
+| `x + y` | Addition (with broadcasting) |
+| `x - y` | Subtraction (with broadcasting) |
+| `-x` | Negation |
+| `x * y` | Elementwise multiplication (with broadcasting) |
+| `x @ y` | Matrix multiplication |
+| `x ** p` | Power (constant exponent) |
+| `x[i]`, `x[0:3]`, `x[[0, 2]]` | Indexing and slicing |
+| `x.T` | Transpose |
+
+### Elementwise functions
+
+| Function | Description | Domain |
+|---|---|---|
+| `sp.exp(x)` | Exponential | all reals |
+| `sp.sin(x)` | Sine | all reals |
+| `sp.cos(x)` | Cosine | all reals |
+| `sp.tan(x)` | Tangent | x != pi/2 + k*pi |
+| `sp.sinh(x)` | Hyperbolic sine | all reals |
+| `sp.tanh(x)` | Hyperbolic tangent | all reals |
+| `sp.asinh(x)` | Inverse hyperbolic sine | all reals |
+| `sp.atanh(x)` | Inverse hyperbolic tangent | \|x\| < 1 |
+| `sp.log(x)` | Natural logarithm | x > 0 |
+| `sp.logistic(x)` | Softplus: log(1 + exp(x)) | all reals |
+| `sp.normal_cdf(x)` | Standard normal CDF | all reals |
+| `sp.entr(x)` | Entropy: -x log(x) | x > 0 |
+| `sp.xexp(x)` | x exp(x) | all reals |
+| `sp.power(x, p)` | Power with float exponent | depends on p |
+
+### Reductions
+
+| Function | Description |
+|---|---|
+| `sp.sum(x)` | Sum all elements |
+| `sp.sum(x, axis=0)` | Sum along rows |
+| `sp.sum(x, axis=1)` | Sum along columns |
+| `sp.prod(x)` | Product of all elements |
+| `sp.prod(x, axis=0)` | Product along rows |
+| `sp.prod(x, axis=1)` | Product along columns |
+| `sp.trace(X)` | Matrix trace (square matrices) |
+
+### Structural operations
+
+| Function | Description |
+|---|---|
+| `sp.hstack([a, b, c])` | Horizontal concatenation (same row count) |
+| `sp.vstack([a, b, c])` | Vertical concatenation (same column count) |
+| `sp.reshape(x, d1, d2)` | Reshape (preserves total size) |
+| `sp.diag_vec(x)` | Diagonal matrix from column vector |
+
+### Special functions
+
+| Function | Description |
+|---|---|
+| `sp.quad_form(x, Q)` | Quadratic form: x^T Q x |
+| `sp.quad_over_lin(x, z)` | sum(x^2) / z |
+| `sp.rel_entr(x, y)` | Relative entropy: x log(x/y) |
+
+## Shapes
+
+All shapes are 2D tuples `(rows, cols)`, matching the underlying C engine. There is no 1D shorthand — use `Variable(3, 1)` for a column vector, `Variable(1, 3)` for a row vector:
 
 ```python
-from sparsediffpy import _sparsediffengine
+x = scope.Variable(3, 1)     # column vector
+r = scope.Variable(1, 3)     # row vector
+M = scope.Variable(3, 3)     # matrix
+a = scope.Variable(1, 1)     # scalar
 ```
 
 ## License
diff --git a/SparseDiffEngine b/SparseDiffEngine
deleted file mode 160000
index bcdb0f0..0000000
--- a/SparseDiffEngine
+++ /dev/null
@@ -1 +0,0 @@
-Subproject commit bcdb0f0e74670b80b0f60b7ff02338dfa325fdf0
diff --git a/SparseDiffEngine b/SparseDiffEngine
new file mode 120000
index 0000000..684f0b6
--- /dev/null
+++ b/SparseDiffEngine
@@ -0,0 +1 @@
+/Users/daniel/Documents/software/NLP/SparseDiff/SparseDiffEngine
\ No newline at end of file
diff --git a/sparsediffpy/__init__.py b/sparsediffpy/__init__.py
index 9b73c36..29480ac 100644
--- a/sparsediffpy/__init__.py
+++ b/sparsediffpy/__init__.py
@@ -14,4 +14,28 @@
 limitations under the License.
 """
 
+# C extension (low-level, for advanced users)
 from sparsediffpy import _sparsediffengine  # noqa: F401
+
+# Core classes
+from sparsediffpy._core._scope import Scope, Variable, Parameter, DimensionError  # noqa: F401
+from sparsediffpy._core._expression import Expression  # noqa: F401
+
+# Compile
+from sparsediffpy._core._compile import compile  # noqa: F401
+
+# Elementwise functions
+from sparsediffpy._core._fn_elementwise import (  # noqa: F401
+    sin, cos, exp, log, tan, sinh, tanh, asinh, atanh,
+    logistic, normal_cdf, entr, xexp, power,
+)
+
+# Affine / structural functions
+from sparsediffpy._core._fn_affine import (  # noqa: F401
+    diag_vec, trace, reshape, sum, prod, hstack, vstack,
+)
+
+# Bivariate / special functions
+from sparsediffpy._core._fn_bivariate import (  # noqa: F401
+    quad_form, quad_over_lin, rel_entr,
+)
diff --git a/sparsediffpy/_bindings/bindings.c b/sparsediffpy/_bindings/bindings.c
index f9b1ac9..77ca0db 100644
--- a/sparsediffpy/_bindings/bindings.c
+++ b/sparsediffpy/_bindings/bindings.c
@@ -44,6 +44,13 @@
 #include "atoms/vector_mult.h"
 #include "atoms/xexp.h"
 
+/* Include expression-level bindings */
+#include "expression/forward.h"
+#include "expression/hessian.h"
+#include "expression/init_derivatives.h"
+#include "expression/jacobian.h"
+#include "expression/update_params.h"
+
 /* Include problem bindings */
 #include "problem/constraint_forward.h"
 #include "problem/gradient.h"
@@ -125,6 +132,18 @@ static PyMethodDef DNLPMethods[] = {
     {"get_expr_size", py_get_expr_size, METH_VARARGS,
      "Get the total size of an expression"},
     {"make_reshape", py_make_reshape, METH_VARARGS, "Create reshape atom"},
+    {"expr_forward", py_expr_forward, METH_VARARGS,
+     "Evaluate expression forward pass"},
+    {"expr_init_jacobian", py_expr_init_jacobian, METH_VARARGS,
+     "Initialize Jacobian sparsity for expression"},
+    {"expr_init_hessian", py_expr_init_hessian, METH_VARARGS,
+     "Initialize Hessian sparsity for expression"},
+    {"expr_jacobian", py_expr_jacobian, METH_VARARGS,
+     "Evaluate expression Jacobian"},
+    {"expr_hessian", py_expr_hessian, METH_VARARGS,
+     "Evaluate expression weighted-sum Hessian"},
+    {"expr_update_params", py_expr_update_params, METH_VARARGS,
+     "Update parameter values and propagate refresh flag"},
     {"make_problem", py_make_problem, METH_VARARGS,
      "Create problem from objective and constraints"},
     {"problem_init_derivatives", py_problem_init_derivatives, METH_VARARGS,
diff --git a/sparsediffpy/_bindings/expression/forward.h b/sparsediffpy/_bindings/expression/forward.h
new file mode 100644
index 0000000..b897ca2
--- /dev/null
+++ b/sparsediffpy/_bindings/expression/forward.h
@@ -0,0 +1,45 @@
+#ifndef EXPR_FORWARD_H
+#define EXPR_FORWARD_H
+
+#include "../atoms/common.h"
+
+static PyObject *py_expr_forward(PyObject *self, PyObject *args)
+{
+    PyObject *expr_capsule;
+    PyObject *u_obj;
+
+    if (!PyArg_ParseTuple(args, "OO", &expr_capsule, &u_obj))
+    {
+        return NULL;
+    }
+
+    expr *node = (expr *) PyCapsule_GetPointer(expr_capsule, EXPR_CAPSULE_NAME);
+    if (!node)
+    {
+        PyErr_SetString(PyExc_ValueError, "invalid expression capsule");
+        return NULL;
+    }
+
+    PyArrayObject *u_array =
+        (PyArrayObject *) PyArray_FROM_OTF(u_obj, NPY_DOUBLE, NPY_ARRAY_IN_ARRAY);
+    if (!u_array)
+    {
+        return NULL;
+    }
+
+    node->forward(node, (const double *) PyArray_DATA(u_array));
+    Py_DECREF(u_array);
+
+    npy_intp size = node->size;
+    PyObject *out = PyArray_SimpleNew(1, &size, NPY_DOUBLE);
+    if (!out)
+    {
+        return NULL;
+    }
+    memcpy(PyArray_DATA((PyArrayObject *) out), node->value,
+           size * sizeof(double));
+
+    return out;
+}
+
+#endif /* EXPR_FORWARD_H */
diff --git a/sparsediffpy/_bindings/expression/hessian.h b/sparsediffpy/_bindings/expression/hessian.h
new file mode 100644
index 0000000..7b12c0a
--- /dev/null
+++ b/sparsediffpy/_bindings/expression/hessian.h
@@ -0,0 +1,56 @@
+#ifndef EXPR_HESSIAN_H
+#define EXPR_HESSIAN_H
+
+#include "../atoms/common.h"
+
+static PyObject *py_expr_hessian(PyObject *self, PyObject *args)
+{
+    PyObject *expr_capsule;
+    PyObject *weights_obj;
+
+    if (!PyArg_ParseTuple(args, "OO", &expr_capsule, &weights_obj))
+    {
+        return NULL;
+    }
+
+    expr *node = (expr *) PyCapsule_GetPointer(expr_capsule, EXPR_CAPSULE_NAME);
+    if (!node)
+    {
+        PyErr_SetString(PyExc_ValueError, "invalid expression capsule");
+        return NULL;
+    }
+
+    PyArrayObject *weights_arr = (PyArrayObject *) PyArray_FROM_OTF(
+        weights_obj, NPY_DOUBLE, NPY_ARRAY_IN_ARRAY);
+    if (!weights_arr)
+    {
+        return NULL;
+    }
+
+    node->eval_wsum_hess(node, (const double *) PyArray_DATA(weights_arr));
+    Py_DECREF(weights_arr);
+
+    CSR_Matrix *H = node->wsum_hess;
+    npy_intp nnz = H->nnz;
+    npy_intp n_plus_1 = H->n + 1;
+
+    PyObject *data = PyArray_SimpleNew(1, &nnz, NPY_DOUBLE);
+    PyObject *indices = PyArray_SimpleNew(1, &nnz, NPY_INT32);
+    PyObject *indptr = PyArray_SimpleNew(1, &n_plus_1, NPY_INT32);
+
+    if (!data || !indices || !indptr)
+    {
+        Py_XDECREF(data);
+        Py_XDECREF(indices);
+        Py_XDECREF(indptr);
+        return NULL;
+    }
+
+    memcpy(PyArray_DATA((PyArrayObject *) data), H->x, nnz * sizeof(double));
+    memcpy(PyArray_DATA((PyArrayObject *) indices), H->i, nnz * sizeof(int));
+    memcpy(PyArray_DATA((PyArrayObject *) indptr), H->p, n_plus_1 * sizeof(int));
+
+    return Py_BuildValue("(OOO(ii))", data, indices, indptr, H->m, H->n);
+}
+
+#endif /* EXPR_HESSIAN_H */
diff --git a/sparsediffpy/_bindings/expression/init_derivatives.h b/sparsediffpy/_bindings/expression/init_derivatives.h
new file mode 100644
index 0000000..57b7dda
--- /dev/null
+++ b/sparsediffpy/_bindings/expression/init_derivatives.h
@@ -0,0 +1,46 @@
+#ifndef EXPR_INIT_DERIVATIVES_H
+#define EXPR_INIT_DERIVATIVES_H
+
+#include "../atoms/common.h"
+
+static PyObject *py_expr_init_jacobian(PyObject *self, PyObject *args)
+{
+    PyObject *expr_capsule;
+
+    if (!PyArg_ParseTuple(args, "O", &expr_capsule))
+    {
+        return NULL;
+    }
+
+    expr *node = (expr *) PyCapsule_GetPointer(expr_capsule, EXPR_CAPSULE_NAME);
+    if (!node)
+    {
+        PyErr_SetString(PyExc_ValueError, "invalid expression capsule");
+        return NULL;
+    }
+
+    jacobian_init(node);
+    Py_RETURN_NONE;
+}
+
+static PyObject *py_expr_init_hessian(PyObject *self, PyObject *args)
+{
+    PyObject *expr_capsule;
+
+    if (!PyArg_ParseTuple(args, "O", &expr_capsule))
+    {
+        return NULL;
+    }
+
+    expr *node = (expr *) PyCapsule_GetPointer(expr_capsule, EXPR_CAPSULE_NAME);
+    if (!node)
+    {
+        PyErr_SetString(PyExc_ValueError, "invalid expression capsule");
+        return NULL;
+    }
+
+    wsum_hess_init(node);
+    Py_RETURN_NONE;
+}
+
+#endif /* EXPR_INIT_DERIVATIVES_H */
diff --git a/sparsediffpy/_bindings/expression/jacobian.h b/sparsediffpy/_bindings/expression/jacobian.h
new file mode 100644
index 0000000..e01f914
--- /dev/null
+++ b/sparsediffpy/_bindings/expression/jacobian.h
@@ -0,0 +1,47 @@
+#ifndef EXPR_JACOBIAN_H
+#define EXPR_JACOBIAN_H
+
+#include "../atoms/common.h"
+
+static PyObject *py_expr_jacobian(PyObject *self, PyObject *args)
+{
+    PyObject *expr_capsule;
+
+    if (!PyArg_ParseTuple(args, "O", &expr_capsule))
+    {
+        return NULL;
+    }
+
+    expr *node = (expr *) PyCapsule_GetPointer(expr_capsule, EXPR_CAPSULE_NAME);
+    if (!node)
+    {
+        PyErr_SetString(PyExc_ValueError, "invalid expression capsule");
+        return NULL;
+    }
+
+    node->eval_jacobian(node);
+
+    CSR_Matrix *jac = node->jacobian;
+    npy_intp nnz = jac->nnz;
+    npy_intp m_plus_1 = jac->m + 1;
+
+    PyObject *data = PyArray_SimpleNew(1, &nnz, NPY_DOUBLE);
+    PyObject *indices = PyArray_SimpleNew(1, &nnz, NPY_INT32);
+    PyObject *indptr = PyArray_SimpleNew(1, &m_plus_1, NPY_INT32);
+
+    if (!data || !indices || !indptr)
+    {
+        Py_XDECREF(data);
+        Py_XDECREF(indices);
+        Py_XDECREF(indptr);
+        return NULL;
+    }
+
+    memcpy(PyArray_DATA((PyArrayObject *) data), jac->x, nnz * sizeof(double));
+    memcpy(PyArray_DATA((PyArrayObject *) indices), jac->i, nnz * sizeof(int));
+    memcpy(PyArray_DATA((PyArrayObject *) indptr), jac->p, m_plus_1 * sizeof(int));
+
+    return Py_BuildValue("(OOO(ii))", data, indices, indptr, jac->m, jac->n);
+}
+
+#endif /* EXPR_JACOBIAN_H */
diff --git a/sparsediffpy/_bindings/expression/update_params.h b/sparsediffpy/_bindings/expression/update_params.h
new file mode 100644
index 0000000..17f5d9e
--- /dev/null
+++ b/sparsediffpy/_bindings/expression/update_params.h
@@ -0,0 +1,72 @@
+#ifndef EXPR_UPDATE_PARAMS_H
+#define EXPR_UPDATE_PARAMS_H
+
+#include "../atoms/common.h"
+#include "subexpr.h"
+
+/*
+ * py_expr_update_params(root_capsule, param_capsule_list, theta_array)
+ *
+ * Updates parameter values from theta and propagates the refresh flag
+ * down the expression tree rooted at root_capsule.
+ */
+static PyObject *py_expr_update_params(PyObject *self, PyObject *args)
+{
+    PyObject *root_capsule;
+    PyObject *param_list;
+    PyObject *theta_obj;
+
+    if (!PyArg_ParseTuple(args, "OOO", &root_capsule, &param_list, &theta_obj))
+    {
+        return NULL;
+    }
+
+    expr *root = (expr *) PyCapsule_GetPointer(root_capsule, EXPR_CAPSULE_NAME);
+    if (!root)
+    {
+        PyErr_SetString(PyExc_ValueError, "invalid root expression capsule");
+        return NULL;
+    }
+
+    if (!PyList_Check(param_list))
+    {
+        PyErr_SetString(PyExc_TypeError,
+                        "second argument must be a list of parameter capsules");
+        return NULL;
+    }
+
+    PyArrayObject *theta_arr = (PyArrayObject *) PyArray_FROM_OTF(
+        theta_obj, NPY_DOUBLE, NPY_ARRAY_IN_ARRAY);
+    if (!theta_arr)
+    {
+        return NULL;
+    }
+
+    const double *theta = (const double *) PyArray_DATA(theta_arr);
+    Py_ssize_t n = PyList_Size(param_list);
+
+    for (Py_ssize_t i = 0; i < n; i++)
+    {
+        PyObject *capsule = PyList_GetItem(param_list, i);
+        expr *pnode = (expr *) PyCapsule_GetPointer(capsule, EXPR_CAPSULE_NAME);
+        if (!pnode)
+        {
+            Py_DECREF(theta_arr);
+            PyErr_SetString(PyExc_ValueError,
+                            "invalid parameter capsule in list");
+            return NULL;
+        }
+
+        parameter_expr *param = (parameter_expr *) pnode;
+        int offset = param->param_id;
+        memcpy(pnode->value, theta + offset, pnode->size * sizeof(double));
+    }
+
+    Py_DECREF(theta_arr);
+
+    expr_set_needs_refresh(root);
+
+    Py_RETURN_NONE;
+}
+
+#endif /* EXPR_UPDATE_PARAMS_H */
diff --git a/sparsediffpy/_core/__init__.py b/sparsediffpy/_core/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/sparsediffpy/_core/_compile.py b/sparsediffpy/_core/_compile.py
new file mode 100644
index 0000000..dca986a
--- /dev/null
+++ b/sparsediffpy/_core/_compile.py
@@ -0,0 +1,339 @@
+"""compile() and CompiledExpression.
+
+The recursive tree walker that converts Python expression nodes to C capsules.
+Node-type-to-C-call mapping lives in _registry.py.
+"""
+
+import numpy as np
+import scipy.sparse
+
+from sparsediffpy import _sparsediffengine as _C
+from sparsediffpy._core._constants import Constant, SparseConstant
+from sparsediffpy._core._nodes_affine import (
+    LeftMatMul,
+    ParamScalarMult,
+    ParamVectorMult,
+    RightMatMul,
+)
+from sparsediffpy._core._registry import (
+    ATOM_CONVERTERS,
+    _to_dense_row_major,
+    make_dense_left_matmul,
+    make_dense_right_matmul,
+    make_sparse_left_matmul,
+    make_sparse_right_matmul,
+)
+from sparsediffpy._core._scope import Parameter, Variable
+
+
+def compile(expr):
+    """Compile an expression tree into a CompiledExpression.
+
+    Walks the Python expression tree, discovers all Variables and Parameters,
+    builds C capsules bottom-up, and initializes sparsity patterns for
+    Jacobian and Hessian computation.
+    """
+    # Collect all Variable and Parameter leaves. Raise an error
+    # if the expression does not contain any variable. 
+    variables = []
+    parameters = []
+    _collect_leaves(expr, variables, parameters, set())
+
+    if not variables:
+        raise ValueError("Expression must contain at least one Variable")
+
+    # Check that all variables in the expression have the same scope
+    scope = variables[0]._scope
+    for v in variables[1:]:
+        if v._scope is not scope:
+            raise ValueError("All variables must belong to the same Scope")
+
+    n_vars = scope._next_var_offset
+
+    # Build C capsules bottom-up
+    capsule_cache = {}
+    param_capsules_ordered = []
+    param_objects_ordered = []
+    root_capsule = _build_capsule(
+        expr, n_vars, capsule_cache, param_capsules_ordered, param_objects_ordered
+    )
+
+    if param_capsules_ordered:
+        scope._params_dirty = True
+
+    return CompiledExpression(
+        expr_capsule=root_capsule,
+        scope=scope,
+        param_capsules=param_capsules_ordered,
+        param_objects=param_objects_ordered,
+        expr_shape=expr.shape,
+    )
+
+
+# ---------------------------------------------------------------------------
+# Tree walking
+# ---------------------------------------------------------------------------
+
+def _collect_leaves(node, variables, parameters, visited):
+    """Walk the expression tree to collect Variable and Parameter leaves."""
+    node_id = id(node)
+    if node_id in visited:
+        return
+    visited.add(node_id)
+
+    if isinstance(node, Variable):
+        variables.append(node)
+        return
+    if isinstance(node, Parameter):
+        parameters.append(node)
+        return
+    if isinstance(node, (Constant, SparseConstant)):
+        return
+
+    # Walk children. Nodes use one of three conventions:
+    #   .child            — unary ops (Neg, Sin, Exp, Reshape, ...)
+    #   .left / .right    — binary ops (Add, Multiply, MatMul, ParamScalarMult, ...)
+    #   .matrix_expr      — LeftMatMul / RightMatMul (the constant/parameter matrix)
+    # HStack uses .children. Some nodes combine these (e.g. LeftMatMul has both
+    # .child and .matrix_expr).
+    if hasattr(node, "child"):
+        _collect_leaves(node.child, variables, parameters, visited)
+    if hasattr(node, "left"):
+        _collect_leaves(node.left, variables, parameters, visited)
+    if hasattr(node, "right"):
+        _collect_leaves(node.right, variables, parameters, visited)
+    if hasattr(node, "matrix_expr"):
+        _collect_leaves(node.matrix_expr, variables, parameters, visited)
+    if hasattr(node, "children"):
+        for c in node.children:
+            _collect_leaves(c, variables, parameters, visited)
+
+
+# ---------------------------------------------------------------------------
+# Capsule building
+# ---------------------------------------------------------------------------
+
+def _build_capsule(node, n_vars, cache, param_caps, param_objs):
+    """Recursively build C capsules for the expression tree."""
+    node_id = id(node)
+    
+    # catch common subexpressions
+    if node_id in cache:
+        return cache[node_id]
+
+    cap = _convert_node(node, n_vars, cache, param_caps, param_objs)
+
+    # Post-conversion dimension check
+    d1_c, d2_c = _C.get_expr_dimensions(cap)
+    d1_py, d2_py = node.shape
+    if d1_c != d1_py or d2_c != d2_py:
+        raise ValueError(
+            f"Dimension mismatch for {type(node).__name__}: "
+            f"C dimensions ({d1_c}, {d2_c}) vs Python dimensions ({d1_py}, {d2_py})"
+        )
+
+    cache[node_id] = cap
+    return cap
+
+
+def _convert_node(node, n_vars, cache, param_caps, param_objs):
+    """Convert a single Python expression node to a C capsule."""
+
+    d1, d2 = node.shape
+
+    # --- Leaves ---
+    if isinstance(node, Variable):
+        return _C.make_variable(d1, d2, node._var_id, n_vars)
+
+    if isinstance(node, Parameter):
+        # Use current values if set, otherwise zeros as placeholder.
+        # Real values are synced via problem_update_params before evaluation.
+        size = d1 * d2
+        values = node._value_flat if node._value_flat is not None else np.zeros(size)
+        cap = _C.make_parameter(d1, d2, node._param_id, n_vars, values)
+        param_caps.append(cap)
+        param_objs.append(node)
+        return cap
+
+    if isinstance(node, Constant):
+        return _C.make_parameter(d1, d2, -1, n_vars, node._value_flat)
+
+    if isinstance(node, SparseConstant):
+        # right now we don't support sparse parameters in the C engine
+        return _C.make_parameter(d1, d2, -1, n_vars, node._to_dense_flat())
+
+    # --- Matmul and multiply with parameter dispatch ---
+    # These need special handling because they access matrix_expr / param_expr
+    # directly rather than going through a uniform children list.
+    if isinstance(node, LeftMatMul):
+        return _convert_left_matmul(node, n_vars, cache, param_caps, param_objs)
+
+    if isinstance(node, RightMatMul):
+        return _convert_right_matmul(node, n_vars, cache, param_caps, param_objs)
+
+    if isinstance(node, ParamScalarMult):
+        param_cap = _build_capsule(node.left, n_vars, cache, param_caps, param_objs)
+        child_cap = _build_capsule(node.right, n_vars, cache, param_caps, param_objs)
+        return _C.make_param_scalar_mult(param_cap, child_cap)
+
+    if isinstance(node, ParamVectorMult):
+        param_cap = _build_capsule(node.left, n_vars, cache, param_caps, param_objs)
+        child_cap = _build_capsule(node.right, n_vars, cache, param_caps, param_objs)
+        return _C.make_param_vector_mult(param_cap, child_cap)
+
+    # --- Registry lookup ---
+    node_type = type(node)
+    if node_type in ATOM_CONVERTERS:
+        child_caps = _build_children(node, n_vars, cache, param_caps, param_objs)
+        return ATOM_CONVERTERS[node_type](node, child_caps)
+
+    raise TypeError(f"Unknown expression node type: {node_type.__name__}")
+
+
+def _build_children(node, n_vars, cache, param_caps, param_objs):
+    """Build C capsules for all children of a node, returned as a list."""
+    caps = []
+    # Unary: .child
+    if hasattr(node, "child"):
+        caps.append(_build_capsule(node.child, n_vars, cache, param_caps, param_objs))
+    # Binary: .left, .right
+    if hasattr(node, "left"):
+        caps.append(_build_capsule(node.left, n_vars, cache, param_caps, param_objs))
+    if hasattr(node, "right"):
+        caps.append(_build_capsule(node.right, n_vars, cache, param_caps, param_objs))
+    # HStack: .children
+    if hasattr(node, "children"):
+        for c in node.children:
+            caps.append(_build_capsule(c, n_vars, cache, param_caps, param_objs))
+    return caps
+
+
+# ---------------------------------------------------------------------------
+# Left/right matmul converters
+# These live here rather than in _registry.py because the Parameter case
+# needs _build_capsule, which would create a circular dependency.
+# ---------------------------------------------------------------------------
+
+def _convert_left_matmul(node, n_vars, cache, param_caps, param_objs):
+    """Convert A @ f(x)."""
+    child_cap = _build_capsule(node.child, n_vars, cache, param_caps, param_objs)
+    matrix = node.matrix_expr
+    m, n = matrix.shape
+
+    if isinstance(matrix, SparseConstant):
+        return make_sparse_left_matmul(None, child_cap, matrix)
+
+    if isinstance(matrix, Parameter):
+        param_cap = _build_capsule(matrix, n_vars, cache, param_caps, param_objs)
+        vals = _to_dense_row_major(matrix)
+        return make_dense_left_matmul(param_cap, child_cap, vals, m, n)
+
+    if isinstance(matrix, Constant):
+        vals = _to_dense_row_major(matrix)
+        return make_dense_left_matmul(None, child_cap, vals, m, n)
+
+    raise TypeError(f"LeftMatMul matrix must be Constant, SparseConstant, or Parameter")
+
+
+def _convert_right_matmul(node, n_vars, cache, param_caps, param_objs):
+    """Convert f(x) @ A."""
+    child_cap = _build_capsule(node.child, n_vars, cache, param_caps, param_objs)
+    matrix = node.matrix_expr
+    m, n = matrix.shape
+
+    if isinstance(matrix, SparseConstant):
+        return make_sparse_right_matmul(None, child_cap, matrix)
+
+    if isinstance(matrix, Parameter):
+        param_cap = _build_capsule(matrix, n_vars, cache, param_caps, param_objs)
+        vals = _to_dense_row_major(matrix)
+        return make_dense_right_matmul(param_cap, child_cap, vals, m, n)
+
+    if isinstance(matrix, Constant):
+        vals = _to_dense_row_major(matrix)
+        return make_dense_right_matmul(None, child_cap, vals, m, n)
+
+    raise TypeError(f"RightMatMul matrix must be Constant, SparseConstant, or Parameter")
+
+
+# ---------------------------------------------------------------------------
+# CompiledExpression
+# ---------------------------------------------------------------------------
+
+class CompiledExpression:
+    """A compiled expression ready for evaluation.
+
+    Reads variable values from the scope's flat buffer.
+    Reads parameter values from the Parameter objects.
+    """
+
+    def __init__(self, expr_capsule, scope, param_capsules, param_objects,
+                 expr_shape):
+        self._expr = expr_capsule
+        self._scope = scope
+        self._param_capsules = param_capsules
+        self._param_objects = param_objects
+        self._expr_shape = expr_shape
+        self._jacobian_initialized = False
+        self._hessian_initialized = False
+
+    def _sync_params(self):
+        """Push current parameter values to the C expression if any changed."""
+        if not self._scope._params_dirty:
+            return
+        for p in self._param_objects:
+            if p._value_flat is None:
+                raise ValueError(
+                    f"Parameter with shape {p.shape} has no value set. "
+                    f"Assign a value via parameter.value = ... before evaluating."
+                )
+        theta_parts = [p._value_flat for p in self._param_objects]
+        theta = np.concatenate(theta_parts)
+        _C.expr_update_params(self._expr, self._param_capsules, theta)
+        self._scope._params_dirty = False
+
+    def _ensure_jacobian_initialized(self):
+        if not self._jacobian_initialized:
+            _C.expr_init_jacobian(self._expr)
+            self._jacobian_initialized = True
+
+    def forward(self):
+        """Evaluate the expression at the current variable values.
+
+        Must be called before jacobian() or hessian().
+        """
+        self._sync_params()
+        return _C.expr_forward(self._expr, self._scope._flat_values)
+
+    def jacobian(self):
+        """Compute the sparse Jacobian at the current variable values.
+
+        Requires forward() to have been called first.
+
+        Returns scipy.sparse.csr_matrix of shape (expr_size, n_vars).
+        """
+        if not self._jacobian_initialized:
+            _C.expr_init_jacobian(self._expr)
+            self._jacobian_initialized = True
+        data, indices, indptr, (m, n) = _C.expr_jacobian(self._expr)
+        return scipy.sparse.csr_matrix((data, indices, indptr), shape=(m, n))
+
+    def hessian(self, weights):
+        """Compute the sparse Hessian of the weighted expression.
+
+        Requires forward() and jacobian() to have been called first.
+
+        The Hessian is of the scalar function w^T f(x), where w is the
+        weights vector and f is the compiled expression.
+
+        Args:
+            weights: array of length expr_size
+
+        Returns scipy.sparse.csr_matrix of shape (n_vars, n_vars).
+        """
+        if not self._hessian_initialized:
+            _C.expr_init_hessian(self._expr)
+            self._hessian_initialized = True
+        weights = np.asarray(weights, dtype=np.float64).ravel(order='F')
+        data, indices, indptr, (m, n) = _C.expr_hessian(self._expr, weights)
+        return scipy.sparse.csr_matrix((data, indices, indptr), shape=(m, n))
diff --git a/sparsediffpy/_core/_constants.py b/sparsediffpy/_core/_constants.py
new file mode 100644
index 0000000..5eab192
--- /dev/null
+++ b/sparsediffpy/_core/_constants.py
@@ -0,0 +1,49 @@
+"""Constant and SparseConstant expression nodes."""
+
+import numpy as np
+import scipy.sparse
+
+
+class Constant:
+    """A fixed dense constant in the expression tree.
+
+    Stores values in column-major (Fortran) flat order to match the C layer.
+    """
+
+    __array_ufunc__ = None
+    __array_priority__ = 20
+
+    def __init__(self, value, shape):
+        self.shape = shape
+        self._value_flat = np.asarray(value, dtype=np.float64).ravel(order="F")
+        expected_size = shape[0] * shape[1]
+        if self._value_flat.size != expected_size:
+            raise ValueError(
+                f"Constant value has {self._value_flat.size} elements, "
+                f"expected {expected_size} for shape {shape}"
+            )
+
+
+class SparseConstant:
+    """A fixed sparse constant in the expression tree.
+
+    Stores CSR arrays for use with make_left_matmul / make_right_matmul.
+    """
+
+    __array_ufunc__ = None
+    __array_priority__ = 20
+
+    def __init__(self, csr_matrix):
+        csr = scipy.sparse.csr_matrix(csr_matrix)
+        self.shape = (csr.shape[0], csr.shape[1])
+        self._csr_data = np.asarray(csr.data, dtype=np.float64)
+        self._csr_indices = np.asarray(csr.indices, dtype=np.int32)
+        self._csr_indptr = np.asarray(csr.indptr, dtype=np.int32)
+
+    def _to_dense_flat(self):
+        """Convert to dense column-major flat array (for standalone use)."""
+        dense = scipy.sparse.csr_matrix(
+            (self._csr_data, self._csr_indices, self._csr_indptr),
+            shape=self.shape,
+        ).toarray()
+        return dense.ravel(order="F").astype(np.float64)
diff --git a/sparsediffpy/_core/_dispatch.py b/sparsediffpy/_core/_dispatch.py
new file mode 100644
index 0000000..8267803
--- /dev/null
+++ b/sparsediffpy/_core/_dispatch.py
@@ -0,0 +1,142 @@
+"""Operator dispatch: routes +, -, *, @, [] to the correct expression nodes.
+
+Separated from _expression.py to avoid circular imports — this module
+imports from both _expression and _nodes_*, but neither imports this module.
+_expression.py calls these functions via late binding (module-level reference).
+"""
+
+import numpy as np
+
+from sparsediffpy._core._constants import Constant, SparseConstant
+from sparsediffpy._core._expression import _wrap_constant
+from sparsediffpy._core._nodes_affine import (
+    Add, Broadcast, Index, LeftMatMul, Neg, ParamScalarMult,
+    ParamVectorMult, RightMatMul, Transpose,
+)
+from sparsediffpy._core._nodes_bivariate import MatMul, Multiply
+from sparsediffpy._core._nodes_elementwise import Power
+from sparsediffpy._core._shapes import broadcast_shape, check_matmul_shapes, is_scalar
+
+
+def _is_param_like(node):
+    # Lazy import to avoid circular: _scope -> _expression -> _dispatch -> _scope
+    from sparsediffpy._core._scope import Parameter
+    return isinstance(node, (Constant, SparseConstant, Parameter))
+
+
+def make_add(left, right):
+    result_shape, left_bc, right_bc = broadcast_shape(left.shape, right.shape)
+    if left_bc:
+        left = Broadcast(left, result_shape)
+    if right_bc:
+        right = Broadcast(right, result_shape)
+    return Add(left, right)
+
+
+def make_sub(left, right):
+    return make_add(left, Neg(right))
+
+
+def make_neg(node):
+    return Neg(node)
+
+
+def make_mul(left, right):
+    if _is_param_like(left) and is_scalar(left.shape):
+        return ParamScalarMult(left, right)
+    if _is_param_like(right) and is_scalar(right.shape):
+        return ParamScalarMult(right, left)
+
+    if _is_param_like(left) and left.shape == right.shape:
+        return ParamVectorMult(left, right)
+    if _is_param_like(right) and right.shape == left.shape:
+        return ParamVectorMult(right, left)
+
+    result_shape, left_bc, right_bc = broadcast_shape(left.shape, right.shape)
+    if left_bc:
+        left = Broadcast(left, result_shape)
+    if right_bc:
+        right = Broadcast(right, result_shape)
+
+    if _is_param_like(left):
+        return ParamVectorMult(left, right)
+    if _is_param_like(right):
+        return ParamVectorMult(right, left)
+
+    return Multiply(left, right)
+
+
+def make_matmul(left, right):
+    result_shape = check_matmul_shapes(left.shape, right.shape)
+
+    if _is_param_like(left):
+        return LeftMatMul(left, right, result_shape)
+    if _is_param_like(right):
+        return RightMatMul(right, left, result_shape)
+    return MatMul(left, right, result_shape)
+
+
+def make_pow(node, exponent):
+    if not isinstance(exponent, (int, float)):
+        raise TypeError("Exponent must be a constant number")
+    return Power(node, float(exponent))
+
+
+def make_transpose(node):
+    return Transpose(node)
+
+
+def make_index(node, key):
+    d1, d2 = node.shape
+
+    if isinstance(key, tuple):
+        if len(key) != 2:
+            raise IndexError("Only 1D or 2D indexing supported")
+        row_key, col_key = key
+        row_indices = _resolve_axis_index(row_key, d1)
+        col_indices = _resolve_axis_index(col_key, d2)
+        flat_indices = []
+        for c in col_indices:
+            for r in row_indices:
+                flat_indices.append(r + c * d1)
+        out_d1 = len(row_indices)
+        out_d2 = len(col_indices)
+    else:
+        if d2 == 1:
+            flat_indices = _resolve_axis_index(key, d1)
+            out_d1 = len(flat_indices)
+            out_d2 = 1
+        elif d1 == 1:
+            flat_indices = _resolve_axis_index(key, d2)
+            out_d1 = 1
+            out_d2 = len(flat_indices)
+        else:
+            flat_indices = _resolve_axis_index(key, d1 * d2)
+            out_d1 = len(flat_indices)
+            out_d2 = 1
+
+    flat_arr = np.array(flat_indices, dtype=np.int32)
+    return Index(node, flat_arr, (out_d1, out_d2))
+
+
+def _resolve_axis_index(key, length):
+    if isinstance(key, (int, np.integer)):
+        idx = int(key)
+        if idx < 0:
+            idx += length
+        if idx < 0 or idx >= length:
+            raise IndexError(f"Index {key} out of range for axis of length {length}")
+        return [idx]
+    if isinstance(key, slice):
+        return list(range(*key.indices(length)))
+    if isinstance(key, (list, np.ndarray)):
+        out = []
+        for i in key:
+            idx = int(i)
+            if idx < 0:
+                idx += length
+            if idx < 0 or idx >= length:
+                raise IndexError(f"Index {i} out of range for axis of length {length}")
+            out.append(idx)
+        return out
+    raise IndexError(f"Unsupported index type: {type(key).__name__}")
diff --git a/sparsediffpy/_core/_expression.py b/sparsediffpy/_core/_expression.py
new file mode 100644
index 0000000..69ea79e
--- /dev/null
+++ b/sparsediffpy/_core/_expression.py
@@ -0,0 +1,136 @@
+"""Expression base class and _wrap_constant.
+
+Operator dispatch lives in _dispatch.py (avoids circular imports).
+Node types are defined in _nodes_affine.py, _nodes_elementwise.py,
+_nodes_bivariate.py, and _nodes_other.py.
+"""
+
+import numpy as np
+import scipy.sparse
+
+from sparsediffpy._core._constants import Constant, SparseConstant
+
+
+# ---------------------------------------------------------------------------
+# _wrap_constant: converts raw values into expression nodes
+# ---------------------------------------------------------------------------
+
+def _wrap_constant(value):
+    """Wrap a raw Python/NumPy/SciPy value into an expression node.
+
+    Called by operators and node constructors so users can write
+    ``x + 1.0`` or ``A @ x`` with raw scalars/arrays.
+
+    - Expression subclass -> return as-is
+    - int / float -> Constant with shape (1, 1)
+    - np.ndarray 1D (n,) -> Constant with shape (n, 1) (column vector)
+    - np.ndarray 2D (m, n) -> Constant with shape (m, n)
+    - scipy.sparse -> SparseConstant
+    """
+    if hasattr(value, "_is_sparsediff_expr"):
+        return value
+
+    if isinstance(value, (int, float)):
+        return Constant(np.array([float(value)]), (1, 1))
+
+    if isinstance(value, np.ndarray):
+        if value.ndim == 0:
+            return Constant(np.array([value.item()]), (1, 1))
+        if value.ndim == 1:
+            return Constant(value, (value.shape[0], 1))
+        if value.ndim == 2:
+            return Constant(value, (value.shape[0], value.shape[1]))
+        raise ValueError(f"Cannot wrap {value.ndim}D array as constant")
+
+    if scipy.sparse.issparse(value):
+        return SparseConstant(value)
+
+    raise TypeError(f"Cannot convert {type(value).__name__} to expression")
+
+
+# ---------------------------------------------------------------------------
+# Base class
+# ---------------------------------------------------------------------------
+
+class Expression:
+    """Base class for all expression tree nodes."""
+
+    _is_sparsediff_expr = True
+    shape = None  # (d1, d2), set by subclasses
+
+    # Tell NumPy to defer to our operators instead of trying its own
+    __array_ufunc__ = None
+    __array_priority__ = 20
+
+    def __add__(self, other):
+        return _dispatch.make_add(self, _wrap_constant(other))
+
+    def __radd__(self, other):
+        return _dispatch.make_add(_wrap_constant(other), self)
+
+    def __sub__(self, other):
+        return _dispatch.make_sub(self, _wrap_constant(other))
+
+    def __rsub__(self, other):
+        return _dispatch.make_sub(_wrap_constant(other), self)
+
+    def __neg__(self):
+        return _dispatch.make_neg(self)
+
+    def __mul__(self, other):
+        return _dispatch.make_mul(self, _wrap_constant(other))
+
+    def __rmul__(self, other):
+        return _dispatch.make_mul(_wrap_constant(other), self)
+
+    def __matmul__(self, other):
+        return _dispatch.make_matmul(self, _wrap_constant(other))
+
+    def __rmatmul__(self, other):
+        return _dispatch.make_matmul(_wrap_constant(other), self)
+
+    def __pow__(self, exponent):
+        return _dispatch.make_pow(self, exponent)
+
+    def __getitem__(self, key):
+        return _dispatch.make_index(self, key)
+
+    @property
+    def T(self):
+        return _dispatch.make_transpose(self)
+
+    @property
+    def size(self):
+        return self.shape[0] * self.shape[1]
+
+
+# ---------------------------------------------------------------------------
+# Make Constant and SparseConstant behave as expressions
+# ---------------------------------------------------------------------------
+
+for _cls in (Constant, SparseConstant):
+    _cls._is_sparsediff_expr = True
+    _cls.__array_ufunc__ = None
+    _cls.__array_priority__ = 20
+    _cls.__add__ = Expression.__add__
+    _cls.__radd__ = Expression.__radd__
+    _cls.__sub__ = Expression.__sub__
+    _cls.__rsub__ = Expression.__rsub__
+    _cls.__neg__ = Expression.__neg__
+    _cls.__mul__ = Expression.__mul__
+    _cls.__rmul__ = Expression.__rmul__
+    _cls.__matmul__ = Expression.__matmul__
+    _cls.__rmatmul__ = Expression.__rmatmul__
+    _cls.__pow__ = Expression.__pow__
+    _cls.__getitem__ = Expression.__getitem__
+    _cls.T = Expression.T
+    _cls.size = Expression.size
+
+
+# ---------------------------------------------------------------------------
+# Import _dispatch at the bottom to avoid circular imports.
+# By this point Expression is fully defined, so _dispatch.py (which imports
+# from _nodes_*.py which inherit from Expression) can resolve everything.
+# ---------------------------------------------------------------------------
+
+from sparsediffpy._core import _dispatch  # noqa: E402
diff --git a/sparsediffpy/_core/_fn_affine.py b/sparsediffpy/_core/_fn_affine.py
new file mode 100644
index 0000000..901dbfe
--- /dev/null
+++ b/sparsediffpy/_core/_fn_affine.py
@@ -0,0 +1,95 @@
+"""Affine named functions: sp.sum, sp.prod, sp.reshape, sp.hstack, etc."""
+
+import builtins as _builtins
+
+from sparsediffpy._core._expression import _wrap_constant
+from sparsediffpy._core._nodes_affine import (
+    DiagVec, HStack, Reshape, Sum, Trace, Transpose,
+)
+from sparsediffpy._core._nodes_other import Prod, ProdAxisOne, ProdAxisZero
+from sparsediffpy._core._shapes import validate_shape
+
+_builtin_sum = _builtins.sum
+
+
+def diag_vec(x):
+    return DiagVec(x)
+
+def trace(x):
+    return Trace(x)
+
+def reshape(x, d1, d2):
+    validate_shape(d1, d2)
+    return Reshape(x, (d1, d2))
+
+
+def sum(x, axis=None):
+    """Sum reduction.
+
+    axis=None: sum all elements -> (1,1)
+    axis=0: sum along rows (collapse d1) -> (1, d2)
+    axis=1: sum along columns (collapse d2) -> (1, d1)
+    """
+    if axis is None:
+        return Sum(x, -1)
+    elif axis == 0:
+        return Sum(x, 0)
+    elif axis == 1:
+        return Sum(x, 1)
+    else:
+        raise ValueError(f"Invalid axis {axis}, must be None, 0, or 1")
+
+
+def prod(x, axis=None):
+    """Product reduction.
+
+    axis=None: product of all elements -> (1,1)
+    axis=0: product along rows -> (1, d2)
+    axis=1: product along columns -> (1, d1)
+    """
+    if axis is None:
+        return Prod(x)
+    elif axis == 0:
+        return ProdAxisZero(x)
+    elif axis == 1:
+        return ProdAxisOne(x)
+    else:
+        raise ValueError(f"Invalid axis {axis}, must be None, 0, or 1")
+
+
+def hstack(expressions):
+    """Horizontally stack expressions. All must have the same d1 (rows).
+
+    Result shape: (d1, sum of all d2).
+    """
+    exprs = [_wrap_constant(e) for e in expressions]
+    if not exprs:
+        raise ValueError("hstack: empty argument")
+
+    d1 = exprs[0].shape[0]
+    for e in exprs[1:]:
+        if e.shape[0] != d1:
+            raise ValueError(f"hstack: row mismatch, {d1} vs {e.shape[0]}")
+
+    total_d2 = _builtin_sum(e.shape[1] for e in exprs)
+    return HStack(exprs, (d1, total_d2))
+
+
+def vstack(expressions):
+    """Vertically stack expressions. All must have the same d2 (columns).
+
+    Implemented as transpose(hstack(transpose(each))).
+    """
+    exprs = [_wrap_constant(e) for e in expressions]
+    if not exprs:
+        raise ValueError("vstack: empty argument")
+
+    d2 = exprs[0].shape[1]
+    for e in exprs[1:]:
+        if e.shape[1] != d2:
+            raise ValueError(f"vstack: column mismatch, {d2} vs {e.shape[1]}")
+
+    transposed = [Transpose(e) for e in exprs]
+    total_d1 = _builtin_sum(e.shape[0] for e in exprs)
+    h = HStack(transposed, (d2, total_d1))
+    return Transpose(h)
diff --git a/sparsediffpy/_core/_fn_bivariate.py b/sparsediffpy/_core/_fn_bivariate.py
new file mode 100644
index 0000000..480d839
--- /dev/null
+++ b/sparsediffpy/_core/_fn_bivariate.py
@@ -0,0 +1,37 @@
+"""Bivariate named functions: sp.quad_form, sp.quad_over_lin, sp.rel_entr."""
+
+import scipy.sparse
+
+from sparsediffpy._core._nodes_bivariate import QuadOverLin, RelEntr
+from sparsediffpy._core._nodes_other import QuadForm
+
+
+def quad_form(x, Q):
+    """Quadratic form xT Q x with x (n, 1) and Q (n, n)"""
+    if not isinstance(Q, scipy.sparse.csr_matrix):
+        Q = scipy.sparse.csr_matrix(Q)
+
+    n = x.shape[0]
+    if x.shape[1] != 1 or Q.shape != (n, n):
+        raise ValueError(f"quad_form: need x (n, 1) and Q (n, n) "
+                           ", got x {x.shape} and Q {Q.shape}")
+
+    return QuadForm(x, Q)
+
+
+def quad_over_lin(x, z):
+    """sum(x^2) / z where z is a scalar variable.
+
+    Both arguments must be variable-dependent expressions.
+    z must be a plain Variable (not a composition).
+    """
+    return QuadOverLin(x, z)
+
+
+def rel_entr(x, y):
+    """x * log(x / y) elementwise.
+
+    Both arguments must be variable-dependent expressions.
+    The C engine does not support constant arguments.
+    """
+    return RelEntr(x, y)
diff --git a/sparsediffpy/_core/_fn_elementwise.py b/sparsediffpy/_core/_fn_elementwise.py
new file mode 100644
index 0000000..3227d7e
--- /dev/null
+++ b/sparsediffpy/_core/_fn_elementwise.py
@@ -0,0 +1,52 @@
+"""Elementwise named functions: sp.sin, sp.exp, sp.log, etc.
+
+_UnaryOp.__init__ handles _wrap_constant, so these are one-liners.
+"""
+
+from sparsediffpy._core._nodes_elementwise import (
+    Asinh, Atanh, Cos, Entr, Exp, Log, Logistic, NormalCdf, Power,
+    Sin, Sinh, Tan, Tanh, Xexp,
+)
+
+
+def sin(x):
+    return Sin(x)
+
+def cos(x):
+    return Cos(x)
+
+def exp(x):
+    return Exp(x)
+
+def log(x):
+    return Log(x)
+
+def tan(x):
+    return Tan(x)
+
+def sinh(x):
+    return Sinh(x)
+
+def tanh(x):
+    return Tanh(x)
+
+def asinh(x):
+    return Asinh(x)
+
+def atanh(x):
+    return Atanh(x)
+
+def logistic(x):
+    return Logistic(x)
+
+def normal_cdf(x):
+    return NormalCdf(x)
+
+def entr(x):
+    return Entr(x)
+
+def xexp(x):
+    return Xexp(x)
+
+def power(x, p):
+    return Power(x, float(p))
diff --git a/sparsediffpy/_core/_nodes_affine.py b/sparsediffpy/_core/_nodes_affine.py
new file mode 100644
index 0000000..f2e2490
--- /dev/null
+++ b/sparsediffpy/_core/_nodes_affine.py
@@ -0,0 +1,136 @@
+"""Affine expression nodes: linear/affine operations on expressions."""
+
+import numpy as np
+
+from sparsediffpy._core._expression import Expression, _wrap_constant
+
+
+class _UnaryOp(Expression):
+    def __init__(self, child):
+        child = _wrap_constant(child)
+        self.child = child
+        self.shape = child.shape
+
+
+class Neg(_UnaryOp):
+    pass
+
+
+class Transpose(Expression):
+    def __init__(self, child):
+        self.child = child
+        self.shape = (child.shape[1], child.shape[0])
+
+
+class DiagVec(Expression):
+    """Create a diagonal matrix from a column vector (n,1) -> (n,n)."""
+    def __init__(self, child):
+        if child.shape[1] != 1:
+            raise ValueError(f"diag_vec requires a column vector, got shape {child.shape}")
+        self.child = child
+        self.shape = (child.shape[0], child.shape[0])
+
+
+class Trace(Expression):
+    def __init__(self, child):
+        if child.shape[0] != child.shape[1]:
+            raise ValueError(f"trace requires a square matrix, got shape {child.shape}")
+        self.child = child
+        self.shape = (1, 1)
+
+
+class Reshape(Expression):
+    def __init__(self, child, new_shape):
+        old_size = child.shape[0] * child.shape[1]
+        new_size = new_shape[0] * new_shape[1]
+        if old_size != new_size:
+            raise ValueError(
+                f"Cannot reshape {child.shape} (size {old_size}) to "
+                f"{new_shape} (size {new_size})"
+            )
+        self.child = child
+        self.shape = new_shape
+
+
+class Broadcast(Expression):
+    """Broadcast scalar/row/column to a target shape."""
+    def __init__(self, child, target_shape):
+        if child.shape == target_shape:
+            raise ValueError(
+                f"Broadcast is unnecessary: child shape {child.shape} "
+                f"already matches target {target_shape}"
+            )
+        self.child = child
+        self.shape = target_shape
+
+
+class Sum(Expression):
+    """Sum reduction. C layer always returns row vectors:
+    axis=-1: (1,1), axis=0: (1,d2), axis=1: (1,d1)."""
+    def __init__(self, child, axis):
+        self.child = child
+        self.axis = axis
+        d1, d2 = child.shape
+        if axis == -1:
+            self.shape = (1, 1)
+        elif axis == 0:
+            self.shape = (1, d2)
+        elif axis == 1:
+            self.shape = (1, d1)
+        else:
+            raise ValueError(f"Invalid axis {axis}, must be -1, 0, or 1")
+
+
+class Add(Expression):
+    def __init__(self, left, right):
+        assert left.shape == right.shape, f"Add shape mismatch: {left.shape} vs {right.shape}"
+        self.left = left
+        self.right = right
+        self.shape = left.shape
+
+
+class HStack(Expression):
+    """Horizontal concatenation."""
+    def __init__(self, children, result_shape):
+        self.children = children
+        self.shape = result_shape
+
+
+class Index(Expression):
+    """Indexing with flat column-major indices."""
+    def __init__(self, child, flat_indices, result_shape):
+        self.child = child
+        self.flat_indices = flat_indices
+        self.shape = result_shape
+
+
+class ParamScalarMult(Expression):
+    """a * f(x) where a is a scalar constant/parameter."""
+    def __init__(self, left, right):
+        self.left = left
+        self.right = right
+        self.shape = right.shape
+
+
+class ParamVectorMult(Expression):
+    """a . f(x) elementwise where a is a constant/parameter of matching shape."""
+    def __init__(self, left, right):
+        self.left = left
+        self.right = right
+        self.shape = right.shape
+
+
+class LeftMatMul(Expression):
+    """A @ f(x) where A is a constant/sparse constant/parameter matrix."""
+    def __init__(self, matrix_expr, child, result_shape):
+        self.matrix_expr = matrix_expr
+        self.child = child
+        self.shape = result_shape
+
+
+class RightMatMul(Expression):
+    """f(x) @ A where A is a constant/sparse constant/parameter matrix."""
+    def __init__(self, matrix_expr, child, result_shape):
+        self.matrix_expr = matrix_expr
+        self.child = child
+        self.shape = result_shape
diff --git a/sparsediffpy/_core/_nodes_bivariate.py b/sparsediffpy/_core/_nodes_bivariate.py
new file mode 100644
index 0000000..4366a74
--- /dev/null
+++ b/sparsediffpy/_core/_nodes_bivariate.py
@@ -0,0 +1,85 @@
+"""Bivariate expression nodes: operations on two variable-dependent expressions."""
+
+from sparsediffpy._core._expression import Expression
+from sparsediffpy._core._shapes import is_scalar
+
+
+def _expr_contains_variable(expr, var):
+    """Check if a specific Variable object appears anywhere in an expression tree."""
+    if expr is var:
+        return True
+    for attr in ("child", "left", "right", "x", "y", "z", "param_expr", "matrix_expr"):
+        child = getattr(expr, attr, None)
+        if child is not None and _expr_contains_variable(child, var):
+            return True
+    for child in getattr(expr, "children", []):
+        if _expr_contains_variable(child, var):
+            return True
+    return False
+
+
+class Multiply(Expression):
+    """Elementwise multiply (both operands are variable-dependent)."""
+    def __init__(self, left, right):
+        assert left.shape == right.shape, f"Multiply shape mismatch: {left.shape} vs {right.shape}"
+        self.left = left
+        self.right = right
+        self.shape = left.shape
+
+
+class MatMul(Expression):
+    """Matrix multiply where both operands are variable-dependent."""
+    def __init__(self, left, right, result_shape):
+        self.left = left
+        self.right = right
+        self.shape = result_shape
+
+
+class QuadOverLin(Expression):
+    """sum(x^2) / z where both x and z are plain Variables.
+
+    z must be scalar and must not appear in x.
+    """
+    def __init__(self, x, z):
+        from sparsediffpy._core._scope import Variable
+        if not isinstance(x, Variable):
+            raise ValueError(
+                "quad_over_lin: x (first argument) must be a plain Variable."
+            )
+        if not isinstance(z, Variable):
+            raise ValueError(
+                "quad_over_lin: z (second argument) must be a plain scalar Variable."
+            )
+        if not is_scalar(z.shape):
+            raise ValueError(f"quad_over_lin: z must be scalar, got {z.shape}")
+        if _expr_contains_variable(x, z):
+            raise ValueError(
+                "quad_over_lin: z must not appear in x."
+            )
+        self.left = x
+        self.right = z
+        self.shape = (1, 1)
+
+
+class RelEntr(Expression):
+    """x * log(x / y) elementwise.
+
+    Supports three variants (auto-dispatched by the C layer):
+    - Both same shape: elementwise
+    - Scalar x, vector y: x * log(x / y_i) for each i
+    - Vector x, scalar y: x_i * log(x_i / y) for each i
+    """
+    def __init__(self, x, y):
+        if x.shape == y.shape:
+            self.shape = x.shape
+        elif is_scalar(x.shape):
+            self.shape = y.shape
+        elif is_scalar(y.shape):
+            self.shape = x.shape
+        else:
+            raise ValueError(
+                f"rel_entr: shapes must match or one must be scalar, "
+                f"got {x.shape} and {y.shape}"
+            )
+        self.left = x
+        self.right = y
diff --git a/sparsediffpy/_core/_nodes_elementwise.py b/sparsediffpy/_core/_nodes_elementwise.py
new file mode 100644
index 0000000..74d89e7
--- /dev/null
+++ b/sparsediffpy/_core/_nodes_elementwise.py
@@ -0,0 +1,95 @@
+"""Elementwise expression nodes: unary operations applied element-by-element."""
+
+from sparsediffpy._core._expression import Expression
+from sparsediffpy._core._nodes_affine import _UnaryOp, Index
+
+
+# --- Full domain ---
+
+class Exp(_UnaryOp):
+    pass
+
+
+class Sin(_UnaryOp):
+    pass
+
+
+class Cos(_UnaryOp):
+    pass
+
+
+class Sinh(_UnaryOp):
+    pass
+
+
+class Tanh(_UnaryOp):
+    pass
+
+
+class Asinh(_UnaryOp):
+    pass
+
+
+class Logistic(_UnaryOp):
+    pass
+
+
+class NormalCdf(_UnaryOp):
+    pass
+
+
+class Xexp(_UnaryOp):
+    pass
+
+
+class Power(Expression):
+    def __init__(self, child, exponent):
+        self.child = child
+        self.exponent = exponent
+        self.shape = child.shape
+
+
+# --- Restricted domain ---
+# The C engine's restricted-domain Jacobian code does not correctly handle
+# children with non-trivial Jacobian structure (e.g., index nodes with
+# nonzero offset). These ops require the child to be a plain variable or
+# a full-domain composition.
+
+def _check_no_index_child(child, op_name):
+    """Raise if the immediate child is an Index node.
+
+    The C engine's restricted-domain atoms assume the child's Jacobian has
+    columns starting at offset 0. Applying them directly to an Index node
+    with nonzero offset produces wrong Jacobian column positions.
+    """
+    if isinstance(child, Index):
+        raise ValueError(
+            f"{op_name} cannot be applied directly to an indexed expression. "
+            f"This is a known limitation of the C engine's restricted-domain "
+            f"Jacobian computation. As a workaround, use a separate variable "
+            f"for the indexed slice."
+        )
+
+
+class Log(_UnaryOp):
+    def __init__(self, child):
+        _check_no_index_child(child, "log")
+        super().__init__(child)
+
+
+class Tan(_UnaryOp):
+    def __init__(self, child):
+        _check_no_index_child(child, "tan")
+        super().__init__(child)
+
+
+class Atanh(_UnaryOp):
+    def __init__(self, child):
+        _check_no_index_child(child, "atanh")
+        super().__init__(child)
+
+
+class Entr(_UnaryOp):
+    def __init__(self, child):
+        _check_no_index_child(child, "entr")
+        super().__init__(child)
diff --git a/sparsediffpy/_core/_nodes_other.py b/sparsediffpy/_core/_nodes_other.py
new file mode 100644
index 0000000..2365b03
--- /dev/null
+++ b/sparsediffpy/_core/_nodes_other.py
@@ -0,0 +1,52 @@
+"""Other expression nodes: quad_form, prod variants."""
+
+from sparsediffpy._core._expression import Expression
+
+
+class QuadForm(Expression):
+    """x' Q x where Q is a constant CSR sparse matrix."""
+    def __init__(self, child, Q):
+        self.child = child
+        self.Q = Q
+        self.shape = (1, 1)
+
+
+def _check_prod_child_is_variable(child):
+    """Require prod's argument to be a plain Variable.
+
+    Temporary limitation: the C engine does not implement the chain rule
+    for prod, so it only works correctly when the argument is a variable
+    (Jacobian is identity-like). Compositions like prod(f(x)) will give
+    wrong derivatives.
+    """
+    from sparsediffpy._core._scope import Variable
+    if not isinstance(child, Variable):
+        raise ValueError(
+            "prod requires its argument to be a plain Variable. "
+            "The C engine does not currently implement the chain rule for prod, "
+            "so compositions like prod(f(x)) are not supported."
+        )
+
+
+class Prod(Expression):
+    """Product of all elements -> (1, 1)."""
+    def __init__(self, child):
+        _check_prod_child_is_variable(child)
+        self.child = child
+        self.shape = (1, 1)
+
+
+class ProdAxisZero(Expression):
+    """Product along axis 0 -> (1, d2)."""
+    def __init__(self, child):
+        _check_prod_child_is_variable(child)
+        self.child = child
+        self.shape = (1, child.shape[1])
+
+
+class ProdAxisOne(Expression):
+    """Product along axis 1 -> (1, d1). C layer returns row vector."""
+    def __init__(self, child):
+        _check_prod_child_is_variable(child)
+        self.child = child
+        self.shape = (1, child.shape[0])
diff --git a/sparsediffpy/_core/_registry.py b/sparsediffpy/_core/_registry.py
new file mode 100644
index 0000000..59a7c70
--- /dev/null
+++ b/sparsediffpy/_core/_registry.py
@@ -0,0 +1,176 @@
+"""Node converter registry: maps expression node types to C diff engine constructors.
+
+Each converter receives (node, children_caps) where node is the Python expression
+node and children_caps are already-converted C capsules. matmul and multiply are
+handled separately in _compile.py (they need param_dict for parameter support).
+
+Modelled after DNLP's registry.py.
+"""
+
+import numpy as np
+
+from sparsediffpy import _sparsediffengine as _C
+from sparsediffpy._core._nodes_affine import (
+    Add, Broadcast, DiagVec, HStack, Index, Neg, Reshape, Sum, Trace,
+    Transpose,
+)
+from sparsediffpy._core._nodes_bivariate import (
+    MatMul, Multiply, QuadOverLin, RelEntr,
+)
+from sparsediffpy._core._nodes_elementwise import (
+    Asinh, Atanh, Cos, Entr, Exp, Log, Logistic, NormalCdf, Power,
+    Sin, Sinh, Tan, Tanh, Xexp,
+)
+from sparsediffpy._core._nodes_other import (
+    Prod, ProdAxisOne, ProdAxisZero, QuadForm,
+)
+
+
+# ---------------------------------------------------------------------------
+# Matmul helpers (matching DNLP's helpers.py)
+# ---------------------------------------------------------------------------
+
+def make_sparse_left_matmul(param_node, child_cap, matrix):
+    """A @ f(x) with sparse constant A."""
+    return _C.make_left_matmul(
+        param_node, child_cap, "sparse",
+        matrix._csr_data, matrix._csr_indices, matrix._csr_indptr,
+        matrix.shape[0], matrix.shape[1])
+
+
+def make_dense_left_matmul(param_node, child_cap, A_flat, m, n):
+    """A @ f(x) with dense constant A."""
+    return _C.make_left_matmul(param_node, child_cap, "dense", A_flat, m, n)
+
+
+def make_sparse_right_matmul(param_node, child_cap, matrix):
+    """f(x) @ A with sparse constant A."""
+    return _C.make_right_matmul(
+        param_node, child_cap, "sparse",
+        matrix._csr_data, matrix._csr_indices, matrix._csr_indptr,
+        matrix.shape[0], matrix.shape[1],
+    )
+
+
+def make_dense_right_matmul(param_node, child_cap, A_flat, m, n):
+    """f(x) @ A with dense constant A."""
+    return _C.make_right_matmul(param_node, child_cap, "dense", A_flat, m, n)
+
+
+def _to_dense_row_major(matrix):
+    """Convert a Constant or Parameter to row-major flat data for dense matmul.
+
+    For Parameters with no value set yet, returns zeros as a placeholder —
+    the real values are pushed via problem_update_params before evaluation.
+    """
+    m, n = matrix.shape
+    if matrix._value_flat is None:
+        return np.zeros(m * n, dtype=np.float64)
+    return matrix._value_flat.reshape((m, n), order="F").flatten(order="C")
+
+
+# ---------------------------------------------------------------------------
+# Individual converters for nodes needing special handling
+# ---------------------------------------------------------------------------
+
+def convert_hstack(node, child_caps):
+    return _C.make_hstack(child_caps)
+
+
+def convert_index(node, child_caps):
+    return _C.make_index(
+        child_caps[0], node.shape[0], node.shape[1], node.flat_indices
+    )
+
+
+def convert_sum(node, child_caps):
+    return _C.make_sum(child_caps[0], node.axis)
+
+
+def convert_power(node, child_caps):
+    return _C.make_power(child_caps[0], node.exponent)
+
+
+def convert_reshape(node, child_caps):
+    return _C.make_reshape(child_caps[0], node.shape[0], node.shape[1])
+
+
+def convert_broadcast(node, child_caps):
+    return _C.make_broadcast(child_caps[0], node.shape[0], node.shape[1])
+
+
+def convert_quad_over_lin(node, child_caps):
+    return _C.make_quad_over_lin(child_caps[0], child_caps[1])
+
+
+def convert_rel_entr(node, child_caps):
+    return _C.make_rel_entr(child_caps[0], child_caps[1])
+
+
+def convert_quad_form(node, child_caps):
+    Q = node.Q
+    return _C.make_quad_form(
+        child_caps[0],
+        np.asarray(Q.data, dtype=np.float64),
+        np.asarray(Q.indices, dtype=np.int32),
+        np.asarray(Q.indptr, dtype=np.int32),
+        Q.shape[0], Q.shape[1],
+    )
+
+
+
+# ---------------------------------------------------------------------------
+# Registry dict
+# ---------------------------------------------------------------------------
+
+ATOM_CONVERTERS = {
+    # Elementwise unary (full domain)
+    Neg: lambda _node, caps: _C.make_neg(caps[0]),
+    Exp: lambda _node, caps: _C.make_exp(caps[0]),
+    Sin: lambda _node, caps: _C.make_sin(caps[0]),
+    Cos: lambda _node, caps: _C.make_cos(caps[0]),
+    Sinh: lambda _node, caps: _C.make_sinh(caps[0]),
+    Tanh: lambda _node, caps: _C.make_tanh(caps[0]),
+    Asinh: lambda _node, caps: _C.make_asinh(caps[0]),
+    Logistic: lambda _node, caps: _C.make_logistic(caps[0]),
+    NormalCdf: lambda _node, caps: _C.make_normal_cdf(caps[0]),
+    Xexp: lambda _node, caps: _C.make_xexp(caps[0]),
+
+    # Elementwise unary (restricted domain)
+    Log: lambda _node, caps: _C.make_log(caps[0]),
+    Tan: lambda _node, caps: _C.make_tan(caps[0]),
+    Atanh: lambda _node, caps: _C.make_atanh(caps[0]),
+    Entr: lambda _node, caps: _C.make_entr(caps[0]),
+
+    # Elementwise unary with extra args
+    Power: convert_power,
+
+    # Affine unary
+    Transpose: lambda _node, caps: _C.make_transpose(caps[0]),
+    DiagVec: lambda _node, caps: _C.make_diag_vec(caps[0]),
+    Trace: lambda _node, caps: _C.make_trace(caps[0]),
+
+    # Reductions
+    Sum: convert_sum,
+    Prod: lambda _node, caps: _C.make_prod(caps[0]),
+    ProdAxisZero: lambda _node, caps: _C.make_prod_axis_zero(caps[0]),
+    ProdAxisOne: lambda _node, caps: _C.make_prod_axis_one(caps[0]),
+
+    # Shape operations
+    Reshape: convert_reshape,
+    Broadcast: convert_broadcast,
+
+    # Binary (both variable-dependent)
+    Add: lambda _node, caps: _C.make_add(caps[0], caps[1]),
+    Multiply: lambda _node, caps: _C.make_multiply(caps[0], caps[1]),
+    MatMul: lambda _node, caps: _C.make_matmul(caps[0], caps[1]),
+
+    # Bivariate
+    QuadOverLin: convert_quad_over_lin,
+    RelEntr: convert_rel_entr,
+    QuadForm: convert_quad_form,
+
+    # Structural
+    HStack: convert_hstack,
+    Index: convert_index,
+}
diff --git a/sparsediffpy/_core/_scope.py b/sparsediffpy/_core/_scope.py
new file mode 100644
index 0000000..a63a626
--- /dev/null
+++ b/sparsediffpy/_core/_scope.py
@@ -0,0 +1,121 @@
+"""Scope, Variable, and Parameter."""
+
+import numpy as np
+
+from sparsediffpy._core._expression import Expression
+from sparsediffpy._core._shapes import validate_shape
+
+
+class DimensionError(ValueError):
+    """Raised when a value has the wrong number of elements."""
+    pass
+
+
+class Variable(Expression):
+    """A decision variable in the expression tree.
+
+    Created by Scope.Variable(). Has a .value property that reads/writes
+    into the scope's flat value buffer.
+    """
+
+    def __init__(self, scope, var_id, shape):
+        self._scope = scope
+        self._var_id = var_id
+        self.shape = shape
+
+    @property
+    def value(self):
+        return self._scope._flat_values[self._var_id:self._var_id + self.size].copy()
+
+    @value.setter
+    def value(self, val):
+        val = np.asarray(val, dtype=np.float64).ravel()
+        if val.size != self.size:
+            raise DimensionError(f"expected {self.size} elements, got {val.size}")
+        self._scope._flat_values[self._var_id:self._var_id + self.size] = val
+
+
+class Parameter(Expression):
+    """An updatable parameter in the expression tree.
+
+    Created by Scope.Parameter(). Values must be set via .value before
+    evaluating any expression that uses this parameter.
+    """
+
+    def __init__(self, scope, param_id, shape):
+        self._scope = scope
+        self._param_id = param_id
+        self.shape = shape
+        self._value_flat = None
+
+    @property
+    def value(self):
+        if self._value_flat is None:
+            return None
+        return self._value_flat.copy()
+
+    @value.setter
+    def value(self, val):
+        val = np.asarray(val, dtype=np.float64).ravel(order="F")
+        if val.size != self.size:
+            raise DimensionError(f"expected {self.size} elements, got {val.size}")
+        self._value_flat = val.copy()
+        self._scope._params_dirty = True
+
+
+class Scope:
+    """Owns the variable/parameter space and flat value buffer."""
+
+    def __init__(self):
+        self._variables = []
+        self._parameters = []
+        self._flat_values = np.zeros(0, dtype=np.float64)
+        self._next_var_offset = 0
+        self._next_param_offset = 0
+        self._params_dirty = False
+
+    def Variable(self, d1, d2):
+        """Create a new variable in this scope."""
+        validate_shape(d1, d2)
+        size = d1 * d2
+        var_id = self._next_var_offset
+
+        new_flat = np.zeros(self._next_var_offset + size, dtype=np.float64)
+        if self._next_var_offset > 0:
+            new_flat[:self._next_var_offset] = self._flat_values
+        self._flat_values = new_flat
+        self._next_var_offset += size
+
+        var = Variable(self, var_id, (d1, d2))
+        self._variables.append(var)
+        return var
+
+    def Parameter(self, d1, d2):
+        """Create a new updatable parameter in this scope.
+
+        Set its value via .value = ... before evaluating.
+        """
+        validate_shape(d1, d2)
+        size = d1 * d2
+        param_id = self._next_param_offset
+        self._next_param_offset += size
+
+        param = Parameter(self, param_id, (d1, d2))
+        self._parameters.append(param)
+        return param
+
+    def set_values(self, array):
+        """Set all variable values at once from a flat array."""
+        array = np.asarray(array, dtype=np.float64)
+        in_size = self._flat_values.size
+        if array.size != in_size:
+            raise DimensionError(f"expected {in_size} elements, got {array.size}")
+        self._flat_values[:] = array
+
+    def get_values(self):
+        """Return a copy of the flat value buffer."""
+        return self._flat_values.copy()
+
+    @property
+    def total_var_size(self):
+        return self._next_var_offset
diff --git a/sparsediffpy/_core/_shapes.py b/sparsediffpy/_core/_shapes.py
new file mode 100644
index 0000000..8554ddd
--- /dev/null
+++ b/sparsediffpy/_core/_shapes.py
@@ -0,0 +1,79 @@
+"""Shape validation, broadcasting, and matmul checks.
+
+All shapes are 2-tuples (d1, d2) matching the C layer's convention.
+Column-major flat storage: flat_index = row + col * d1.
+"""
+
+
+def validate_shape(d1, d2):
+    """Check that d1 and d2 are positive integers."""
+    if not isinstance(d1, int) or not isinstance(d2, int):
+        raise TypeError(f"Shape dimensions must be integers, got ({type(d1).__name__}, {type(d2).__name__})")
+    if d1 <= 0 or d2 <= 0:
+        raise ValueError(f"Shape dimensions must be positive, got ({d1}, {d2})")
+
+
+def is_scalar(shape):
+    return shape == (1, 1)
+
+
+
+def broadcast_shape(left_shape, right_shape):
+    """Compute broadcast result shape for elementwise operations.
+
+    Returns (result_shape, left_needs_broadcast, right_needs_broadcast).
+    Raises ValueError if shapes are incompatible.
+
+    Rules (CVXPY/NumPy convention):
+      (1,1) + (m,n) -> (m,n)   broadcast scalar
+      (m,1) + (m,n) -> (m,n)   broadcast column
+      (1,n) + (m,n) -> (m,n)   broadcast row
+      (m,n) + (m,n) -> (m,n)   no broadcast
+    """
+    ld1, ld2 = left_shape
+    rd1, rd2 = right_shape
+
+    if left_shape == right_shape:
+        return left_shape, False, False
+
+    # Broadcast each dimension independently: 1 matches anything
+    if ld1 == rd1:
+        out_d1 = ld1
+    elif ld1 == 1:
+        out_d1 = rd1
+    elif rd1 == 1:
+        out_d1 = ld1
+    else:
+        raise ValueError(
+            f"Cannot broadcast shapes {left_shape} and {right_shape}: "
+            f"d1 mismatch ({ld1} vs {rd1})"
+        )
+
+    if ld2 == rd2:
+        out_d2 = ld2
+    elif ld2 == 1:
+        out_d2 = rd2
+    elif rd2 == 1:
+        out_d2 = ld2
+    else:
+        raise ValueError(
+            f"Cannot broadcast shapes {left_shape} and {right_shape}: "
+            f"d2 mismatch ({ld2} vs {rd2})"
+        )
+
+    result = (out_d1, out_d2)
+    return result, left_shape != result, right_shape != result
+
+
+def check_matmul_shapes(left_shape, right_shape):
+    """Validate matmul dimensions and return result shape.
+
+    Requires left_shape[1] == right_shape[0].
+    Returns (left_shape[0], right_shape[1]).
+    """
+    if left_shape[1] != right_shape[0]:
+        raise ValueError(
+            f"Matmul shape mismatch: ({left_shape[0]}, {left_shape[1]}) @ "
+            f"({right_shape[0]}, {right_shape[1]})"
+        )
+    return (left_shape[0], right_shape[1])
diff --git a/tests/affine/__init__.py b/tests/affine/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/tests/affine/test_add.py b/tests/affine/test_add.py
new file mode 100644
index 0000000..bf9aa93
--- /dev/null
+++ b/tests/affine/test_add.py
@@ -0,0 +1,128 @@
+"""Tests for addition with all broadcast combinations."""
+
+import numpy as np
+import sparsediffpy as sp
+from tests.utils import NumericalDerivativeChecker, random_point
+
+
+# --- Scalar + matrix broadcast ---
+
+def test_add_scalar_plus_matrix_jacobian(scope, rng):
+    a = scope.Variable(1, 1)
+    B = scope.Variable(3, 2)
+    f = a + B
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_add_scalar_plus_matrix_hessian(scope, rng):
+    a = scope.Variable(1, 1)
+    B = scope.Variable(3, 2)
+    f = a + B
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    x0 = random_point(scope, rng)
+    checker.check_hessian(x0, rng.standard_normal(6))
+
+
+# --- Column + matrix broadcast ---
+
+def test_add_column_plus_matrix_jacobian(scope, rng):
+    x = scope.Variable(3, 1)
+    Y = scope.Variable(3, 2)
+    f = x + Y
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_add_column_plus_matrix_hessian(scope, rng):
+    x = scope.Variable(3, 1)
+    Y = scope.Variable(3, 2)
+    f = x + Y
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    x0 = random_point(scope, rng)
+    checker.check_hessian(x0, rng.standard_normal(6))
+
+
+# --- Row + matrix broadcast ---
+
+def test_add_row_plus_matrix_jacobian(scope, rng):
+    r = scope.Variable(1, 2)
+    Y = scope.Variable(3, 2)
+    f = r + Y
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_add_row_plus_matrix_hessian(scope, rng):
+    r = scope.Variable(1, 2)
+    Y = scope.Variable(3, 2)
+    f = r + Y
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    x0 = random_point(scope, rng)
+    checker.check_hessian(x0, rng.standard_normal(6))
+
+
+# --- Same shape (no broadcast) ---
+
+def test_add_same_shape_jacobian(scope, rng):
+    X = scope.Variable(3, 2)
+    Y = scope.Variable(3, 2)
+    f = X + Y
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+# --- Scalar constant broadcast ---
+
+def test_add_constant_scalar_jacobian(scope, rng):
+    x = scope.Variable(3, 1)
+    f = x + 1.0
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_add_constant_scalar_forward(scope, rng):
+    x = scope.Variable(3, 1)
+    f = x + 2.5
+    fn = sp.compile(f)
+    x0 = random_point(scope, rng)
+    np.testing.assert_allclose(fn.forward(), x0 + 2.5)
+
+
+# --- NumPy array constant broadcast ---
+
+def test_add_numpy_array_jacobian(scope, rng):
+    x = scope.Variable(3, 1)
+    b = np.array([1.0, 2.0, 3.0])
+    f = b + x
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_add_numpy_array_forward(scope, rng):
+    x = scope.Variable(3, 1)
+    b = np.array([10.0, 20.0, 30.0])
+    f = x + b
+    fn = sp.compile(f)
+    x0 = random_point(scope, rng)
+    np.testing.assert_allclose(fn.forward(), x0 + b)
+
+
+# --- Column vectors ---
+
+def test_add_column_vectors_jacobian(scope, rng):
+    x = scope.Variable(4, 1)
+    y = scope.Variable(4, 1)
+    f = x + y
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
diff --git a/tests/affine/test_diag_vec.py b/tests/affine/test_diag_vec.py
new file mode 100644
index 0000000..db44701
--- /dev/null
+++ b/tests/affine/test_diag_vec.py
@@ -0,0 +1,22 @@
+import numpy as np
+import sparsediffpy as sp
+from tests.utils import NumericalDerivativeChecker, random_point
+
+
+def test_diag_vec_jacobian(scope, rng):
+    x = scope.Variable(3, 1)
+    f = sp.diag_vec(x)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_diag_vec_forward(scope, rng):
+    x = scope.Variable(3, 1)
+    f = sp.diag_vec(x)
+    fn = sp.compile(f)
+    x0 = random_point(scope, rng)
+    result = fn.forward()
+    # diag_vec produces a 3x3 matrix, flattened column-major
+    expected = np.diag(x0).ravel(order="F")
+    np.testing.assert_allclose(result, expected)
diff --git a/tests/affine/test_hstack.py b/tests/affine/test_hstack.py
new file mode 100644
index 0000000..0c4a37f
--- /dev/null
+++ b/tests/affine/test_hstack.py
@@ -0,0 +1,33 @@
+import numpy as np
+import sparsediffpy as sp
+from tests.utils import NumericalDerivativeChecker, random_point
+
+
+def test_hstack_vectors_jacobian(scope, rng):
+    x = scope.Variable(3, 1)
+    y = scope.Variable(3, 1)
+    f = sp.hstack([x, y])
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_hstack_vectors_forward(scope, rng):
+    x = scope.Variable(3, 1)
+    y = scope.Variable(3, 1)
+    f = sp.hstack([x, y])
+    fn = sp.compile(f)
+    x0 = random_point(scope, rng)
+    x_val = x.value
+    y_val = y.value
+    np.testing.assert_allclose(fn.forward(), np.concatenate([x_val, y_val]))
+
+
+def test_hstack_three_jacobian(scope, rng):
+    a = scope.Variable(2, 1)
+    b = scope.Variable(2, 1)
+    c = scope.Variable(2, 1)
+    f = sp.hstack([a, b, c])
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
diff --git a/tests/affine/test_index.py b/tests/affine/test_index.py
new file mode 100644
index 0000000..e1e4dcc
--- /dev/null
+++ b/tests/affine/test_index.py
@@ -0,0 +1,67 @@
+import numpy as np
+import sparsediffpy as sp
+from tests.utils import NumericalDerivativeChecker, random_point
+
+
+def test_index_scalar(scope, rng):
+    x = scope.Variable(4, 1)
+    f = x[0]
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_index_scalar_forward(scope, rng):
+    x = scope.Variable(4, 1)
+    f = x[2]
+    fn = sp.compile(f)
+    x0 = random_point(scope, rng)
+    np.testing.assert_allclose(fn.forward(), [x0[2]])
+
+
+def test_index_slice(scope, rng):
+    x = scope.Variable(4, 1)
+    f = x[1:3]
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_index_slice_forward(scope, rng):
+    x = scope.Variable(4, 1)
+    f = x[1:3]
+    fn = sp.compile(f)
+    x0 = random_point(scope, rng)
+    np.testing.assert_allclose(fn.forward(), x0[1:3])
+
+
+def test_index_fancy(scope, rng):
+    x = scope.Variable(4, 1)
+    f = x[[0, 3]]
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_index_matrix_element(scope, rng):
+    X = scope.Variable(3, 2)
+    f = X[1, 0]
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_index_matrix_row_slice(scope, rng):
+    X = scope.Variable(3, 2)
+    f = X[0:2, :]
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_index_matrix_column(scope, rng):
+    X = scope.Variable(3, 2)
+    f = X[:, 1]
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
diff --git a/tests/affine/test_left_matmul.py b/tests/affine/test_left_matmul.py
new file mode 100644
index 0000000..4c2b9ee
--- /dev/null
+++ b/tests/affine/test_left_matmul.py
@@ -0,0 +1,62 @@
+import numpy as np
+import scipy.sparse
+import sparsediffpy as sp
+from tests.utils import NumericalDerivativeChecker, random_point
+
+
+def test_left_matmul_dense_jacobian(scope, rng):
+    x = scope.Variable(3, 1)
+    A = rng.standard_normal((4, 3))
+    f = A @ x
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_left_matmul_dense_forward(scope, rng):
+    x = scope.Variable(3, 1)
+    A = rng.standard_normal((4, 3))
+    f = A @ x
+    fn = sp.compile(f)
+    x0 = random_point(scope, rng)
+    np.testing.assert_allclose(fn.forward(), A @ x0, rtol=1e-10)
+
+
+def test_left_matmul_sparse_jacobian(scope, rng):
+    x = scope.Variable(3, 1)
+    A = scipy.sparse.eye(3, format="csr") * 2.0
+    f = A @ x
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_left_matmul_sparse_forward(scope, rng):
+    x = scope.Variable(3, 1)
+    A = scipy.sparse.eye(3, format="csr") * 3.0
+    f = A @ x
+    fn = sp.compile(f)
+    x0 = random_point(scope, rng)
+    np.testing.assert_allclose(fn.forward(), 3.0 * x0)
+
+
+def test_left_matmul_parameter_jacobian(scope, rng):
+    x = scope.Variable(3, 1)
+    A = scope.Parameter(4, 3)
+    A.value = rng.standard_normal((4, 3))
+    f = A @ x
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_left_matmul_parameter_update(scope, rng):
+    x = scope.Variable(3, 1)
+    A = scope.Parameter(3, 3)
+    A.value = np.eye(3)
+    f = A @ x
+    fn = sp.compile(f)
+    x0 = random_point(scope, rng)
+    np.testing.assert_allclose(fn.forward(), x0)
+    A.value = 2 * np.eye(3)
+    np.testing.assert_allclose(fn.forward(), 2 * x0)
diff --git a/tests/affine/test_neg.py b/tests/affine/test_neg.py
new file mode 100644
index 0000000..77ab620
--- /dev/null
+++ b/tests/affine/test_neg.py
@@ -0,0 +1,27 @@
+import numpy as np
+import sparsediffpy as sp
+from tests.utils import NumericalDerivativeChecker, random_point
+
+
+def test_neg_vector_forward(scope, rng):
+    x = scope.Variable(4, 1)
+    f = -x
+    fn = sp.compile(f)
+    x0 = random_point(scope, rng)
+    np.testing.assert_allclose(fn.forward(), -x0)
+
+
+def test_neg_vector_jacobian(scope, rng):
+    x = scope.Variable(4, 1)
+    f = -x
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_neg_matrix_jacobian(scope, rng):
+    X = scope.Variable(3, 2)
+    f = -X
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
diff --git a/tests/affine/test_reshape.py b/tests/affine/test_reshape.py
new file mode 100644
index 0000000..068ed0b
--- /dev/null
+++ b/tests/affine/test_reshape.py
@@ -0,0 +1,19 @@
+import numpy as np
+import sparsediffpy as sp
+from tests.utils import NumericalDerivativeChecker, random_point
+
+
+def test_reshape_jacobian(scope, rng):
+    x = scope.Variable(6, 1)
+    f = sp.reshape(x, 2, 3)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_reshape_matrix_jacobian(scope, rng):
+    X = scope.Variable(3, 2)
+    f = sp.reshape(X, 6, 1)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
diff --git a/tests/affine/test_right_matmul.py b/tests/affine/test_right_matmul.py
new file mode 100644
index 0000000..dff4d14
--- /dev/null
+++ b/tests/affine/test_right_matmul.py
@@ -0,0 +1,33 @@
+import numpy as np
+import scipy.sparse
+import sparsediffpy as sp
+from tests.utils import NumericalDerivativeChecker, random_point
+
+
+def test_right_matmul_dense_jacobian(scope, rng):
+    x = scope.Variable(1, 3)
+    A = rng.standard_normal((3, 4))
+    f = x @ A
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_right_matmul_dense_forward(scope, rng):
+    x = scope.Variable(1, 3)
+    A = rng.standard_normal((3, 4))
+    f = x @ A
+    fn = sp.compile(f)
+    x0 = random_point(scope, rng)
+    x_mat = x0.reshape(1, 3)
+    expected = (x_mat @ A).ravel(order="F")
+    np.testing.assert_allclose(fn.forward(), expected, rtol=1e-10)
+
+
+def test_right_matmul_sparse_jacobian(scope, rng):
+    x = scope.Variable(1, 3)
+    A = scipy.sparse.eye(3, format="csr") * 2.0
+    f = x @ A
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
diff --git a/tests/affine/test_scalar_mult.py b/tests/affine/test_scalar_mult.py
new file mode 100644
index 0000000..a17ac97
--- /dev/null
+++ b/tests/affine/test_scalar_mult.py
@@ -0,0 +1,57 @@
+import numpy as np
+import sparsediffpy as sp
+from tests.utils import NumericalDerivativeChecker, random_point
+
+
+def test_scalar_mult_constant_jacobian(scope, rng):
+    x = scope.Variable(4, 1)
+    f = 3.0 * x
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_scalar_mult_constant_forward(scope, rng):
+    x = scope.Variable(3, 1)
+    f = 2.5 * x
+    fn = sp.compile(f)
+    x0 = random_point(scope, rng)
+    np.testing.assert_allclose(fn.forward(), 2.5 * x0)
+
+
+def test_scalar_mult_parameter_jacobian(scope, rng):
+    x = scope.Variable(4, 1)
+    a = scope.Parameter(1, 1)
+    a.value = np.array([[3.0]])
+    f = a * x
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_scalar_mult_parameter_update(scope, rng):
+    x = scope.Variable(3, 1)
+    a = scope.Parameter(1, 1)
+    a.value = np.array([[2.0]])
+    f = a * x
+    fn = sp.compile(f)
+    x0 = random_point(scope, rng)
+    np.testing.assert_allclose(fn.forward(), 2.0 * x0)
+    a.value = np.array([[5.0]])
+    np.testing.assert_allclose(fn.forward(), 5.0 * x0)
+
+
+def test_scalar_mult_matrix_jacobian(scope, rng):
+    X = scope.Variable(3, 2)
+    f = 2.0 * X
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_right_scalar_mult_jacobian(scope, rng):
+    x = scope.Variable(4, 1)
+    f = x * 3.0
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
diff --git a/tests/affine/test_sub.py b/tests/affine/test_sub.py
new file mode 100644
index 0000000..c53013d
--- /dev/null
+++ b/tests/affine/test_sub.py
@@ -0,0 +1,39 @@
+import numpy as np
+import sparsediffpy as sp
+from tests.utils import NumericalDerivativeChecker, random_point
+
+
+def test_sub_vectors_forward(scope, rng):
+    x = scope.Variable(3, 1)
+    y = scope.Variable(3, 1)
+    f = x - y
+    fn = sp.compile(f)
+    x0 = random_point(scope, rng)
+    x_val = x.value
+    y_val = y.value
+    np.testing.assert_allclose(fn.forward(), x_val - y_val)
+
+
+def test_sub_vectors_jacobian(scope, rng):
+    x = scope.Variable(3, 1)
+    y = scope.Variable(3, 1)
+    f = x - y
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_sub_scalar_constant(scope, rng):
+    x = scope.Variable(3, 1)
+    f = x - 1.0
+    fn = sp.compile(f)
+    x0 = random_point(scope, rng)
+    np.testing.assert_allclose(fn.forward(), x0 - 1.0)
+
+
+def test_rsub_constant(scope, rng):
+    x = scope.Variable(3, 1)
+    f = 1.0 - x
+    fn = sp.compile(f)
+    x0 = random_point(scope, rng)
+    np.testing.assert_allclose(fn.forward(), 1.0 - x0)
diff --git a/tests/affine/test_sum.py b/tests/affine/test_sum.py
new file mode 100644
index 0000000..a471f96
--- /dev/null
+++ b/tests/affine/test_sum.py
@@ -0,0 +1,35 @@
+import numpy as np
+import sparsediffpy as sp
+from tests.utils import NumericalDerivativeChecker, random_point
+
+
+def test_sum_all_jacobian(scope, rng):
+    x = scope.Variable(3, 2)
+    f = sp.sum(x)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_sum_axis0_jacobian(scope, rng):
+    x = scope.Variable(3, 2)
+    f = sp.sum(x, axis=0)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_sum_axis1_jacobian(scope, rng):
+    x = scope.Variable(3, 2)
+    f = sp.sum(x, axis=1)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_sum_vector_jacobian(scope, rng):
+    x = scope.Variable(4, 1)
+    f = sp.sum(x)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
diff --git a/tests/affine/test_trace.py b/tests/affine/test_trace.py
new file mode 100644
index 0000000..0f723b5
--- /dev/null
+++ b/tests/affine/test_trace.py
@@ -0,0 +1,20 @@
+import numpy as np
+import sparsediffpy as sp
+from tests.utils import NumericalDerivativeChecker, random_point
+
+
+def test_trace_jacobian(scope, rng):
+    X = scope.Variable(3, 3)
+    f = sp.trace(X)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_trace_hessian(scope, rng):
+    X = scope.Variable(3, 3)
+    f = sp.trace(X)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    x0 = random_point(scope, rng)
+    checker.check_hessian(x0, np.array([1.0]))
diff --git a/tests/affine/test_transpose.py b/tests/affine/test_transpose.py
new file mode 100644
index 0000000..e7d4700
--- /dev/null
+++ b/tests/affine/test_transpose.py
@@ -0,0 +1,19 @@
+import numpy as np
+import sparsediffpy as sp
+from tests.utils import NumericalDerivativeChecker, random_point
+
+
+def test_transpose_matrix_jacobian(scope, rng):
+    X = scope.Variable(3, 2)
+    f = X.T
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_transpose_vector_jacobian(scope, rng):
+    x = scope.Variable(4, 1)
+    f = x.T
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
diff --git a/tests/affine/test_vector_mult.py b/tests/affine/test_vector_mult.py
new file mode 100644
index 0000000..8a57278
--- /dev/null
+++ b/tests/affine/test_vector_mult.py
@@ -0,0 +1,43 @@
+import numpy as np
+import sparsediffpy as sp
+from tests.utils import NumericalDerivativeChecker, random_point
+
+
+def test_vector_mult_constant_jacobian(scope, rng):
+    x = scope.Variable(3, 1)
+    c = np.array([1.0, 2.0, 3.0])
+    f = c * x
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_vector_mult_constant_forward(scope, rng):
+    x = scope.Variable(3, 1)
+    c = np.array([1.0, 2.0, 3.0])
+    f = c * x
+    fn = sp.compile(f)
+    x0 = random_point(scope, rng)
+    np.testing.assert_allclose(fn.forward(), c * x0)
+
+
+def test_vector_mult_parameter_jacobian(scope, rng):
+    x = scope.Variable(3, 1)
+    a = scope.Parameter(3, 1)
+    a.value = np.array([1.0, 2.0, 3.0])
+    f = a * x
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_vector_mult_parameter_update(scope, rng):
+    x = scope.Variable(3, 1)
+    a = scope.Parameter(3, 1)
+    a.value = np.array([1.0, 1.0, 1.0])
+    f = a * x
+    fn = sp.compile(f)
+    x0 = random_point(scope, rng)
+    np.testing.assert_allclose(fn.forward(), x0)
+    a.value = np.array([2.0, 3.0, 4.0])
+    np.testing.assert_allclose(fn.forward(), np.array([2.0, 3.0, 4.0]) * x0)
diff --git a/tests/affine/test_vstack.py b/tests/affine/test_vstack.py
new file mode 100644
index 0000000..d75bdb9
--- /dev/null
+++ b/tests/affine/test_vstack.py
@@ -0,0 +1,32 @@
+import numpy as np
+import sparsediffpy as sp
+from tests.utils import NumericalDerivativeChecker, random_point
+
+
+def test_vstack_vectors_jacobian(scope, rng):
+    x = scope.Variable(3, 1)
+    y = scope.Variable(2, 1)
+    f = sp.vstack([x, y])
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_vstack_vectors_forward(scope, rng):
+    x = scope.Variable(3, 1)
+    y = scope.Variable(2, 1)
+    f = sp.vstack([x, y])
+    fn = sp.compile(f)
+    x0 = random_point(scope, rng)
+    x_val = x.value
+    y_val = y.value
+    np.testing.assert_allclose(fn.forward(), np.concatenate([x_val, y_val]))
+
+
+def test_vstack_matrices_jacobian(scope, rng):
+    X = scope.Variable(2, 3)
+    Y = scope.Variable(2, 3)
+    f = sp.vstack([X, Y])
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
diff --git a/tests/bivariate_full_dom/__init__.py b/tests/bivariate_full_dom/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/tests/bivariate_full_dom/test_matmul.py b/tests/bivariate_full_dom/test_matmul.py
new file mode 100644
index 0000000..4d6e62f
--- /dev/null
+++ b/tests/bivariate_full_dom/test_matmul.py
@@ -0,0 +1,32 @@
+import numpy as np
+import sparsediffpy as sp
+from tests.utils import NumericalDerivativeChecker, random_point
+
+
+def test_matmul_jacobian(scope, rng):
+    X = scope.Variable(2, 3)
+    Y = scope.Variable(3, 2)
+    f = X @ Y
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_matmul_hessian(scope, rng):
+    X = scope.Variable(2, 3)
+    Y = scope.Variable(3, 2)
+    f = X @ Y
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    x0 = random_point(scope, rng)
+    checker.check_hessian(x0, rng.standard_normal(4))
+
+
+def test_matmul_vec_jacobian(scope, rng):
+    """Row vector @ column vector = scalar."""
+    x = scope.Variable(1, 3)
+    y = scope.Variable(3, 1)
+    f = x @ y
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
diff --git a/tests/bivariate_full_dom/test_multiply.py b/tests/bivariate_full_dom/test_multiply.py
new file mode 100644
index 0000000..b9cd0a9
--- /dev/null
+++ b/tests/bivariate_full_dom/test_multiply.py
@@ -0,0 +1,42 @@
+import numpy as np
+import sparsediffpy as sp
+from tests.utils import NumericalDerivativeChecker, random_point
+
+
+def test_multiply_vectors_jacobian(scope, rng):
+    x = scope.Variable(4, 1)
+    y = scope.Variable(4, 1)
+    f = x * y
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_multiply_vectors_hessian(scope, rng):
+    x = scope.Variable(4, 1)
+    y = scope.Variable(4, 1)
+    f = x * y
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    x0 = random_point(scope, rng)
+    checker.check_hessian(x0, rng.standard_normal(4))
+
+
+def test_multiply_vectors_forward(scope, rng):
+    x = scope.Variable(3, 1)
+    y = scope.Variable(3, 1)
+    f = x * y
+    fn = sp.compile(f)
+    x0 = random_point(scope, rng)
+    x_val = x.value
+    y_val = y.value
+    np.testing.assert_allclose(fn.forward(), x_val * y_val)
+
+
+def test_multiply_matrices_jacobian(scope, rng):
+    X = scope.Variable(3, 2)
+    Y = scope.Variable(3, 2)
+    f = X * Y
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
diff --git a/tests/complicated/__init__.py b/tests/complicated/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/tests/complicated/test_compositions.py b/tests/complicated/test_compositions.py
new file mode 100644
index 0000000..a6bfb39
--- /dev/null
+++ b/tests/complicated/test_compositions.py
@@ -0,0 +1,349 @@
+"""Complicated composition tests.
+
+Each test builds a deep or wide expression involving many atoms, then
+verifies forward (against manual NumPy), Jacobian, and Hessian.
+"""
+
+import numpy as np
+import pytest
+import sparsediffpy as sp
+from tests.utils import NumericalDerivativeChecker, random_point, random_positive_point
+
+
+# -----------------------------------------------------------------------
+# 1. Affine chain: A @ x + b
+# -----------------------------------------------------------------------
+
+class TestAffineChain:
+    def _build(self, scope, rng):
+        x = scope.Variable(3, 1)
+        A = rng.standard_normal((3, 3))
+        b = rng.standard_normal(3)
+        f = A @ x + b
+        fn = sp.compile(f)
+        return fn, x, A, b
+
+    def test_forward(self, scope, rng):
+        fn, x, A, b = self._build(scope, rng)
+        x0 = random_point(scope, rng)
+        np.testing.assert_allclose(fn.forward(), A @ x0 + b, rtol=1e-10)
+
+    def test_jacobian(self, scope, rng):
+        fn, x, A, b = self._build(scope, rng)
+        checker = NumericalDerivativeChecker(fn, scope)
+        checker.check_jacobian(random_point(scope, rng))
+
+    def test_hessian(self, scope, rng):
+        fn, x, A, b = self._build(scope, rng)
+        checker = NumericalDerivativeChecker(fn, scope)
+        x0 = random_point(scope, rng)
+        checker.check_hessian(x0, rng.standard_normal(3))
+
+
+# -----------------------------------------------------------------------
+# 2. Nonlinear composition: exp(A @ x) + sin(x)
+# -----------------------------------------------------------------------
+
+class TestNonlinearComposition:
+    def _build(self, scope, rng):
+        x = scope.Variable(3, 1)
+        A = rng.standard_normal((3, 3))
+        f = sp.exp(A @ x) + sp.sin(x)
+        fn = sp.compile(f)
+        return fn, x, A
+
+    def test_forward(self, scope, rng):
+        fn, x, A = self._build(scope, rng)
+        x0 = random_point(scope, rng, low=-0.5, high=0.5)
+        np.testing.assert_allclose(
+            fn.forward(), np.exp(A @ x0) + np.sin(x0), rtol=1e-10
+        )
+
+    def test_jacobian(self, scope, rng):
+        fn, x, A = self._build(scope, rng)
+        checker = NumericalDerivativeChecker(fn, scope)
+        checker.check_jacobian(random_point(scope, rng, low=-0.5, high=0.5))
+
+    def test_hessian(self, scope, rng):
+        fn, x, A = self._build(scope, rng)
+        checker = NumericalDerivativeChecker(fn, scope)
+        x0 = random_point(scope, rng, low=-0.5, high=0.5)
+        checker.check_hessian(x0, rng.standard_normal(3))
+
+
+# -----------------------------------------------------------------------
+# 3. Matrix expression: sin(X) * Y  and  X.T @ Y  (tested separately)
+# -----------------------------------------------------------------------
+
+class TestMatrixExpression:
+    def _build_elementwise(self, scope, rng):
+        X = scope.Variable(3, 2)
+        Y = scope.Variable(3, 2)
+        f = sp.sin(X) * Y
+        fn = sp.compile(f)
+        return fn, X, Y
+
+    def test_elementwise_forward(self, scope, rng):
+        fn, X, Y = self._build_elementwise(scope, rng)
+        x0 = random_point(scope, rng)
+        X_val = X.value.reshape(3, 2, order="F")
+        Y_val = Y.value.reshape(3, 2, order="F")
+        expected = (np.sin(X_val) * Y_val).ravel(order="F")
+        np.testing.assert_allclose(fn.forward(), expected, rtol=1e-10)
+
+    def test_elementwise_jacobian(self, scope, rng):
+        fn, X, Y = self._build_elementwise(scope, rng)
+        checker = NumericalDerivativeChecker(fn, scope)
+        checker.check_jacobian(random_point(scope, rng))
+
+    def test_elementwise_hessian(self, scope, rng):
+        fn, X, Y = self._build_elementwise(scope, rng)
+        checker = NumericalDerivativeChecker(fn, scope)
+        x0 = random_point(scope, rng)
+        checker.check_hessian(x0, rng.standard_normal(6))
+
+    def _build_matmul(self, scope, rng):
+        X = scope.Variable(3, 2)
+        Y = scope.Variable(3, 2)
+        f = X.T @ Y  # (2,3) @ (3,2) = (2,2)
+        fn = sp.compile(f)
+        return fn, X, Y
+
+    def test_matmul_jacobian(self, scope, rng):
+        fn, X, Y = self._build_matmul(scope, rng)
+        checker = NumericalDerivativeChecker(fn, scope)
+        checker.check_jacobian(random_point(scope, rng))
+
+    def test_matmul_hessian(self, scope, rng):
+        fn, X, Y = self._build_matmul(scope, rng)
+        checker = NumericalDerivativeChecker(fn, scope)
+        x0 = random_point(scope, rng)
+        checker.check_hessian(x0, rng.standard_normal(4))
+
+
+# -----------------------------------------------------------------------
+# 4. Broadcast heavy: a * X + r + c
+#    a: scalar (1,1), X: matrix (3,2), r: row (1,2), c: column (3,1)
+# -----------------------------------------------------------------------
+
+class TestBroadcastHeavy:
+    def _build(self, scope, rng):
+        a = scope.Variable(1, 1)
+        X = scope.Variable(3, 2)
+        r = scope.Variable(1, 2)
+        c = scope.Variable(3, 1)
+        f = a * X + r + c
+        fn = sp.compile(f)
+        return fn, a, X, r, c
+
+    def test_forward(self, scope, rng):
+        fn, a, X, r, c = self._build(scope, rng)
+        x0 = random_point(scope, rng)
+        a_val = a.value[0]
+        X_val = X.value.reshape(3, 2, order="F")
+        r_val = r.value.reshape(1, 2, order="F")
+        c_val = c.value.reshape(3, 1, order="F")
+        expected = (a_val * X_val + r_val + c_val).ravel(order="F")
+        np.testing.assert_allclose(fn.forward(), expected, rtol=1e-10)
+
+    def test_jacobian(self, scope, rng):
+        fn, *_ = self._build(scope, rng)
+        checker = NumericalDerivativeChecker(fn, scope)
+        checker.check_jacobian(random_point(scope, rng))
+
+    def test_hessian(self, scope, rng):
+        fn, *_ = self._build(scope, rng)
+        checker = NumericalDerivativeChecker(fn, scope)
+        x0 = random_point(scope, rng)
+        checker.check_hessian(x0, rng.standard_normal(6))
+
+
+# -----------------------------------------------------------------------
+# 5. Index into composition: sum(exp(x[0:3]) + log(x[3:6]))
+# -----------------------------------------------------------------------
+
+class TestIndexIntoComposition:
+    """Index + elementwise composition tests.
+
+    Full-domain ops (exp, sin, etc.) work on indexed expressions.
+    Restricted-domain ops (log, tan, atanh, entr) raise an error when
+    applied directly to an index node (C engine limitation).
+    """
+    def test_full_domain_on_index(self, scope, rng):
+        """exp on indexed variable works correctly."""
+        x = scope.Variable(6, 1)
+        f = sp.exp(x[3:6])
+        fn = sp.compile(f)
+        x0 = random_positive_point(scope, rng)
+        np.testing.assert_allclose(fn.forward(), np.exp(x0[3:6]))
+        checker = NumericalDerivativeChecker(fn, scope)
+        checker.check_jacobian(x0)
+
+    def test_restricted_domain_on_index_raises(self, scope, rng):
+        """log on indexed variable raises ValueError."""
+        x = scope.Variable(6, 1)
+        with pytest.raises(ValueError, match="log cannot be applied directly"):
+            sp.log(x[3:6])
+
+    def test_workaround_separate_variables(self, scope, rng):
+        """Use separate variables as workaround for restricted-domain + index."""
+        a = scope.Variable(3, 1)
+        b = scope.Variable(3, 1)
+        f = sp.sum(sp.exp(a) + sp.log(b))
+        fn = sp.compile(f)
+        x0 = random_positive_point(scope, rng)
+        a_val = a.value
+        b_val = b.value
+        expected = np.sum(np.exp(a_val) + np.log(b_val))
+        np.testing.assert_allclose(fn.forward(), [expected], rtol=1e-10)
+        checker = NumericalDerivativeChecker(fn, scope)
+        checker.check_jacobian(x0)
+        checker.check_hessian(x0, np.array([1.0]))
+
+
+# -----------------------------------------------------------------------
+# 6. Hstack mixed: hstack([sin(x), A @ x, y])
+# -----------------------------------------------------------------------
+
+class TestHstackMixed:
+    def _build(self, scope, rng):
+        x = scope.Variable(3, 1)
+        y = scope.Variable(3, 1)
+        A = rng.standard_normal((3, 3))
+        f = sp.hstack([sp.sin(x), A @ x, y])
+        fn = sp.compile(f)
+        return fn, x, y, A
+
+    def test_forward(self, scope, rng):
+        fn, x, y, A = self._build(scope, rng)
+        x0 = random_point(scope, rng)
+        x_val = x.value
+        y_val = y.value
+        expected = np.concatenate([np.sin(x_val), A @ x_val, y_val])
+        np.testing.assert_allclose(fn.forward(), expected, rtol=1e-10)
+
+    def test_jacobian(self, scope, rng):
+        fn, *_ = self._build(scope, rng)
+        checker = NumericalDerivativeChecker(fn, scope)
+        checker.check_jacobian(random_point(scope, rng))
+
+    def test_hessian(self, scope, rng):
+        fn, *_ = self._build(scope, rng)
+        checker = NumericalDerivativeChecker(fn, scope)
+        x0 = random_point(scope, rng)
+        checker.check_hessian(x0, rng.standard_normal(9))
+
+
+# -----------------------------------------------------------------------
+# 7. Multi-compile shared scope
+# -----------------------------------------------------------------------
+
+class TestMultiCompileSharedScope:
+    def test_two_expressions_same_scope(self, scope, rng):
+        x = scope.Variable(3, 1)
+        f = sp.sin(x)
+        g = sp.exp(x)
+        fn_f = sp.compile(f)
+        fn_g = sp.compile(g)
+
+        x0 = random_point(scope, rng)
+        np.testing.assert_allclose(fn_f.forward(), np.sin(x0), rtol=1e-10)
+        np.testing.assert_allclose(fn_g.forward(), np.exp(x0), rtol=1e-10)
+
+        # Jacobians are independent
+        J_f = fn_f.jacobian().toarray()
+        J_g = fn_g.jacobian().toarray()
+        np.testing.assert_allclose(J_f, np.diag(np.cos(x0)), rtol=1e-10)
+        np.testing.assert_allclose(J_g, np.diag(np.exp(x0)), rtol=1e-10)
+
+    def test_shared_scope_different_variables(self, scope, rng):
+        x = scope.Variable(3, 1)
+        y = scope.Variable(2, 1)
+        f = sp.sin(x)
+        g = sp.exp(y)
+        fn_f = sp.compile(f)
+        fn_g = sp.compile(g)
+
+        x0 = random_point(scope, rng)
+        x_val = x.value
+        y_val = y.value
+        np.testing.assert_allclose(fn_f.forward(), np.sin(x_val), rtol=1e-10)
+        np.testing.assert_allclose(fn_g.forward(), np.exp(y_val), rtol=1e-10)
+
+        # f's Jacobian is 3x5 (3 outputs, 5 total vars), g's is 2x5
+        J_f = fn_f.jacobian().toarray()
+        J_g = fn_g.jacobian().toarray()
+        assert J_f.shape == (3, 5)
+        assert J_g.shape == (2, 5)
+        # f depends on x (cols 0-2), not y (cols 3-4)
+        np.testing.assert_allclose(J_f[:, 3:], 0.0)
+        # g depends on y (cols 3-4), not x (cols 0-2)
+        np.testing.assert_allclose(J_g[:, 0:3], 0.0)
+
+
+# -----------------------------------------------------------------------
+# 8. Deep chain: exp(sin(tanh(A @ x + b)))
+# -----------------------------------------------------------------------
+
+class TestDeepChain:
+    def _build(self, scope, rng):
+        x = scope.Variable(3, 1)
+        A = rng.standard_normal((3, 3)) * 0.5  # scale down to avoid exp overflow
+        b = rng.standard_normal(3) * 0.1
+        f = sp.exp(sp.sin(sp.tanh(A @ x + b)))
+        fn = sp.compile(f)
+        return fn, x, A, b
+
+    def test_forward(self, scope, rng):
+        fn, x, A, b = self._build(scope, rng)
+        x0 = random_point(scope, rng, low=-0.5, high=0.5)
+        expected = np.exp(np.sin(np.tanh(A @ x0 + b)))
+        np.testing.assert_allclose(fn.forward(), expected, rtol=1e-10)
+
+    def test_jacobian(self, scope, rng):
+        fn, *_ = self._build(scope, rng)
+        checker = NumericalDerivativeChecker(fn, scope)
+        checker.check_jacobian(random_point(scope, rng, low=-0.5, high=0.5))
+
+    def test_hessian(self, scope, rng):
+        fn, *_ = self._build(scope, rng)
+        checker = NumericalDerivativeChecker(fn, scope)
+        x0 = random_point(scope, rng, low=-0.5, high=0.5)
+        checker.check_hessian(x0, rng.standard_normal(3))
+
+
+# -----------------------------------------------------------------------
+# 9. Matrix hessian: sin(A @ X) with matrix variable and 2D weights
+# -----------------------------------------------------------------------
+
+class TestMatrixHessian:
+    def test_sin_AX_hessian(self, scope, rng):
+        X = scope.Variable(3, 3)
+        A = rng.standard_normal((3, 3))
+        f = sp.sin(A @ X)
+        fn = sp.compile(f)
+        checker = NumericalDerivativeChecker(fn, scope)
+        x0 = random_point(scope, rng, low=-0.5, high=0.5)
+        weights = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]], dtype=np.float64)
+        checker.check_hessian(x0, weights.ravel(order='F'))
+
+    def test_sin_AX_hessian_2d_weights(self, scope, rng):
+        """Passing weights as a 2D array — hessian() should flatten column-major."""
+        X = scope.Variable(3, 3)
+        A = rng.standard_normal((3, 3))
+        f = sp.sin(A @ X)
+        fn = sp.compile(f)
+        x0 = random_point(scope, rng, low=-0.5, high=0.5)
+        weights = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]], dtype=np.float64)
+
+        # 2D weights should be flattened column-major internally
+        fn.forward()
+        fn.jacobian()
+        H_2d = fn.hessian(weights)
+
+        # Compare against explicitly flattened F-order weights
+        fn.forward()
+        fn.jacobian()
+        H_flat = fn.hessian(weights.ravel(order='F'))
+
+        np.testing.assert_allclose(H_2d.toarray(), H_flat.toarray())
diff --git a/tests/conftest.py b/tests/conftest.py
new file mode 100644
index 0000000..418baad
--- /dev/null
+++ b/tests/conftest.py
@@ -0,0 +1,13 @@
+import pytest
+import numpy as np
+import sparsediffpy as sp
+
+
+@pytest.fixture
+def scope():
+    return sp.Scope()
+
+
+@pytest.fixture
+def rng():
+    return np.random.default_rng(42)
diff --git a/tests/elementwise_full_dom/__init__.py b/tests/elementwise_full_dom/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/tests/elementwise_full_dom/test_asinh.py b/tests/elementwise_full_dom/test_asinh.py
new file mode 100644
index 0000000..076f506
--- /dev/null
+++ b/tests/elementwise_full_dom/test_asinh.py
@@ -0,0 +1,20 @@
+import numpy as np
+import sparsediffpy as sp
+from tests.utils import NumericalDerivativeChecker, random_point
+
+
+def test_asinh_vector_jacobian(scope, rng):
+    x = scope.Variable(4, 1)
+    f = sp.asinh(x)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_asinh_vector_hessian(scope, rng):
+    x = scope.Variable(4, 1)
+    f = sp.asinh(x)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    x0 = random_point(scope, rng)
+    checker.check_hessian(x0, rng.standard_normal(4))
diff --git a/tests/elementwise_full_dom/test_cos.py b/tests/elementwise_full_dom/test_cos.py
new file mode 100644
index 0000000..85e2fbc
--- /dev/null
+++ b/tests/elementwise_full_dom/test_cos.py
@@ -0,0 +1,28 @@
+import numpy as np
+import sparsediffpy as sp
+from tests.utils import NumericalDerivativeChecker, random_point
+
+
+def test_cos_vector_forward(scope, rng):
+    x = scope.Variable(4, 1)
+    f = sp.cos(x)
+    fn = sp.compile(f)
+    x0 = random_point(scope, rng)
+    np.testing.assert_allclose(fn.forward(), np.cos(x0))
+
+
+def test_cos_vector_jacobian(scope, rng):
+    x = scope.Variable(4, 1)
+    f = sp.cos(x)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_cos_vector_hessian(scope, rng):
+    x = scope.Variable(4, 1)
+    f = sp.cos(x)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    x0 = random_point(scope, rng)
+    checker.check_hessian(x0, rng.standard_normal(4))
diff --git a/tests/elementwise_full_dom/test_exp.py b/tests/elementwise_full_dom/test_exp.py
new file mode 100644
index 0000000..fe3d8e4
--- /dev/null
+++ b/tests/elementwise_full_dom/test_exp.py
@@ -0,0 +1,44 @@
+import numpy as np
+import sparsediffpy as sp
+from tests.utils import NumericalDerivativeChecker, random_point
+
+
+def test_exp_vector_forward(scope, rng):
+    x = scope.Variable(4, 1)
+    f = sp.exp(x)
+    fn = sp.compile(f)
+    x0 = random_point(scope, rng)
+    np.testing.assert_allclose(fn.forward(), np.exp(x0))
+
+
+def test_exp_vector_jacobian(scope, rng):
+    x = scope.Variable(4, 1)
+    f = sp.exp(x)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_exp_vector_hessian(scope, rng):
+    x = scope.Variable(4, 1)
+    f = sp.exp(x)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    x0 = random_point(scope, rng)
+    checker.check_hessian(x0, rng.standard_normal(4))
+
+
+def test_exp_matrix_jacobian(scope, rng):
+    X = scope.Variable(3, 2)
+    f = sp.exp(X)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_exp_scalar_jacobian(scope, rng):
+    x = scope.Variable(1, 1)
+    f = sp.exp(x)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
diff --git a/tests/elementwise_full_dom/test_logistic.py b/tests/elementwise_full_dom/test_logistic.py
new file mode 100644
index 0000000..3e8d6b1
--- /dev/null
+++ b/tests/elementwise_full_dom/test_logistic.py
@@ -0,0 +1,30 @@
+import numpy as np
+import sparsediffpy as sp
+from tests.utils import NumericalDerivativeChecker, random_point
+
+
+def test_logistic_vector_jacobian(scope, rng):
+    x = scope.Variable(4, 1)
+    f = sp.logistic(x)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_logistic_vector_hessian(scope, rng):
+    x = scope.Variable(4, 1)
+    f = sp.logistic(x)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    x0 = random_point(scope, rng)
+    checker.check_hessian(x0, rng.standard_normal(4))
+
+
+def test_logistic_forward(scope, rng):
+    """The C logistic function is the softplus: log(1 + exp(x))."""
+    x = scope.Variable(3, 1)
+    f = sp.logistic(x)
+    fn = sp.compile(f)
+    x0 = random_point(scope, rng)
+    expected = np.log(1.0 + np.exp(x0))
+    np.testing.assert_allclose(fn.forward(), expected)
diff --git a/tests/elementwise_full_dom/test_normal_cdf.py b/tests/elementwise_full_dom/test_normal_cdf.py
new file mode 100644
index 0000000..80f31c1
--- /dev/null
+++ b/tests/elementwise_full_dom/test_normal_cdf.py
@@ -0,0 +1,29 @@
+import numpy as np
+from scipy.stats import norm
+import sparsediffpy as sp
+from tests.utils import NumericalDerivativeChecker, random_point
+
+
+def test_normal_cdf_vector_jacobian(scope, rng):
+    x = scope.Variable(4, 1)
+    f = sp.normal_cdf(x)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_normal_cdf_vector_hessian(scope, rng):
+    x = scope.Variable(4, 1)
+    f = sp.normal_cdf(x)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    x0 = random_point(scope, rng)
+    checker.check_hessian(x0, rng.standard_normal(4))
+
+
+def test_normal_cdf_forward(scope, rng):
+    x = scope.Variable(3, 1)
+    f = sp.normal_cdf(x)
+    fn = sp.compile(f)
+    x0 = random_point(scope, rng)
+    np.testing.assert_allclose(fn.forward(), norm.cdf(x0), rtol=1e-6)
diff --git a/tests/elementwise_full_dom/test_power.py b/tests/elementwise_full_dom/test_power.py
new file mode 100644
index 0000000..394bb3e
--- /dev/null
+++ b/tests/elementwise_full_dom/test_power.py
@@ -0,0 +1,45 @@
+import numpy as np
+import sparsediffpy as sp
+from tests.utils import NumericalDerivativeChecker, random_positive_point
+
+
+def test_power_squared_jacobian(scope, rng):
+    x = scope.Variable(4, 1)
+    f = x ** 2
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_positive_point(scope, rng))
+
+
+def test_power_squared_hessian(scope, rng):
+    x = scope.Variable(4, 1)
+    f = x ** 2
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    x0 = random_positive_point(scope, rng)
+    checker.check_hessian(x0, rng.standard_normal(4))
+
+
+def test_power_cubed_jacobian(scope, rng):
+    x = scope.Variable(3, 1)
+    f = x ** 3
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_positive_point(scope, rng))
+
+
+def test_power_half_jacobian(scope, rng):
+    x = scope.Variable(3, 1)
+    f = sp.power(x, 0.5)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_positive_point(scope, rng))
+
+
+def test_power_half_hessian(scope, rng):
+    x = scope.Variable(3, 1)
+    f = sp.power(x, 0.5)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    x0 = random_positive_point(scope, rng)
+    checker.check_hessian(x0, rng.standard_normal(3))
diff --git a/tests/elementwise_full_dom/test_sin.py b/tests/elementwise_full_dom/test_sin.py
new file mode 100644
index 0000000..0f4abfd
--- /dev/null
+++ b/tests/elementwise_full_dom/test_sin.py
@@ -0,0 +1,36 @@
+import numpy as np
+import sparsediffpy as sp
+from tests.utils import NumericalDerivativeChecker, random_point
+
+
+def test_sin_vector_forward(scope, rng):
+    x = scope.Variable(4, 1)
+    f = sp.sin(x)
+    fn = sp.compile(f)
+    x0 = random_point(scope, rng)
+    np.testing.assert_allclose(fn.forward(), np.sin(x0))
+
+
+def test_sin_vector_jacobian(scope, rng):
+    x = scope.Variable(4, 1)
+    f = sp.sin(x)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_sin_vector_hessian(scope, rng):
+    x = scope.Variable(4, 1)
+    f = sp.sin(x)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    x0 = random_point(scope, rng)
+    checker.check_hessian(x0, rng.standard_normal(4))
+
+
+def test_sin_matrix_jacobian(scope, rng):
+    X = scope.Variable(3, 2)
+    f = sp.sin(X)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
diff --git a/tests/elementwise_full_dom/test_sinh.py b/tests/elementwise_full_dom/test_sinh.py
new file mode 100644
index 0000000..674e1e1
--- /dev/null
+++ b/tests/elementwise_full_dom/test_sinh.py
@@ -0,0 +1,20 @@
+import numpy as np
+import sparsediffpy as sp
+from tests.utils import NumericalDerivativeChecker, random_point
+
+
+def test_sinh_vector_jacobian(scope, rng):
+    x = scope.Variable(4, 1)
+    f = sp.sinh(x)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_sinh_vector_hessian(scope, rng):
+    x = scope.Variable(4, 1)
+    f = sp.sinh(x)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    x0 = random_point(scope, rng)
+    checker.check_hessian(x0, rng.standard_normal(4))
diff --git a/tests/elementwise_full_dom/test_tanh.py b/tests/elementwise_full_dom/test_tanh.py
new file mode 100644
index 0000000..f0201fd
--- /dev/null
+++ b/tests/elementwise_full_dom/test_tanh.py
@@ -0,0 +1,20 @@
+import numpy as np
+import sparsediffpy as sp
+from tests.utils import NumericalDerivativeChecker, random_point
+
+
+def test_tanh_vector_jacobian(scope, rng):
+    x = scope.Variable(4, 1)
+    f = sp.tanh(x)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_tanh_vector_hessian(scope, rng):
+    x = scope.Variable(4, 1)
+    f = sp.tanh(x)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    x0 = random_point(scope, rng)
+    checker.check_hessian(x0, rng.standard_normal(4))
diff --git a/tests/elementwise_full_dom/test_xexp.py b/tests/elementwise_full_dom/test_xexp.py
new file mode 100644
index 0000000..8065050
--- /dev/null
+++ b/tests/elementwise_full_dom/test_xexp.py
@@ -0,0 +1,28 @@
+import numpy as np
+import sparsediffpy as sp
+from tests.utils import NumericalDerivativeChecker, random_point
+
+
+def test_xexp_vector_jacobian(scope, rng):
+    x = scope.Variable(4, 1)
+    f = sp.xexp(x)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_xexp_vector_hessian(scope, rng):
+    x = scope.Variable(4, 1)
+    f = sp.xexp(x)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    x0 = random_point(scope, rng)
+    checker.check_hessian(x0, rng.standard_normal(4))
+
+
+def test_xexp_forward(scope, rng):
+    x = scope.Variable(3, 1)
+    f = sp.xexp(x)
+    fn = sp.compile(f)
+    x0 = random_point(scope, rng)
+    np.testing.assert_allclose(fn.forward(), x0 * np.exp(x0))
diff --git a/tests/elementwise_restricted_dom/__init__.py b/tests/elementwise_restricted_dom/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/tests/elementwise_restricted_dom/test_atanh.py b/tests/elementwise_restricted_dom/test_atanh.py
new file mode 100644
index 0000000..19b5503
--- /dev/null
+++ b/tests/elementwise_restricted_dom/test_atanh.py
@@ -0,0 +1,21 @@
+import numpy as np
+import sparsediffpy as sp
+from tests.utils import NumericalDerivativeChecker, random_point
+
+
+def test_atanh_vector_jacobian(scope, rng):
+    x = scope.Variable(4, 1)
+    f = sp.atanh(x)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    # Domain: (-1, 1)
+    checker.check_jacobian(random_point(scope, rng, low=-0.8, high=0.8))
+
+
+def test_atanh_vector_hessian(scope, rng):
+    x = scope.Variable(4, 1)
+    f = sp.atanh(x)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    x0 = random_point(scope, rng, low=-0.8, high=0.8)
+    checker.check_hessian(x0, rng.standard_normal(4))
diff --git a/tests/elementwise_restricted_dom/test_entr.py b/tests/elementwise_restricted_dom/test_entr.py
new file mode 100644
index 0000000..96c28f2
--- /dev/null
+++ b/tests/elementwise_restricted_dom/test_entr.py
@@ -0,0 +1,28 @@
+import numpy as np
+import sparsediffpy as sp
+from tests.utils import NumericalDerivativeChecker, random_positive_point
+
+
+def test_entr_vector_forward(scope, rng):
+    x = scope.Variable(4, 1)
+    f = sp.entr(x)
+    fn = sp.compile(f)
+    x0 = random_positive_point(scope, rng)
+    np.testing.assert_allclose(fn.forward(), -x0 * np.log(x0))
+
+
+def test_entr_vector_jacobian(scope, rng):
+    x = scope.Variable(4, 1)
+    f = sp.entr(x)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_positive_point(scope, rng))
+
+
+def test_entr_vector_hessian(scope, rng):
+    x = scope.Variable(4, 1)
+    f = sp.entr(x)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    x0 = random_positive_point(scope, rng)
+    checker.check_hessian(x0, rng.standard_normal(4))
diff --git a/tests/elementwise_restricted_dom/test_log.py b/tests/elementwise_restricted_dom/test_log.py
new file mode 100644
index 0000000..2e3b4b0
--- /dev/null
+++ b/tests/elementwise_restricted_dom/test_log.py
@@ -0,0 +1,36 @@
+import numpy as np
+import sparsediffpy as sp
+from tests.utils import NumericalDerivativeChecker, random_positive_point
+
+
+def test_log_vector_forward(scope, rng):
+    x = scope.Variable(4, 1)
+    f = sp.log(x)
+    fn = sp.compile(f)
+    x0 = random_positive_point(scope, rng)
+    np.testing.assert_allclose(fn.forward(), np.log(x0))
+
+
+def test_log_vector_jacobian(scope, rng):
+    x = scope.Variable(4, 1)
+    f = sp.log(x)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_positive_point(scope, rng))
+
+
+def test_log_vector_hessian(scope, rng):
+    x = scope.Variable(4, 1)
+    f = sp.log(x)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    x0 = random_positive_point(scope, rng)
+    checker.check_hessian(x0, rng.standard_normal(4))
+
+
+def test_log_matrix_jacobian(scope, rng):
+    X = scope.Variable(3, 2)
+    f = sp.log(X)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_positive_point(scope, rng))
diff --git a/tests/elementwise_restricted_dom/test_tan.py b/tests/elementwise_restricted_dom/test_tan.py
new file mode 100644
index 0000000..b270d92
--- /dev/null
+++ b/tests/elementwise_restricted_dom/test_tan.py
@@ -0,0 +1,21 @@
+import numpy as np
+import sparsediffpy as sp
+from tests.utils import NumericalDerivativeChecker, random_point
+
+
+def test_tan_vector_jacobian(scope, rng):
+    x = scope.Variable(4, 1)
+    f = sp.tan(x)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    # Use small values to stay away from poles at pi/2
+    checker.check_jacobian(random_point(scope, rng, low=-0.5, high=0.5))
+
+
+def test_tan_vector_hessian(scope, rng):
+    x = scope.Variable(4, 1)
+    f = sp.tan(x)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    x0 = random_point(scope, rng, low=-0.5, high=0.5)
+    checker.check_hessian(x0, rng.standard_normal(4))
diff --git a/tests/other/__init__.py b/tests/other/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/tests/other/test_prod.py b/tests/other/test_prod.py
new file mode 100644
index 0000000..679b02d
--- /dev/null
+++ b/tests/other/test_prod.py
@@ -0,0 +1,44 @@
+import numpy as np
+import sparsediffpy as sp
+from tests.utils import NumericalDerivativeChecker, random_positive_point
+
+
+def test_prod_all_jacobian(scope, rng):
+    x = scope.Variable(3, 1)
+    f = sp.prod(x)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_positive_point(scope, rng))
+
+
+def test_prod_all_hessian(scope, rng):
+    x = scope.Variable(3, 1)
+    f = sp.prod(x)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    x0 = random_positive_point(scope, rng)
+    checker.check_hessian(x0, np.array([1.0]))
+
+
+def test_prod_all_forward(scope, rng):
+    x = scope.Variable(3, 1)
+    f = sp.prod(x)
+    fn = sp.compile(f)
+    x0 = random_positive_point(scope, rng)
+    np.testing.assert_allclose(fn.forward(), [np.prod(x0)])
+
+
+def test_prod_axis_zero_jacobian(scope, rng):
+    X = scope.Variable(3, 2)
+    f = sp.prod(X, axis=0)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_positive_point(scope, rng))
+
+
+def test_prod_axis_one_jacobian(scope, rng):
+    X = scope.Variable(3, 2)
+    f = sp.prod(X, axis=1)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_positive_point(scope, rng))
diff --git a/tests/other/test_quad_form.py b/tests/other/test_quad_form.py
new file mode 100644
index 0000000..be5729f
--- /dev/null
+++ b/tests/other/test_quad_form.py
@@ -0,0 +1,43 @@
+import numpy as np
+import scipy.sparse
+import sparsediffpy as sp
+from tests.utils import NumericalDerivativeChecker, random_point
+
+
+def test_quad_form_identity_jacobian(scope, rng):
+    x = scope.Variable(3, 1)
+    Q = scipy.sparse.eye(3, format="csr")
+    f = sp.quad_form(x, Q)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_quad_form_identity_hessian(scope, rng):
+    x = scope.Variable(3, 1)
+    Q = scipy.sparse.eye(3, format="csr")
+    f = sp.quad_form(x, Q)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    x0 = random_point(scope, rng)
+    checker.check_hessian(x0, np.array([1.0]))
+
+
+def test_quad_form_dense_jacobian(scope, rng):
+    x = scope.Variable(3, 1)
+    Q = rng.standard_normal((3, 3))
+    Q = Q.T @ Q  # make positive definite
+    f = sp.quad_form(x, Q)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_point(scope, rng))
+
+
+def test_quad_form_forward(scope, rng):
+    x = scope.Variable(3, 1)
+    Q = np.eye(3) * 2.0
+    f = sp.quad_form(x, Q)
+    fn = sp.compile(f)
+    x0 = random_point(scope, rng)
+    expected = x0 @ (2.0 * x0)
+    np.testing.assert_allclose(fn.forward(), [expected], rtol=1e-10)
diff --git a/tests/other/test_quad_over_lin.py b/tests/other/test_quad_over_lin.py
new file mode 100644
index 0000000..b78672b
--- /dev/null
+++ b/tests/other/test_quad_over_lin.py
@@ -0,0 +1,34 @@
+import numpy as np
+import sparsediffpy as sp
+from tests.utils import NumericalDerivativeChecker, random_positive_point
+
+
+def test_quad_over_lin_jacobian(scope, rng):
+    x = scope.Variable(3, 1)
+    z = scope.Variable(1, 1)
+    f = sp.quad_over_lin(x, z)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_positive_point(scope, rng))
+
+
+def test_quad_over_lin_hessian(scope, rng):
+    x = scope.Variable(3, 1)
+    z = scope.Variable(1, 1)
+    f = sp.quad_over_lin(x, z)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    x0 = random_positive_point(scope, rng)
+    checker.check_hessian(x0, np.array([1.0]))
+
+
+def test_quad_over_lin_forward(scope, rng):
+    x = scope.Variable(3, 1)
+    z = scope.Variable(1, 1)
+    f = sp.quad_over_lin(x, z)
+    fn = sp.compile(f)
+    x0 = random_positive_point(scope, rng)
+    x_val = x.value
+    z_val = z.value[0]
+    expected = np.sum(x_val ** 2) / z_val
+    np.testing.assert_allclose(fn.forward(), [expected], rtol=1e-10)
diff --git a/tests/other/test_rel_entr.py b/tests/other/test_rel_entr.py
new file mode 100644
index 0000000..d4b51bc
--- /dev/null
+++ b/tests/other/test_rel_entr.py
@@ -0,0 +1,114 @@
+import numpy as np
+import pytest
+import sparsediffpy as sp
+from tests.utils import NumericalDerivativeChecker, random_positive_point
+
+
+# --- Vector-vector (both same shape) ---
+
+def test_rel_entr_vector_jacobian(scope, rng):
+    x = scope.Variable(3, 1)
+    y = scope.Variable(3, 1)
+    f = sp.rel_entr(x, y)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_positive_point(scope, rng))
+
+
+def test_rel_entr_vector_hessian(scope, rng):
+    x = scope.Variable(3, 1)
+    y = scope.Variable(3, 1)
+    f = sp.rel_entr(x, y)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    x0 = random_positive_point(scope, rng)
+    checker.check_hessian(x0, rng.standard_normal(3))
+
+
+def test_rel_entr_vector_forward(scope, rng):
+    x = scope.Variable(3, 1)
+    y = scope.Variable(3, 1)
+    f = sp.rel_entr(x, y)
+    fn = sp.compile(f)
+    x0 = random_positive_point(scope, rng)
+    x_val = x.value
+    y_val = y.value
+    expected = x_val * np.log(x_val / y_val)
+    np.testing.assert_allclose(fn.forward(), expected)
+
+
+# --- Scalar x, vector y ---
+
+def test_rel_entr_scalar_vector_jacobian(scope, rng):
+    x = scope.Variable(1, 1)
+    y = scope.Variable(3, 1)
+    f = sp.rel_entr(x, y)
+    assert f.shape == (3, 1)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_positive_point(scope, rng))
+
+
+def test_rel_entr_scalar_vector_hessian(scope, rng):
+    x = scope.Variable(1, 1)
+    y = scope.Variable(3, 1)
+    f = sp.rel_entr(x, y)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    x0 = random_positive_point(scope, rng)
+    checker.check_hessian(x0, rng.standard_normal(3))
+
+
+def test_rel_entr_scalar_vector_forward(scope, rng):
+    x = scope.Variable(1, 1)
+    y = scope.Variable(3, 1)
+    f = sp.rel_entr(x, y)
+    fn = sp.compile(f)
+    x0 = random_positive_point(scope, rng)
+    x_val = x.value[0]
+    y_val = y.value
+    expected = x_val * np.log(x_val / y_val)
+    np.testing.assert_allclose(fn.forward(), expected)
+
+
+# --- Vector x, scalar y ---
+
+def test_rel_entr_vector_scalar_jacobian(scope, rng):
+    x = scope.Variable(3, 1)
+    y = scope.Variable(1, 1)
+    f = sp.rel_entr(x, y)
+    assert f.shape == (3, 1)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    checker.check_jacobian(random_positive_point(scope, rng))
+
+
+def test_rel_entr_vector_scalar_hessian(scope, rng):
+    x = scope.Variable(3, 1)
+    y = scope.Variable(1, 1)
+    f = sp.rel_entr(x, y)
+    fn = sp.compile(f)
+    checker = NumericalDerivativeChecker(fn, scope)
+    x0 = random_positive_point(scope, rng)
+    checker.check_hessian(x0, rng.standard_normal(3))
+
+
+def test_rel_entr_vector_scalar_forward(scope, rng):
+    x = scope.Variable(3, 1)
+    y = scope.Variable(1, 1)
+    f = sp.rel_entr(x, y)
+    fn = sp.compile(f)
+    x0 = random_positive_point(scope, rng)
+    x_val = x.value
+    y_val = y.value[0]
+    expected = x_val * np.log(x_val / y_val)
+    np.testing.assert_allclose(fn.forward(), expected)
+
+
+# --- Shape mismatch ---
+
+def test_rel_entr_incompatible_shapes(scope):
+    x = scope.Variable(3, 1)
+    y = scope.Variable(2, 1)
+    with pytest.raises(ValueError, match="shapes must match or one must be scalar"):
+        sp.rel_entr(x, y)
diff --git a/tests/test_misc.py b/tests/test_misc.py
new file mode 100644
index 0000000..d287cde
--- /dev/null
+++ b/tests/test_misc.py
@@ -0,0 +1,391 @@
+"""Miscellaneous tests: hessians for affine atoms, re-evaluation, negative indexing,
+parameter jacobian after update, x.T@x, sub with broadcast, scope roundtrip,
+compile twice, degenerate cases."""
+
+import numpy as np
+import pytest
+import sparsediffpy as sp
+from tests.utils import NumericalDerivativeChecker, random_point, random_positive_point
+
+
+# ---------------------------------------------------------------------------
+# Hessian for affine atoms (should be zero, but verify through compositions)
+# ---------------------------------------------------------------------------
+
+class TestAffineHessians:
+    def test_neg_hessian_is_zero(self, scope, rng):
+        x = scope.Variable(3, 1)
+        f = -x
+        fn = sp.compile(f)
+        x0 = random_point(scope, rng)
+        fn.forward()
+        fn.jacobian()
+        H = fn.hessian(rng.standard_normal(3))
+        np.testing.assert_allclose(H.toarray(), np.zeros((3, 3)), atol=1e-14)
+
+    def test_add_hessian_is_zero(self, scope, rng):
+        x = scope.Variable(3, 1)
+        y = scope.Variable(3, 1)
+        f = x + y
+        fn = sp.compile(f)
+        x0 = random_point(scope, rng)
+        fn.forward()
+        fn.jacobian()
+        H = fn.hessian(rng.standard_normal(3))
+        np.testing.assert_allclose(H.toarray(), np.zeros((6, 6)), atol=1e-14)
+
+    def test_hstack_hessian_is_zero(self, scope, rng):
+        x = scope.Variable(3, 1)
+        y = scope.Variable(3, 1)
+        f = sp.hstack([x, y])
+        fn = sp.compile(f)
+        x0 = random_point(scope, rng)
+        fn.forward()
+        fn.jacobian()
+        H = fn.hessian(rng.standard_normal(6))
+        np.testing.assert_allclose(H.toarray(), np.zeros((6, 6)), atol=1e-14)
+
+    def test_index_hessian_is_zero(self, scope, rng):
+        x = scope.Variable(4, 1)
+        f = x[1:3]
+        fn = sp.compile(f)
+        x0 = random_point(scope, rng)
+        fn.forward()
+        fn.jacobian()
+        H = fn.hessian(rng.standard_normal(2))
+        np.testing.assert_allclose(H.toarray(), np.zeros((4, 4)), atol=1e-14)
+
+    def test_sum_hessian_is_zero(self, scope, rng):
+        x = scope.Variable(3, 2)
+        f = sp.sum(x)
+        fn = sp.compile(f)
+        x0 = random_point(scope, rng)
+        fn.forward()
+        fn.jacobian()
+        H = fn.hessian(np.array([1.0]))
+        np.testing.assert_allclose(H.toarray(), np.zeros((6, 6)), atol=1e-14)
+
+    def test_sin_of_sum_hessian_nonzero(self, scope, rng):
+        """sin(sum(x)) — affine feeding into nonlinear produces nonzero Hessian."""
+        x = scope.Variable(3, 1)
+        f = sp.sin(sp.sum(x))
+        fn = sp.compile(f)
+        checker = NumericalDerivativeChecker(fn, scope)
+        x0 = random_point(scope, rng)
+        checker.check_hessian(x0, np.array([1.0]))
+
+
+# ---------------------------------------------------------------------------
+# Re-evaluation after value change
+# ---------------------------------------------------------------------------
+
+class TestReEvaluation:
+    def test_forward_updates_with_new_values(self, scope, rng):
+        x = scope.Variable(3, 1)
+        f = sp.sin(x)
+        fn = sp.compile(f)
+
+        x.value = np.array([0.0, 0.0, 0.0])
+        np.testing.assert_allclose(fn.forward(), np.sin([0, 0, 0]))
+
+        x.value = np.array([1.0, 2.0, 3.0])
+        np.testing.assert_allclose(fn.forward(), np.sin([1, 2, 3]))
+
+    def test_jacobian_updates_with_new_values(self, scope, rng):
+        x = scope.Variable(3, 1)
+        f = sp.sin(x)
+        fn = sp.compile(f)
+
+        x.value = np.array([0.0, 0.0, 0.0])
+        fn.forward()
+        J1 = fn.jacobian().toarray()
+        np.testing.assert_allclose(np.diag(J1), np.cos([0, 0, 0]))
+
+        x.value = np.array([1.0, 2.0, 3.0])
+        fn.forward()
+        J2 = fn.jacobian().toarray()
+        np.testing.assert_allclose(np.diag(J2), np.cos([1, 2, 3]))
+
+    def test_hessian_updates_with_new_values(self, scope, rng):
+        x = scope.Variable(3, 1)
+        f = sp.sin(x)
+        fn = sp.compile(f)
+        w = np.ones(3)
+
+        x.value = np.array([0.0, 0.0, 0.0])
+        fn.forward()
+        fn.jacobian()
+        H1 = fn.hessian(w).toarray()
+        np.testing.assert_allclose(np.diag(H1), -np.sin([0, 0, 0]), atol=1e-14)
+
+        x.value = np.array([1.0, 2.0, 3.0])
+        fn.forward()
+        fn.jacobian()
+        H2 = fn.hessian(w).toarray()
+        np.testing.assert_allclose(np.diag(H2), -np.sin([1, 2, 3]))
+
+
+# ---------------------------------------------------------------------------
+# Negative indexing
+# ---------------------------------------------------------------------------
+
+class TestNegativeIndexing:
+    def test_negative_scalar_index(self, scope, rng):
+        x = scope.Variable(4, 1)
+        f = x[-1]
+        fn = sp.compile(f)
+        x0 = random_point(scope, rng)
+        np.testing.assert_allclose(fn.forward(), [x0[-1]])
+
+    def test_negative_scalar_index_jacobian(self, scope, rng):
+        x = scope.Variable(4, 1)
+        f = x[-1]
+        fn = sp.compile(f)
+        checker = NumericalDerivativeChecker(fn, scope)
+        checker.check_jacobian(random_point(scope, rng))
+
+    def test_negative_slice(self, scope, rng):
+        x = scope.Variable(4, 1)
+        f = x[-2:]
+        fn = sp.compile(f)
+        x0 = random_point(scope, rng)
+        np.testing.assert_allclose(fn.forward(), x0[-2:])
+
+    def test_negative_fancy(self, scope, rng):
+        x = scope.Variable(4, 1)
+        f = x[[-1, -3]]
+        fn = sp.compile(f)
+        x0 = random_point(scope, rng)
+        np.testing.assert_allclose(fn.forward(), x0[[-1, -3]])
+
+
+# ---------------------------------------------------------------------------
+# Parameter Jacobian after update
+# ---------------------------------------------------------------------------
+
+class TestParameterJacobianAfterUpdate:
+    def test_left_matmul_jacobian_after_update(self, scope, rng):
+        x = scope.Variable(3, 1)
+        A = scope.Parameter(3, 3)
+        A.value = np.eye(3)
+        f = A @ x
+        fn = sp.compile(f)
+
+        x0 = random_point(scope, rng)
+        fn.forward()
+        J1 = fn.jacobian().toarray()
+        np.testing.assert_allclose(J1, np.eye(3), atol=1e-14)
+
+        A.value = 2 * np.eye(3)
+        fn.forward()
+        J2 = fn.jacobian().toarray()
+        np.testing.assert_allclose(J2, 2 * np.eye(3), atol=1e-14)
+
+    def test_scalar_mult_jacobian_after_update(self, scope, rng):
+        x = scope.Variable(3, 1)
+        a = scope.Parameter(1, 1)
+        a.value = np.array([[3.0]])
+        f = a * x
+        fn = sp.compile(f)
+
+        x0 = random_point(scope, rng)
+        fn.forward()
+        J1 = fn.jacobian().toarray()
+        np.testing.assert_allclose(J1, 3.0 * np.eye(3), atol=1e-14)
+
+        a.value = np.array([[7.0]])
+        fn.forward()
+        J2 = fn.jacobian().toarray()
+        np.testing.assert_allclose(J2, 7.0 * np.eye(3), atol=1e-14)
+
+
+# ---------------------------------------------------------------------------
+# x.T @ x pattern
+# ---------------------------------------------------------------------------
+
+class TestTransposeMatmul:
+    def test_xT_x_forward(self, scope, rng):
+        x = scope.Variable(3, 1)
+        f = x.T @ x
+        fn = sp.compile(f)
+        x0 = random_point(scope, rng)
+        np.testing.assert_allclose(fn.forward(), [x0 @ x0], rtol=1e-10)
+
+    def test_xT_x_jacobian(self, scope, rng):
+        x = scope.Variable(3, 1)
+        f = x.T @ x
+        fn = sp.compile(f)
+        checker = NumericalDerivativeChecker(fn, scope)
+        checker.check_jacobian(random_point(scope, rng))
+
+    def test_xT_x_hessian(self, scope, rng):
+        x = scope.Variable(3, 1)
+        f = x.T @ x
+        fn = sp.compile(f)
+        checker = NumericalDerivativeChecker(fn, scope)
+        x0 = random_point(scope, rng)
+        checker.check_hessian(x0, np.array([1.0]))
+
+    def test_xT_A_x(self, scope, rng):
+        """x.T @ A @ x where A is a constant matrix."""
+        x = scope.Variable(3, 1)
+        A = rng.standard_normal((3, 3))
+        A = A + A.T  # symmetric
+        f = x.T @ (A @ x)
+        fn = sp.compile(f)
+        checker = NumericalDerivativeChecker(fn, scope)
+        x0 = random_point(scope, rng)
+        checker.check_jacobian(x0)
+        checker.check_hessian(x0, np.array([1.0]))
+
+
+# ---------------------------------------------------------------------------
+# Subtraction with broadcasting
+# ---------------------------------------------------------------------------
+
+class TestSubBroadcast:
+    def test_sub_scalar_broadcast(self, scope, rng):
+        X = scope.Variable(3, 2)
+        a = scope.Variable(1, 1)
+        f = X - a
+        fn = sp.compile(f)
+        checker = NumericalDerivativeChecker(fn, scope)
+        checker.check_jacobian(random_point(scope, rng))
+
+    def test_sub_column_broadcast(self, scope, rng):
+        X = scope.Variable(3, 2)
+        c = scope.Variable(3, 1)
+        f = X - c
+        fn = sp.compile(f)
+        checker = NumericalDerivativeChecker(fn, scope)
+        checker.check_jacobian(random_point(scope, rng))
+
+    def test_rsub_broadcast(self, scope, rng):
+        x = scope.Variable(3, 1)
+        f = np.array([10.0, 20.0, 30.0]) - x
+        fn = sp.compile(f)
+        x0 = random_point(scope, rng)
+        np.testing.assert_allclose(
+            fn.forward(), np.array([10.0, 20.0, 30.0]) - x0
+        )
+
+
+# ---------------------------------------------------------------------------
+# Scope set_values / get_values roundtrip
+# ---------------------------------------------------------------------------
+
+class TestScopeRoundtrip:
+    def test_set_get_roundtrip(self, scope, rng):
+        x = scope.Variable(3, 1)
+        y = scope.Variable(2, 1)
+        vals = np.array([1.0, 2.0, 3.0, 4.0, 5.0])
+        scope.set_values(vals)
+        np.testing.assert_allclose(scope.get_values(), vals)
+        np.testing.assert_allclose(x.value, [1.0, 2.0, 3.0])
+        np.testing.assert_allclose(y.value, [4.0, 5.0])
+
+    def test_variable_value_writes_to_scope(self, scope, rng):
+        x = scope.Variable(3, 1)
+        y = scope.Variable(2, 1)
+        x.value = np.array([10.0, 20.0, 30.0])
+        y.value = np.array([40.0, 50.0])
+        np.testing.assert_allclose(
+            scope.get_values(), [10.0, 20.0, 30.0, 40.0, 50.0]
+        )
+
+
+# ---------------------------------------------------------------------------
+# Compile same expression twice
+# ---------------------------------------------------------------------------
+
+class TestCompileTwice:
+    def test_two_compiles_independent(self, scope, rng):
+        x = scope.Variable(3, 1)
+        f = sp.sin(x)
+        fn1 = sp.compile(f)
+        fn2 = sp.compile(f)
+
+        x.value = np.array([1.0, 2.0, 3.0])
+        f1 = fn1.forward()
+        f2 = fn2.forward()
+        np.testing.assert_allclose(f1, f2)
+        np.testing.assert_allclose(
+            fn1.jacobian().toarray(), fn2.jacobian().toarray()
+        )
+
+
+# ---------------------------------------------------------------------------
+# Degenerate / edge cases
+# ---------------------------------------------------------------------------
+
+class TestDegenerateCases:
+    def test_hstack_single(self, scope, rng):
+        x = scope.Variable(3, 1)
+        f = sp.hstack([x])
+        fn = sp.compile(f)
+        checker = NumericalDerivativeChecker(fn, scope)
+        checker.check_jacobian(random_point(scope, rng))
+
+    def test_vstack_single(self, scope, rng):
+        x = scope.Variable(3, 1)
+        f = sp.vstack([x])
+        fn = sp.compile(f)
+        checker = NumericalDerivativeChecker(fn, scope)
+        checker.check_jacobian(random_point(scope, rng))
+
+    def test_sum_scalar(self, scope, rng):
+        x = scope.Variable(1, 1)
+        f = sp.sum(x)
+        fn = sp.compile(f)
+        checker = NumericalDerivativeChecker(fn, scope)
+        checker.check_jacobian(random_point(scope, rng))
+
+    def test_scalar_variable(self, scope, rng):
+        x = scope.Variable(1, 1)
+        f = sp.sin(x)
+        fn = sp.compile(f)
+        checker = NumericalDerivativeChecker(fn, scope)
+        x0 = random_point(scope, rng)
+        checker.check_jacobian(x0)
+        checker.check_hessian(x0, np.array([1.0]))
+
+    def test_identity_expression(self, scope, rng):
+        """Compiling just a variable."""
+        x = scope.Variable(3, 1)
+        fn = sp.compile(x)
+        x0 = random_point(scope, rng)
+        np.testing.assert_allclose(fn.forward(), x0)
+        J = fn.jacobian().toarray()  # forward() was just called above
+        np.testing.assert_allclose(J, np.eye(3))
+
+    def test_constant_expression_raises(self, scope, rng):
+        """Compiling a constant (no variables) should raise."""
+        from sparsediffpy._core._constants import Constant
+        c = Constant(np.array([1.0, 2.0, 3.0]), (3, 1))
+        with pytest.raises(ValueError, match="at least one Variable"):
+            sp.compile(c)
+
+    def test_nested_transpose(self, scope, rng):
+        """x.T.T should be x."""
+        x = scope.Variable(3, 1)
+        f = x.T.T
+        assert f.shape == (3, 1)
+        fn = sp.compile(f)
+        checker = NumericalDerivativeChecker(fn, scope)
+        checker.check_jacobian(random_point(scope, rng))
+
+    def test_power_one(self, scope, rng):
+        """x**1 should be identity."""
+        x = scope.Variable(3, 1)
+        f = x ** 1
+        fn = sp.compile(f)
+        x0 = random_point(scope, rng)
+        np.testing.assert_allclose(fn.forward(), x0)
+
+    def test_power_zero(self, scope, rng):
+        """x**0 should be ones."""
+        x = scope.Variable(3, 1)
+        f = x ** 0
+        fn = sp.compile(f)
+        x0 = random_positive_point(scope, rng)
+        np.testing.assert_allclose(fn.forward(), np.ones(3))
diff --git a/tests/test_row_vectors.py b/tests/test_row_vectors.py
new file mode 100644
index 0000000..0c0ec9d
--- /dev/null
+++ b/tests/test_row_vectors.py
@@ -0,0 +1,140 @@
+"""Tests with row vectors (1, n) to exercise different code paths."""
+
+import numpy as np
+import sparsediffpy as sp
+from tests.utils import NumericalDerivativeChecker, random_point
+
+
+class TestRowVectorBasics:
+    def test_row_variable_forward(self, scope, rng):
+        r = scope.Variable(1, 4)
+        fn = sp.compile(r)
+        x0 = random_point(scope, rng)
+        np.testing.assert_allclose(fn.forward(), x0)
+
+    def test_row_sin_jacobian(self, scope, rng):
+        r = scope.Variable(1, 4)
+        f = sp.sin(r)
+        fn = sp.compile(f)
+        checker = NumericalDerivativeChecker(fn, scope)
+        checker.check_jacobian(random_point(scope, rng))
+
+    def test_row_sin_hessian(self, scope, rng):
+        r = scope.Variable(1, 4)
+        f = sp.sin(r)
+        fn = sp.compile(f)
+        checker = NumericalDerivativeChecker(fn, scope)
+        x0 = random_point(scope, rng)
+        checker.check_hessian(x0, rng.standard_normal(4))
+
+
+class TestRowVectorIndexing:
+    def test_row_scalar_index(self, scope, rng):
+        r = scope.Variable(1, 4)
+        f = r[2]
+        fn = sp.compile(f)
+        checker = NumericalDerivativeChecker(fn, scope)
+        checker.check_jacobian(random_point(scope, rng))
+
+    def test_row_scalar_index_forward(self, scope, rng):
+        r = scope.Variable(1, 4)
+        f = r[2]
+        fn = sp.compile(f)
+        x0 = random_point(scope, rng)
+        np.testing.assert_allclose(fn.forward(), [x0[2]])
+
+    def test_row_slice_index(self, scope, rng):
+        r = scope.Variable(1, 4)
+        f = r[1:3]
+        fn = sp.compile(f)
+        checker = NumericalDerivativeChecker(fn, scope)
+        checker.check_jacobian(random_point(scope, rng))
+
+
+class TestRowVectorBroadcast:
+    def test_row_plus_scalar(self, scope, rng):
+        r = scope.Variable(1, 3)
+        f = r + 1.0
+        fn = sp.compile(f)
+        checker = NumericalDerivativeChecker(fn, scope)
+        checker.check_jacobian(random_point(scope, rng))
+
+    def test_row_plus_matrix(self, scope, rng):
+        r = scope.Variable(1, 3)
+        M = scope.Variable(4, 3)
+        f = r + M
+        fn = sp.compile(f)
+        checker = NumericalDerivativeChecker(fn, scope)
+        checker.check_jacobian(random_point(scope, rng))
+
+    def test_scalar_times_row(self, scope, rng):
+        r = scope.Variable(1, 4)
+        f = 2.5 * r
+        fn = sp.compile(f)
+        x0 = random_point(scope, rng)
+        np.testing.assert_allclose(fn.forward(), 2.5 * x0)
+
+
+class TestRowVectorMatmul:
+    def test_row_times_matrix(self, scope, rng):
+        """(1,3) @ (3,2) = (1,2)"""
+        r = scope.Variable(1, 3)
+        A = rng.standard_normal((3, 2))
+        f = r @ A
+        fn = sp.compile(f)
+        checker = NumericalDerivativeChecker(fn, scope)
+        checker.check_jacobian(random_point(scope, rng))
+
+    def test_row_times_column(self, scope, rng):
+        """(1,3) @ (3,1) = (1,1) — dot product."""
+        r = scope.Variable(1, 3)
+        c = scope.Variable(3, 1)
+        f = r @ c
+        fn = sp.compile(f)
+        checker = NumericalDerivativeChecker(fn, scope)
+        x0 = random_point(scope, rng)
+        checker.check_jacobian(x0)
+        checker.check_hessian(x0, np.array([1.0]))
+
+    def test_matrix_times_row_transpose(self, scope, rng):
+        """A @ r.T where r is (1,3) -> r.T is (3,1)."""
+        r = scope.Variable(1, 3)
+        A = rng.standard_normal((4, 3))
+        f = A @ r.T
+        fn = sp.compile(f)
+        checker = NumericalDerivativeChecker(fn, scope)
+        checker.check_jacobian(random_point(scope, rng))
+
+
+class TestRowVectorTranspose:
+    def test_row_transpose_is_column(self, scope, rng):
+        r = scope.Variable(1, 4)
+        f = r.T
+        assert f.shape == (4, 1)
+        fn = sp.compile(f)
+        checker = NumericalDerivativeChecker(fn, scope)
+        checker.check_jacobian(random_point(scope, rng))
+
+    def test_column_transpose_is_row(self, scope, rng):
+        c = scope.Variable(4, 1)
+        f = c.T
+        assert f.shape == (1, 4)
+        fn = sp.compile(f)
+        checker = NumericalDerivativeChecker(fn, scope)
+        checker.check_jacobian(random_point(scope, rng))
+
+
+class TestRowVectorReductions:
+    def test_sum_row(self, scope, rng):
+        r = scope.Variable(1, 4)
+        f = sp.sum(r)
+        fn = sp.compile(f)
+        checker = NumericalDerivativeChecker(fn, scope)
+        checker.check_jacobian(random_point(scope, rng))
+
+    def test_sum_row_forward(self, scope, rng):
+        r = scope.Variable(1, 4)
+        f = sp.sum(r)
+        fn = sp.compile(f)
+        x0 = random_point(scope, rng)
+        np.testing.assert_allclose(fn.forward(), [np.sum(x0)])
diff --git a/tests/test_validation.py b/tests/test_validation.py
new file mode 100644
index 0000000..77c680b
--- /dev/null
+++ b/tests/test_validation.py
@@ -0,0 +1,272 @@
+"""Tests for error/validation: shape mismatches, wrong assignments, mixed scopes."""
+
+import numpy as np
+import pytest
+import scipy.sparse
+import sparsediffpy as sp
+
+
+# ---------------------------------------------------------------------------
+# Shape mismatch in binary operators
+# ---------------------------------------------------------------------------
+
+class TestShapeMismatch:
+    def test_add_incompatible(self, scope):
+        x = scope.Variable(3, 1)
+        y = scope.Variable(2, 1)
+        with pytest.raises(ValueError, match="Cannot broadcast"):
+            x + y
+
+    def test_add_incompatible_matrix(self, scope):
+        X = scope.Variable(3, 2)
+        Y = scope.Variable(2, 3)
+        with pytest.raises(ValueError, match="Cannot broadcast"):
+            X + Y
+
+    def test_sub_incompatible(self, scope):
+        x = scope.Variable(3, 1)
+        y = scope.Variable(2, 1)
+        with pytest.raises(ValueError, match="Cannot broadcast"):
+            x - y
+
+    def test_matmul_incompatible(self, scope):
+        x = scope.Variable(3, 1)
+        y = scope.Variable(2, 1)
+        with pytest.raises(ValueError, match="Matmul shape mismatch"):
+            x @ y
+
+    def test_matmul_inner_dim_mismatch(self, scope):
+        X = scope.Variable(3, 4)
+        Y = scope.Variable(2, 3)
+        with pytest.raises(ValueError, match="Matmul shape mismatch"):
+            X @ Y
+
+    def test_rel_entr_shape_mismatch(self, scope):
+        x = scope.Variable(3, 1)
+        y = scope.Variable(2, 1)
+        with pytest.raises(ValueError, match="shapes must match or one must be scalar"):
+            sp.rel_entr(x, y)
+
+    def test_quad_over_lin_non_scalar_z(self, scope):
+        x = scope.Variable(3, 1)
+        z = scope.Variable(2, 1)
+        with pytest.raises(ValueError, match="must be scalar"):
+            sp.quad_over_lin(x, z)
+
+
+# ---------------------------------------------------------------------------
+# Atom-specific shape validation
+# ---------------------------------------------------------------------------
+
+class TestAtomValidation:
+    def test_trace_non_square(self, scope):
+        X = scope.Variable(3, 2)
+        with pytest.raises(ValueError, match="square matrix"):
+            sp.trace(X)
+
+    def test_diag_vec_non_column(self, scope):
+        X = scope.Variable(3, 2)
+        with pytest.raises(ValueError, match="column vector"):
+            sp.diag_vec(X)
+
+    def test_diag_vec_row_vector(self, scope):
+        r = scope.Variable(1, 3)
+        with pytest.raises(ValueError, match="column vector"):
+            sp.diag_vec(r)
+
+    def test_reshape_size_mismatch(self, scope):
+        x = scope.Variable(3, 1)
+        with pytest.raises(ValueError, match="Cannot reshape"):
+            sp.reshape(x, 2, 2)
+
+    def test_quad_form_wrong_Q_size(self, scope):
+        x = scope.Variable(3, 1)
+        Q = np.eye(4)
+        with pytest.raises(ValueError, match="need x"):
+            sp.quad_form(x, Q)
+
+    def test_quad_form_non_column(self, scope):
+        x = scope.Variable(1, 3)
+        Q = np.eye(3)
+        with pytest.raises(ValueError, match="need x"):
+            sp.quad_form(x, Q)
+
+    def test_pow_non_numeric_exponent(self, scope):
+        x = scope.Variable(3, 1)
+        with pytest.raises(TypeError, match="constant number"):
+            x ** "two"
+
+    def test_hstack_empty(self):
+        with pytest.raises(ValueError, match="empty argument"):
+            sp.hstack([])
+
+    def test_vstack_empty(self):
+        with pytest.raises(ValueError, match="empty argument"):
+            sp.vstack([])
+
+    def test_hstack_mismatched_rows(self, scope):
+        x = scope.Variable(3, 1)
+        y = scope.Variable(2, 1)
+        with pytest.raises(ValueError, match="row mismatch"):
+            sp.hstack([x, y])
+
+    def test_vstack_mismatched_cols(self, scope):
+        X = scope.Variable(3, 2)
+        Y = scope.Variable(3, 3)
+        with pytest.raises(ValueError, match="column mismatch"):
+            sp.vstack([X, Y])
+
+    def test_restricted_domain_on_index(self, scope):
+        x = scope.Variable(4, 1)
+        with pytest.raises(ValueError, match="cannot be applied directly"):
+            sp.log(x[1:3])
+
+        with pytest.raises(ValueError, match="cannot be applied directly"):
+            sp.tan(x[1:3])
+
+        with pytest.raises(ValueError, match="cannot be applied directly"):
+            sp.atanh(x[1:3])
+
+        with pytest.raises(ValueError, match="cannot be applied directly"):
+            sp.entr(x[1:3])
+
+    def test_quad_over_lin_args_must_be_variables(self, scope):
+        x = scope.Variable(3, 1)
+        z = scope.Variable(1, 1)
+        # This should work — both are plain variables
+        sp.quad_over_lin(x, z)
+
+        # x is a composition — fails
+        with pytest.raises(ValueError, match="x.*must be a plain Variable"):
+            sp.quad_over_lin(sp.sin(x), z)
+
+        # z is a composition — fails
+        with pytest.raises(ValueError, match="z.*must be a plain"):
+            sp.quad_over_lin(x, sp.exp(z))
+
+    def test_quad_over_lin_z_not_in_x(self, scope):
+        z = scope.Variable(1, 1)
+        # z used as both args
+        with pytest.raises(ValueError, match="z must not appear in x"):
+            sp.quad_over_lin(z, z)
+
+    def test_prod_must_be_variable(self, scope):
+        x = scope.Variable(3, 1)
+        # This should work — x is a plain variable
+        sp.prod(x)
+
+        # This should fail — argument is a composition
+        with pytest.raises(ValueError, match="plain Variable"):
+            sp.prod(sp.sin(x))
+
+    def test_prod_axis_must_be_variable(self, scope):
+        X = scope.Variable(3, 2)
+        sp.prod(X, axis=0)
+        sp.prod(X, axis=1)
+
+        with pytest.raises(ValueError, match="plain Variable"):
+            sp.prod(sp.sin(X), axis=0)
+
+
+# ---------------------------------------------------------------------------
+# Wrong-size value assignment
+# ---------------------------------------------------------------------------
+
+class TestValueAssignment:
+    def test_variable_wrong_size(self, scope):
+        x = scope.Variable(3, 1)
+        with pytest.raises(ValueError, match="expected 3 elements"):
+            x.value = np.array([1.0, 2.0])
+
+    def test_variable_too_many(self, scope):
+        x = scope.Variable(3, 1)
+        with pytest.raises(ValueError, match="expected 3 elements"):
+            x.value = np.array([1.0, 2.0, 3.0, 4.0])
+
+    def test_parameter_wrong_size(self, scope):
+        p = scope.Parameter(2, 2)
+        p.value = np.eye(2)
+        with pytest.raises(ValueError, match="expected 4 elements"):
+            p.value = np.array([1.0, 2.0])
+
+    def test_scope_set_values_wrong_size(self, scope):
+        x = scope.Variable(3, 1)
+        with pytest.raises(ValueError, match="expected 3 elements"):
+            scope.set_values(np.array([1.0, 2.0]))
+
+    def test_parameter_unset_value_is_none(self, scope):
+        p = scope.Parameter(2, 2)
+        assert p.value is None
+
+    def test_parameter_unset_raises_on_eval(self, scope):
+        x = scope.Variable(3, 1)
+        A = scope.Parameter(3, 3)
+        f = A @ x
+        fn = sp.compile(f)
+        x.value = np.array([1.0, 2.0, 3.0])
+        with pytest.raises(ValueError, match="has no value set"):
+            fn.forward()
+
+
+# ---------------------------------------------------------------------------
+# Mixed scopes
+# ---------------------------------------------------------------------------
+
+class TestMixedScopes:
+    def test_mixed_scopes_raises(self):
+        scope1 = sp.Scope()
+        scope2 = sp.Scope()
+        x = scope1.Variable(3, 1)
+        y = scope2.Variable(3, 1)
+        f = x + y
+        with pytest.raises(ValueError, match="same Scope"):
+            sp.compile(f)
+
+
+# ---------------------------------------------------------------------------
+# Invalid shape dimensions
+# ---------------------------------------------------------------------------
+
+class TestInvalidShapes:
+    def test_variable_zero_dim(self, scope):
+        with pytest.raises(ValueError, match="positive"):
+            scope.Variable(0, 1)
+
+    def test_variable_negative_dim(self, scope):
+        with pytest.raises(ValueError, match="positive"):
+            scope.Variable(-1, 3)
+
+    def test_parameter_zero_dim(self, scope):
+        with pytest.raises(ValueError, match="positive"):
+            scope.Parameter(3, 0)
+
+
+# ---------------------------------------------------------------------------
+# Index out of bounds
+# ---------------------------------------------------------------------------
+
+class TestIndexOutOfBounds:
+    def test_scalar_index_out_of_range(self, scope):
+        x = scope.Variable(3, 1)
+        with pytest.raises(IndexError, match="out of range"):
+            x[5]
+
+    def test_negative_index_out_of_range(self, scope):
+        x = scope.Variable(3, 1)
+        with pytest.raises(IndexError, match="out of range"):
+            x[-4]
+
+    def test_fancy_index_out_of_range(self, scope):
+        x = scope.Variable(3, 1)
+        with pytest.raises(IndexError, match="out of range"):
+            x[[0, 5]]
+
+    def test_matrix_index_out_of_range(self, scope):
+        X = scope.Variable(3, 2)
+        with pytest.raises(IndexError, match="out of range"):
+            X[5, 0]
+
+    def test_matrix_col_out_of_range(self, scope):
+        X = scope.Variable(3, 2)
+        with pytest.raises(IndexError, match="out of range"):
+            X[0, 3]
diff --git a/tests/utils.py b/tests/utils.py
new file mode 100644
index 0000000..d6cc85a
--- /dev/null
+++ b/tests/utils.py
@@ -0,0 +1,124 @@
+"""Test utilities: numerical derivative checker and helpers."""
+
+import numpy as np
+import sparsediffpy as sp
+
+
+class NumericalDerivativeChecker:
+    """Check Jacobian and Hessian of a compiled expression against
+    central finite differences.
+
+    Usage:
+        checker = NumericalDerivativeChecker(fn, scope)
+        checker.check_jacobian(x0)
+        checker.check_hessian(x0, weights)
+    """
+
+    def __init__(self, compiled_expr, scope, h=1e-5, rtol=1e-5, atol=1e-8):
+        self._fn = compiled_expr
+        self._scope = scope
+        self._h = h
+        self._rtol = rtol
+        self._atol = atol
+
+    def check_jacobian(self, x0):
+        """Compare analytical Jacobian against central finite differences.
+
+        J_approx[:, j] = (f(x + h*e_j) - f(x - h*e_j)) / (2h)
+        """
+        x0 = np.asarray(x0, dtype=np.float64).ravel()
+        n = x0.size
+        self._scope.set_values(x0)
+
+        # Analytical Jacobian
+        self._fn.forward()
+        J_analytical = self._fn.jacobian().toarray()
+        m = J_analytical.shape[0]
+
+        # Numerical Jacobian via central differences
+        J_numerical = np.zeros((m, n))
+        for j in range(n):
+            x_plus = x0.copy()
+            x_minus = x0.copy()
+            x_plus[j] += self._h
+            x_minus[j] -= self._h
+
+            self._scope.set_values(x_plus)
+            f_plus = self._fn.forward().copy()
+
+            self._scope.set_values(x_minus)
+            f_minus = self._fn.forward().copy()
+
+            J_numerical[:, j] = (f_plus - f_minus) / (2 * self._h)
+
+        # Restore original point
+        self._scope.set_values(x0)
+
+        np.testing.assert_allclose(
+            J_analytical, J_numerical,
+            rtol=self._rtol, atol=self._atol,
+            err_msg="Jacobian mismatch between analytical and numerical",
+        )
+
+    def check_hessian(self, x0, weights):
+        """Compare analytical Hessian against numerical Hessian.
+
+        For phi(x) = w^T f(x), the Hessian is computed by perturbing x_j
+        and recomputing the gradient grad_phi = J^T w:
+
+        H_approx[:, j] = (J(x+h*e_j)^T w - J(x-h*e_j)^T w) / (2h)
+        """
+        x0 = np.asarray(x0, dtype=np.float64).ravel()
+        weights = np.asarray(weights, dtype=np.float64).ravel()
+        n = x0.size
+
+        # Analytical Hessian
+        self._scope.set_values(x0)
+        self._fn.forward()
+        self._fn.jacobian()
+        H_analytical = self._fn.hessian(weights).toarray()
+
+        # Numerical Hessian via central differences on the gradient
+        H_numerical = np.zeros((n, n))
+        for j in range(n):
+            x_plus = x0.copy()
+            x_minus = x0.copy()
+            x_plus[j] += self._h
+            x_minus[j] -= self._h
+
+            self._scope.set_values(x_plus)
+            self._fn.forward()
+            J_plus = self._fn.jacobian().toarray()
+            grad_plus = J_plus.T @ weights
+
+            self._scope.set_values(x_minus)
+            self._fn.forward()
+            J_minus = self._fn.jacobian().toarray()
+            grad_minus = J_minus.T @ weights
+
+            H_numerical[:, j] = (grad_plus - grad_minus) / (2 * self._h)
+
+        # Restore original point
+        self._scope.set_values(x0)
+
+        np.testing.assert_allclose(
+            H_analytical, H_numerical,
+            rtol=self._rtol, atol=self._atol,
+            err_msg="Hessian mismatch between analytical and numerical",
+        )
+
+
+def random_point(scope, rng, low=-1.0, high=1.0):
+    """Set all variables to random values and return the flat vector."""
+    n = scope.total_var_size
+    x0 = rng.uniform(low, high, size=n)
+    scope.set_values(x0)
+    return x0
+
+
+def random_positive_point(scope, rng, low=0.1, high=2.0):
+    """Set all variables to positive random values (for restricted domains)."""
+    n = scope.total_var_size
+    x0 = rng.uniform(low, high, size=n)
+    scope.set_values(x0)
+    return x0