Skip to content

Standardize TFLM Reference Kernels to Single Rounding Requantization #3252

@veblush

Description

@veblush

Issue

TFLM's Reference Kernels (and TFLite's one) use a legacy "Double Rounding" algorithm, resulting in an approximately ~1.8% off-by-one error rate. This diverges from the mathematically correct "Single Rounding" used by TFLite's Optimized kernels, primarily XNNPack and ruy, which serve as the "Golden Reference." Context: Single-round was introduced in TFLite by tensorflow/tensorflow#50290 to address tensorflow/tensorflow#25087 in 2021 but it wan't enabled by default due to concerns about regression.

This inconsistency creates a critical "Validation Trap":

  • Validation: Developers validate models using TFLite in Python (correct Single Rounding).
  • Deployment: Deploying to TFLM (incorrect Double Rounding) causes output mismatches, leading to unexpected accuracy drops and engineering churn.

Proposal: Standardize Reference Kernels to Single Rounding

I propose standardizing the Reference Kernel behavior to Single Rounding, ensuring TFLM matches the TFLite behavior.

Action Plan:

  1. Enabling TFLITE_SINGLE_ROUNDING by default in Bazel & Makefile build. (Optionally, we can consider introducing a new switch to use the double-rounding)
  2. Vendor Alignment and Coordination: Propagate this standard to optimized TFLM vendor kernels (e.g., ARM CMSIS-NN, Cadence Xtensa). Coordination is essential to ensure these paths align with the new mathematical standard, maintaining ecosystem consistency.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions