py/map: Make dicts preserve insertion order. by andrewleech · Pull Request #34 · andrewleech/micropython

andrewleech · 2026-03-26T11:38:14Z

Summary

Some years back @dpgeorge and I had a number of discussions about making dicts ordered (micropython#6170, micropython#6173), with a couple of my earlier attempts (micropython#5514, micropython#5517) being too simple to land. That led to @dpgeorge's micropython#6173 which took the proper CPython/PyPy approach (dense key/value array with separate sparse hash index) but was left as WIP with a few open items. With an aim to close out this earlier work I've tried to build on that PR with fixes for the remaining issues, compaction, OrderedDict simplification, performance testing and flash size reduction (as much as I can).

Dicts now preserve insertion order matching CPython 3.7+. Hash indices are uint8 for small dicts (<256 entries), uint16 above that, and uint32 via MICROPY_PY_MAP_LARGE for dicts exceeding 65535 elements. Without LARGE the alloc is capped at 65535; at 8+ bytes per entry that's over 0.5MB of table, well past what most targets have.

Deleted entries become tombstones in the dense array. Compaction runs in-place when tombstones exceed 50% of live entries or when the array is full, no allocation needed so dict operations work safely under heap lock (exception handling, GC).

dict.popitem() returns LIFO matching CPython. OrderedDict shares the same hash table backing as regular dicts now, the mutable is_ordered linear-scan code path is removed. ROM/const maps keep linear scan via is_fixed.

The filled count for O(1) len() is packed into the same bitfield word as used so mp_obj_dict_t stays within one GC block. On 32-bit that gives 15 bits each (max 32767 entries), on 64-bit 31 bits.

Behind MICROPY_PY_MAP_ORDERED, defaulting on at EXTRA_FEATURES ROM level. With it off the original flat open-addressing table is used, zero overhead.

map->used changes meaning from live-entry count to high-water mark when ordered maps are enabled so external code reading dict length should use the mp_map_len() macro. The core code and port pin files are converted; there are ~28 kwargs->used instances across extmod/ and ports/ left unconverted since kwargs maps are add-only (used == filled always holds). Open question whether those should be converted for consistency.

Testing

Tested on unix (standard, coverage, minimal, nanbox, longlong), STM32 PYBV10 (cross-compiled), and PYBD-SF6W (on-device). Both MICROPY_PY_MAP_ORDERED=1 and =0 build and pass. Tests cover compaction thresholds, ordering preservation, empty-dict edge cases, non-qstr keys through hash rebuild, LIFO popitem with mixed del/add, dict operations under micropython.heap_lock(), and the alloc overflow boundary (stress/dict_create_max).

Flash measured by building branch and merge-base with identical toolchain (arm-none-eabi-gcc 14.3.1, -Os), comparing size on the firmware ELF:

Target	Branch	Merge base	Delta
STM32 PYBV10 (Thumb-2)	367,676	367,356	+320 B (+0.09%)
Unix x86-64	783,774	783,062	+712 B (+0.09%)

RAM via gc.collect(); gc.mem_alloc() before/after dict creation on unix x86-64:

Dict size	Branch	Master	Delta
0 (empty)	32	32	0
1	96	64	+32
10	224	192	+32
50	1,056	992	+64
100	2,208	2,080	+128
500	9,440	8,384	+1,056

Per-slot overhead is the hash index byte. Empty dict identical to master.

Speed on unix x86-64, time.ticks_us() inside functions, 1000 iterations on 100-entry dicts, median of 3 runs. Measured before the latest minor fixes (alloc guard, assert); these don't affect hot paths so numbers should still be representative:

Operation	Branch (us)	Master (us)	Change
create 100	52,630	48,825	+7.8%
insert 100	61,199	55,885	+9.5%
lookup 100	48,389	49,209	-1.7%
iterate 100	30,972	30,813	+0.5%
del+add 100	275,700	835,207	-66.9%
globals rw 1M	146,271	137,767	+6.2%

Lookup and iteration within noise. Insertion ~10% slower from hash index writes, delete+add cycles ~3x faster (in-place compact vs full rehash). Globals access ~6% overhead.

On-device speed on PYBD-SF6W (STM32F767, Cortex-M7), time.ticks_us(), 500 iterations on 50-entry dicts, compared branch firmware against master on same board:

Operation	Branch (us)	Master (us)	Change
create 50	423,988	431,240	-1.7%
lookup 50	1,527,215	1,582,656	-3.5%
iterate 50	432,025	417,258	+3.5%
del+add 50	1,126,618	1,869,892	-39.7%
globals 100K	413,621	426,807	-3.1%

On the MCU everything is within noise except delete+add cycles which are ~40% faster and iteration which is ~3.5% slower.

Trade-offs and Alternatives

+320 bytes flash on STM32 Thumb-2, mostly the compaction path needed for heap-locked safety. Ports at CORE/BASIC ROM level can disable MICROPY_PY_MAP_ORDERED for zero cost.

OrderedDict no longer accepts unhashable keys (e.g. slices) since it now uses the hash table, this matches CPython behavior. MicroPython's old linear-scan OrderedDict accepted them as a side effect of not hashing. The slice_optimise.py test is updated to handle both paths.

Generative AI

I used generative AI tools when creating this PR, but a human has checked the code and is responsible for the description above.

github-actions · 2026-03-26T13:56:40Z

Code size report:

Reference:  esp32/boards/SEEED_XIAO_ESP32C6: Add new XIAO board definition. [2dc2e30]
Comparison: tests: Add ordered dict tests and benchmarks. [merge of c1310aa]
  mpy-cross:   +64 +0.017% 
   bare-arm:   +36 +0.064% 
minimal x86:  +128 +0.068% 
   unix x64:  +680 +0.079% standard
      stm32:  +336 +0.084% PYBV10
      esp32:  +376 +0.022% ESP32_GENERIC
     mimxrt:  +328 +0.085% TEENSY40
        rp2:  +496 +0.054% RPI_PICO_W
       samd:  +328 +0.119% ADAFRUIT_ITSYBITSY_M4_EXPRESS
  qemu rv32:  +612 +0.134% VIRT_RV32

Signed-off-by: Damien George <damien@micropython.org>

Signed-off-by: Andrew Leech <andrew.leech@planetinnovation.com.au>

andrewleech force-pushed the py-map-ordered branch 8 times, most recently from 740bcf8 to fb2071d Compare March 29, 2026 11:40

dpgeorge and others added 3 commits March 31, 2026 11:35

py/map: Convert map implementation to preserve insertion order.

fc9b550

Signed-off-by: Damien George <damien@micropython.org>

py/map: Add compaction, LIFO popitem, and OrderedDict simplification.

355e1d9

Signed-off-by: Andrew Leech <andrew.leech@planetinnovation.com.au>

tests: Add ordered dict tests and benchmarks.

2197cfe

Signed-off-by: Andrew Leech <andrew.leech@planetinnovation.com.au>

andrewleech force-pushed the py-map-ordered branch from fb2071d to 2197cfe Compare March 31, 2026 00:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

py/map: Make dicts preserve insertion order.#34

py/map: Make dicts preserve insertion order.#34
andrewleech wants to merge 3 commits intoreview/py-map-orderedfrom
py-map-ordered

andrewleech commented Mar 26, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 26, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

andrewleech commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing

Trade-offs and Alternatives

Generative AI

Uh oh!

github-actions bot commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

andrewleech commented Mar 26, 2026 •

edited

Loading

github-actions bot commented Mar 26, 2026 •

edited

Loading