Skip to content

fix: correct WAV decoder normalization for 16-bit and 24-bit PCM#177

Open
Yanhu007 wants to merge 1 commit into
faiface:masterfrom
Yanhu007:fix/wav-decode-amplitude
Open

fix: correct WAV decoder normalization for 16-bit and 24-bit PCM#177
Yanhu007 wants to merge 1 commit into
faiface:masterfrom
Yanhu007:fix/wav-decode-amplitude

Conversation

@Yanhu007
Copy link
Copy Markdown

Fixes #176

Problem

The WAV decoder normalizes signed PCM samples using unsigned maximum values, producing audio at ~50% amplitude:

Bit depth Division Range Expected
16-bit / 65535 (1<<16-1) [-0.5, 0.5] [-1.0, 1.0]
24-bit / 256 / 16777215 [-0.5, 0.5] [-1.0, 1.0]

Fix

Use signed maximum values:

Bit depth Division Range
16-bit / 32768 (1<<15) [-1.0, ~1.0]
24-bit / 2147483648 (1<<31) [-1.0, ~1.0]

This matches the standard PCM normalization convention used by other audio libraries.

16-bit PCM samples (range [-32768, 32767]) were divided by 65535
(unsigned max) instead of 32768 (signed max), resulting in output
range [-0.5, 0.5] instead of [-1.0, 1.0].

24-bit PCM samples had the same issue: after packing into int32
and dividing by 256, the result was divided by 16777215 instead
of 8388608, again producing half amplitude.

Fix: divide 16-bit by (1<<15) and 24-bit by (1<<31).

Fixes faiface#176
@neurlang
Copy link
Copy Markdown

Concept ACK

However, at minimum edge cases must be checked e.g. whether +32767 maps to +1.0 and whether -32767 or -32768 maps to -1.0

Although we want this fix in neurlang/gomel ASAP, this fix may break other consumers that already wrongly expect 50% volume and compensate for this bug using 2x volume boost. It is highly recommended to release this only as a breaking change version. Please also let us know ahead before tagging it so we can coordinate rollout on our side (if merged).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

WAV Decoder Normalizes 16-bit and 24-bit Audio to Half Amplitude

2 participants