Skip to content

Fix truncating large workspace values#1195

Merged
langou merged 2 commits intoReference-LAPACK:masterfrom
jberg5:fix-f32-issue
Mar 25, 2026
Merged

Fix truncating large workspace values#1195
langou merged 2 commits intoReference-LAPACK:masterfrom
jberg5:fix-f32-issue

Conversation

@jberg5
Copy link
Copy Markdown
Contributor

@jberg5 jberg5 commented Feb 27, 2026

Description
Addresses #1194.

Quick summary: using REAL() workspace values truncated unrepresentable values to the nearest representable value for a f32, meaning most integers over 2^24 would be wrong. Downstream users of these values would underallocate, leading to segfaults.

Have tested by building and running the MRE in the linked issue above.

I also asked Claude (with Opus 4.6) to create a test suite to prevent these kinds of regressions going forward. That said, I do not know fortran!! The REAL/DBLE issue was easy enough for me to understand, but TESTING/EIG/test_wq_rwork.f is not something I've personally been able to review with any level of confidence. I'm happy to remove it if it's valueless slop.

Checklist

  • The documentation has been updated.
  • If the PR solves a specific issue, it is set to be closed on merge.

@langou
Copy link
Copy Markdown
Contributor

langou commented Mar 24, 2026

I also asked Claude (with Opus 4.6) to create a test suite to prevent these kinds of regressions going forward. That said, I do not know fortran!! The REAL/DBLE issue was easy enough for me to understand, but TESTING/EIG/test_wq_rwork.f is not something I've personally been able to review with any level of confidence. I'm happy to remove it if it's valueless slop.

I think this is a good idea but we are not ready for it. Can you remove please? Maybe break in two pull requests.

The REAL/DBLE is good and we are thankful.

TESTING/EIG/test_wq_rwork.f is a little much as of now.

@langou
Copy link
Copy Markdown
Contributor

langou commented Mar 24, 2026

With respect to the DBLE(), it might actually be even better to use DROUNDUP_LWORK().

See example of use in SRC/zgesdd.f for example.

And then instead of REAL(), (for the 32-bit subroutines,) we should probably use SROUNDUP_LWORK() wherever we use DROUNDUP_LWORK() (for their 64-bit counterparts).

In any case, DBLE() is certainly better than REAL().

Copy link
Copy Markdown
Collaborator

@mgates3 mgates3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks.

EXTERNAL ZHPGVD, ZHBGVD, ZSTEDC
*
* Test N values: 1000 (below 2^24 threshold), 3000 and 5000 (above)
DATA NVALS / 1000, 3000, 5000 /
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, for n = 3000 or n = 5000, the LRWMIN = 1 + 5n + 2n^2 = 18 015 001 or 50 025 001, which are > 2^24 = 16 777 216. In both cases, rounding to float32 will truncate the lrwork:

>>> int( numpy.float32( 18015001 ) )
18015000
>>> int( numpy.float32( 50025001. ) )
50025000

@langou langou merged commit 2f098d0 into Reference-LAPACK:master Mar 25, 2026
12 checks passed
@jberg5
Copy link
Copy Markdown
Contributor Author

jberg5 commented Mar 25, 2026

Thanks @langou and @mgates3 ! Appreciate the reviews!

@langou for your comments in particular, are they worth a followup PR? Happy to remove the tests if needed, or try out DROUNDUP_LWORK.

@langou
Copy link
Copy Markdown
Contributor

langou commented Mar 25, 2026

I think it's good enough for now. The xROUNDUP_LWORK is not too critical and we need to think on whether we want to do a pass on the whole LAPACK and applies it consistently. It has been applied mostly in an ad hoc way to these routines that mostly need it. (E.g. 32-bit and n^2 workspace.) The check of the workspace calculation seems to work and it is good to have this in as something we should have. Everything can always be improved, but good enough on my end for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants