Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] In new FPU, division by NAN incorrectly sets NX and DZ fflags #1192

Open
1 task done
flaviens opened this issue Apr 17, 2023 · 11 comments
Open
1 task done

[BUG] In new FPU, division by NAN incorrectly sets NX and DZ fflags #1192

flaviens opened this issue Apr 17, 2023 · 11 comments
Assignees
Labels
Component:RTL For issues in the RTL (e.g. for files in the rtl directory) notCV32A65X It is not an CV32A65X issue PARAM:FPU Issue depends on the FPU parameter Status:New Newly created issue, nobody has looked at it yet. Type:Bug For bugs in the RTL, Documentation, Verification environment or Tool and Build system

Comments

@flaviens
Copy link
Contributor

flaviens commented Apr 17, 2023

Is there an existing CVA6 bug for this?

  • I have searched the existing bug issues

Bug Description

I create a new issue, similar to #1071, but this affects the new FPU and changed the fflags signature.

I discovered that with the new FPU, Ariane unexpectedly sets the NV (invalid operation) and the DZ flags when we call fdiv.s on floating-point operands that have the 32 lower bits as zeros but are NaNs.

Example snippet to reproduce:

  .section ".text.init","ax",@progbits
  .globl _start
  .align 2
_start:

  # Enable the FPU
  li t0, 0x2000
  csrs mstatus, t0
  csrw	fcsr,x0

  la t0, .fdata0
  la t1, .fdata1
  fld ft0, (t0)
  fld ft1, (t1)

  fdiv.s ft2, ft0, ft1
  csrr t2, fflags

infinite_loop:
  j infinite_loop

.section ".fdata0","ax",@progbits
  .8byte 0x49a3e80e00000000
.section ".fdata1","ax",@progbits
  .8byte 0xa81ad47700000000

From spike, fflags should be clear, but Ariane sets fflags=0x18, which corresponds to NV | DZ.
Let me know if I can help debug that.

Thanks!
Flavien

@flaviens flaviens added the Type:Bug For bugs in the RTL, Documentation, Verification environment or Tool and Build system label Apr 17, 2023
@MikeOpenHWGroup MikeOpenHWGroup added Component:RTL For issues in the RTL (e.g. for files in the rtl directory) Status:New Newly created issue, nobody has looked at it yet. labels Apr 19, 2023
@JeanRochCoulon JeanRochCoulon added the PARAM:FPU Issue depends on the FPU parameter label Nov 9, 2023
@JeanRochCoulon JeanRochCoulon added the notCV32A65X It is not an CV32A65X issue label Apr 6, 2024
@fly-1011
Copy link

Hello.

I have a question. It seems that the values of ft0 and ft1 are not NaNs. For a value to be considered NaN, all exponent bits must be set to 1, but it appears that ft0 and ft1 do not meet this requirement.

Moreover, if the value of fflags is 0x18, its corresponding binary representation should be 0001 1000, which should correspond to NV and DZ .

image

I look forward to your response. Thanks!

@flaviens
Copy link
Contributor Author

flaviens commented May 13, 2024

Hi @fly-1011 thank you for your comment! It's been more than 1 year so I don't remember details unfortunately. Can you observe a mismatch between CVA6 and Spike on this snippet in the first place? This will let us know if the reporting issue is in the description or in the snippet.

EDIT:
Regarding NaNs, here is what spike says:

core   0: 0x0000000000001000 (0x00000297) auipc   t0, 0x0
core   0: 0x0000000000001004 (0x02028593) addi    a1, t0, 32
core   0: 0x0000000000001008 (0xf1402573) csrr    a0, mhartid
core   0: 0x000000000000100c (0x0182b283) ld      t0, 24(t0)
core   0: 0x0000000000001010 (0x00028067) jr      t0
core   0: >>>>  _start
core   0: 0x0000000080000000 (0x000022b7) lui     t0, 0x2
core   0: 0x0000000080000004 (0x3002a073) csrs    mstatus, t0
core   0: 0x0000000080000008 (0x00301073) csrw    fcsr, zero
core   0: 0x000000008000000c (0x00004297) auipc   t0, 0x4
core   0: 0x0000000080000010 (0xff428293) addi    t0, t0, -12
core   0: 0x0000000080000014 (0x00004317) auipc   t1, 0x4
core   0: 0x0000000080000018 (0xff430313) addi    t1, t1, -12
core   0: 0x000000008000001c (0x0002b007) fld     ft0, 0(t0)
core   0: 0x0000000080000020 (0x00033087) fld     ft1, 0(t1)
(spike) freg 0 ft0
0xffffffffffffffff49a3e80e00000000
(spike) freg 0 ft1
0xffffffffffffffffa81ad47700000000
(spike) fregs 0 ft0
nan
(spike) fregs 0 ft1
nan

@fly-1011
Copy link

Thank you for your prompt reply. I apologize for my late response.

After verifying with the above program, the fflags is still 0x18, and it seems that this issue has not been fixed. However, the diagram above shows that 0x18 (0001 1000) should correspond to NV and DZ, not NV and NX.

Regarding the nan issue, my understanding is: In Spike debugging, freg shows the specific contents of the register, while fregs or fregd interprets the register's content in floating-point format. fregs corresponds to single precision, but ft0 and ft1 are double precision, which cannot be interpreted as valid floating-point numbers, resulting in nan. I have encountered the same situation when using Spike debugging before.
image

@flaviens
Copy link
Contributor Author

Hi @fly-1011 ,
Thanks for your response! Maybe I wrote NX instead of DZ above, I'm happy to edit this once this gets confirmed. Do you confirm you get 0x18 in CVA6?
What's your take-away regarding your last paragraph?

@fly-1011
Copy link

Hello,Thank you very much for your response! I have confirmed that the value of fflags in CVA is 0x18.
Below is the log information:

core   0: 0x0000000080002018 (0x186270d3) fdiv.s  ft1, ft4, ft6
3 0x0000000080002018 (0x186270d3) f 1 0xffffffff7fc00000
core   0: 0x000000008000201c (0x001023f3) csrrs   t2, fflags, zero
3 0x000000008000201c (0x001023f3) x 7 0x0000000000000018

@flaviens flaviens changed the title [BUG] In new FPU, division by NAN incorrectly sets NX and NV fflags [BUG] In new FPU, division by NAN incorrectly sets NX and DZ fflags May 14, 2024
@flaviens
Copy link
Contributor Author

Ok! I updated the issue accordingly, thank you for your comment.

@fly-1011
Copy link

Hi @flaviens ,
Thank you very much for the update on this issue; I have a few questions that I would like to discuss further.

  1. Neither ft0 nor ft1 is NaN.

According to the IEEE 754 standard regarding NaN, the exponent bits of ft0 and ft1 are not all ones, which does not meet the criteria for NaN. In the spike debugger, using fregs prints a NaN value because ft0 is a double-precision floating point number, while fregs is intended for single precision.
image

  1. This issue seems to be an incorrect setting of the DZ flag for zero divided by zero.

In cva6, single-precision instructions processing double-precision floating-point numbers will only handle the lower 32 bits. For more details, see: openhwgroup/cvfpu#118 (comment). The lower 32 bits of both ft0 and ft1 are zeros, which should indicate a zero divided by zero case.

Note: #1071 seems to be a similar situation,so I think it's not a bug.

@flaviens
Copy link
Contributor Author

flaviens commented May 14, 2024

Hard to say, it's been a while I haven't touched floating-point numbers. But then why do Spike and the other designs like Rocket not raise the same flags in this situation? Is there some permittivity in the RISC-V spec?

@fly-1011
Copy link

Based on my understanding, when double-precision floating-point numbers ft0 and ft1 are processed by single-precision instructions, they can either be treated as NaN or only the lower 32 bits can be considered. It appears that Spike treats them as NaN, whereas cva6 uses the lower 32 bits.

As explained by RISC-V spec, when D extension is used, f registers are widened to 64 bits and can hold either 32-bit or 64-bit floating point values.
When f registers are used with single-precision instructions (.s), only 32 LSBs are used.
With double-precision instructions (
.d), all 64 bits are used.

@flaviens
Copy link
Contributor Author

According to the RISC-V unprivileged spec "NaN Boxing of Narrower Values":

Floating-point compute and sign-injection operations calculate results based on the FLEN-bit values
held in the f registers. A narrow n-bit operation, where n < FLEN, checks that input operands
are correctly NaN-boxed, i.e., all upper FLEN−n bits are 1. If so, the n least-significant bits of the
input are used as the input value, otherwise the input value is treated as an n-bit canonical NaN.
An n-bit floating-point result is written to the n least-significant bits of the destination f register,
with all 1s written to the uppermost FLEN−n bits to yield a legal NaN-boxed value.

After a quick look at this I understand that ft0 and ft1 should be treated as 32-bit NaNs.

@fly-1011
Copy link

Thank you very much for your patient reply!👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component:RTL For issues in the RTL (e.g. for files in the rtl directory) notCV32A65X It is not an CV32A65X issue PARAM:FPU Issue depends on the FPU parameter Status:New Newly created issue, nobody has looked at it yet. Type:Bug For bugs in the RTL, Documentation, Verification environment or Tool and Build system
Projects
None yet
Development

No branches or pull requests

4 participants