Bug in FPU Coprocessor float division?

michal1
Posts: 3
Joined: Tue May 14, 2019 7:26 pm

Bug in FPU Coprocessor float division?

Postby michal1 » Tue May 14, 2019 8:03 pm

Hello,

I have noticed that using the native FPU division instructions leads to unexpected results when dividing two floating numbers of significantly different exponents.

It seems that if a division operator is used in C code, gcc inserts a call to __divsf3() which does not use the native FPU division instructions but instead relies on software implementation and gives the correct result. As I am working on computationally intensive DSP code, performance is really important to me and I would like to use the native instructions instead. Using the FPU instruction sequence for division as documented in the ISA Reference manual and implemented in divsf() : https://github.com/espressif/esp-idf/bl ... /test_fp.c however leads to wrong results (typically 0.0) when the two exponents differ a lot. The FPU division should however be IEEE compliant according to documentation.

An example code:

Code: Select all

float x = 24e9;
float y = 4;

printf("C: %.2f/%.2f = %.2f\n", x,y, x/y);
printf("ASM: %.2f/%.2f = %.2f\n", x,y, divsf(x,y));

x = 17;
y = 4;

printf("C: %.2f/%.2f = %.2f\n", x,y, x/y);
printf("ASM: %.2f/%.2f = %.2f\n", x,y, divsf(x,y));
Output:

Code: Select all

C: 24000000000.00/4.00 = 6000000000.00
ASM: 24000000000.00/4.00 = 0.00
C: 17.00/4.00 = 4.25
ASM: 17.00/4.00 = 4.25
I was wondering if anybody has come across those issues and if this is indeed a hardware bug in the FPU or if there are perhaps limitations to the native FPU?

Thank you very much!

michal1
Posts: 3
Joined: Tue May 14, 2019 7:26 pm

Re: Bug in FPU Coprocessor float division?

Postby michal1 » Wed May 15, 2019 4:33 pm

It turns out there is a bug in divsf() implementation in https://github.com/espressif/esp-idf/bl ... /test_fp.c

I believe the correct implemenation should be:

Code: Select all

float divsf(float a, float b)
{
    float result;
    asm volatile (
        "wfr f0, %1\n"
        "wfr f1, %2\n"
        "div0.s f3, f1 \n"
        "nexp01.s f4, f1 \n"
        "const.s f5, 1 \n"
        "maddn.s f5, f4, f3 \n"
        "mov.s f6, f3 \n"
        "mov.s f7, f1 \n"
        "nexp01.s f8, f0 \n"
        "maddn.s f6, f5, f3 \n"
        "const.s f5, 1 \n"
        "const.s f2, 0 \n"
        "neg.s f9, f8 \n"
        "maddn.s f5,f4,f6 \n"
        "maddn.s f2, f9, f3 \n" /* Original was "maddn.s f2, f0, f3 \n" */
        "mkdadj.s f7, f0 \n"
        "maddn.s f6,f5,f6 \n"
        "maddn.s f9,f4,f2 \n"
        "const.s f5, 1 \n"
        "maddn.s f5,f4,f6 \n"
        "maddn.s f2,f9,f6 \n"
        "neg.s f9, f8 \n"
        "maddn.s f6,f5,f6 \n"
        "maddn.s f9,f4,f2 \n"
        "addexpm.s f2, f7 \n"
        "addexp.s f6, f7 \n"
        "divn.s f2,f9,f6\n"
        "rfr %0, f2\n"
        :"=r"(result):"r"(a), "r"(b)
    );
    return result;
}
Another question is, why does gcc by default use __divsf3() when the performance of the native FPU is significantly better?

ESP_Angus
Posts: 2176
Joined: Sun May 08, 2016 4:11 am

Re: Bug in FPU Coprocessor float division?

Postby ESP_Angus » Thu May 16, 2019 2:15 am

Thanks for pointing this out, michal.

EDIT: The libgcc __divsf3() implementation in the toolchain uses FPU registers. IDF is accidentally linking the version in ROM which does not use FPU registers. We'll fix this so that you get the FPU version when building a project that uses floating point division.

EDIT 2: Fix is in internal review now.

simap2000
Posts: 1
Joined: Sun Jan 10, 2016 8:59 pm

Re: Bug in FPU Coprocessor float division?

Postby simap2000 » Sat May 09, 2020 4:13 pm

Hello!
This post has been most helpful! I haven't tested it thoroughly, but this division implementation is much faster than what I'm getting natively.

I poked around on the esp-idf github, and it looks like it's still using the rom version that is slow. What happened to the fix? Should I file a github issue?

ESP_Angus
Posts: 2176
Joined: Sun May 08, 2016 4:11 am

Re: Bug in FPU Coprocessor float division?

Postby ESP_Angus » Sun May 10, 2020 11:22 pm

Hi simap2000,

Sorry this post should have been updated when the fix was merged. The fix was merged in this commit and should be included in ESP-IDF v4.0 and newer.

Can you please give some more details about what you're seeing? Specifically: Which ESP-IDF version are you using, and what is the reason for saying "it looks like it's still using the rom version that is slow"?

Thanks,

Angus

Who is online

Users browsing this forum: No registered users and 3 guests