floating point issue in nbfloatprint.cpp

Discussion to talk about software related topics only.
Post Reply
ephogy
Posts: 30
Joined: Fri Aug 29, 2008 12:53 pm

floating point issue in nbfloatprint.cpp

Post by ephogy »

Occasionally, one of the tasks that generates a string gets stuck. I finally had a development version of the software running with TaskMonitor running.

The code that seems to break it is a relatively simple call to snprintf(buffer, sizeof(buffer), "%f", some_float).
since this is hard to reproduce, I have absolutely no idea as to what the value of some_float is, but it *should* within the range of (-0.1, 0.1), and can be extremely close to 0.

Multiple requests using TaskScan seem to indicate that the code is stuck in nbfloatprint.cpp between lines 83-86:

Code: Select all

while ( fi > 0xFFFFFFFFFFFFFFFFULL )
{
    l++;
    fi /= 10.0;
}
It appears as though the operation fi /= 10.0; is causing the gcc compiled code to throw an internal exception and the while loop doesn't handle the exception gracefully, and the task gets stuck in an infinite loop.

The software is running on a MOD5272, NNDK2.7, and internally we use our own implementation of floats as they are 2-3x faster than gcc's implementation of IEEE754, then when something like printf is required, the internal floating point representation is converted to the IEEE754 representation.

I'm guessing that I'll need to modify this loop somehow, but I'd like to concede to someone with perhaps a little more experience than me.

Cheers,
Keith
ephogy
Posts: 30
Joined: Fri Aug 29, 2008 12:53 pm

Re: floating point issue in nbfloatprint.cpp

Post by ephogy »

I believe I've solved my problem, or at least I see what is going on now...

In nbfloatprint.cpp, TheFloatPrintf function simply doesn't handle Infinity (properly?)

In my main code,

Code: Select all

    uint64_double infTest;
    printf("Infinity (bit pattern, float value):\r\n");

    infTest.d = nan("0");
    printf ("\tnan: %llX, %f\r\n", infTest.u, infTest.d);

    infTest.d = infinity();
    printf ("\tpos: %llX, %f\r\n", infTest.u, infTest.d);

    infTest.d = -infinity();
    printf ("\tneg: %llX, %f\r\n", infTest.u, infTest.d);
  • Serial port response:

    EPWaiting 2sec to start 'A' to abort
    Configured IP = 192.168.1.230
    Configured Mask = 255.255.255.0
    MAC Address= 00:03:f4:09:39:51

    Application started
    Infinity (bit pattern, float value):
    nan: 7FF8000000000000, NAN
    pos: 7FF0000000000000, I?Waiting 2sec to start 'A' to abort
Is this a known problem? I don't think I'm doing anything strange in my code here, or after, as removing these lines, the application runs fine.
User avatar
TomNB
Posts: 538
Joined: Tue May 10, 2016 8:22 am

Re: floating point issue in nbfloatprint.cpp

Post by TomNB »

Exactly which 2.7.x release are you using?
Can you attach a minimal main.cpp using only standard data types that exhibits the problem that we can build and test?
ephogy
Posts: 30
Joined: Fri Aug 29, 2008 12:53 pm

Re: floating point issue in nbfloatprint.cpp

Post by ephogy »

Hi Tom,

My mistake. It's actually NNDK 2.8.7 that I'm using.

I have the install for NNDK 2.6, but it looks like this section of the kernel went through some heavy modification since my 2.6.0 code base. e.g. none of the files I've listed above exist.

In the above code, uint64_double is just a union of unsigned long long and double so I could print out the hex value. It's not needed;

Code: Select all

printf("%f\r\n", infinity());
should produce the same problem. Occasionally it works, but instead of printing Inf to the screen, you get some crazy number.

I added:

Code: Select all

    if ( isinf( d ) )
    {
        char buf[4] = {'-', 'I', 'n', 'f'};
        char *bufp = buf;
        if (sign)
        {
            pfs.len = 4;
        }
        else
        {
            pfs.len = 3;
            bufp++;
        }

        prespace( pfs.width, pfs.flags, pfs.len, pf, data, pfs.nsent );
        pf( data, bufp, pfs.len );
        pfs.nsent += pfs.len;
        ret = postspace( pfs.width, pfs.flags, pfs.len, pf, data, pfs.nsent );
        return ret;
    }
at line 512 of nbfloatprint.cpp (just below the isnan conditional), and this resolved my issue.

Keith
User avatar
TomNB
Posts: 538
Joined: Tue May 10, 2016 8:22 am

Re: floating point issue in nbfloatprint.cpp

Post by TomNB »

Hello,

If I follow what you have described with the program below, nothing prints out at all. To be honest, I have not found anyone here, or searching through google, as to what I would use the infinity() function for. So you are saying you can run the a program like the one below in 2.8.7 and you get varying results? Important: with no modification to how floats are handled.

void UserMain(void * pd) {
InitializeStack();
GetDHCPAddressIfNecessary();
OSChangePrio(MAIN_PRIO);
EnableAutoUpdate();
StartHTTP();
EnableTaskMonitor();

#ifndef _DEBUG
EnableSmartTraps();
#endif


iprintf("Application started\n");
while (1)
{
printf("Infinity: %f\r\n", infinity());
OSTimeDly(TICKS_PER_SECOND);
}
}
ephogy
Posts: 30
Joined: Fri Aug 29, 2008 12:53 pm

Re: floating point issue in nbfloatprint.cpp

Post by ephogy »

Hi Tom,

I recompiled the code you've listed above with a just a very minor change (no web server):

Code: Select all

#include <stdio.h>
#include <autoupdate.h>
#include <ip.h>
//#include <http.h>
#include <dhcpclient.h>
#include <taskmon.h>
#include <smarttrap.h>
#include <math.h>

void UserMain(void * pd)
{
    InitializeStack();
    GetDHCPAddressIfNecessary();
    OSChangePrio(MAIN_PRIO);
    EnableAutoUpdate();
    //StartHTTP();
    EnableTaskMonitor();

    #ifndef _DEBUG
    EnableSmartTraps();
    #endif

    iprintf("Application started\n");
    while (1)
    {
        printf("Infinity: %g\r\n", infinity());
        OSTimeDly(TICKS_PER_SECOND);
    }
}
This is the output:

Waiting 2sec to start 'A' to abort
Configured IP = 192.168.1.230
Configured Mask = 255.255.255.0
MAC Address= 00:03:f4:09:39:51
Application started

No modifications to the kernel.
The function infinity() is not the important important part, it's just that the infinity case is NOT handled in nbfloatprint.cpp. the NaN case, however, IS handled.

When a float is printed to the screen, there are several functions which have while loops with no error checks:

Code: Select all

int Get_f_len(double d, pfstate & pfs)
...
    while ( fi > 0xFFFFFFFFFFFFFFFFULL )
    {
        l++;
        fi /= 10.0;
    }
...

int OutputFFloat(PutCharsFunction * pf, void * data, double d, pfstate & pfs)
...
        while ( fi > 0xFFFFFFFFFFFFFFFFULL )
        {
            fi /= 10.0;
            exp++;
        }
...

int OutputEFloat(PutCharsFunction * pf, void * data, double d, pfstate & pfs, char e_or_E)
...
    while ( d >= 10.0 )
    {
        d /= 10.0;
        exp++;
    }
...
since no check for infinity is done before these calls are made, gcc throws an internal exception when fi (or d) are divided by 10.0, and the original value of fi and d are maintained at infinity, hence the loops run forever.

handing the Inf case, in the same location as the NaN case is currently handled, solves the problem. This non-handling of printing infinities exists in NNDK 2.8.1, 2.8.5 and 2.8.7 (though I assume this exists since these files were introduced).

In my case, the software is performing logarithm calculations and at some point tries to take the logf(0.0), which results in Infinity, and when printing this back to a web page, the HTTP task gets stuck in this infinite loop.
User avatar
TomNB
Posts: 538
Joined: Tue May 10, 2016 8:22 am

Re: floating point issue in nbfloatprint.cpp

Post by TomNB »

Hello ephogy,

I ran all this by engineering, and they said it was a really good find on your part, thank you very much for sharing all the details. The fix they came up with and tested is to make a change in nbfloatprintf.cpp. In the function TheFloatPrintf() add another check under NAN for INF. In the file I am looking at in 2.8.7 it is on line 512.

So before there was only:

if ( isnan( d ) )
{
char buf[3] = {'N', 'A', 'N'};
pfs.len = 3;
prespace( pfs.width, pfs.flags, pfs.len, pf, data, pfs.nsent );
pf( data, buf, 3 );
pfs.nsent += 3;
ret = postspace( pfs.width, pfs.flags, pfs.len, pf, data, pfs.nsent );
return ret;
}

Copy that block of code and paste it underneath. Then change the N A N characters to I N F:

if ( isinf( d ) )
{
char buf[3] = {'I', 'N', 'F'};
pfs.len = 3;
prespace( pfs.width, pfs.flags, pfs.len, pf, data, pfs.nsent );
pf( data, buf, 3 );
pfs.nsent += 3;
ret = postspace( pfs.width, pfs.flags, pfs.len, pf, data, pfs.nsent );
return ret;
}

So now there are two checks, one for NAN and one for INF. I have run this also on my release and the previous code that hung on INF now works properly.
ephogy
Posts: 30
Joined: Fri Aug 29, 2008 12:53 pm

Re: floating point issue in nbfloatprint.cpp

Post by ephogy »

Thanks Tom,

Glad I could help get to the bottom of this.

Cheers,
Keith
User avatar
pbreed
Posts: 1080
Joined: Thu Apr 24, 2008 3:58 pm

Re: floating point issue in nbfloatprint.cpp

Post by pbreed »

That was a nice bug find.
Thanks for digging to the bottom of this...
Post Reply