Release 2.8.5

Discussion to talk about software related topics only.
sulliwk06
Posts: 118
Joined: Tue Sep 17, 2013 7:14 am

Re: Release 2.8.5

Post by sulliwk06 »

Actually it looks like your bytes are shifted left by 2, not an endian problem

If i was an endian problem, you would see this:
300 (0x0000012C) -> 738263040 (0x2C010000)

This is a 2 byte left shift:
300 (0x0000012C) -> 19660800 (0x012C0000)


Maybe something in your structure got sized differently in the 2 builds that led to a 2 byte offset?
SeeCwriter
Posts: 608
Joined: Mon May 12, 2008 10:55 am

Re: Release 2.8.5

Post by SeeCwriter »

Ah, good point. It's same source code. The only difference is the declaration of IPADDR. I've done a sizeof on IPADDR in both builds and they are the same size. I'll do more investigating.
SeeCwriter
Posts: 608
Joined: Mon May 12, 2008 10:55 am

Re: Release 2.8.5

Post by SeeCwriter »

I found the problem with my structure between the 2 versions of the compiler.

Below is a section of the structure from both compilers. Everything is fine up to multi_band. Then 2.7.6 inserts a pad byte after multi_band so that ntp_server falls on an even boundary. But 2.8.5 does not. So the structures are off by one byte at this point. It continues until use_multicast, where 2.7.6 inserts another pad byte, and 2.8.5 does not. Now the structures are out of sync by 2-bytes. And that’s the way it continues to the end of the structure.

Code: Select all

v2.7.6
           Member      Size  Offset
       rf_subband_frq   16   894
           multi_band    1   910
           ntp_server    4   912
           futureuse1   12   916
           ip_address    4   928
           futureuse2   12   932
              ip_mask    4   944
           futureuse3   12   948
           ip_gateway    4   960
           futureuse4   12   964
         ip_multicast    4   976
           futureuse5   12   980
        use_multicast    1   992
         amp_cfg_mask    2   994
          vtune_level    2   996

Last offset = 1190

Structure size = 1192

======================================

v2.8.5
            Member     Size  Offset
       rf_subband_frq   16   894
           multi_band    1   910
           ntp_server    4   911
           futureuse1   12   915
           ip_address    4   927
           futureuse2   12   931
              ip_mask    4   943
           futureuse3   12   947
           ip_gateway    4   959
           futureuse4   12   963
         ip_multicast    4   975
           futureuse5   12   979
        use_multicast    1   991
         amp_cfg_mask    2   992
          vtune_level    2   994

Last offset = 1188

Structure size = 1192
It looks to me like 2.7.6 is correct, and something happened with 2.8.x.
User avatar
pbreed
Posts: 1081
Joined: Thu Apr 24, 2008 3:58 pm

Re: Release 2.8.5

Post by pbreed »

If you are using structures between different software, I always include the packed directive...

If you DON'T use packed there is no guarantee what the structure layout/padding will be...

typedef struct
{
.
.
} __attribute__( ( packed ) ) MyStructType;
SeeCwriter
Posts: 608
Joined: Mon May 12, 2008 10:55 am

Re: Release 2.8.5

Post by SeeCwriter »

Who doesn't upgrade their compiler version over time? I have at least 4 different compilers installed, and I get updates all the time, and I've never had to pack a structure to guarantee a structure layout would stay the same from version to version.

IMO, the way GNU 5.2 changed how the structure is laid out is a bug. The layout isn't even consistent within the same structure with the same compiler version. Up until the point that it deviated, every structure member of 1-byte in size that was followed by an integer had a pad byte inserted so that the integer fell on an even address. And then all of a sudden it breaks the pattern and decides to not do that, plus it adds 2-byes at the end of the structure so that the structure size stays the same as that produced by the previous version? That is a bug.
User avatar
dciliske
Posts: 623
Joined: Mon Feb 06, 2012 9:37 am
Location: San Diego, CA
Contact:

Re: Release 2.8.5

Post by dciliske »

Unfortunately, no. Trust me when I say this, it feels like a betrayal of the compiler for it to do this, but the C *and* C++ language specs do not place any restrictions on how a compiler packs elements within a structure. Therefore, the compiler is completely free to place the members where it chooses. Obviously, the 5.2 compiler is a little more aggressive with packing the data than 4.2.1.

However, as Paul mentioned, GCC (and pretty much every other C/C++ compiler) provide a way to ensure that no silent padding is added into the structure (the 'packed' attribute, or #pragma pack). As a rule, any structure that may be passed in binary form between two separate applications should have an '__attribute__( ( packed ) )' attached to the definition.
Dan Ciliske
Project Engineer
Netburner, Inc
SeeCwriter
Posts: 608
Joined: Mon May 12, 2008 10:55 am

Re: Release 2.8.5

Post by SeeCwriter »

I may be splitting hairs, but we aren't passing the structure between applications. We have an application who's current release was compiled with 2.7.6. We then wanted to start implementing IPv6, so we compiled the application with 2.8.5. The structure holds operating parameters of the unit that are stored in EEPROM. When the application boots it copies the structure from EEPROM to sram. But now, a portion of the data is out of alignment and filled with bad data. This isn't a problem with new units that have only had firmware built with 2.8.5. But it makes field upgrades very difficult.

While this is annoying, it is not the cause of the random reboots/crashes I am experiencing. I am no closer today than I was nearly 4-months ago when this was first noticed.

I've had the debugger running on one unit for just over two weeks now with no crashes. Other units continue to crash at random times and intervals (days to weeks), and none provide any useful information from the debug port.
ephogy
Posts: 30
Joined: Fri Aug 29, 2008 12:53 pm

Re: Release 2.8.5

Post by ephogy »

SeeCwriter wrote:I may be splitting hairs, but we aren't passing the structure between applications. We have an application who's current release was compiled with 2.7.6. We then wanted to start implementing IPv6, so we compiled the application with 2.8.5. The structure holds operating parameters of the unit that are stored in EEPROM. When the application boots it copies the structure from EEPROM to sram. But now, a portion of the data is out of alignment and filled with bad data. This isn't a problem with new units that have only had firmware built with 2.8.5. But it makes field upgrades very difficult.
In the 2.8.5 builds, have you considered simply adding __attribute__((aligned(2))) to your structure definition?
It looks as though the alignment is set to 1 for the structure in 2.8.5, whereas it was 2 for the 2.7.6 branch. This should solve loading the values from EEPROM when updating the 2.7.6 branch to the 2.8.5 branch.

If you have any other structures that have (unsigned) chars in them, they may also now be back to 1 byte offsets. Is this potentially causing you grief elsewhere?
User avatar
dciliske
Posts: 623
Joined: Mon Feb 06, 2012 9:37 am
Location: San Diego, CA
Contact:

Re: Release 2.8.5

Post by dciliske »

While indeed, we are splitting hairs, from the context of guarantees, two different compilations of the same source with *any* change in the build tree (which includes the compiler and libraries), must be considered two different applications.

Uh... How many physical network interfaces exist for the device that crashes and are you running multicast?
Dan Ciliske
Project Engineer
Netburner, Inc
SeeCwriter
Posts: 608
Joined: Mon May 12, 2008 10:55 am

Re: Release 2.8.5

Post by SeeCwriter »

The first member of the structure has __attribute__((aligned(2))) as part of it's declaration. Is that what you mean?

We have one network interface. We aren't using multicast at this time. We use broadcasts, twice a second.
Post Reply