I see code like "__attribute__((packed))" and "__attribute__((aligned(4)))" occassionally. Can someone explain exactly what these things do and when they are required or not required? Is there any documentation? Are there speed advantages to keeping things aligned? In the case of 32 bit objects (eg int), is there a reason to align them so they are read or written atomically?
For example, why the aligned in:
DWORD MyTaskStack[MY_TASK_STACK_SIZE]__attribute__((aligned( 4 )));
I have structs that combine objects of type byte, char, word, int, dword, double, time_t, etc. I access the objects by name, not by pointers offset from the 1st object (except for the elements of arrays of a given type). Do I need to worry about alignment? I am usually not trying to write the structs as binary objects and therefore generally don't care how big they are as a whole. If I needed to figure out an offset to a member in a struct, or the size of an entire struct, why wouldn't I use offsetof() and sizeof() instread of worring about (and forcing) alignment?
Thanks in advance.
__attribute__((packed)) and __attribute__((aligned(4)))
-
- Posts: 513
- Joined: Sat Apr 26, 2008 7:14 am
Re: __attribute__((packed)) and __attribute__((aligned(4)))
See details at "GCC online documentation" http://gcc.gnu.org/onlinedocs
The compiler is smart enough to reasonably align the variable (or field of the structure).
Sometimes, you wish explain to the compiler, what is preferred to save: RAM for data, or CPU cycles.
The CPU cycles are saved, if the compiler assigned alignment is met the hardware alignment.
For example, 32-bit variable, which isn't aligned on the 4-byte boundary, demands two machine instructions.
You can arrange alignment by trusting to the compiler default, or with pragmas, or by carefully selected offsets. It's matter of your convenience.
Fix: "you wish to explain to the compiler" instead of the previously mistyped "you can explain to the compiler".
The compiler is smart enough to reasonably align the variable (or field of the structure).
Sometimes, you wish explain to the compiler, what is preferred to save: RAM for data, or CPU cycles.
The CPU cycles are saved, if the compiler assigned alignment is met the hardware alignment.
For example, 32-bit variable, which isn't aligned on the 4-byte boundary, demands two machine instructions.
You can arrange alignment by trusting to the compiler default, or with pragmas, or by carefully selected offsets. It's matter of your convenience.
Fix: "you wish to explain to the compiler" instead of the previously mistyped "you can explain to the compiler".
Last edited by yevgenit on Fri Jun 04, 2010 8:48 pm, edited 2 times in total.
Yevgeni Tunik
Embedded/RealTime software engineer
https://www.linkedin.com/in/yevgenitunik/
________________________
Embedded/RealTime software engineer
https://www.linkedin.com/in/yevgenitunik/
________________________
-
- Posts: 513
- Joined: Sat Apr 26, 2008 7:14 am
Re: __attribute__((packed)) and __attribute__((aligned(4)))
Yevgenit: thanks for the gnu link. Still have some specific questions to be sure I get it. Lets say I have a struct:
struct TemperatureData_t {
BYTE TempSensorStatus;
BYTE TempSensorError;
double Temperature;
}
//An array of these sensor structs
TemperatureData_t TemperatureSensors[ NUM_SENSORS ];
How do I ensure calls to "TemperatureSensors[ n ].Temperature;" access the "Temperature" double in one, atomic read or write? Would I say:
struct TemperatureData_t {
BYTE TempSensorStatus;
BYTE TempSensorError;
double Temperature; __aligned__(( aligned(4) ))
}
where the aligned(4) puts the double "Temperature" on a 32 bit boundary by adding two bytes of padding after the first two declared BYTEs?
struct TemperatureData_t {
BYTE TempSensorStatus;
BYTE TempSensorError;
double Temperature;
}
//An array of these sensor structs
TemperatureData_t TemperatureSensors[ NUM_SENSORS ];
How do I ensure calls to "TemperatureSensors[ n ].Temperature;" access the "Temperature" double in one, atomic read or write? Would I say:
struct TemperatureData_t {
BYTE TempSensorStatus;
BYTE TempSensorError;
double Temperature; __aligned__(( aligned(4) ))
}
where the aligned(4) puts the double "Temperature" on a 32 bit boundary by adding two bytes of padding after the first two declared BYTEs?
Re: __attribute__((packed)) and __attribute__((aligned(4)))
If you don't say packed you example structure will likely look like:
struct TemperatureData_t {
BYTE TempSensorStatus;
BYTE TempSensorError;
BYTE pad[2];
double Temperature;
}
You can allways tell if there is padding what does sizeof(TemperatureData) return?
Unless you are using a dual port RAM where something else can read the data writes 32 bits or less should be monolithic.
The double is slightly problematic because it is 64 bits in size so it is not necessarily writen /read in monolithic way.
Its possible that an interrupt or other task switch (actually caused by an interrupt) could split a write to
double into two parts, this is not as bad as it seems because the exponent, sign and top 20 bits of the mantissa
are all in one monolithic part so the worst possible error in a double from having the write split and read by something else is as follows:
there are the following 4 possibilities:
1)you read before the update....
Both parts old value no error result == old value
2)You read and split the update... (Really bad coincidental timing)
Top part of old value with bottom part of new value.
This has at most an error different than the old value of (1/2^22) or 0.00023841 %
or
3)You read and split the update the other way. In reality you would only get erro 2 or 3 not both, I just don't know enough to tell you which one you would get.
Top part of new value with bottom part of old value
This has at most an error different than the new value of (1/2^22) or 0.00023841 %
4)Finally you read both parts as the new value again no error.
struct TemperatureData_t {
BYTE TempSensorStatus;
BYTE TempSensorError;
BYTE pad[2];
double Temperature;
}
You can allways tell if there is padding what does sizeof(TemperatureData) return?
Unless you are using a dual port RAM where something else can read the data writes 32 bits or less should be monolithic.
The double is slightly problematic because it is 64 bits in size so it is not necessarily writen /read in monolithic way.
Its possible that an interrupt or other task switch (actually caused by an interrupt) could split a write to
double into two parts, this is not as bad as it seems because the exponent, sign and top 20 bits of the mantissa
are all in one monolithic part so the worst possible error in a double from having the write split and read by something else is as follows:
there are the following 4 possibilities:
1)you read before the update....
Both parts old value no error result == old value
2)You read and split the update... (Really bad coincidental timing)
Top part of old value with bottom part of new value.
This has at most an error different than the old value of (1/2^22) or 0.00023841 %
or
3)You read and split the update the other way. In reality you would only get erro 2 or 3 not both, I just don't know enough to tell you which one you would get.
Top part of new value with bottom part of old value
This has at most an error different than the new value of (1/2^22) or 0.00023841 %
4)Finally you read both parts as the new value again no error.
-
- Posts: 513
- Joined: Sat Apr 26, 2008 7:14 am
Re: __attribute__((packed)) and __attribute__((aligned(4)))
thanks for taking the time to explain this paul and yevgenit.