All tasks suspending

Discussion to talk about software related topics only.
Post Reply
khoney
Posts: 125
Joined: Fri Sep 11, 2009 12:43 pm

All tasks suspending

Post by khoney »

I'm having a weird problem that manifests itself in two different ways. In one case, the application simply locks up - all of my tasks are suspended. I do not get a trap in this case. In the other case, I'm getting a trap and the Netburner resets. Here is the task info for the lockup situation:

Code: Select all

NetBurner Standard Debugger (4/21/10 7:34 AM) (Suspended)	
	Thread [1  Prio: 63 Name: Idle] (Suspended)	
	Thread [2  Prio: 50 Name: Main] (Suspended: Signal 'SIGSEGV' received. Description: Segmentation fault.)	
		5 OSSched() C:\nburn\system\ucos.c:333 0x0201b13a	
		4 OSTimeDly() C:\nburn\system\ucos.c:404 0x0201b334	
		3 UserMain() C:\Project\Div14\NetBurner\PhoenixII-LP\main.cpp:402 0x02002ed4	
		2 TopOfStackKillfunction() C:\nburn\system\ucosmcfc.c:40 0x0201aa70	
		1 <symbol is not available> 0x00000000	
	Thread [3  Prio: 40 Name: TCPD] (Suspended)	
	Thread [4  Prio: 39 Name: IP] (Suspended)	
	Thread [5  Prio: 38 Name: Esnd] (Suspended)	
	Thread [6  Prio: 48 Name: FTPD] (Suspended)	
	Thread [7  Prio: 45 Name: HTTP] (Suspended)	
	Thread [8  Prio: 47 Name: User] (Suspended)	
	Thread [9  Prio: 46 Name: User] (Suspended)	
	Thread [10  Prio: 49 Name: User] (Suspended)	
In the other case, the trap occurs, but I've been unable to determine where the problem lies. WinAddr2Line reports _vfprintf_r at ??:0. Here's the trap info for that...

Code: Select all

-------------------Trap information-----------------------------
Exception Frame/A7 =02063EA0
Trap Vector        =Access Error (2)
Format             =04
Status register SR =2000
Fault Status       =0C
Faulted PC         =0203A126

-------------------Register information-------------------------
A0=36353436 A1=0204BD94 A2=0200D7D4 A3=02006384
A4=020645F8 A5=02004EFC A6=02064570 A7=02063EA0
D0=02048140 D1=020645F8 D2=FFFFFFFF D3=0206465E
D4=02064660 D5=02064662 D6=02064664 D7=02064666 
SR=2000 PC=0203A126
-------------------RTOS information-----------------------------
The OSTCBCur current task control block = 200006DC
This looks like a valid TCB
The current running task is: User,#2F
-------------------Task information-----------------------------
Task    | State    |Wait| Call Stack
Idle#3F|Ready     |    |020136C0,02011AC0,0
Main#32|Timer     |0020|0201369E,02002A3A,02011AC0,0
TCPD#28|Semaphore |0001|0201331A,0201FF3E,02011AC0,0
IP#27|Fifo      |0012|020127F8,02016B30,02011AC0,0
Enet#26|Fifo      |0047|020127F8,02011070,02011AC0,0
FTPD#30|Semaphore |0087|0201331A,02023588,0201C77C,02011AC0,0
HTTP#2D|Semaphore |0000|0201331A,0201EEDC,02021960,02011AC0,0
User,#2F|Running   |    |0203A126,0
User,#2E|Timer     |0002|0201369E,0200732A,02011AC0,0
User,#31|Ready     |    |0201369E,0

-------------------End of Trap Diagnostics----------------------
A couple of questions:
1) Is WinAddr2Line program valid when running a release version (I think yes, but want to make sure).
2) Can anyone enlighten me on _vfprintf_r? Who calls it (assuming the addr2line info is valid)?
3) In the first case, where the segmentation fault is indicated, shouldn't that cause a trap as well?

Any clarification/advice would be appreciated.
User avatar
Chris Ruff
Posts: 222
Joined: Thu Apr 24, 2008 4:09 pm
Location: topsail island, nc
Contact:

Re: All tasks suspending

Post by Chris Ruff »

In my experience, Addr2Line is rarely useful. The fact that the fault happened in a clean part of the code (vfprintf is a sub part of functions such as printf, sprintf, etc.) doesn't -Ever- mean that the problem is with that code.

The most likely scenario is that your code did something bad to the stack and when some part of the code (in this case vfprintf in the library code) does a return the stack has been damaged by your code previously.

Watch those automatic variables, especially the char[] ones. To be absolutely paranoid (always good) you should use strncpy() and do your own string terminating so you will *know* that your stack will not be corrupted by an automatic variable.

My 2c

Chris
Real Programmers don't comment their code. If it was hard to write, it should be hard to understand
khoney
Posts: 125
Joined: Fri Sep 11, 2009 12:43 pm

Re: All tasks suspending

Post by khoney »

I've verified that in both the lockup case and the trap case, I'm making a call to OSTimeDly in one of my user threads and am never returning from it.
User avatar
tod
Posts: 587
Joined: Sat Apr 26, 2008 8:27 am
Location: Southern California
Contact:

Re: All tasks suspending

Post by tod »

I sometimes get useful info from Addr2Line and I'm very suspicious when it points to anything related to printf. Make sure you are using the latest NNDK release. There were at least two releases that had a renentrancy bug with printf when using floats (same bug also applied to using cout with a float). Also if you're actually using printf, sprintf and all related c functions, they are type unsafe and I think you're much better off using the iostream typesafe version of these functions. It eliminates one possible source of traps, so why not? If the problem just started happening your source code control system is your friend, starting doing some compares to previous versions. If you don't use a source code control system (now you know why you should), Eclipse will help you out anyway because it keeps a local history of changed files you can use. (How long they are kept is a preference setting).
Post Reply