Crash implementing dual-stack

Discussion to talk about software related topics only.
SeeCwriter
Posts: 606
Joined: Mon May 12, 2008 10:55 am

Crash implementing dual-stack

Post by SeeCwriter »

I'm trying to incorporate the dual ip stack and of course my program crashes.
The utility WinAddr2Line is useless. It rarely provides any output. So I'm using
m68k-elf-objdump to create a dump file and searching for the address where the
crash occurred. But I need help deciphering the dump file identifiers.

I have a class named protocol_link. And in the dump file a section of code has
this label: <_ZN13protocol_link5ResetEv>:

I see the class name in the label, and there are two class functions with the
word Reset in their name, Reset() and ResetLink(), but none with "Ev" in their
name. Is this label for function "Reset()", and What does "Ev" mean?

Within the code block labeled <_ZN13protocol_link5ResetEv> is a call to another
function in a different class. That call is to label <_ZN10c_interval5ResetEm>.
It's in this function that the crash occurs.

Again, the class name is c_interval and there is only one class function with Reset
in the name, and it's Reset(). So I assume that's the function, but what does "Em" mean?

I'm using v2.9.2 of the tools.
SeeCwriter
Posts: 606
Joined: Mon May 12, 2008 10:55 am

Re: Crash implementing dual-stack

Post by SeeCwriter »

My crash occurs when I reset a timer when a new tcp connection is accepted.

The timer is created with this class:

Code: Select all

class c_interval    // interval timer based on TimeTick
	{
	public:
	volatile DWORD start_time, timeout, interval;
	explicit c_interval(DWORD msec=1000);	// interval in msec.
	void  Reset(DWORD msec=0);  // If parameter not supplied, interval is left alone.
	bool  Cycle(DWORD msec=0); 	// If time has expired, returns true and increments by one interval.
	bool  Expired();            // If time has expired, returns true.
	DWORD ElapsedSec();
	DWORD ElapsedMsec();
	DWORD RemainingSec();
	void  ForceExpiration();    // sets start_time such that Cycle() or Expired() will return true.
	};
There is a tcp connection class where this timer is used to close tcp connections
when there is no activity for a certain amount of time. See "idle" parameter.
Note that the only change to this class is changing ip_addr from IPADDR4 to IPADDR.

Code: Select all

class protocol_link
{
public:
  int  port;
  char rbuf[PROTOCOL_BUF_SIZE];	// holds incoming packet.
  char tbuf[PROTOCOL_BUF_SIZE];	// holds outgoing packet.
  int  count;						// size of current packet.
  char address;
  bool printable, check_address, efd_protocol;
  bool udp_redirect;				// TRUE = this is a redirect of UDP-compatible messaging.
  int efd_addr;
  enum_packet_state step;
  #ifdef IPv6_DEVELOPMENT
  IPADDR  ip_addr;  // The ip address of the remote host, if applicable.
  #else
  IPADDR4 ip_addr;  // The ip address of the remote host, if applicable.
  #endif
  bool running;	    // prevent recursion, ex: block routines triggered by current packet from checking for another packet...
                    // ... and upon seeing the same packet (having not called ResetLink()), go into infinite loop.
  c_interval *idle;	// idle timer, restarted by ResetLink()
  protocol_link():port(-1),count(0),address('A'),printable(false),check_address(false),efd_protocol(false),
                  udp_redirect(false),efd_addr(1),step(PACKET_DEFAULT_STATE),running(false)
    { 
      idle = new c_interval(300000); // 300,000 msec = 5 minutes
      rbuf[0] = 0;
      tbuf[0] = 0;
    }
  ~protocol_link() { delete [] idle; }
  void Init(int fd, char device_address, bool check_address, bool as_udp=false);
  int  Send(char start_char, const char *string, char new_address=0, int send_bytes=-1);
  int  Printf(char start_char, const char *format_str, ...);
  protocol_status_enum Receive();
  protocol_status_enum ReceiveUdp();
  char CheckSum(char *buf, int size);
  void Reset(), ResetLink();
  void KillCheckSum();	// zeroes rbuf[] checksum field, typically called AFTER confirming the checksum (if applicable).
};
When a new tcp connection is established, the function protocol_link::ResetLink()
is called, and it in turn calls c_interval::Reset() to restart the timer. It's resetting
the timer where it crashes.
This is the reset function:

Code: Select all

void c_interval::Reset(DWORD msec) 
  { 
  if (msec) interval = (DWORD)(msec * TICKS_PER_SECOND / 1000); //lint !e790 Example: 50 msec interval x 20 ticks/sec / 1000 = 1 tick.
  timeout = TimeTick + interval;  //lint !e644
  start_time = TimeTick;
  }
This is the assembly code of the Reset() function. The crash always happens at address 4004138, which
appears to be when the code assigns a value to timeout.

Code: Select all

40041708 <_ZN10c_interval5ResetEm>:
40041708:	4e56 0000      	linkw %fp,#0
4004170c:	206e 0008      	moveal %fp@(8),%a0
40041710:	2f02           	movel %d2,%sp@-
40041712:	202e 000c      	movel %fp@(12),%d0
40041716:	6714           	beqs 4004172c <_ZN10c_interval5ResetEm+0x24>
40041718:	7214           	moveq #20,%d1
4004171a:	4c01 0800      	mulsl %d1,%d0
4004171e:	243c 0000 03e8 	movel #1000,%d2
40041724:	4c42 0000      	remul %d2,%d0,%d0
40041728:	2140 0008      	movel %d0,%a0@(8)
4004172c:	2028 0008      	movel %a0@(8),%d0
40041730:	2239 8000 09e8 	movel 800009e8 <TimeTick>,%d1
40041736:	d081           	addl %d1,%d0
40041738:	2140 0004      	movel %d0,%a0@(4)
4004173c:	2039 8000 09e8 	movel 800009e8 <TimeTick>,%d0
40041742:	241f           	movel %sp@+,%d2
40041744:	2080           	movel %d0,%a0@
40041746:	4e5e           	unlk %fp
40041748:	4e75           	rts
This is how the connection is accepted.

Code: Select all

protocol_link mandc[TOTAL_PROTOCOL_LINKS];  // total links = 12

int listener = listen( INADDR_ANY, (WORD) port, 1 );
...
void ManageTcp()
{
  IPADDR addr;

  if ( !IsSocketReadable( listener ) ) return;  // Readability on a listening socket = connection available.

  int fd = accept( listener, &addr, NULL, 0 ); // blocking call but allegedly a connection is available.
  if ( fd <= 0 )  return;   // if not valid file descriptor, bail...

  // look for available socket...
  for ( i = 0; i < TOTAL_PROTOCOL_LINKS; i++ )
    {
    if ( mandc[i].port < 0 )          // available link.
      {
      mandc[i].port = fd;          // aka the socket.
      mandc[i].ip_addr = addr; <---- Is this corrupting memory!!! 
      mandc[i].ResetLink();    <---- CRASH
      setsockoption( mandc[i].port, SO_NONAGLE );
      break;
      }
    }

}

static bool IsSocketReadable( int fd )
{
  static fd_set fds_list;
  FD_ZERO( &fds_list );                 // reset file descriptor list.
  FD_SET( fd, &fds_list );              // add us to list, to be checked for readability.
  return ZeroWaitSelect( FD_SETSIZE, &fds_list, NULL, NULL ) != 0;   // check for readability and return immediately.
}
Is memory being corrupted when the IP address is saved?
sulliwk06
Posts: 118
Joined: Tue Sep 17, 2013 7:14 am

Re: Crash implementing dual-stack

Post by sulliwk06 »

What exactly is the crash error that you're getting? Is it an Access Error, or maybe Divide by Zero?
SeeCwriter
Posts: 606
Joined: Mon May 12, 2008 10:55 am

Re: Crash implementing dual-stack

Post by SeeCwriter »

The error is "debug interrupt (12)".

Code: Select all

-------------------Trap information-----------------------------
Exception Frame/A7 =80002b70
Trap Vector        =Debug interupt (12)
Format             =04
Status register SR =2000
Fault Status       =00
Faulted PC         =40041738
SeeCwriter
Posts: 606
Joined: Mon May 12, 2008 10:55 am

Re: Crash implementing dual-stack

Post by SeeCwriter »

My explanation of how my program was crashing was far more detailed than necessary. The fact is, it will crash during bootup initialization, there's no need to attempt a tcp connection.
The code I posted above is all the same. When the array of protocol_link is created, the constructor initializes all the objects. In the protocol_link class there is an Init() function that does everything the constructor does, except it doesn't create a new timer, it just resets the timer. I had commented that initialization out to see how far the code would run before crashing. And it runs until a tcp connection is attempted. I added the initialization back in and now it crashes during bootup again. And it crashes at the same place as noted above, when resetting the timer.

Code: Select all

protocol_link mandc[TOTAL_PROTOCOL_LINKS];  // total links = 12

void InitSerial()
{
  int i;

  for ( i = 0; i < (int) TOTAL_PROTOCOL_LINKS; i++ ) // mark all file descriptors/sockets as available, define rs485 address.
    mandc[i].Init( -1, SS->rs485_address, false, (i >= (int) FIRST_UDP_LINK && i <= (int) LAST_UDP_LINK) );    <===== CRASHES ON FIRST CALL.

  puts( "M&C Init Done" );      <==== NEVER GETS HERE.
  OSTimeDly( TICKS_PER_SECOND / 4 ); // wait for string to clock out.

  // more initialization follows...
}

void protocol_link::Init(int fd, char rs485_address, bool enable_address, bool as_udp)
{
  port          = fd;
  address       = rs485_address;
  check_address = enable_address;
  tbuf[0]       = 0;            
  udp_redirect  = as_udp; 
  Reset();
}
void protocol_link::Reset()
{
  char ch;
  ResetLink();
  if (port<0) return;
  while (dataavail(port)) read(port,&ch,1);    //lint !e534 empty receive buffer.
}
void protocol_link::ResetLink()
{
  count        = 0;
  rbuf[0]      = 0;
  printable    = false;
  efd_protocol = false;
  step         = PACKET_DEFAULT_STATE;
  running      = false;
  idle->Reset();
}
void c_interval::Reset(DWORD msec) 
{ 
  if (msec) interval = (DWORD)(msec * TICKS_PER_SECOND / 1000); //lint !e790 Example: 50 msec interval x 20 ticks/sec / 1000 = 1 tick.
  timeout = TimeTick + interval;  //lint !e644
  start_time = TimeTick;
}
The crash only happens when parameter ip_addr in the protocol_link class is changed from IPADDR4 to IPADDR.
SeeCwriter
Posts: 606
Joined: Mon May 12, 2008 10:55 am

Re: Crash implementing dual-stack

Post by SeeCwriter »

In my protocol_link class I modified the declaration of ip_addr to be a pointer to IPADDR instead, and I added the initialization of the pointer to the constructor.
So far this seems to work. The program boots, and I am communicating tcp, udp, and serially.

Code: Select all

class protocol_link
  …
  IPADDR *ip_addr;
  
  protocol_link() { ip_addr = new IPADDR(); ... }
  ...
}
What's up with that? I have other classes that also declare an IP address. Can I expect those to fail as well?
User avatar
TomNB
Posts: 538
Joined: Tue May 10, 2016 8:22 am

Re: Crash implementing dual-stack

Post by TomNB »

For dual stack releases, IPADDR is an object, not a 32-bit value like IPADDR4. A very common issue is that while assigning a value to to IPADDR4, such as:
IPADDR 4 ipAddress;
ipAddress = 0;

will work, it won't work on the IPADDR object that can be either IPv4 or v6. Instead, you would use ipAddress.SetNull().

I would first go through and make sure you aren't treating an IPADDR object like a 32-bit value.
SeeCwriter
Posts: 606
Joined: Mon May 12, 2008 10:55 am

Re: Crash implementing dual-stack

Post by SeeCwriter »

I understand that. I've been using IPADDR4 for a couple of years now, and the associated '4' functions (listen4(), accept4(), GetSourceAddress4(), etc.). The only change now was to use IPADDR, which is really IPADDR6, and the '6' functions. In my mind, I should have been able to replace IPADDR4 with IPADDR6 and be done. But apparently not.

But you'll note that the crash I was experiencing had nothing to do with setting an IPADDR6 address. It appears that the mere declaration of IPADDR6 corrupted memory in my class. I don't think that should happen. I could be wrong, but this smells like a compiler issue.
User avatar
TomNB
Posts: 538
Joined: Tue May 10, 2016 8:22 am

Re: Crash implementing dual-stack

Post by TomNB »

It still seems like a 32-bit issue to me. IPADDR4 is nothing like IPADDR under the hood. 32-bit value vs and object. So declaring IPADDR in a class will certainly be different. When your changed IPADDR ip_addr; to a pointer, you went back from an object to a 32-bit value (pointers are 32-bits).

The trap message: Trap Vector =Debug interupt (12), means that you have a null pointer exception.


I'm wondering if your ifdef is working correctly. Try putting in a #error to verify:

#ifdef IPv6_DEVELOPMENT
#error In IPv6 mode
IPADDR ip_addr; // The ip address of the remote host, if applicable.
#else
IPADDR4 ip_addr; // The ip address of the remote host, if applicable.
#endif
SeeCwriter
Posts: 606
Joined: Mon May 12, 2008 10:55 am

Re: Crash implementing dual-stack

Post by SeeCwriter »

I agree that the problem is related to one declaration being 32-bits and the other 128-bits. I verified that my macro is working correctly.

The next thing I tried is that I moved the IPADDR declaration in the class after all the other declarations and switched back to the original declaration of not using a pointer.

Code: Select all

class protocol_link
{
public:
  int  port;
  char rbuf[PROTOCOL_BUF_SIZE];	// holds incoming packet.
  char tbuf[PROTOCOL_BUF_SIZE];	// holds outgoing packet.
  int  count;			// size of current packet.
  char address;
  bool printable, check_address, efd_protocol;
  bool udp_redirect;		// TRUE = this is a redirect of UDP-compatible messaging.
  int efd_addr;
  enum_packet_state step;
  c_interval *idle; // idle timer, restarted by ResetLink()
  bool running;	    // prevent recursion, ex: block routines triggered by current packet from checking for another packet...
                    // ... and upon seeing the same packet (having not called ResetLink()), go into infinite loop.
  #ifdef IPv6_DEVELOPMENT
      IPADDR   ip_addr; // The ip address of the remote host, if applicable.
  #else
      IPADDR4  ip_addr; // The ip address of the remote host, if applicable.
  #endif
  protocol_link():port(-1),count(0),address('A'),printable(false),check_address(false),efd_protocol(false),
                  udp_redirect(false),efd_addr(1),step(PACKET_DEFAULT_STATE),running(false)
    { 
      idle = new c_interval(300000); // 300,000 msec = 5 minutes
      rbuf[0] = 0;
      tbuf[0] = 0;
    }
  ~protocol_link() { delete [] idle; }
  void Init(int fd, char device_address, bool check_address, bool as_udp=false);
  int  Send(char start_char, const char *string, char new_address=0, int send_bytes=-1);
  int  Printf(char start_char, const char *format_str, ...);
  protocol_status_enum Receive();
  protocol_status_enum ReceiveUdp();
  char CheckSum(char *buf, int size);
  void Reset(), ResetLink();
  void KillCheckSum();	// zeroes rbuf[] checksum field, called AFTER confirming the checksum (if applicable).
};
With this class configuration, the program runs without crashing. Again, it seems to point to a layout issue with the compiler. It would be nice to be able to confirm that hypothesis, but I'm not sure how.
Post Reply