chipKIT® Development Platform

Inspired by Arduino™

[Solved!] 5Mbps serial connection, Uno32 goes into lala land

Created Sun, 27 May 2012 18:08:14 +0000 by Demolishun


Demolishun

Sun, 27 May 2012 18:08:14 +0000

Okay, here is what I am doing. I have 3 Uno32s I am testing with. Thankfully they are all showing the same symptoms. At least that gives me hope it is a software only issue.

I am connecting 2 of the Uno32s together. Right now I am measuring transmission latency of sending 2 bytes from one Uno to another. The first Uno sends the bytes, the second Uno just looks for data on the USART using normal Serial1 calls and sends the data back. The first Uno measures TOF for the round trip for the bytes. Right now I am getting a consistent 16uS TOF round trip.

In order to set the serial baud rate to 5Mbps I had to add a second "begin" call that takes an additional parameter called sprg. This parameter determines if we need to set the sprg bit for high speed transmission:

void HardwareSerial::begin(unsigned long baudRate, int brgh)
...
if(!brgh){
		uart->uxBrg.reg	 = ((__PIC32_pbClk / 16 / baudRate) - 1);	// calculate actual BAUD generate value.
		uart->uxMode.reg = (1 << _UARTMODE_ON);				//enable UART module
	}else{
		uart->uxBrg.reg	 = ((__PIC32_pbClk / 4 / baudRate) - 1);	// calculate actual BAUD generate value.
		#define _UARTMODE_BRGH 3
		uart->uxMode.reg = (1 << _UARTMODE_ON)|((1 << _UARTMODE_BRGH));	//enable UART module and brgh
	}
...

The code shows the relevant changes to get this functionality. With this change I can get consistent data transfers on a loopback of pins 39 and 40 of 6Mbps. Any higher and it starts throwing data errors. So I am targetting 5Mbps. This will eventually go over fiber transmitters and receivers rated for up to 5Mbps data rate. So it won't be copper in the future.

What is the problem? The problem is after running in this mode to where the first Uno32 acts as a transmitter and calculates TOF and the second Uno32 acts as a loopback is the second Uno will walk into lala land. The code for both systems is identical as I load the same firmware to both. The difference is I read an analog bit A0 as a digital to determine if the device should run in transmitter or loopback mode. Here is the code:

// loopback round data:
// safest bet is 5Mbit
// total time: 8uS, receive time: 4uS @ 5000000
// total time: 7uS, receive time: 2uS @ 6000000
// errors @  7000000
// errors @  7500000
// errors @ 10000000
// roundtrip through other uno32:
// total time: 16uS, receive time: 12uS @ 5000000
// that is not a one (1) at the end of baud, it is the letter L
#define SERIAL1_BAUD_RATE 5000000L
#define SERIAL1_DATA_LEN 2

byte serial_data[SERIAL1_DATA_LEN];
byte chan_count;
byte max_chan;

#define COM_MUX_CHAN_SEL 0xC0
#define COM_MUX_RESET 0x42
#define COM_TIMING_FUNC() micros()

#define MODE_PIN 14

#define TEST_DELAY 25

byte mode = 0;
boolean toggle = LOW;

void setup() {
  chan_count = 0;
  max_chan = 8;
  
  pinMode(MODE_PIN, INPUT); // A0 as digital input
  pinMode(43, OUTPUT); // Led5
  
  Serial.begin(38400);
  Serial.write("__PIC32_pbClk: ");
  Serial.print(__PIC32_pbClk);
  Serial.write("\r\n");
  Serial.write("Baud Rate Reg Value: ");
  unsigned long tempLong = ((__PIC32_pbClk / 4 / SERIAL1_BAUD_RATE) - 1);
  Serial.print(tempLong,16);
  Serial.write("\r\n");
  Serial1.begin(SERIAL1_BAUD_RATE,1); // enable brgh by setting second param to 1
  //Serial1.setTimeout(20); // mS timeout
}



void loop() {
  mode = digitalRead(MODE_PIN);
  if(mode){
    Serial.write("starting transmission\r\n");
    
    unsigned long pre_start = COM_TIMING_FUNC();
    serial_data[0] = COM_MUX_CHAN_SEL | chan_count;
    serial_data[1] = serial_data[0] ^ 0xFF;
    Serial1.write(serial_data, SERIAL1_DATA_LEN);
    //Serial1.write(serial_data[0]);
    //Serial1.write(serial_data[1]);
    unsigned long start_time = COM_TIMING_FUNC();
    unsigned long pre_time = start_time - pre_start;
    unsigned long max_time = 200000;
    byte got_data = 0;
    while((COM_TIMING_FUNC()-start_time) < max_time){
      if(!got_data && Serial1.available()){
        serial_data[0] = Serial1.read();
        if((serial_data[0] & 0xF0) != COM_MUX_CHAN_SEL)
          continue;
        got_data += 1;
      }
      if(got_data && Serial1.available()){
        serial_data[1] = Serial1.read();
        got_data += 1;
        break;
      }
    }
    unsigned long recv_time = COM_TIMING_FUNC() - start_time;
    unsigned long total_time = recv_time + pre_time;
    
    if(got_data == SERIAL1_DATA_LEN){
      if(serial_data[0] == (serial_data[1] ^ 0xFF)){
        Serial.write("recv_time ");
        Serial.print(recv_time);
        Serial.write("\r\n");
        Serial.write("total_time ");
        Serial.print(total_time);
        Serial.write("\r\n");
        Serial.write("data ");
        Serial.print(serial_data[0],16);
        Serial.write(":");
        Serial.print(serial_data[1],16);
        Serial.write("\r\n");
        Serial.write("successful reception\r\n");
      }else{
        Serial.write("recv_time ");
        Serial.print(recv_time);
        Serial.write("\r\n");
        Serial.write("total_time ");
        Serial.print(total_time);
        Serial.write("\r\n");
        Serial.write("data ");
        Serial.print(serial_data[0],16);
        Serial.write(":");
        Serial.print(serial_data[1],16);
        Serial.write("\r\n");
        Serial.write("got 2 bytes that don't match\r\n");
      }
    }else{
      Serial.write("did not get enough data\r\n");
    }
    
    delay(TEST_DELAY);
    Serial1.flush();
    delay(TEST_DELAY);
    
    chan_count += 1;
    if(chan_count >= max_chan){
      chan_count = 0;
    }
  }else{
    byte data;
    if(Serial1.available()){
      toggle = !toggle;
      while(Serial1.available()){
        data = Serial1.read();
        Serial1.write(&data,1);
      }
    }
  }
  
  digitalWrite(43,toggle);
}

At first I thought it was just one of the Uno32s that was having issues. But I have been able to get 2 of the Uno32s to demonstrate this behavior when running in lookback mode. The funny thing is if the Uno32 runs in the transmitter mode it never has this issue. The only difference I can see is the analog pin being used to switch modes, but if it switches modes it show strange data for the timing, not walking into lala land. I guess I need to get a debugger or something.

Thanks for any insight you can provide. :)

Edit: Wow, just wow. I just got it into that strange mode where it acts like it is in lala land. If I pull the pin on 43 from ground to get it back into the transmission mode, the transmission mode acts the way it should. That means the processor is working, but for some reason the other mode is stuck somewhere. Which is strange because it uses the same Serial1 object to control the port in both modes! Now I am really confused.


Demolishun

Mon, 28 May 2012 01:16:22 +0000

It all points to receiver error and shutdown.

[url]http://www.microchip.com/forums/m11533.aspx[/url]

I will post a fix when I figure it out. The HardwareSerial.h and .cpp have no error checking of the hardware anywhere. That is why this is happening. It has no way to know or recover from a USART shutdown.

The transmit continues to work however.


Demolishun

Mon, 28 May 2012 05:22:05 +0000

Corrected Interrupt Handler:

void HardwareSerial::doSerialInt(void)
{
	int		bufIndex;
	uint8_t	ch;
	uint32_t tmp_mode;

	/* If it's a receive interrupt, get the character and store
	** it in the receive buffer.
	*/
	if ((ifs->reg & bit_rx) != 0)
	{
		// while bytes available
		while(uart->uxSta.reg & 0x01)
		{ 
			// error check
			if((uart->uxSta.reg & 0x04)||(uart->uxSta.reg & 0x02)) // checking FERR 0x04 and OERR 0x02
			{
				// save old reg value to maintain brgh if set
				tmp_mode = ((1 << _UARTMODE_ON)|((1 << _UARTMODE_BRGH))) & uart->uxMode.reg;
				// disable UART to clear errors
				uart->uxMode.reg = 0;
				// grab next 4 values out of UART
				// 4 level deep stack
				// need data to be current so let the old bytes die
				/*
				ch = uart->uxRx.reg;
				ch = uart->uxRx.reg;
				ch = uart->uxRx.reg;
				ch = uart->uxRx.reg;
				*/
				// reenable uart
				uart->uxMode.reg = tmp_mode;
			}
			else
			{
				ch = uart->uxRx.reg;
				bufIndex	= (rx_buffer.head + 1) % RX_BUFFER_SIZE;
			
				/* If we should be storing the received character into the location
				** just before the tail (meaning that the head would advance to the
				** current location of the tail), we're about to overflow the buffer
				** and so we don't write the character or advance the head.
				*/
				if (bufIndex != rx_buffer.tail)
				{
					rx_buffer.buffer[rx_buffer.head] = ch;
					rx_buffer.head = bufIndex;
				}
			} // if error check
		} // while

		/* Clear the interrupt flag.
		*/
		ifs->clr = bit_rx;
	}
        /* If it's a transmit interrupt, ignore it, as we don't current
	** have interrupt driven i/o on the transmit side.
	*/
	if ((ifs->reg & bit_tx) != 0)
	{
		/* Clear the interrupt flag.
		*/
		ifs->clr = bit_tx;
	}
}

The most significant change is using a while loop to parse bytes until the rx buffer (4 bytes max) is empty. Otherwise you risk overflow (OERR) at high speeds like 5Mbps. It seems to have fixed my issue. I left notes to help people understand how this works. Also refer to Microchip UART reference manual for PIC32Mx and the PIC32Mx3xx/4xx manual as well. Hopefully this can get incorporated into the main codebase as well as the ability to set brgh value so you can set baud rates up to 20Mbps.

This interrupt handler comes from the HardwareSerial.cpp in "hardware\pic32\cores\pic32" directory in the mpide main directory.

Please, by all means test this out and report back. That way if there are other issues they can be addressed. Thanks

Edit: Note, this did not affect speed at all. I am measuring 16 uS round trip of 2 bytes at 5Mbps. It is consistent with the code I posted above. Obviously as I do more processing around this it will increase the code latency, but this code change did not increase that noticeably. It just works and no longer walks into lala land. <crosses fingers> :?


Demolishun

Fri, 01 Jun 2012 00:28:55 +0000

This change works fine with the 20120429 release.


Demolishun

Tue, 26 Jun 2012 21:25:55 +0000

There is still a problem with the error detection and correction using the above code. Here is how it should be done and is working reliably:

void HardwareSerial::doSerialInt(void)
{
	int		bufIndex;
	uint8_t	ch;
	uint32_t tmp_mode;

	/* If it's a receive interrupt, get the character and store
	** it in the receive buffer.
	*/
	if ((ifs-&gt;reg &amp; bit_rx) != 0)
	{
		// while bytes available
		while(uart-&gt;uxSta.reg &amp; 0x01)
		{ 
			// error check
			if(uart-&gt;uxSta.reg &amp; 0x04) // checking FERR 0x04
			{
				ch = uart-&gt;uxRx.reg; // dispose of byte
			}
			else if(uart-&gt;uxSta.reg &amp; 0x02) // checking OERR 0x02
			{
				// clear OERR bit
				uart-&gt;uxSta.clr = 0x02;
			}
			else
			{
				ch = uart-&gt;uxRx.reg;
				bufIndex	= (rx_buffer.head + 1) % RX_BUFFER_SIZE;
			
				/* If we should be storing the received character into the location
				** just before the tail (meaning that the head would advance to the
				** current location of the tail), we're about to overflow the buffer
				** and so we don't write the character or advance the head.
				*/
				if (bufIndex != rx_buffer.tail)
				{
					rx_buffer.buffer[rx_buffer.head] = ch;
					rx_buffer.head = bufIndex;
				}
			} // if error check
		} // while

		/* Clear the interrupt flag.
		*/
		ifs-&gt;clr = bit_rx;
	}

	/* If it's a transmit interrupt, ignore it, as we don't current
	** have interrupt driven i/o on the transmit side.
	*/
	if ((ifs-&gt;reg &amp; bit_tx) != 0)
	{
		/* Clear the interrupt flag.
		*/
		ifs-&gt;clr = bit_tx;
	}

}

The while loop is part of the solution and why I saw improvement in my transmissions. However when testing with a fiber connection I was locking up the RX unit every time I disconnected a fiber. So I took a much closer look at the PIC32 UART datasheet and found the error of my ways. This now is tolerant to disconnecting or connecting which will cause spurious partial and bogus data to appear at the RX unit. So, now it finally works the way it is supposed to.