I'm a friend of unexpectedly's. I'm working on a high-speed CAN datalogger and analyzer. However I've been having to clean house on each and every step on my way to that goal. Most of the code out there is terribly inefficient. I first touched an Arduino only a few months ago, so I'm taking it slow and hammering away at the libraries before I integrate the API to add touchscreen LCDs, accelerometers, GPS and what not.
I only have a Uno32 and Mega1280 to work with so I have a towering pillar of hats.
3 devices are sharing the SPI bus, 2 CAN MCP2515 shields and 1 SD card.
- ChipKit Uno32
- Seeedunio MCP2515 CANbus shield
- Sparkfun MCP2515 CANbus shield with SD card, joystick, and Serial LCD,
Latency is critical in my application. (Probably more suited to an RTOS to be honest.) I'm dealing with a 500kbit/s serial bus that runs at maybe 40% utilization. I am trying to understand a live vehicle so my first goal was to understand the interrelationship between MCUs. Unfortunately CAN is a common-line bus so it isn't possible to simply snip the RX and TX lines and probe them. For that reason I decided to actively intercept and retransmit CAN packets -- requiring two separate CAN interfaces. Also I am dumping any logged data out of serial -- so I required a Serial port speed >500,000 baud plus overhead. (115200 isn't going to cut it.)
ABSOLUTELY NO PACKET LOSS CAN BE PERMITTED.
The first challenge (actually wiring the hardware) was solved by unexpectedly doing it for me. First the CS pins had to be moved around to allow all devices to communicate. Then SPI had to be sorted out -- we use a Mega and an Uno32 which doesn't use SPI on pins 10-13, it uses a 6-pin header (pins 54 or some such).
Since Seeedunio shield has a dedicated SPI header, but the Sparkfun does not, unexpectedly did the "Mega pin hack" on the Seedunio shield and put it on the bottom, allowing it to be used as a "R3 shield to Mega" adapter board in addition to serving its normal function. Clever! It also had the happy side effect of putting the DB9 CAN ports on opposite sides.
Since this was the only hardware donated was an Uno32 and a Mega 1280 (aka. CHEAP), I am compensating for the lack of multiple SPI ports with hyper-efficient coding. All elements share a 10Mhz SPI bus (8Mhz on the Mega) which is rate-limited by the 10Mhz cap on the MCP2515. This means latency issues and inefficient coding become painfully apparent.
The first problem was the serial port: as mentioned above, to dump a 500kbps bus requires at least two times that to account for overhead. Bosch CAN 2.0B spec can theoretically go to 1Mbps. When attempting to log high bus loads I noticed I was frequently missing packets. When I increased my Serial rate, I learned fairly quickly the max COM speed on a 16Mhz Mega1280's was not 3Mhz (according to FTDI spec), but 2Mhz because of a quirk with the 16Mhz oscillator frequencies garbling data. Luckily 2Mbaud is portable to ChipKit and was high enough.
Unfortunately, Serial output is blocking -- so I need to be quick and succinct with what I transmit. A CAN packet is [11 or 29 bit standard/extended ID] [1 bit standard/extended ID] [1 bit remote request] [3 bit data length] [Up to 8 bytes data]. I use an somewhat-odd-but-uniform API that includes checksums that brings it up to 20 bytes per packet. As long as I transmit serial encoded [20 bytes] as opposed to human-readable ASCII [120 chars/bytes], the Serial COM link @ 2Mbaud is sufficient.
Interfacing was also difficult since many libraries pollute the namespace and cannot be instantiated (i.e. aren't objects, can't have multiple copies). So I had to integrate my code into modular classes that still allowed for individual per-pin optimization.
The MCP2515 shields and code:
I found the stock MCP2515 library was unable to keep up with CAN traffic at only 125kbps. Again, I was dropping packets, or packet ordering was out-of-order. Upon investigation, I found the stock library used up to 5 SPI transactions just to acquire a single packet. Transmitting a packet required as many as 14.
Furthermore there are major silicon flaws in the MCP2515 controller -- such as if one of the three transmit buffers is loaded during the same T_Osc another lower-priority buffer was being sent, both buffers are corupted. (1/16,000,000 Mhz chance.) Worse still, to account for this the stock transmit library resorts to blocking -- rendering two of the three transmit buffers absolutely worthless, in addition to stalling the code loop.
I re-coded and optimized the MCP2515 libs from scratch and reduced the number of SPI requests for a CAN packet "Read" down to 1 and "Transmit" down to 2. The length of these transactions was also cut down by (15% to 20%) by using optimized per-buffer SPI byte commands. To correct for the "priority" silicon bug I incorporated buffer cycling -- buffers are assigned with constantly decrementing priority in a round-robin format. Only one or two extra SPI commands are needed intermittently and are slipstreamed into the packet loading command. The worst-case scenario -- requiring a total realignment of Transmit buffers -- only requires some some CPU calculations and (2 or less) SPI commands.
ChipSelect issues were also present on the Arduino Mega. (And partly on the ChipKit.) After educating myself on the "FastDigitalWrite" problem I solved the issue on the Mega by adding Macros for direct register writes on the Arduino and LATxSET/CLR on the Uno32 in the MCP2515 class. I was able to keep track of efficiency by the "psuedo-PWM" intensity of the "message ready" interrupt PIN2 LED on the Seeedunio CAN shield. (Brighter LED = more locking, messages queued up). By this seat-of-the-pants metric, fixing the ChipSelect issues reduced SPI CAN read/write overhead by 20-30%. The LED is only a dull red glow now.
By switching the INT function of the MCP2515 to "RX buffer overflow", I am also able to keep track of if even a single overflow occurs. If the LED ever lights up, "I dun goofed".
I can now SPI read a packet from one shield, process it with higher level logic in the ChipKit, and SPI write to transmit on a fully loaded bus. The time delay between packet receipt, interdiction, replay, serial output, is less than 1000 microseconds with no out-of-ordering. (According to an much, much more expensive external CAN analyzer).
Progress on that front is documented here: [TO BE COMPLETED LATER]
The SD card:
I have had good success logging CANbus over Serial, but I also want to log to SD card when using the logger in standalone format. However I came to learn of the ugly truth that is SD-card proprietary SDIO protocol. Stock SD SPI libraries are terribly inefficient, and moreso in the case of a limited ecosystem like ChipKit.
The ChipKit MPIDE default <SD.h> is bitbanged software SPI. So firstly, it was flat-out incompatible with my hardware SPI solution for the CAN shields. Second, performance and latency were atrocious. 120KB/s max speeds and blocking latencies in the hundreds of milliseconds. Without a second, seperate SPI bus AND lack of threading in the PIC32 code this was almost a lost cause.
However, I was able to refactor code to optimize it enough to pull 600 write/900 read. By further optimizing SPI transactions to make use of the PIC32's 32-bit SPI mode I was able to push it to 960 KB/s write / 1175 KB/s read. I was also able to reduce optimized sequential write latency to 410 microseconds. Not quite in Teensy 2MB/s write / 4000Hz latency territory, but good enough.
It still is a work in progress, so further gains could be made. Improving the performance by a factor of 10x is fine by me.
Progress on that front is documented here: viewtopic.php?f=7&t=3262
What remains to be done:
I already have added rudimentary CAN processing logic, but plan to add logic to decode CAN data in realtime. That is, to "sniff" RPM, TPS, etc data off the bus directly and display it on a LCD/TFT. I have already begun decoding the PIDs used on this vehicle and extracting useful data.Hardware 3.3v vs 5.0v issues: PIC32 can't initialize the CANbus shields if it is plugged into a USB powered hub (VCC 4.9v), but works just fine if it is in a passive hub (VCC 4.5v). This is really annoying but I don't know how to fix it.
Progress on that front is documented here: [TO BE COMPLETED LATER]
Further still, I hope to add diagnostic functionality to the device. A task like reading/resetting error codes requires a single specially crafted CAN packet, or possibly a session negotiation protocol. However, this might burn through the limited flash quota available on a ChipKit, and at this point I may migrate to another embedded system. The TI series of triple-core dual-can line Launchpad boards seem attractive. Another option is to port the code to a RealTimeOS where I could make better use of the peripherals. But I intend for this project to be accessible to end owners, so I might not get too carried away with proprietary equipment.
Well, that's a quick summary of what I've been up to. Pictures will be uploaded later. If there's interest I may make these to sell. It's actually intended for... *cough* ahem performance vehicle use. Back to the code mines...