chipKIT® Development Platform

Inspired by Arduino™

WF32 microSD Bit-Bang

Created Sat, 28 Jun 2014 17:37:31 +0000 by Max44


Max44

Sat, 28 Jun 2014 17:37:31 +0000

I have to say I was disappointed to find my WF32 board had the microSD card wired to GPIO using a bit-bang routine versus a hardware SPI port, which would run much faster than a "softSPI" implementation. I see that the new version of the board with the PIC32MZ will allow pin re-configuration to get around this. I don't want to purchase a new board just now, however.

I've found that a C/C++ softSPI implementation has much to be desired, even running at 80 MHz. I'm looking into an assembly routine, but thought I'd ask if anyone knows if/where one already exists for PIC32.


majenko

Sat, 28 Jun 2014 19:22:51 +0000

You and me both. The chip has 4 SPI ports on it - why they chose to use software SPI is beyond me. Not that it makes much difference at the moment in MPIDE as it uses software SPI even on hardware ports. I have a better hardware implementation in UECIDE, but for the WF32 it still uses the old software SPI method.

If you do come up with a good fast implementation I'd certainly be interested in incorporating it...


Max44

Sat, 28 Jun 2014 22:57:32 +0000

Well .....................

I don't know about "good", since I haven't done much assembly language programming since back in the ancient days of working with a Z80. I did put together a prototype PIC32 assembly SPI one byte transmit/receive function. There is no loop .... straight inline bang out each bit. It looks like the SPI clock is a little more than 5MHz during the byte transmit/receive and running the CPU at 80MHz. The generated SPI clock is not always 50/50 due to some bit 0 or bit 1 branches that alter the timing. I just got to the point of completing simulating it in MPLAB X and testing CMD0 initialization with a Sandisk 2GB microSD card on the WF32 board. It returned a 0x01 response, which is a good sign. My next task is to integrate it into the MicroChip SPI-SD program and verify it can be used to read/write files.

I can send you a copy for further testing, but I'd like assurance that it will end up open source.

Max


majenko

Sat, 28 Jun 2014 23:06:21 +0000

Everything I do is BSD 3-clause ;)


Max44

Sun, 29 Jun 2014 01:40:45 +0000

I don't see an email address, so I'll just post the routine here.

Call the routine as:

// // Assember SPI transmit/receive function // extern unsigned char SPIxByte(unsigned char BIn );

and:

SPIByteIn = SPIxByte(SPIByteOut);


This routine was replaced by separate send and receive routines. See below.



majenko

Sun, 29 Jun 2014 14:37:24 +0000

Eee, by 'eck, that's a bit wordy. It's crying out for a loop.

Also, it's going to need to be made generic, so you can pass a set of pins to it.

Another thing - if you want to increase throughput, it doesn't really matter about the duty cycle of the clock, so you can lose some of the NOPs and it'll still work fine.


Max44

Sun, 29 Jun 2014 17:08:07 +0000

Hah-hah! Glad you're impressed! Yeah, it's brutal .... but isn't that, as the name implies, what bit banging is? Although seemingly "wordy", I bet the instruction count isn't much worse than what the C compiler spits out for a SoftSPI routine. I didn't want to burden the code with managing loop counters and generic setup. All that tends to slow it down and I was orienting it specifically towards speeding up the the WF32 microSD interface. I'll try your advice and remove some of the nops for further testing.

I started out debugging this by using pins in PORTE on the WF32 connectors so I could hook up a scope and check timing. It's easy enough to change the pins you want to use in the first few lines:

la $t6, PORTG; # Setup PORTG address (0xBF886190) li $t4,0x2000; # MOSI bit mask li $t5,0x4000; # SCLK bit mask li $t7,0x8000; # MISO bit mask

Just change the port address to the one you're using and set the mask bit for the pin.

I suppose this could be done with some input values .... or even better, some settings from a config file if that's possible. I'm new to the MIPS32 assembler, but I'll take a look ... or someone with relevant experience can contribute.

As we discussed already, you should really be using hardware SPI for a SD card. Especially for a webserver that's pulling web pages off the SD card! None of this nonsense should be necessary.


majenko

Sun, 29 Jun 2014 17:18:57 +0000

My UECIDE SD library allows you to specify the pins to use in the constructor for software SPI, so passing the pins (or obtaining them from variables in some way) would be necessary.


Max44

Sat, 05 Jul 2014 16:31:22 +0000

I updated the assembly bit-bang routine by removing extra nop cycles as discussed above. And I edited the previous post with it.

I tested the WF32 assembly bit-bang routine using FatFs and the benchmark port to PIC32 published by Riccardo Leonardi. I replaced the PIC32 SPI interface in mmcPIC32.c with the assembler function. Looks to work fairly well. I got a time to write 24MB of 111.643 seconds, ~220 KB/sec. While not nearly as good as a hardware SPI can do running 20 MHz, I think it's a big improvement over a C/C++ softSPI routine.


Max44

Sat, 05 Jul 2014 16:54:16 +0000

Matt,

Despite garnering an "Eee, by 'eck", I can look further into implementing passing pin parameters into the WF32 assembly SPI bit-bang if you think it's worthy. Point me to where the constructors you mentioned are in the UECIDE libraries and I'll experiment with it.

Regards, Max


majenko

Sat, 05 Jul 2014 20:48:11 +0000

There's 2 functions that deal with software SPI - spiRec and spiSend. They are in Sd2Card.cpp: [url]https://github.com/UECIDE/pic32/blob/master/cores/chipKIT/libraries/SD/utility/Sd2Card.cpp#L56[/url]

The great thing is it's always unidirectional, so you don't need a full SPI implementation (I use shiftIn and shiftOut at the moment) which would make your code much simpler.

I need to re-write the block read/write routines (which just loop calling spiSend or spiRec at the moment) to use bulk hardware SPI operations. I did it once, but lost it :(


Max44

Mon, 07 Jul 2014 22:32:58 +0000

I'm floundering around trying to find a simple example that accesses the WF32 microSD card which builds in UECIDE and runs on the board. I saw on the UECIDE forum from about a year ago you had a SD benchmark test that would be ideal if it runs on the WF32. So far I haven't been able to figure out what settings I need to provide for the WF32 and my attempts just get ye olde "Initialization failed!".

I could create look-alike spiRec and spiSend functions in my existing MPLAB X test .... but they'd end up being more Microchip-like versus Arduino-like. I could get a good measure of existing SD speeds this way, though, and look at what the XC32 compiler is doing with the shiftIn and shiftOut functions.

I'd like to assess if the shiftIn and shiftOut functions are already fairly efficient. It seems like they should be and perhaps there's not much to be gained in an assembly language function. It's worth finding out, though. Any idea what the current SD card speeds are for the WF32?


Max44

Tue, 08 Jul 2014 17:31:31 +0000

More floundering, but some progress.

I got the test that writes 10 MB to the WF32 microSD card running in MPIDE. A power on reset to the board seemed to have helped get through the initialization.

It reported 210.837 sec to write 10MB (10,485,760 Bytes) and 128.466 sec to read. This is a write speed of 49.7KB/sec ..... so definitely room for improvement.

I'll start constructing some shiftIn and shiftOut routines in PIC32 asm. I could use some guidance as to how to modify the Sd2Card source and include the modified version in a build versus the library file(s).

Here's a screen copy of the COM window:


Max44

Thu, 10 Jul 2014 01:02:29 +0000

I finally got the SD 10MB write/read test running in UECIDE (v0.8.6i) on my WF32 board. I needed to add:

// Create an instance of Sd2Card with the WF32 pins Sd2Card WF32SD(52,49,50,51); SDClass wSD(WF32SD);

Initializing SD card .....initialization done. Deleting existing file Writing 10MB to test.txt...done. Time taken: 364435ms Reading data from test.txt: done. Time taken: 316439ms


Max44

Sun, 13 Jul 2014 16:22:34 +0000

After some work in MPIDE and UECIDE, it looked like improvements could be made with separate and simpler send and receive routines. This is also the way all the file libraries work. This proved to be true, as my FatFs test increased to 270 KB/sec. This uses the PIC32 test posted on the Microchip Forum:

[url]http://www.microchip.com/forums/m563218.aspx[/url]

Following the directions in the XC32 Compiler manual for integrating assembly language routines, I also figured out how to make the assembly routines somewhat generic by setting variables for the pin position and PIC32 port.

Here's a sample of the C setup:

//
// Assembler SPI transmit function
// Assembler SPI receive finction 
//
//extern unsigned char SPIxByte(unsigned char BIn );
extern void SPIsendL(unsigned char BOut );
extern unsigned char SPIrecL(void);
//
// Variables for the assembler functions
//
#define _WF32_BOARD_

#if defined(_WF32_BOARD_)
#define SPI_PORT 0xBF886190 // PORTG base address
#define SCKmask 0x00004000  // Set a 1 in the SCK port bit position (RG14)
#define MISOmask 0x00008000 // Set a 1 in the MISO port bit position (RG15)
#define MOSImask 0x00002000 // Set a 1 in the MOSI port bit position (RG13)
#endif

volatile unsigned int SPIp = SPI_PORT;
volatile unsigned int MISOm = MISOmask;
volatile unsigned int SCKm = SCKmask;
volatile unsigned int MOSIm = MOSImask;

I attached the assembler code in the next post.

This code assumes the SPI pins are in the same PIC32 port. It should be possible to feed in a different port for each pin ..... if there are enough registers left! Note I used the temporary registers, of which there are 10. This allows the assembly routine not to have to save existing data on the stack. So far, I haven't seen any problems with data being corrupted.

I found by trial and error I had to add a couple cycles setting and clearing the clock for my microSD card to operate (an old 2GB SanDisk). Keep in mind a PIC32 running at 80MHz can toggle outputs at 40MHz.


Max44

Sun, 13 Jul 2014 19:15:10 +0000

Here's a zip file of the WF32 Bit-Bang assembly routines


Max44

Sun, 13 Jul 2014 19:47:29 +0000

And a copy of the MPLAB X working directory.


Max44

Sun, 02 Nov 2014 20:35:38 +0000

I had occasion to hook up an external microSD breakout board to my WF32 board on SPI2. Testing with my old 2GB SanDisk card didn't result in much better performance that the bit-bang routines .... even with the SPI running 20 MHz. So I bought a 16GB SanDisk Ultra microSDHC UHS-1 card. This resulted in much better performance, both from the SPI port and the bit-bang routines. Here's a summary of the test results from the FatFs 24MB write test above:

SPI2: 20 MHz 740 KB/sec 13.3 MHz 645 KB/sec 10 MHz 549 KB/sec

WF32 Assembly Bit-Bang: 575 KB/sec

So, the conclusion is the assembler bit-bang routines run just slightly faster than a 10 MHz SPI connection. IMHO, this is considerably better than a C soft-SPI routine. Now if I could just find a MRF24WG0MA compatible TCP/IP stack that uses FatFs!

I tried removing some of the nop cycles in the assembly routines I used to widen the bit-bang SPI clock pulse. This failed, so I left them in for the above test.


Max44

Wed, 05 Nov 2014 21:37:29 +0000

I thought I should also test the FatFs file read, and added a test to read the 24MB file created. I discarded the data to just time the file read. I found the bit-bang read routine was slower than the write, coming in at around 511 KB/sec with the SanDisk Ultra card. I tweaked the read routine to move the SPI clock clear and eliminate extra nop cycles added for clock pulse timing. It improved the read data rate to around 585 KB/sec. I've included the assembler routines with this message.

BTW, I did look at the file(s) created to verify a 24 MB file was created and read the file to observe that the "IMTOTALLYGOINGSOFAST" string was repeated in the file. For the read test, I did verify the last 4K block read was equal to the 4K block written.

MPLAB X stopwatch indicated the assembly routines are around 100 clock cycles to move a byte, so the maximum burst data rate for an 80 MHz PIC32 would be 800KB/sec. Getting well over 500 KB/sec data flow is pretty efficient by FatFs + assembler SPI bit-bang + SanDisk Ultra card.

I hope those who have a WF32 and don't want to add an additional microSD connector have found these routines useful. Of course a hardware SPI peripheral will do much better, with the added advantage of being able to use DMA. I've read on the Microchip forum SPI + DMA getting up towards 2 MB/sec data rates. My SPI microSD card running at 20 MHz will read at 1 MB/sec.


Next-TU

Mon, 22 Dec 2014 06:44:48 +0000

Hi Max44,

I have been struggling with the WF32 SDcard for a few days now. Happened to see your posts and code. Helped me a lot. Thanks for posting.


Max44

Wed, 14 Jan 2015 21:24:55 +0000

I wanted to look at what the bit bang routines were doing with a scope. I finally got around to setting up a test fixture. Since the WF32 has no convenient test points for the microSD signals, I used a Sparkfun microSD sniffer board.

It turns out this was a good idea! In doing this, I noticed there were transitions on the data (MOSI) line at unexpected times. I attributed this to an error in the offset to the Port G latches when clearing SCK ..... it was 0x04 and should be 0x14. I corrected this and the data looked much cleaner. While it worked before, the extra transitions were undesirable. I found I had also propagated this error to the read routine. I've attached the updated assembly routines.

The attached zip file has some screen shots of clock and data during the write test using the updated assembly routines.

Things to note:

Clock pulse width varies - I measured 250 nsec or 500 nsec

Data out to clock time is sometimes only one CPU clock time 12.5 nsec (as pictured). I think this might be pushing it ..... but it works.

Byte to byte times vary slightly, but overall burst data rate looks good >750 KB/sec.


Max44

Mon, 16 Feb 2015 21:06:17 +0000

I thought as a more thorough test I would write an incrementing byte pattern (0x00 to 0xFF) through the entire 24 MB FatFs test. I also modified the test to initialize the WF32 RTCC and use it for the file timestamp as well as to check timing. Finally, I added a read and verify of the entire 24 MB. All seems to be working well with my SanDisk Ultra microSD card and the assembler BitBang routines.

I attached a zip file of the MPLAB X directory with the revised test, which includes the most recent assembler BitBang files. Also included is a screen shot of the test pattern with a binary file reader (TestData.pdf).