I have been looking on the new cool ARM7 Cortex from ST for awhile, now got my latest dev board up and running. ---------------- Benchmark test1: I have a nice 3rd order lowpass filter with floating point calculations,
float filter(float s) { Z[3]=Z[2]; Z[2]=Z[1]; Z[1]=Z[0]; // the rolling buffer Z[0]= s - (D[1]*Z[1] + D[2]*Z[2] + D[3]*Z[3]); return Z[0]*N[0] + Z[1]*N[1] + Z[2]*N[2] + Z[3]*N[3]; }
here are the filter constants used: float N[]={1.67e-02, 5.01e-02, 5.01e-02, 1.67e-02}; float D[]={1, -1.798, 1.221, -.2898}; float Z[4]={200.001,200.001,200.001,200.001};
This code is made and tested for AVR, but this exactly same C code compile and run just fine on the STM32 !
AVR at 8 MHz time = 2200uS (244uS at 72MHz caluculated) GCC compiler for AVR STM32 at 72MHz time = 12-14uS (126uS at 8MHz caluculated) KEIL compiler STM32 at 72MHz time = 16-41uS (with GCC compiler) STM32 at 72MHz time = 12-15uS (with IAR compiler)
the STR32 is 157-183 times faster. but the clock is also 9 times faster. so at SAME clock the STR32 is 17-20 times faster than AVR.
uptimizer changed from none to max speed: 14uS to 12uS
if you call the filter routine with zero it is known to be a bit faster, but I called it with a variable number and saw almost same result all the time.
Benchmark test2: Pin toggle, the STM32 can toogle a pin with 20.8MHz on port b, the port B bridge setup to 50MHz, and 72MHz core. so if you set a bit hi and then lo, the pulse will be only 24nS wide !! KEIL compiler
Pin toggle, the STM32 can toogle a pin with 7.3MHz on port b, the port B bridge setup to 50MHz, and 72MHz core. so if you set a bit hi and then lo, the pulse will be 68nS wide GCC compiler
Pin toggle, the STM32 can toogle a pin with 19.2MHz on port b, the port B bridge setup to 50MHz, and 72MHz core. so if you set a bit hi and then lo, the pulse will be only 26nS-109nS wide !! IAR compiler, time depends on uptimize
An AVR MEGA type, can toogle a pin at halve its core frequency, pin 10MHz, core 20MHz. so if you set a bit hi and then lo, the pulse will be only 125nS wide at 8MHz (50nS wide, at 20MHz clock) so an 20MHz clocked AVR will outperform STM32 only with GCC compiler in bit banging applications -----------------------------------------------------
code size test floats:
STM32 this filter and init of the floats needed: (KEIL compiler microvision) O0: 1108 bytes O3: 1068 bytes -- STM32 this filter and init of the floats needed: (GCC compiler) NO: 896 bytes O3: 896 bytes MAX: 896 bytes funny the uptimizer levels dont change a thing in size or speed of this filter. GCC wins on smallest code, looses on speed -- STM32 this filter and init of the floats needed: (IAR compiler) None: 1000 bytes medium: 926 bytes High : 918 bytes several optimizer options exist, rather complex. -- AVR this filter and init of the floats needed: (GCC compiler here) O0: 2046 bytes O3: 2248 bytes -S: 1838 bytes (all optimized results where tested to function, the filter call takes same time to execute) it is a known fact that is IAR compiler can make code size much smaller for normal AVR programs. I have seen 50% code size with IAR, and the program still worked, on another project. -------------------------------------------------- code size test chars and integers: needs to be done --------------------------------------------------
Price: ok it is crasy cheap with STM32 !! you get so much internal features and so much code space for so much less compared to avr 128kb vs 128kb !!
-----------------
ADC: The AVR have 10 bit internal ADC and normally we see 1 LSB jitter with no extra software filter added. On the STR32 dev board from KEIL I have here I see 4-5 LSB jitter, ok it is 12bit resolution os I say it is the same internal digital noise, the STM32 can run at 1MS !! so it can handle plenty of oversampling and digital noise filters = much better resolution and still much faster ADC result. I made a normal running avarage like this on the STM32: adcfiltered=((adcfiltered*0.9)+(newadc*0.1)); // this gives 12 bits useable resolution 0-4000 this is a 10 cycle filter, now I have the full 12 bits resolution with under 1 LSB jitter. I have proved by lab test if the runing avarage is changed a bit, so it accumulate the noise, and the output is therefore a calculated higher value = more bits resolution due to the noise, this gives a much lower speed, but ok it can handle 1 MEGA samples/pr sec at 12bit this converter ! adc13bits = ((adc13bits*0.99)+((float)ADC_ConvertedValue*0.02) ); // this gives 13 bits useable resolution 0-8000 adc14bits = ((adc14bits*0.99)+((float)ADC_ConvertedValue*0.03) ); // this gives 14 bits useable resolution 0-12000 -------------------
Upgradeablility: The STM32 exist in 48 - 64 - 100 pin LQFP pakages, easy to hand solder and works fine on even 2 layer boards also they exist from 32-64-128kb flash, the cheapest is 1.8$ and the biggest 3.6$ they exist in pin compatible slow and cheaper versions also. they will release 256kb and 512kb soon. __________________
Last edited by ThomasScherrer (2008-01-15 10:10:31)
latest news : this gives you 16 bit resolution !!! adc16bits = ((adc16bits*0.999)+((float)ADC_ConvertedValue*0.016)); // 16 bit it is still nice and lineary, no shit try it ! ok I had to add one more potentiometer 10k and 1Meg resistor to the first pot to really prove I have a useable 16 bit resolution.
Last edited by ThomasScherrer (2008-01-04 14:43:42)
no no stewee the whole idea of oversampling only works IF the input is not stable, as I said in this case the ADC generate actually 4-5 LSB of internal noise, that is actually perfect, with oversampling and aditive filtering you will infact be able to see input signals that is 16 times under one LSB of the original 12 bit adc !! equals a 16 bit resolution with a full scale analog voltage making a result that go from 0 to 65535, and you can actually use every count as a final result, I have tried it today so I know this works !
I believe that it worls only if the noise is white, i.e. purely random. In the case of a microcontroller the noise comes from Vcc, the CPU consumption (interrupt,...), and the other peripherals (IOs, USB, etc...). Therefore, the result would depend a lot on the activity of the close environment. Anyway, it's sure that filtering will improve the result.
;-) Nobody is saying oversampling is homeopathy, or that you are wrong, indeed it is 'VERY' helpful. Let me clarify:
My concern is that we 'ONLY' need to AVERAGE MORE noisy samples to get MORE resolution - NOT use coefficients that add up to more than 1. e.g. .99 and .02 .
The more samples you average the more bits of resolution you can assume - if the noise is white. Every 64 samples averaged can give one more bit of resolution so the expression for 16bit resolution would be; average = average * 255/256 + sample * 1/256; (pretty much what you had. Thanks.)
If the coefficients do not add up to 1, your tests will still give better resolution, as you found, if their RATIO is about right, but an offset will grow in the final measurement over a VERY long time.
I am concerned readers under pressure might cut and paste your excellent benchmark expressions into long running applications that could suffer from an offset problem. That is all.
The step response for the pseudo average is different to a true rolling average. It takes longer for the pseudo average to arrive at the new reading after a step change in the signal measured. The step response for a true rolling average is a straight incline. The step response for the pseudo rolling average is a curve that starts shallow, steepest midway, ends shallow.
You are right ! that is why I have combined first avaraging and then 3rd order low pass, to give me a fast reacting and nice noise free result :-)
from 12 to 16 bits ADC resolution I see only 16 times over sampling needed, but if you also want to stable the noise 64 to 128 times is more likely to be used.
it is also right I have a few calculations errors, the: ((adc16bits*0.999)+((float)ADC_ConvertedValue*0.016)); // 16 bit this gives not exactly 16 times the ADC value, but more like 15.782 but it is a constant error, so i am still happy.
When going from 12 to 16 bits you will need 256x oversampling like Stewee said above. Also when making a program, it's much faster to use shifts of integers instead of using floats. The 256x oversampling is exactly a shift of one byte. Better to try something like this: Sum = Sum - Sum>>8 + AdcValue; Value16bit = Sum >> 4;
Small problem (depending on your application) is that this will creep slowly toward the right value. For a fast response you will need to use a 256 value buffer instead of an IIR filter like the one above.
is it possible that you post your source code for benchmark test 2, that shows how to "contact" a port. Because I have in the moment no idea for a program that toggles a pin, specially with the primer.