Squidly’s Progress

I am glad to say that I have made some major progress in one of my MPLAB-X errata issues. Specifically, the issue of the debugger being unable to resolve the scope of local variables and parameters. After concluding that the support forums had no answers, I bit the bullet and created a support ticket:

In MPLAB-X debugger, function parameters display as “Out of Scope” when they are most clearly NOT out of scope. This forces all sorts of copying non-sense that is very multi-threading risky.

To make a very long story mercifully short, 10 Microchip support responses and 7 customer updates later we finally reached the following point:

Hi Peter,

After further investigation, it is observed that it is a MPLABX issue and not a[n] optimization issue. This will be corrected in the future version of the MPLABX IDE. The concerned team is working on it.

I hope it helps. If you need further support/clarification on the same issue, please update this ticket.

Regards, Microchip Technical Support

Take it from me, I’ve worked in code maintenance and I know that sometimes recreating a problem seen by the customer can be the most difficult part of the process. Getting a firm grip on the nature of the issue is essential to making any progress in resolving it. We are finally over that enormous hurdle and I am confident that progress will be made on this matter!

Finally; On a somewhat less serious note: I am very pleased to announce that I have passed the milestone of 100 postings in the Microchip Technical Support forums. Hooray for me 😉 Hopefully, my postings have been of value or helpful or at the very least, not harmful or incorrect!

As always, your comments and observations are welcomed!

Peter Camilleri (aka Squidly Jones)

Magnus Errata Part 1

Most of the time, microcontroller errata detail some minor problem with an obscure corner of the chip that only affects a few people. Sometimes though it’s much more serious. Recently, Microchip released errata documentation, covering pretty much the entire PIC32 line, with not one but two major CPU issues. The scope of these issues affects the vast majority of PIC32 projects planned, under development and in the field. This is serious stuff. This article covers the first major issue: Reading data from flash while interrupts are in use.

Flash Data Read Errata

In a nutshell this issue states that when reading data from flash memory (as opposed to fetching instructions from flash) there is a chance of a serious error (data bus exception) if interrupts occur at inconvenient times. Let’s see if this is serious. To have a problem, an application would have to read data from flash and it would have to use interrupts. Without any data to back this up, but based on my years of experience, I would estimate that this issue could affect almost all PIC32 applications!

Given how serious this problem seemed to me, I was perplexed that there was not a lot more being said about it on the discussion forms. So I raised the issue in a posting. This generated a fair bit of interest. For starters, I was NOT the first on the block for this issue. In my defense the original thread was called “PIC32 SPI fail only at PBclk= 40 Silicon Problem??” which is not exactly shouting “memory errata”. I was correct that many others were concerned:

kalpak wrote:

“For a new project the scary and vague errata drove me into the arms of NXP”

bsder wrote:

“…

Errata 44 (for the 5/6/7 series) lacks a *LOT* of information. Why can’t we just shut the interrupts off? Is the handler the issue? Is the specific mask bit the issue? Is the overall interrupt an issue? The fact that we *don’t* see these every single time says that it’s not *just* an interrupt issue going on here.

In addition, why should their workaround be any more reliable? Maybe it is, but there is a whole bunch of arbitration issues that come in, and I’ve sometimes seen quirky DMA behavior. The fact that they couldn’t interface SPI to the MIPS core certainly doesn’t give me confidence in their DMA engine.

This one is bad. 🙁

crosland wrote:

“Does anyone form Microchip monitor these forums? What’s the plan to fix the most serious CPU errata? It’s simply not acceptable to have to disable interrupts when reading flash nor use DMA to access peripherals. Imagine the task to go through and sanitise the application libraries…”

Finally after nearly a week, the moderators took notice and made the following statement:

“One explanation why most people do not see DBE errata in their applications is that it can only appear if both cache and prefetch are enabled. By default (on chip reset), both features are disabled, and out of the box start-up code from Microchip doesn’t enable them. So, one has to understand these features and initialize them to enable them (or use provided PLib functions).

When both features are enabled, application performance will increase, as much as double (average execution time will be half, vs the same app without cache and prefetch turned on). However, if only one of them is enabled (as workaround suggests), performance increase will be slightly less (on average 90% instead of 100%).

Very minimum impact.

Microchip understands importance of having these issues fixed and is actively working on it. Please check with your local Microchip representative on availability dates and obtaining early samples (if desired).

Well I guess that helps, except I have always fully enable all the speed-ups available, so why have I not had any problems? Still, it got me thinking; there are two possible courses of action: A) Disable the Prefetch or B) Disable the Cache. Which choice would do less damage to system performance? To help understand this I wrote a test program that drew 100,000 circles on a graphical display. I then tried four sets of options:

Test Execution Time % Slower
None Disabled 79.1
Cache Off 84.3 6.5%
Prefetch Off 82.3 4.0%
Both Off 120.2 52.0%

My data seems to support the official line. It looks like performance is slightly less affected by turning off the prefetch, but not by much. Just turn (or leave) both off. The only remaining sticking point (in this case a point shaped 800 pound gorilla) here is my inability to make this issue happen at all! If it is so serious why do I never see it? I still am no closer to an answer. I don’t know what to say. Clearly there are other conditions/setting/configurations that play a role in this issue that are not called out in the errata. Without more complete information, informed choices are hard to come by. However, safe ones are not. Given the slight impact, it’s safe to say that the instruction prefetch can be turned off with only a minor ding in performance, and for now, that is what I shall do.

Due to the dynamic nature of the situation, I have NOT included links to errata files. For up-to-date information, please go to the Microchip website and look up the PIC32 of interest.

As always, comments and observations are welcomed.

Peter Camilleri (aka Squidly Jones)

C32 Impressions (3/23/2012)

In my PIC32 programming I have three primary development tools: The MPLAB-X IDE, the MPLAB ICD-3 programmer/debugger, and the C32 V2.02 “C” compiler. In this post we are finally going to take a look at the compiler. Like many compiler projects, C32 is a subset of the open source GCC 4.5.1 compiler and the non-proprietary bits are available in source format. Some major features absent from C32 include: C++ support, 128 bit integers, decimal floating types, 16 bit float types, and fixed point (DSP oriented) types. Still, many useful extensions to “C” are included and the user would do well to download the GCC  4.5.3 documentation (the closest to 4.5.1 that I could find). I especially like the binary constants feature; for example: i = 0b101010;

C32 is available in essentially two major versions: A free, 60 day trial and Full. The Free version is a free download from the we site and has the the following restrictions specified in the compiler release notes:

“Microchip provides a free Standard Evaluation edition of the MPLAB C Compiler for PIC32 MCUs. The standard evaluation edition of the compiler provides additional functionality for 60 days. After the evaluation period has expired, the compiler becomes reverts to the lite edition, and optimization features associated with levels -O2, -O3, and -Os are disabled. In addition, MIPS16 code generation is disabled. The compiler continues to accept the -O1 and -O0 optimization levels indefinitely.”

The full version is available for purchase from Microchip Direct for a list price of $895. This price is routinely discounted for attendees of the annual Master’s Conference, but even so it it is still a rather large chunk of change. To gain a better understanding of compiler performance, I undertook to perform a series of benchmarks using C32 V1.11, V1.12, and V2.02 at the four levels of optimization measuring performance in space (size in bytes) and time (execution in seconds). To test with I created four programs. Here are my findings for each program:

Test #1 Circles

Draw 100,000 random circles on a graphical display (no H/W acceleration)

Test #2 Glyphs

Draw 200,000 (8 by 13) Character Glyphs on a graphical display (no H/W acceleration)

      


Test #3 Prime Number Sieves

Find all prime numbers less than 20,000 and do this 5000 times.

Test #4 Spirographic Patterns

Draw 200 repetitions of a complex polar plot diagrams reminiscent of the old Spirograph toy.

Benchmarks Summary

For almost all debugging, it is advised to use O(0) or non-optimized code. The code re-arranging done by the optimizer makes debugging very difficult. At least, optimization should be turned off for the source file being debugged. As for production code, there does not seem to be a great deal of difference in optimization setting 1, 2, and 3. Setting  “s” produces consistently smaller and slower code. Thus the main advantage of the full compiler versus the free one seems to lie mainly in that it can produce more compact code.

MIP16 Mode

There are options to force the compiler to use MIPS16 mode throughout a project, however given that most projects contain various routines not allowed to 16 bit, this approach will seldom work well. I prefer to use the attribute ability to mark individual functions as 16 bit. This gives me fairly fine grained control over where I favor space over time. For brevity I define:

#define mCode16 __attribute__((mips16)) __attribute__((noinline))
#define mCode32 __attribute__((nomips16))

Note that the mCode16 macro selects mips16 mode and the noinline option. The noinline is required as a work around to a minor compiler bug, but the avoidance of inline code makes sense if you are trying to save space. A simple example where space savings may be desired is:

void mCode16 emQryTouchRaw(emPTOUCHRAW rd)
{
    rd->x = xf;
    rd->y = yf;
}

Even in this trivial case, MIPS 16 mode saved 16 bytes of program space. This option is NOT available in the free compiler after the 60 day trial period expires.

Conclusions

For the most part, the C32 compiler is a like a good car engine. It does a good job and you don’t have to waste time fussing over it. Like any good tool, it lets you focus on the job at hand rather than constantly tweaking things. I run the FULL version and I am glad to have the flexibility it offers. Nevertheless, a great many projects can be accommodated by the powerful free version of C32, and tool cost should not be an impediment to trying out new ideas.

Command Line Options

The compiler command line includes many many options, most of which are not documented. They are listed in this PIC32 GCC Usage file for your convenience. Use at your own risk!

Peter Camilleri (aka Squidly Jones)

PS: While Microchip is proceeding with caution, there are indications that they may yet unleash the C++ capabilities inherited from GCC! And that too would be a good thing!

UPDATE (3/23/2012) On further reflection, I just remembered that I left out a major method of reducing code bloat that should work in all versions of C32. When creating code modules, it is not uncommon for some functions to not be called. Now, these could just be deleted, but that often means maintaining multiple versions of the code which can be messy. Two options, work together to automatically remove un-called code. In the picc32-gcc there is “Isolate each function is a section” and in the picc32-ld there is “Remove unused sections”. Together these options can result in substantial space savings with no loss of speed or debugging capability.

On another unrelated note, the image to the left is a screen shot of my spiro-graph test. The subtle color changes are on purpose. I added them so I could see the colors shift as I drew 200 copies of the pattern for my testing. Many many years ago, I was tasked with writing a CALCOMP pen plotter driver and scientific graphing package in FORTRAN. For my test data, I used spiro-graphic patterns back then too and I have been intrigued by them ever since.

MPLAB-X Update (3/15/2012)

Stop the presses! I may have spoken too soon. In my previous article MPLAB-X Impressions I wrote:

Earlier betas of MPLAB-X had severe problems with watch windows being unable to locate local variables, but those issues seem to have been sorted out in V1.00.

Well take a look of the following from a recent debugging session:

The green highlight indicates that the CPU is halted at the specified line of code. The message “Out of Scope” is NOT what you want to see when you hover over a local variable. It basically means that the debugger cannot locate the desired data and you are  on your own.

The work around for this very annoying bug is to copy the variables affected to statically scoped variables that CAN be resolved by the brain dead code. The little snippet in the picture above shows this strategy in action.

I must say that I am disappointed.  I really thought this annoyance was behind us! That said, V1.00 was still better than the beta, so progress is being made.

Peter Camilleri (aka Squidly Jones)

(3/15/2012) Further update: I have raised two tickets with Microchip concerning the troubles I have encountered. I can’t help but notice a lot of loud complaining on the support forums but not much action. If you see a problem, create a ticket to complain to the the folks responsible for fixing it! If you can’t be bothered to take the effort to do that, you’ve got nobody to blame but yourself.

Touch Screen Calibration ][

This is the third posting on the subject of Touch Screens. When we last looked at Touch Screen Calibration, a workable approach had been found. However there was a serious problem; the approach was iterative in nature. The user had to repeat that calibration steps until the data was accurate enough and it was difficult to know when enough was enough. The reason was that Plan B processed an inset touch point and the current guess at the opposite limit. Since each calculation used approximate data, there was always an error. Over several iterations, the error can be reduced, but never truly eliminated. To resolve this issue, a new plan was needed. A plan called Plan C!

Plan C

The new approach is based on the idea of processing input, two points at a time with no recursion! Lets take a fresh look at the data:

Unlike previous charts, we now see two inset points. One is set 15 pixels from the left and the other 15 pixels from the right.

OK, time to bust out the algebra again. As before we know that the slopes of lines (0,X_L) to (319,X_U) and (15,X_A) to (304,X_B) are the same. The slope is the rise over the run and  can be expressed as:

\frac{(X_U-X_L)}{320}=\frac{(X_B-X_A)}{290}

(X_U-X_L)=\frac{320}{290}(X_B-X_A)

At this point we must pause. Both X_L and X_U are unknowns. We need a way to express things in terms that lets us reduce the number of unknowns to one per equation. To do this we need more information about our system. Consider for a moment, the center point of the two lines. It’s added to the illustration below:

Now the point (160, X_C) is the center point for both the shorter line from 15 to 304 and the longer one from 0 to 319. This means that the value X_C is midway between X_AX_B and also X_LX_U Since these midpoints are equal, we are able to state:

X_C=\frac{(X_U+X_L)}{2}=\frac{(X_A+X_B)}{2}

Now X_C itself is of no interest, it’s the other parts of the equality that are useful.

\frac{(X_U+X_L)}{2}=\frac{(X_A+X_B)}{2}

Multiply both sides by 2:

X_U+X_L=X_A+X_B

Now let’s isolate X_L in the above:

X_L=X_A+X_B-X_U

We now have X_L in terms of X_U,X_A, and X_B. We can now return to our earlier equation:

(X_U-X_L)=\frac{320}{290}(X_B-X_A)

Finally substitute  X_A+X_B-X_U for X_L and solve for X_U:

(X_U-(X_A+X_B-X_U))=\frac{320}{290}(X_B-X_A)

(2X_U-(X_A+X_B))=\frac{320}{290}(X_B-X_A)

2X_U=(X_A+X_B)+\frac{320}{290}(X_B-X_A)

X_U=\frac{(X_A+X_B)+\frac{320}{290}(X_B-X_A)}{2}

By the same token, the solution for X_L is shown below. I shall leave the derivation of this as an exercise for the reader 😉 .

X_L=\frac{(X_A+X_B)-\frac{320}{290}(X_B-X_A)}{2}

So what are the results of all this math? Plan C produces accurate, reliable calibration data without having to iterate through a refining process. The math, while a little more complex, is still quite manageable. And with that, I think we can call this one closed for now. As always, your thoughts, comments and suggestions are invited.

Peter Camilleri (aka Squidly Jones)