Interrupts are ... tricky

by Leonard Tramiel

Some early PET users reported that their machines would "hang". After doing ... something the machine would become unresponsive. Nothing short of a power cycle would get the machine back into operation. The hardware folk on the project said it was obviously a software problem and, not surprisingly, the software folk said it was a hardware problem. As mentioned in another entry here, there were some ... idiosyncrasies in the MOS memory being used. In particular the 6550 pseudo static RAM. This was an interesting(?) device, it had a combination of the characteristics of a static and a dynamic RAM but none of the advantages of either. Dynamic, as opposed to static, RAM has fewer pins so it greatly reduced the number of traces on the board but it needs external logic to generate the address signal and it needs a regular pattern of signals to produce a refresh signal. Static RAM uses simpler addressing so more pins but doesn't need to be refreshed. The 6550 had the simple addressing of static RAM but it still needed a "refresh" signal. The timings were ... strange.

I had left Commodore to pursue an advanced degree but I had a PET and some spare time so I thought I'd try to narrow this down. One of the best features of Commodore's 8-bit computers was the on screen editor. It allowed many unusual and interesting uses of the machine and one of those was that it made it possible for a BASIC program to modify itself. I thought that using this feature made the machine do the most work so I wrote that kind of program. It took a while but eventually I wrote a, thankfully short, program that would put the machine into this state. After a bit more tweaking it would enter that state reliably, in just a few seconds.

I called John Feagans and told him that I had made this problem repeatable. I figured that if he ran it on another PET and it behaved like mine did, it was likely a software problem, Otherwise hardware was likely the issue. In either case having a reliable way to cause the problem would make it far easier to diagnose and eliminate. I read him the program. That was the fastest way to get the program to him at the time, at least with the equipment, or lack thereof, I had available. When he typed in the program we typed in RUN and hit return at the same time. I said "Three, two, one, die". John responded with one of the few times I can remember him swearing.  His machine died at the same time mine did. That made it likely that it was a software problem.

In a very short time he found the error and fixed it. Here's John's description:
I went over and over the code and had the MDT attached but the timing was not the same as your program hammering.  If I had a logic analyzer monitoring all the transactions up to the crash it would have been quicker, but acting on the hunch after your program crashed I went directly to the source.
The code that parsed a line of BASIC into tokens directly manipulated the processor stack in a way that was not interruptible. If the hardware timer that updated the internal clock occurred at just the wrong time the stack would be corrupted and the machine would crash. John replaced it with code that had exactly the same effect but didn't cause a problem and did it in less space than the original. Later John found a note in an Ohio Scientific manual that interrupts should be disabled when running Microsoft BASIC. Sure would have been nice if Microsoft had told us that.

Popular posts from this blog

Calculators, codes, and hidden messages

If Looks Could Kill

Programming by Tweezer