Suppose you want to run the original 1981 vintage IBM Pascal 1.0 (supplied by Microsoft) on a PC that is less than 30 years old. Upon execution, PAS1.EXE may well fail with the following error:
Error: Compiler Out Of Memory
Given that the compiler was designed to run on the original IBM PC and only required 128K memory, why is it failing on a system with a lot more? The real reason is of course not that there isn’t enough memory, the problem is that there’s too much. Let’s see how that works (or rather doesn’t work) exactly.
IBM Pascal 1.0 suffers from a problem that is common to a number of products built with the Pascal compiler, specifically programs using the Pascal run-time startup code. That notably includes early versions of MASM as well as the Pascal compiler itself.
There are several variants of the run-time startup with slightly varying behavior, but the core problem is the same. The startup code calculates memory sizes and addresses as 16-bit quantities in units of paragraphs (16 bytes). That way, a 16-bit value handles the entire 1MB address space of an 8086, and in fact follows the implementation of 8086 segment registers.
The Pascal run-time relocates part of the loaded executable image towards higher addresses to make room for the heap and stack. It attempts to make the data segment (heap, stack, static data, constants) up to 64K in size, but will settle for however much is actually available.
The problem with the Pascal run-time is that it uses a signed comparison. That is logically wrong because a PC can’t have a negative amount of memory. The signed comparison may produce incorrect results. Let’s consider some of the variants.
Broken One Way
Variant A, found in programs produced with IBM Pascal 1.0, first calculates the paragraph address of the bottom of the data segment in the executable (load address plus size of memory preceding the data segment). 64K (or 4096 paragraphs) is added to that value, and it is then compared against the highest available paragraph address (read from offset 2 in the PSP). If there is more than 512K conventional memory, the highest available paragraph will be 8000h or higher, and will be interpreted as a negative signed value. That will be less than the desired top of the data segment.
If the signed comparison produces the wrong result, the run-time thinks there is less memory available than the desired maximum, and tries to use as much as it believes there is. That may not be enough for the application, which may then fail with an out of memory error.
Broken Another Way
Variant B is slightly different. It affects for example IBM MASM 1.0 and has been analyzed here. In this case, the startup code takes the highest available paragraph address from offset 2 in the PSP, subtracts the base address of the data segment from it, and then does a signed comparison against 64K (4096 paragraphs) to see if the maximum is available for the data segment.
The failure mode of this variant does not depend on the highest available paragraph address but rather the amount of free memory.
Symptoms and Workarounds
If the user is lucky (sort of), the affected program will report an out of memory error and not cause any harm beyond that. Because the run-time is not careful enough, it is possible to end up with a dynamically allocated stack smaller than the one built into the executable, and then the executable will hang (loop endlessly) while trying to print the “out of memory” error message.
Further analysis of problems with the Pascal run-time may be found here.
The most complete fix for the problem is to patch the executable and replace the problematic signed comparison with unsigned (for example replace JLE instruction with JBE). That is also the most difficult fix because it requires analysis of the start-up code.
A less intrusive, less complete, but much simpler and usually sufficient fix is to change the EXE header to reduce the maximum allocation. That way, instead of trying to grab all available memory, the executable will only get (for example) 64K, which will almost certainly prevent the overflow.
In some cases, the LOADFIX utility may also change the behavior by loading the executable higher in memory. This does not require modification of the executable but also may not help at all.
Minor Correction
The previously referenced article claims the following: The IBM Personal Computer MACRO Assembler”, also known as MASM, published by Microsoft and IBM since 1981, was one the firsts[sic] Assembler programs to run under MS-DOS / PC DOS on IBM PC or compatible computers. Lot’s[sic] of code was written for MASM, notably the MS-DOS kernel itself and ROM-BIOS code. MASM is therefore of historical importance in the field of personal computing.
While the historical importance of MASM is indisputable, the quoted text is slightly misleading. The BIOS of the early IBM PCs was written on Intel development systems (not on PCs for some funny reason) and built using Intel’s development tools, notably the Intel ASM86 V1.0 assembler. DOS 1.x was built using Tim Paterson’s SCP assembler, named simply ASM. Microsoft’s MASM was clearly an important product, but it played no role in the initial development of the PC’s ROM BIOS and DOS. It was used for DOS 2.0 and later, as well as the IBM PC ROM BIOS since circa 1983.
It is almost certainly true that (IBM) MASM was the first assembler commercially available specifically for the IBM PC. Intel’s ASM86 was only ported to DOS in the mid-1980s. SCP’s ASM was not sold by Microsoft or IBM, although it was almost certainly the first assembler which ran on the IBM PC by virtue of the extremely close relationship between SCP’s 86-DOS and PC DOS.
Trivia: The IBM Personal Computer Pascal Compiler V1.00 executables (PAS1.EXE and PAS2.EXE) do not have the typical ‘MZ’ signature in the EXE header at the very beginning of the file. Instead, they have a ‘ZM’ signature. That is considered equivalent to ‘MZ’ by all “true” DOS implementations (not counting DOS 1.x, where COMMAND.COM loads EXE files and does not check the signature). The second word in the PAS1 and PAS2 EXE header also does not indicate the number of bytes in the last page of the executable, but rather the version of the linker used to create it.
The fun thing is that 64K DRAM did exist in 1981 though expensive, and I think even the original PC BIOS supported up to 576K
Why should there be a 576k limit and not 640k?
The thing that didn’t exist at that time were memory expansion boards giving a total of more than 512k.
It was a limit of the first IBM PC BIOS that was eventually increased.
Looks like it is 544K actually:
http://www.minuszerodegrees.net/5150/bios/5150_bios_revisions.htm
Actually, the 1981 BIOSes had the very unusual limit of 544 kB. See http://www.minuszerodegrees.net/5150/ram/5150_ram_16_64.htm
IBM tested the pre-release PC with 256 kB using multiple 64 kB cards.
They really didn’t have 64K DRAM samples available?
They did, but they were expensive. The IBM PC Prototype was built on wire wrap, and had only 16k planar, and 3×9 expansion DIPs.
“The first IBM PC used 16K by 1 bit RAM chips. Soon 64K by 1 bit chips were available and in regular use.”
http://philipstorr.id.au/pcbook/book2/memchips.htm
http://nersp.nerdc.ufl.edu/~esi4161/files/dosman/chap2
This model: 04/24/81 FF = PC-0 (16k) — —
Has this bug:
544k 88000-8BFFF | original IBM PC-1 BIOS limited memory to 544k
Even if they did have 64K chips… I don’t believe there was any concrete expectation users would ever equip the IBM PC with more than 512K RAM. The base model had only 16K after all. As far as I know, IBM did not offer any options allowing such memory expansion. They certainly didn’t plan for 3rd parties to do so, either.
The other factor is that the offending Pascal code was finalized sometime in Summer 1981, before the IBM PC was even announced.
My experience is that these sorts of problems are common. For example IBM XENIX 1.0 (1984) crashes with more than 4MB RAM. OS/2 1.0 (1987) crashes with full 16 MB RAM or more. Microsoft’s typical development systems were pretty beefy for their time, but they did not go to the maximum. RAM was always expensive and there was not much point in stuffing 8MB into a PC/AT in 1984, if it was even possible.
“I don’t believe there was any concrete expectation users would ever equip the IBM PC with more than 512K RAM.”
And the point was that no one would do so immediately, but it should have been tested.
“RAM was always expensive and there was not much point in stuffing 8MB into a PC/AT in 1984, if it was even possible.”
And thinking about it, the falling RAM prices in 1985 probably took everyone by surprise. For example, multitasking DOS 4.0 was designed for real mode, and even Netware didn’t release its 286 version until 1986.
The SHARE report of 1980 indicated the average IBM mainframe had 512 kB of RAM. A PC design was expected to last about 5 years. I doubt any analyst would have predicted that PCs would need more memory than 6 year old mainframes and certainly no one would have paid the extra $100 to $200 needed to add paging hardware equivalent to that on the System 23 to prepare the PC for big memory configurations that might be relevant 5 years later.
There was one other major change to RAM that was needed to break the megabyte barrier at an affordable cost: switching from NMOS to CMOS. The 64 kbit NMOS chips needed twice the power (per chip) of 16 kbit NMOS. The IBM PC had sufficiently excessive design to handle such an increased power draw but that was nearing the limits. A 2 MB card using 72 256 kbit NMOS chips would have needed more power than the ISA bus could supply. The 2 MB card using CMOS chips needs about the same power as the 64 kB NMOS expansion card.
I am not talking about adding extra hardware though.
I agree that it was hard to realize back in the days that many PC’s would last more than those 5 years and be upgraded to a full 640k.
The question is why they didn’t make one prototype with as full specs that was possible at the time, and used that to test various software before shipping?
The Xenix and OS/2 problems could also had been avoided if IBM back in 1984 had made atleast one prototype expansion card filling the machine with 640k+15M.
Yea, that is the point of me mentioning 64K DRAM samples. This also reminds me that the IBM Memory Expansion Adapter was able to do 3MB per card in 1986, before OS/2 1.0 was even released. It relied on SIMMs which would not have existed in 1984 though, though 256K DRAM samples did.
Based on the prices in 1980, 512 kB of RAM using 64 kbit chips would have cost over $7,000. That is a lot to spend for a test that wouldn’t matter for a few years and might not determine anything if the chip designs changed again. Prices dropped, IBM included new parts and tested full setups but didn’t test all older software to ensure it worked on the newest hardware in all possible configurations.
By 1982 64K DRAM was already close to reaching crossover, which is not that long after the IBM PC was released in 1981.
And yet utterly irrelevant for software finalized in mid-1981. So far I have not seen any evidence that this caused problems during the expected lifetime of the IBM PC.
I wonder how many people these days make sure their software works properly with 16TB RAM…
This reminds me of the recent x86 57-bit virtual address extensions too.
The issue was corrected in MASM 1.27 released in 1984 so it showed up during the expected life span of the IBM PC. Right after the 256 kB motherboards for XT and PC plus the 256 kB expansion cards arrived, testing localized the problem with the fix needing a number of months to be shipped. Seems reasonable to me.
Can’t tell you which MS Pascal version was first to make it past 512 kB but the 1985 release was being advertised that it could handle up to 1 MB of RAM.
Making sure code works on the currently available hardware and being willing to do a new version seems like the only way to handle an issue. Half the time a developer tries to future proof for larger memory, the memory design changes and all the work has to be tossed out.
I’m not sure counting the PC/XT within the expected lifetime of the IBM PC is really fair 🙂
Actually… I see the fix (JBE instruction instead of JLE) in MASM 1.25, probably October 1983. It’s still broken in MASM 1.12, probably April 1983. This despite the fact that the MASM 1.27 README.DOC (May 1984?) explicitly claims that it “recognizes memory greater than 512K” and 1.25 did not. Either the readme is wrong or someone patched my MASM 1.25 binary, either seems equally likely to me.
$7000 for memory expansion for a single system that could be used for testing every product shipped isn’t much money for a company the size of IBM or Microsoft.
I think it wasn’t a matter of cost of buiying ram ships, more a matter of that they didn’t think it was worth the time designing a prototype memory expansion card.
The same probably applies today. Even though 16TB ram has a high price, the real problem is finding a way to install 16TB in any system that you can buy today.
The difference is that back in 1981 one engineer could make a card containing 576kb ram using 16kbit dram chips. Today it’s probably much more work involved making a motherboard capable of using 16TB.
Current x64 cpus have 2^48 address space which is 256T, not sure where 16T has come from. Anyway, it’s not impossible to assemble server with that much ram or close, if you willing to spend money. The Dell Poweredge models with 3T are quite common and higher models can be maxed at 12T.
@Vlad Gnatov
“Current x64 cpus have 2^48 address space which is 256T, not sure where 16T has come from.”
From the chipset and memory controllers ofc, as have been always.
16TB was just a random limit that will be crossed in not-too-distant future. 12TB systems are definitely available, but fully populating them is Not Cheap.
$7k is enough that everyone involved will stop and think, “do we really need this?”. Especially if we’re talking 1981 dollars. And yes, today it’s definitely not something one engineer can do.
On the other hand, DRAM chips gets sampled all the time without the user expecting the final price to be the cost of the samples. In fact, I might say that a card based on 64K DRAM is easier to design than one based on 16K DRAM because it is single 5V voltage supply only.
I naively wonder if signed vs. unsigned was because Pascal proper had no unsigned types, unlike Modula-2. Then again, who knows what code that compiler would’ve generated for uses of “type word = 0..65535” subrange (or if it could even handle that).
It’s hard to think about development on such underpowered machines. Slow cpu, lack of disk space (and slow floppies), and lack of RAM … sounds very difficult. I can only imagine that small RAM disks and smart tools (Turbo Pascal, Forth, assembly) mitigated some of that pain.
Pascal was a teaching language originally run on mainframes. Novice programmers are more likely to miss differences in values during subtraction leading to problems with unsigned integers. The iron is big and the easy solution is to go with the next larger step of signed integer. Conversely, the memory limits on early micros made larger programs (i.e. not class work) using 32 bit signed integers everywhere impractical.
Some system return the actual allocated memory or -1 for failure. With signed integers and a system that can only use half the possible addresses for program allocation, that works perfectly. In practice, that just leads to rediscovering Bell’s axiom:
“There is only one mistake that can be made in a computer design that is difficult to recover from–not providing enough address bits for memory addressing and memory management.”
Slow floppies, such luxury. Want to know painful development, try a paper tape compiler. Fortran II on a PDP-8 equipped with paper tape attached to a teletype as the only storage is an interesting experience but not one anyone wishes to try more than once.
A problem with stopping and thinking, “do we really need this?” is that this kind of bugs are probably more embarrasing for the programmer who newer could test the program, than the manager who said no to a by-then fringe test case.
Although I assume some managers also got in trouble when this rocket explosion happend:
http://www-users.math.umn.edu/~arnold/disasters/ariane.html
“It turned out that the cause of the failure was a software error in the inertial reference system. Specifically a 64 bit floating point number relating to the horizontal velocity of the rocket with respect to the platform was converted to a 16 bit signed integer. The number was larger than 32,767, the largest integer storeable in a 16 bit signed integer, and thus the conversion failed. ”
Back in the days I heard roumours that the software were written and tested for an earlier rocket with less acceleration making the value fit in a 16 bit integer. (This might not be a signed/unsigned problem as the parameter might itself actually be signed).
This signed/unsigned problem was in hand-written assembler code, not in anything Pascal-generated. Given the logic in that piece of code, it’s even possible that some earlier version required a signed comparison because it might end up comparing negative numbers. The buggy released code only compares quantities which are guaranteed to be positive.
Some of the development was done on minicomputers. Microsoft used DEC minis for example. A lot of smaller-scale development was also done in BASIC in the early days — always in ROM, fast to load.
We’re not talking about a fringe test case here. We’re talking about testing something that simply does not (yet) exist. Yes, in mid-1981 one could imagine a PC with 576 KB RAM. One could also imagine a PC with 16GB RAM and four processors that weighs a fraction of the original IBM PC. In 1981 could also imagine that by 2010, there would be colonies on the Moon and flying cars would be standard. But if you wanted to test every scenario you can imagine, you’d never get the product out the door.
Not just imagine, the 64K DRAM samples already existed.
AFAIK, IBM did not build PC prototypes with double sided 5.25″ drives let alone the brand new 96-track double sided drives. Those were available in much larger numbers than 64 kbit RAM, 125 thousand DSDD drives versus 36 thousand 64 kBit chips*. IBM had a planned set of options and only those were tested. IBM was not planning for future versions before release.
* Those are 1980 numbers since I expect any decision on prototypes has to be made at least 6 months before release. Trying a shorter development cycle tends to result in badly broken products. Remember, it isn’t just the physical card; the BIOS would also need to be redesigned.
Regarding Ariane 5, it’s well-known in Ada circles (although I’m not very familiar with Ada).
Wikipedia says this:
”
The software was originally written for the Ariane 4 where efficiency considerations (the computer running the software had an 80% maximum workload requirement) led to four variables being protected with a handler while three others, including the horizontal bias variable, were left unprotected because it was thought that they were ‘physically limited or that there was a large margin of safety’. The software, written in Ada, was included in the Ariane 5 through the reuse of an entire Ariane 4 subsystem despite the fact that the particular software containing the bug, which was just a part of the subsystem, was not required by the Ariane 5 because it has a different preparation sequence than the Ariane 4.
”
I don’t fully understand the implications of all of it. Although Ada is still heavily-used in critical cases like this, due to its strictness, there are still some who insist that Ada isn’t ideal (esp. *nix diehards). I feel like this is a bad example. Certainly Ada has changed a lot since 1983 although most people still seem more familiar with Ada95 than anything else (e.g. ’05 or ’12). I don’t think signed/unsigned came into play, but AFAIK, unsigned was only truly supported starting with ’95.
From the brief description of the problem it sounds like the issue with Ariane was not the software per se, but the fact that existing code was made to run in an environment it wasn’t designed for. That is something which comes up time and again and has nothing to do with the implementation language, operating system, or hardware platform.
I’m just saying, Ada still gets negativity for that rocket accident. Probably because they brag so much about being ultra typesafe. To C users (e.g. Jargon File), such strict languages are “bad”, even though there’s many differences between classic Pascal and Modula-2, much less Ada and whatever. No one’s ever happy. (In fairness, Eric Raymond doesn’t really use C anymore either, at least not for new projects.)
FWIW, even methinks that most C programmers could use a lesson or two on
the role of types — and, also, on not including n copies of the same
code, each slightly different, in p places ’cause it’s “too much of a
bother” to write a dedicated routine.
C programmers tend to be really stuck in the mud. But limiting what
people can do just tends to propagate the “if I’m allowed to do it, it’s
acceptable, right?!” attitude.
Medigresses.
As for the rocket failure, it sure sounds more like a standard script
kiddie-style copy-n-squirt manouver, as opposed to anything
language-specific.
Memight be wrong, though — meknows just about nothing of Ada, and
mecertainly wasn’t there.
It’s quite uncertain how it relates to the LOADFIX command which was introduced in DOS 5. The IBM PC Pascal Compiler Version 1 product was limited to its usage for PC DOS versions 1.0 and 1.1. Pascal Compiler Version 2 was introduced in 1984 to fully replace the earlier product. A short list of changes can be found in IBM’s announcement letter 284-158 (“IBM PERSONAL COMPUTER PROGRAMMING ENHANCEMENTS”).
As you know, LOADFIX has little to do with DOS 5 per se. It works around a problem that DOS 5 made much more common but didn’t introduce. And there is nothing about the original Pascal compiler that would make it fundamentally incompatible with newer DOS versions.