The Strange MS OS/2 1.21 Disk Driver

Attempting to install Microsoft OS/2 1.21 will fail on many systems with the following scary looking error:

FDISK failed, which makes installation difficult.

Pressing Enter as directed shows the following more detailed error message:

There is no hard disk; that is a problem!

The error occurs after the initial boot stage as well as the stage immediately before the error takes much longer than it normally should.

Some, but not all, OEM versions of MS OS/2 1.21 are also affected. Tandon and Compaq have the same problem, but Intel MS OS/2 1.21 does not. IBM’s OS/2 1.2 SE or EE likewise does not have this trouble.

The problem is usually caused by having a CD-ROM on a secondary disk controller (IDE or otherwise). But it’s rather more complicated than that, and any device on a secondary disk controller is more likely than not going to cause problems. The failure is triggered by a subtle bug in the MS OS/2 1.21 disk driver.

In OS/2 1.2, the disk driver is no longer in a separate DISK01.SYS module but rather built into the “mega driver” BASEDD01.SYS. The previously separate drivers, CLOCK01.SYS, DISK01.SYS, KBD01.SYS, PRINT01.SYS, and SCREEN01.SYS were merged into a single BASEDD01.SYS module. This no doubt saved precious memory, although it made problem analysis and fixing a little more complicated.

BASEDD01.SYS includes a standard PC/AT disk driver, but in MS OS/2 1.21 it also includes a driver for disks on the secondary controller. The standard driver works with I/O port base 1F0h, while the secondary driver uses port base 170h. These I/O port ranges were defined by IBM for the PC/AT; the IBM fixed disk controller worked with base 1F0h by default, but could be jumpered to respond at 170h instead.

The trigger for the MS OS/2 1.21 failure is that a secondary disk controller simply exists. Specifically, if I/O port 172h can be written and read (that is, the value written to port 172h can be written and then read back unchanged), MS OS/2 1.21 will almost certainly be unable to operate the hard disk.

If that happens when booting an already installed system, this familiar and rather misleading error message will appear:

No COUNTRY.SYS means no hard disk…

Here is an outline of what MS OS/2 1.21 tries to do:

  • Check if port 172h can be written and read; if so, go ahead
  • Install interrupt handler
  • Check CMOS bytes 1Bh and 1Ch; if 1Bh is zero, fail
  • Attempt to initialize disk controller
  • If anything failed, uninstall interrupt handler again

The logic where things go wrong is installing and uninstalling the interrupt handler. To install the handler, BASEDD01.SYS reads from port 1171h. That is not a typo–it is 1171h, not 171h. As far as I can ascertain, port 1171h is associated with the Compaq 32-Bit Intelligent Drive Array Controller, or IDA.

On typical systems, there is no port 1171h. Depending on the chipset, the I/O port address may be decoded as 171h or 1171h. In any case, BASEDD01.SYS uses port 1171h to determine the IRQ level.

On a typical PC/AT compatible, it will decide that the IRQ level is 14 (which is the same as the primary disk controller!) and install a handler for it. It also sets an internal variable, let’s call it DSK2_IRQ, to 14. This variable is used by the interrupt handler and passed to DevHlp_EOI (OS/2 device helper API) to clear the interrupt.

The CMOS byte 1Bh is quite possibly zero and the driver initialization will fail there. If not, it will very likely fail in the next step.

When things fail, BASEDD01.SYS sets DSK2_IRQ to zero and uses that value to call DevHlp_UnSetIRQ to uninstall the interrupt handler.

Readers have no doubt noted that the driver installed an interrupt handler for IRQ 14 and uninstalled a handler for IRQ 0, which was never installed! That is one part of the problem.

The other part of the problem is the interrupt controller that’s left behind. When the primary disk controller is used, it will trigger IRQ 14, yet the wrong interrupt handler is executed. But it’s worse than that. The first time the wrong interrupt controller is called, it will use DevHlp_EOI to clear IRQ 0 instead of IRQ 14. OS/2 will send an EOI to the master interrupt controller (PIC), but not to the slave PIC.

As a consequence, the slave PIC will think that IRQ 14 was never handled and the next IRQ 14 will be blocked at the interrupt controller level; the CPU will never see it.

After this unfortunate series of events, disk interrupts are completely out of action and the primary disk driver cannot function. After a few retries and timeouts, the driver gives up.

Interrupt Setup

After locating the Compaq IDA Technical Reference (not an easy feat), it turned out that I/O port 1171h is in fact the IDA Interrupt Configuration Register.

An excerpt from the Compaq IDA Technical Reference

The fundamental flaw in the MS OS/2 1.21 disk driver is that it sets up an interrupt service before verifying that a controller with drives is in fact present. Once port 172h responds, the driver will set up an interrupt handler. And for reasons that are not at all clear, it defaults to using IRQ 14, which is the same as the primary disk controller.

After things go sideways, the driver incorrectly uninstalls the interrupt handler and destroys the IRQ value for which the driver was installed.

How Was This Missed?

The driver presumably worked on machines with an actual Compaq IDA controller on the secondary I/O address range. The driver also clearly works on systems with only the primary disk controller present.

The question is what happened on systems that had a secondary disk controller present, but not of the Compaq IDA kind. And the answer is that I’m not sure.

Such systems were probably not widespread, but they did exist. How well MS OS/2 1.21 was tested on such machines is questionable. It is also likely that the support for Compaq IDA was added relatively shortly before MS OS/2 1.21 was released and therefore not necessarily very widely tested at all.

It is almost certain that on an ISA system (not EISA as anything with the Compaq IDA would be, and certainly not PCI which didn’t exist yet), accessing port 1171h in fact accessed port 171h, which is the disk controller Features/Error Register. However, this register is written as Features register and read as Error register, which means that a value written to it is highly unlikely to be read back.

On such a system, the BASEDD01.SYS driver should default to IRQ 14 and should have the problem described above. I will also note that any strict IBM PC/AT compatible will fail the BASEDD01.SYS CMOS check because there will be no information about the 3rd/4th drive stored in CMOS bytes 1Bh/1Ch.

I strongly suspect that PC/AT compatibles with a secondary controller would fail to boot MS OS/2 1.21, but it is possible that Compaq machines might succeed.

At any rate, the problem was clearly incorrect error handling in an error path that was probably poorly tested before the release, if at all. The workaround is simple, make sure there is no secondary disk controller, such as a secondary IDE channel.

Addendum

The Compaq Deskpro 386/25 Technical Reference Guide (TRG) from 1988 reveals additional information. The Deskpro 386/25 may have been the first PC/AT compatible system with at least some BIOS support for a secondary disk controller. Two additional hard disks could be installed in an expansion unit (designed explicitly for ESDI drives) and used from DOS through an additional driver called EXTDISK.SYS.

The TRG documents CMOS locations 1Bh/1Ch as holding the additional drive types. It also notes that the secondary disk controller can use IRQ 14 (same as the primary!), and if it does, the driver software takes care to ensure that only one controller at a time has interrupts enabled.

The TRG also says: The hardware interrupt for the fixed disk drive controller shipped with an expansion unit can be selected with a switch located on the adapter. In addition, the interrupt can be changed via software through an I/O port. The configuration register is documented to be located at I/O port 11F1h; however, it is entirely possible that it was located at 11X1h where X was Fh for the primary and 7h for the secondary I/O address.

In light of the above, it is likely that the Compaq IDA was in fact designed to be backward compatible with the Deskpro 386/25, and the MS OS/2 1.21 BASEDD01.SYS driver was likely written to support the Deskpro 386/25. That however does not change the fact that as it was written, the driver was essentially guaranteed to fail on non-Compaq systems with a secondary disk controller.

It may however explain one thing–any machines with 3rd/4th hard disk tested by Microsoft with OS/2 1.21 would have likely been Compaq systems, and the trouble the driver caused on non-Compaq machines would have been missed.

This entry was posted in Bugs, Microsoft, OS/2, PC history. Bookmark the permalink.

5 Responses to The Strange MS OS/2 1.21 Disk Driver

  1. MiaM says:

    Interesting!

    What other OS:es, or third party drivers for OS:es, did support a secondary hard disk controller at the time?

    Asking because I’ve always found it curious that it’s possible to set the controller to primary/secondary (and IIRC the same goes for the floppy part) but I’ve never even seen any way to use it with things that were available in the 1980’s, until this blog post (which shows at least an attempt at making it usable, although it didn’t work).

    In particular the ways to use a secondary hard disk controller I know of is to either use a way newer motherboard, like a 486 or possibly late 386 with BIOS support for both controllers, or use a 1990’s or newer OS like Linux, Windows NT or whatnot, that lets you use both controllers even if BIOS ignores the secondary controller.

    Were there drivers/software similar to the “large disk support” drivers, but for using two controllers?

    Bonus annoying thing 1: I vaguely remember that an RLL hard disk controller I had back in the days still relied on the mothreboard BIOS for certain things even though it had it’s own BIOS, and thus couldn’t be set as secondary controller with a motherboard BIOS that only supports one controller.

    Bonus annoying thing 2: My impression from back in the days was also that some non-MFM/IDE controllers used the same port range and/or something else that would collide, and if you got a hold of used hardware without the instruction manual you were stuck unable to reconfigure it to not conflict. That meant that in practice you had to put say a SCSI controller in one computer and an MFM/IDE controller in another computer, and transfer your files using Laplink or similar. (This was back in the days before the internet, before home networking and at a time “no one” knew of any manufacturer BBS:es or so here in Europe, and many didn’t even have a modem anyways).

    Bonus annoying thing 3: I actually think it’s genuinely crap+++ of Microsoft and IBM to create MS/PC-DOS and the IBM AT and not include a driver that would allow you to use the secondary floppy controller. This is especially bad since IBM themselves created the need for three disk drives for a PC to be compatible with all disks (360k, 1.2M, 1.44M)

  2. Michal Necasek says:

    I think widespread BIOS support for secondary disk controllers only came circa 1994-1995, when motherboards with two IDE channels became common.

    That does raise the question what the secondary address setting on the PC/AT disk controller was good for, since the AT BIOS had no idea what to do with it.

    In the early AT days, I cannot imagine that it would have been physically possible to stuff three hard disks into the case. Too big, almost certainly not enough power. The Compaq Deskpro 386 Tech Ref (1986) documents the address jumper for the ESDI controller, and says: “This [secondary] address selection is available only for special circumstances and under normal circumstances should never be changed.” It clearly was not supported for normal operation.

    The Compaq Deskpro 386/25 Tech Ref (1988) is the oldest I can find which clearly talks about some support for a secondary disk controller. It documents CMOS bytes 1Bh/1Ch for 3rd/4th hard disk drive type. It also mentions an EXTDISK.SYS driver that was apparently needed to use the secondary disk controller.

    I don’t know about your controller, but: My WD1007 and WD1007V controllers do have a BIOS, but it can only be used to format a drive. It has no support for INT 13h. The motherboard BIOS must supply the actual disk driver.

    Are you sure there was not a DOS driver to use a secondary floppy controller?

  3. Fernando says:

    Also:
    COMPAQ DESKPRO 386 Personal Computer MAINTENANCE AND SERVICE GUIDE at this link: http://typewritten.org/Articles/Compaq/108033-003.pdf
    Has “1.17 300-/600-MEGABYTE FIXED DISK DRIVE EXPANSION UNIT”
    I think that if working mirroring disks explains why it uses the the IRQ 14.

  4. Josh Rodd says:

    Intel OEM MS OS/2… now there’s a release I’ve never heard of before. What drivers did it come with? (For example, Compaq’s OEM release came with a PM display driver for their 640×400 plasma display in the Compaq Portable III, which was similar to Olivetti but not identical, and not identical enough Compaq’s driver won’t work on an Olivetti implementation in an emulator.)

    We had a machine (a PS/2 Model 25, an 8086-based machine that had two 8-bit ISA slotes in it) that had both two MFM controllers in it. The built in BIOS seemed to drive the MFM just fine. We needed more storage and got an RLL controller, complete with an external enclosure with the ribbon cables snaked into it. It worked, but it took a long time to boot. I’m guessing the secondary’s BIOS ended up “taking over” the primary controller from the system’s BIOS. The first MFM drive used the built-in BIOS; the add-on card had its own BIOS and somehow was able to find it was on the “secondary” range.

    I have never spotted a PC/AT or similar with two original disk controllers in the “wild”. The BIOS certainly didn’t support this, and I’m not sure of any contemporary operating systems or DOS disk drivers that did, although it seems like a pretty trivial change to the BIOS to support it.

  5. Richard Wells says:

    It was not completely uncommon to have a second hard disk controller. Most likely, the second controller would be a non-booting SCSI for use with an optical drive. Microsolutions Compaticard had all the required support to be a second floppy controller and a driver to match. I am not sure how 9 track tape controllers fit into the whole assemblage; whatever it was, it had to be fast at 9 mbps. I know 9 track did not have OS/2 drivers forcing a reboot to DOS every time a new tape arrived for updating the database.

    IIRC, the Compaticard IV could work under OS/2 but only as the primary floppy controller. I know some SCSI cards would work as a secondary controller under OS/2 but I think those drivers started appearing around the time of OS/2 2.0.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.