The other day I spent a while trying to understand the purpose of a rather strange looking piece of code inside Borland’s THELP.COM utility shipped with Turbo Pascal 6.0 (THELP.COM was misbehaving under emulated DOS).
The THELP utility performs the following actions:
- Use INT 21h/34h to get the address of the InDOS flag
- Starting from the beginning of the InDOS segment, search for word 3E80h using the SCASW instruction
- If found, check if the location in memory six bytes past the 3E80h word holds the value BCh
- If so, store the word just past 3E80h for later use
This logic is applied to DOS version 4 and below (effectively 2.0 to 4.x), not newer versions. But what could it possibly be good for?
To find out, I applied the THELP.COM search logic to a running copy of DOS 3.3. I quickly discovered that it’s looking for a specific piece of code within DOS.
The 3E80h word (or rather the byte sequence 80h 3Eh) is the opcode of ‘cmp byte ptr [mem], imm’. And BCh is the opcode of ‘mov sp, imm’. There is almost certainly only one instance of this particular instruction sequence within any DOS 2.0-4.x.
Cross referencing the instructions with the DOS 4.0 source code, I quickly enough found the following sequence inside the INT 21h function dispatcher:
CMP [ERRORMODE],0
JNZ DISPCALL
MOV SP,OFFSET DOSGROUP:IOSTACK
What THELP.COM is after is the offset of the ERRORMODE variable in the DOS data segment (which will be the same segment the InDOS variable is in).
OK, but why is this necessary? Trying to find documentation about the long-undocumented INT 21h/34h function provided most of the explanation.
The InDOS flag is normally set when DOS INT 21h is entered and cleared when leaving. It is apparent that a TSR like THELP.COM must check the InDOS flag before trying to do things like load something from a file.
The catch is that if DOS is within the critical error handler, then the InDOS flag is clear yet it’s not in fact safe to call into DOS.
The PRINT.COM utility has exactly the same problem, but because it comes with DOS, it solves it in a vastly more elegant fashion than THELP.COM. The code can be seen here and this is the relevant section:
lds si,[INDOS] ;Check for making DOS calls
;---------------------------------------
;
; WARNING!!! Due to INT 24 clearing the
; INDOS flag, we must test both INDOS
; and ERRORMODE at once!
;
; These must be contiguous in MSDATA.
;
;---------------------------------------
cmp WORD PTR [SI-1],0
PRINT.COM fully agrees that checking INDOS is not enough. But rather than fishing for the critical error flag (ERRORMODE) in memory and then checking both separately, PRINT.COM knows that the ERRORMODE flag is stored right below the INDOS flag in memory, and therefore both flags can be checked with a single word comparison.
In DOS 2.x, the ERRORMODE flag was stored right after INDOS, again making it possible to test both in one go. The deal breaker was apparently Compaq DOS 3.0 which stored the ERRORMODE flag in a completely different location. The code in THELP.COM should be able to find it and work with such a “strange” OEM DOS version. Chances are that Compaq DOS 3.0 was not the only odd one.
It is worth noting that the THELP.COM version I looked at was released in 1990. Being conservative, it only searches known DOS versions, up to 4.x; the search logic is not applied to DOS 5.0 and later. Which is just as well because it might not work when the DOS code is separate from DOS data.
For a utility like THELP.COM, not being able to check the critical error flag is unlikely to be a fatal problem in practice. Users who try to invoke the TSR in the middle of a critical error handler only have themselves to blame.
PRINT.COM does not have that luxury because it is invoked from the timer interrupt, and might try to run at any time when interrupts are not disabled, critical errors or not. It is a given that for PRINT.COM, attempting to run during a critical error handler was a real problem that needed to be solved.
So first they keep secret the information that is crucial for writing TSR and then Raymond Chen whines about developers poking inside the OS internals.
Something like that. Programmers tend to be very solution oriented — you give them a problem, they find a solution. If the OS doesn’t provide keys to the front door, they will find the back door or a bathroom window or a chimney or whatever it takes.
>InDOS Is Not Enough
It’s described, among other tsr gotchas, in Undocumented DOS, better in 2nd edition (just disregard that ‘swapping tsr’ nonsense). Also, check TesSeRact library by Chip Rabinowitz if you want to see some crazy tsr hacks, especially in DOS 2.x case.
I wonder how the version of PRINT supplied with the outlier version of Compaq DOS coped with the error mode not being adjacent to the InDOS flag. The Interrupt List says that DOS 3.0+ added a call (INT 21h/AX=5D06h) to get the address of the error flag, so maybe it did that.
(I’m also wondering why an OEM version of MSDOS would have such a different internal memory layout rather than just using the stock build from Microsoft. The Compaq MS-DOS 3.00 found at archive.org appears to have ErrorMode adjacent to InDOS as expected).
I don’t think there was just one release of Compaq DOS 3.0. The Compaq DOS 3.0 I have at hand indeed has the flags next to each other, but it’s also dated May 1985… newer than PC DOS 3.1.
As to why the layout would be different — why not? Nobody was supposed to be poking around the internals. Until networking came, the DOS internal data layout was somewhat fluid.
It’s also always possible that the information in RBIL is wrong.
I suppose it depends who built the outlier IBMDOS.COM. If it was Microsoft then they would indeed be at liberty to change the internals how they liked, and they would know to update their own utilities to match. But if it was Compaq, I was wondering what changes they wanted to make that necessitated building their own IBMDOS.COM rather than using the stock one. (I don’t think it can have been 32-bit sector numbers – those were famously added in Compaq DOS 3.31).
The earliest version of the Interrupt List I could rapidly lay my hands on is version 26 (15 June 1991) and that already has the wording about “Compaq DOS 3.0”. It also has the same difference of opinion with itself about whether INT 21h/AX=5D06h was introduced in “DOS 3.1+” or “DOS 3.0+”.
Looks like it got into the Interrupt List between 1987-11-03 [ http://discmaster.textfiles.com/view/1969/RBBSIABOX31.cdr/kba1/intrpt.zip/INTERRUP.LST ] and 1998-01-30 [ http://discmaster.textfiles.com/view/17675/MASTER_TECHNICIAN.ISO/mtech/library/offline/dos/interup2.zip/INTERRUP.LST ]. The 1987 version only describes DOS 2, the 1988 version adds info for 3.0, 3.1 and the Compaq DOS 3.0 outlier.
OEMs always partially built their own IBMDOS.COM/MSDOS.SYS, albeit largely from object files supplied by Microsoft, which normally included the DOS data area. PRINT.COM was supplied to OEMs in source form, so that would not be an insurmountable problem.
However… I can put this discussion to rest. My guess (without seeing that mythical Compaq DOS 3.0 it’s only a very well informed guess) is that RBIL is not wrong so much as inexcusably incomplete. Can you guess which other DOS 3.0 has the ERRORMODE flag 0x1AA bytes below the InDOS flag? That’s right, IBM’s own PC DOS 3.0!
That fact of course turns the entire mystery 180 degrees around — if PC DOS 3.0 had the flags 0x1AA bytes apart, why wouldn’t Compaq DOS 3.0 do the same thing? Rather than Compaq doing something special, it was Compaq not doing anything special at all! The only problem is that RBIL gives the impression that Compaq was the outlier, when in reality they did the exact same thing as IBM.
PC DOS 3.0 PRINT.COM only checks the InDOS flag. I did not bother figuring out exactly how critical error handling deals with InDOS. FWIW, DOS 2.x also only checks InDOS in PRINT.COM.
Oh and PC DOS 3.0 does have INT 21h/5D06h (it’s called GET_DOS_DATA in my disassembly). Of course the SDA had a different layout in DOS 3.0 compared to the later DOS 3.x versions.
Indeed, the method from MS-DOS Encyclopedia(1988) pg 356 works well on PC-DOS 3.0 and returns DOSDATA:0x167 for criterror flag, which is 0x1aa bytes below indos(0x311). But 0x21/0x5d06 returns DOSDATA:0x30d, so it can’t be used to locate criterror flag in a more convenient way.
Sounds as if the sequence of events might be: PRINT originally checked only InDOS, and no-one realised it needed to check the error mode until some time after PCDOS 3.0 had been released with the two variables separated by 0x1AA bytes. At which point Microsoft reorganised the data segment to move the two together, changed PRINT to do a word check, and pushed that version out to OEMs still numbered 3.0.
The difference in memory layout would also explain why the Interrupt List says in different places that INT 21h/5D06h is applicable to “DOS 3.0+” and “DOS 3.1+” — the call is present in DOS 3.0+ but can’t be used to find the address of the error mode until DOS 3.1.
Conversations like this makes me miss the discussions that were on BIX when they were figuring out things like InDos.
I wonder if anyone archived those, or CompuServe, or some of the other services back then. Most likely it’s all just completely gone.