There’s an interesting (and quite annoying) bug specific to Windows 3.x running in 386 enhanced mode (also known as Windows/386) and using a PS/2 mouse. In some situations, the mouse may jump to the top or the bottom of the screen, especially when the system is somewhat unresponsive. The cause of the jumpiness turns out to be a subtle bug in VKD (Virtual Keyboard Device), a virtual device (VxD) which is part of Windows/386.
The way PS/2 mouse input is handled in Windows/386 is extraordinarily complex. The mouse driver (MOUSE.DRV) simply uses the BIOS INT 15h, sub-function C2h interface to register a mouse callback routine. The BIOS uses IRQ 12 and I/O ports 60h/64h to read data from the mouse. However, the VKD VxD inserts a whole new layer between the BIOS and the mouse hardware.
In Windows/386, VKD emulates a PS/2 style keyboard controller, including the mouse (auxiliary device) support. Microsoft implemented that in order to better separate VMs from hardware (and prevent VMs from globally controlling the A20 gate etc.) and provide hotkey support in a way which is transparent to the BIOS and DOS. Another benefit of emulating the keyboard controller is the ability to paste input into VMs in a way which is effectively indistinguishable from keyboard input. Emulation of the PS/2 auxiliary interface does not appear to bring any real benefit to the user, but was probably inevitable since it shares the same controller as keyboard input.
Now a small digression about the PS/2 mouse protocol. Whenever the mouse is moved or a button is pressed/released, the mouse sends a “packet” of data consisting of three bytes (there are extensions to that protocol, irrelevant for a discussion of standard Windows 3.x). A status byte is followed by two bytes with X and Y direction movement data. The movement information is 9 bits for each direction, with the top bit of each stored in the status byte. There is no guaranteed way to recognize where the packet starts and ends, therefore driver code must take great care to not lose any byte of data; if it does, the PS/2 mouse protocol will lose synchronization and the data will be misinterpreted.
For the mouse support, VKD (more specifically the VAD, or Virtual Auxiliary Device component of VKD) implements a 32-byte queue (circular buffer). The VAD reads incoming bytes from the real keyboard/mouse controller and stores them internally, until at some later point the BIOS fetches them from the emulated controller. The queue size is a power of two for fast access. If the queue is full, the VAD takes care to always throw away three bytes of incoming data so that the PS/2 protocol stays synchronized. Unfortunately, there’s a bug in the queue overflow handling logic.
Because the VAD queue is 32 bytes in size, there’s room for 10 complete PS/2 packets and one incomplete, containing only the status byte and X direction movement data. And that’s where the bug is. The VAD will realize that the queue is full only after it has stored two out of three bytes of an incoming PS/2 packet. It will then discard the next three bytes, i.e. the last byte of the current packet plus the first two bytes of the next.
If the queue has emptied somewhat by the time the last byte (Y movement information) of the next mouse packet arrives, that byte will be stored together with the first two bytes of the earlier packet. The mouse will jump if the Y movement direction has changed between the two packets; it should be kept in mind that multiple packets may be discarded if the queue is not emptied quickly enough. If the direction changes, the top bit of a 9-bit quantity in the status byte won’t match the low 8 bits in the Y movement byte. Typically either a small positive number (e.g. 1) will be transformed to a large negative number (e.g. -255) or vice versa.
That explains why the mouse only ever jumps in the Y direction, either to the top or the bottom of the screen. Because the status byte is always consistent with the X direction movement byte, no jumps can occur in the horizontal direction. Under normal circumstances, the system should be emptying the VAD queue quickly enough that it will never overflow. But if it for any reason doesn’t and the queue fills up, the mouse jumps are quite likely to occur.
The above information was obtained by reading the VKD sources in the Windows 3.1 DDK (especially DDK\386\VDD\VAD.ASM) and Microsoft’s WDEB386 debugger in conjunction with a debug build of WIN386.EXE (also supplied with the DDK). The WDEB386 debugger needed patching to run on a post-486 CPU. The .VKD command in WDEB386 was used to enable VAD trace-outs, later read with the .LQ command.
Wow that makes perfect sense! … I bet this bug is in other MS products as sometimes the mouse goes insane on NT 4.0 (probably others too)… and it would make sense since everyone loves 32bit alignments, although with 3bit data…… oops.
Luckily the protocol was fixed to take 4 bytes with the introduction of the wheel mouse.
The 4 vs 3 bytes thing is irrelevant (so extending the protocol to 4-byte packets does nothing for reliability). The problem is simply that the mouse sends a stream of bytes which must be correctly grouped into sets of three (or four). But there’s no reliable way to tell where the sequence starts or ends, so the protocol is susceptible to “noise” if a byte is somehow lost. The keyboard controller and driver software should normally ensure that that doesn’t occur, but bugs happen.
The Register’s writer has met exactly this problem in the article http://www.theregister.co.uk/2012/04/06/windows_3_1_anniversary/ regrading 20 anniversary of Windows 3.1.
> Emulation of the PS/2 auxiliary interface does not appear to bring any real benefit to the user
Emulation of the PS/2 interface takes care of multiple DOS boxes each running their own mouse driver with different settings (sampling rate, scaling, etc.) within each of them. You need to preserve these settings when switching DOS boxes or the mouse will behave differently (slower, faster,etc.) each time you reenter the same box.
Unfortunately the approach taken by VKD/VAD is rather dumb in the sense that rather than actually keeping these separate settings for each DOS box, it just ignores them, forcing everyone to use the same boring settings (like for example packet size of 3).
With MSMouse 9.x driver, MS toyed with a new PS/2 synchronization method: assume that if enough time has passed between receiving any two bytes, it means that the second byte is the start of a new packet. To do this, rather than requesting full 3-byte packets from the PS/2 BIOS, they request 1-byte packets and do synchronization on their own.
However this is problematic when VKD is going to ignore this setting and keep using a hardcoded 3-byte packet size. MS’s solution was to ship a new VKD called MOUSEVKD.386 , which allows a VM to use 1-byte packet size, thereby again allowing the DOS driver to do the synchronization by itself, and accidentally workarounding this queuing bug.
(See https://www.betaarchive.com/wiki/index.php/Microsoft_KB_Archive/97883 )
I suspect the VKD logic is basically too old. The PS/2 mouse BIOS service was never hardcoded for 3-byte packets, but in practice PS/2 pointing devices used 3-byte packets for many years. I’m also not sure just how widespread PS/2 mice were (on AT compatibles) before ATX took over. I know my old PCs all had serial mice and no PS/2 port.
The problem described in the post was just not thinking clearly enough… the queue size should have been either a multiple of the packet size, or they should have made sure to queue or drop full packets. It’s a somewhat common bug because the overflow conditions tend to be rarely encountered.
You’re right that the VKD also forces all VDMs to use the same settings. I can only assume that it did not cause significant problems in practice.
The MOUSEVKD.386 approach is interesting, and I guess it just exposes the fact that the built-in VKD/VAD wasn’t all that great. Looks like MS Mouse 9.0 came out in 1993, too late to put the better driver into Windows 3.1 but well before Windows 3.1 could be ignored.
I think PS/2 mice tended to be more common in OEM setups (using LPX or custom form factors), rather than on whitebox AT hardware. With many exceptions, of course!