In another useless project, I decided to find out why even trivial programs created with the Open Watcom compiler refuse to run on Windows NT 3.1. Attempting to start an executable failed with foo.exe is not a valid Windows NT application.
But this time, the surgery was successful.
There turned out to be several problems, some related and some not. The biggest issue was the problem previously discussed in this post. To recap, the Open Watcom runtime calls the GetEnvironmentStringsA
API, which does not exist on NT 3.1. This also causes trouble on Win32s; although newer Win32s versions do implement GetEnvironmentStringsA
, it just always fails.
The solution is to use the original GetEnvironmentStrings
API (which is equivalent to GetEnvironmentStringsA
) . But that’s not enough.
There is another related problem, which is that the runtime calls the FreeEnvironmentStringsA
API during termination. This API does not exist on NT 3.1; it was only added in NT 3.5 together with GetEnvironmentStringsA/W
. On NT 3.1 there was nothing to free so the API wasn’t necessary.
The solution is easy enough, query the (already loaded) KERNEL32.DLL
handle and then obtain the address of FreeEnvironmentStringsA
. If this fails (as it will on NT 3.1), just don’t call the function because there is nothing to free anyway.
This additional logic adds size to the runtime, so I decided to build a separate variant of the relevant startup module and call it maint31r.obj
(for the default register calling convention). Users must link this module in the unlikely case that they wish to build NT 3.1 compatible applications.
But that’s still not enough. Windows NT 3.1 (but not NT 3.5 and later) refuses to run applications marked as needing the Windows subsystem version 4.0. That’s what causes the foo.exe is not a valid Windows NT application
error.
However, that is easy to avoid during linking. The linker (wlink
) needs to be passed runtime win=3.10
when linking, which sets the subsystem to Windows and version to 3.10, as required by NT 3.1.
When using the compile and link utility, this can be done as follows:
wcl386 winhello.c mainnt31.obj -bt=nt -l=nt_win -"runtime win=3.10"
Note that the full path to maint31r.obj
needs to be supplied (it will be eventually delivered as %WATCOM%\lib386\nt\maint31r.obj
). With everything in place, it is possible to create executables that run on Windows NT 3.1 from 1993, as the first screenshot demonstrates.
The same executable happily runs on later NT versions, including Windows 10, and it also runs under Win32s.
Are there any compatibility shims activated in 95 when the subsystem is below 4.0? I don’t have the undocumented 95 book handy but perhaps something like GetProcessDword?
There have been various unfortunate compatibility checks over the years making it impossible to please everyone. 2000 wants one field I don’t remember to be exactly 5.0 and image version should be exactly 6.0 to prevent UAC issues ( https://github.com/jrsoftware/issrc/blob/c0778bb57d028ffe04dc4492f3b44e9dd932d274/Projects/CompExeUpdate.pas#L92 )
I would not be surprised if they even looked at the linker version in places.
WS, the subsystem version controls how UI is displayed. A 3.x application, including an NT 3.x application, gets bold, black-on-white; a 4.0 application gets nonbold, black-on-grey. See http://www.malsmith.net/blog/pe-subsystem-version/ for an example. Although officially Visual C++ 2.0 requires NT 3.5, its subsystem is an NT 3.1 application and it launches successfully on NT 3.1.
I vaguely remember there was some ctl3d32 hackery that could get a decent appearance on both NT 3.1 and Windows 95/NT 4.0, but have forgotten what it was.
Is there really any need to call FreeEnvironmentStringsA during program termination? Surely you’re just freeing memory in the application’s own address space that’s about to disappear entirely? If it is needed, surely that means any application that crashes leaks memory?
@Stu If you change a variable or replace the whole block using SetEnvironmentStrings it can allocate a new block on the heap. The old pointer given out by GetEnvironmentStrings must stay valid even if it is not the current environment.
For an EXE file, it’s not really necessary no. But for a DLL, yes — because it can be loaded and unloaded any number of times, and if every load/unload leaks some memory, eventually you (i.e. the process loading and unloading the DLL) will run out.
I’d also been retro-porting to NT 3.1 and hit some odd bugs that I haven’t seen before in GetTempPath and User32. These were both fixed fairly quickly, so presumably somebody else knew about them. Both took the form of accessing strings beyond the supplied buffer length, which “typically” works because the next memory will be valid, but I was still surprised to see them in a released product.
Write-up, if you’re interested: http://www.malsmith.net/blog/nt31-buffer-overflows/
Replying to my earlier comment about how to get a decent appearance for a GUI process on NT 3.1 and NT 4.0 simultaneously. I couldn’t figure out the Ctl3d32 hackery, so did something much worse, but it works, and could be extended to other graphical programs given enough effort.
http://www.malsmith.net/blog/self-modifying-subsystem/
That’s very cool, and very cumbersome 🙂
Would it be possible to create a patch DLL that will somehow be loaded before USER32.dll, and in its DLL_PROCESS_ATTACH it would fix up the subsystem version in the main executable image? I think that might be doable, and it should work for more complex apps as well.
I agree with your instinct that there must be a better way, although it’s not clear to me how to force a DLL to load before user32 via the import table. Note any such scheme needs to work on all newer versions of Windows, so it can’t assume too much about loader behavior.
Offhand the best thing I can think of is to make User32 a DelayLoad, so the code still has to do the version check before calling any User32 function but doesn’t have to use GetProcAddress. The documentation is pretty sparse on the requirements for DelayLoad though – for this to work, it needs to really delay loading on NT 4 and up (if NT 3.1 loads User32 before the version check it doesn’t change anything.) It looks like all of the brains of DelayLoad are a statically linked function to resolve imports though, so it shouldn’t depend on OS version.
Up until now I’ve been avoiding DelayLoad since it seems to be an automated GetProcAddress with all the error handling removed. With a normal import, the program won’t run if the export is missing; with GetProcAddress the program gets to handle the condition; with DelayLoad you end up forced into exceptions or a hook function to recover. It’s a classic example of making code look simple by having hidden codepaths that a casual inspection of the code would not notice.
Based on my research DLLs are loaded in the order they appear in the import table, and it didn’t sound like something that would be safe to change. Precisely because the DLL load order does matter.
I agree that DelayLoad looks like one of those things that hide way too much, with unpredictable side effects.
Somewhat related, and it’s probably not as good as doing it with Open Watcom, and I haven’t tried it lately, but:
ld -pie –subsystem windows:3.10 -e_entry_point [your .o files here]
…used to work for me, with MinGW in place of Open Watcom. The -pie is for the sake of Win32s. And the -e_entry_point is there to bypass all the startup code that depends on MSVCRT.DLL, so it’s “void entry_point()” instead of “int main()”.
I guess -lcrtdll basically works for a C runtime, but I mostly just restricted myself to Win32 function calls whenever I did this.
…And I think adding DS_3DLOOK to your dialog boxes does most (but not all!) of the work to get 3D look and feel on Windows 95 when you’ve also got the subsystem set to 3.1.
The reason for CTL3D32-using NT 3.1/3.5x applications looking bad when running on later Windows versions is because an older version of CTL3D32 is being used. As documented in Microsoft’s CTL3D.EXE package from 1994 (http://ftpmirror.your.org/pub/misc/ftp.microsoft.com/Softlib/MSLFILES/CTL3D.EXE), the library would disable itself if the detected Windows version was 4.0 or later, under the assumption that 4.0 would render all older applications’ user interfaces with its own native 3D appearance. As it turned out, of course, later Windows instead emulates the plain 2D 3.x style for Win32 programs with subsystem < 4.0, so that's what you get as a result.
Solutions include (a) replacing CTL3D32.DLL with a newer version updated to better handle Windows 4.0 (only applicable for applications linked to the DLL and not the static library), or (b) using the IMAGECFG utility with its -w option to override (in the PE header of the application EXE) the DWORD returned by GetVersion(), thus fooling the older CTL3D32 library into not disabling itself. IMAGECFG is included among the debugging tools on the NT 4.0 CD, along with SETNT351.CMD and SETWIN95.CMD batch files for convenience when either is the desired faked Windows version. Regarding (a), although NT 4.0 doesn't install CTL3D32.DLL by default, the CD includes a version 2.29 of the DLL (link timestamp from 4/1995) in \[platform]\INETSRV\ODBC, which has the updated behavior. There are also later 2.31 versions distributed with Visual C++ and various software. NT does install a "ntctl3d.dll" v. 2.31.1371.1 by default, which is more or less equivalent (except for using ANSI APIs instead of Unicode) and can also be copied and used. The updated DLL doesn't result in a native 4.0 appearance (bold fonts and 2D combo boxes remain), but it's a good compromise between older programs' intended 3D appearance and the native UI. Fooling any version of CTL3D32 using IMAGECFG results in the same appearance as Win16 apps using the 16-bit library.