Which is the fastest DPMI extender for freebasic?

DOS specific questions.
Post Reply
lassar
Posts: 306
Joined: Jan 17, 2006 1:35

Which is the fastest DPMI extender for freebasic?

Post by lassar »

I am trying to get the most speed for my program to run in dosbox.

Which dos extender works the fastest?
MichaelW
Posts: 3500
Joined: May 16, 2006 22:34
Location: USA

Re: Which is the fastest DPMI extender for freebasic?

Post by MichaelW »

Instead of looping a large number of times to get a reasonably precise timing with a more or less low-resolution timer, I adapted my most recent cycle-count macros for use under DOS. Where under multi-tasking Windows to get a reasonably repeatable cycle count you must elevate the priority to minimize interruptions, under DOS, at least normally, the test runs pretty much continuously.

Counter.bas:

Code: Select all

''=============================================================================
dim shared as longint counter_cycles
dim shared as integer _counter_loopcount_, _counter_loopcounter_

#macro COUNTER_BEGIN( loop_count )
    _counter_loopcount_ = loop_count
    _counter_loopcounter_ = _counter_loopcount_
    asm
        xor eax, eax
        cpuid               '' serialize
        rdtsc               '' get reference loop start count
        push edx            '' preserve msd (most significant dword)
        push eax            '' preserve lsd
        xor eax, eax
        cpuid               '' serialize
        .balign 16
      0:                    '' start of reference loop
        sub DWORD PTR _counter_loopcounter_, 1
        jnz 0b              '' end of reference loop
        xor eax, eax
        cpuid               '' serialize
        rdtsc               '' get reference loop end count
        pop ecx             '' recover lsd of start count
        sub eax, ecx        '' calc lsd of reference loop count
        pop ecx             '' recover msd of start count
        sbb edx, ecx        '' calc msd of reference loop count
        push edx            '' preserve msd of reference loop count
        push eax            '' preserve lsd of reference loop count
        xor eax, eax
        cpuid               '' serialize
        rdtsc               '' get test loop start count
        push edx            '' preserve msd
        push eax            '' preserve lsd
    end asm
    _counter_loopcounter_ = _counter_loopcount_
    asm
        xor eax, eax
        cpuid               '' serialize
        .balign 16
      1:                    '' start of test loop
    end asm
#endmacro

''=============================================================================

#macro COUNTER_END()
    asm
        sub DWORD PTR _counter_loopcounter_, 1
        jnz 1b              '' end of test loop
        xor eax, eax
        cpuid               '' serialize
        rdtsc
        pop ecx             '' recover lsd of start count
        sub eax, ecx        '' calc lsd of test loop count
        pop ecx             '' recover msd of start count
        sbb edx, ecx        '' calc msd of test loop count
        pop ecx             '' recover lsd of reference loop count
        sub eax, ecx        '' calc lsd of corrected loop count
        pop ecx             '' recover msd of reference loop count
        sbb edx, ecx        '' calc msd of corrected loop count
        mov DWORD PTR [counter_cycles], eax
        mov DWORD PTR [counter_cycles+4], edx
    end asm
    counter_cycles /= _counter_loopcount_
#endmacro

''=============================================================================
The only way I can see to benchmark DOS extenders is to compare the speed of various functions. I made an arbitrary selection of one FreeBASIC function call, one RM interrupt call, and one DPMI function call here, but I think to get a meaningful comparison you would need to test a larger number of functions, starting with the most commonly used.

Code: Select all

#include "counter.bas"

dim as any ptr p

sleep 4000

for i as integer = 1 to 4

    COUNTER_BEGIN( 1000000 )
        p = allocate(16)
    COUNTER_END()
    print counter_cycles

    COUNTER_BEGIN( 1000000 )
        asm
            ''-------------------------------------------
            '' Call the BIOS Return Memory Size service.
            ''-------------------------------------------
            int 0x12
        end asm
    COUNTER_END()
    print counter_cycles

    COUNTER_BEGIN( 1000000 )
        asm
            ''---------------------------------------
            '' Call the DPMI Get Page Size function.
            ''---------------------------------------
            mov ax, 0x0604
            int 0x31
        end asm
    COUNTER_END()
    print counter_cycles
    print

next

sleep
All tests were done on a P2 system running MS-DOS 6.22.

CWSDPMI:

Code: Select all

324
4676
4724

324
4676
4724

325
4676
4724

325
4676
4724
HDPMI:

Code: Select all

280
1901
261

279
1901
261

279
1901
261

279
1901
261
As you can see, the repeatability was very good. AFAIK Japheth expended a lot of effort on HDPMI, coding it entirely in assembly IIRC.
lassar
Posts: 306
Joined: Jan 17, 2006 1:35

Re: Which is the fastest DPMI extender for freebasic?

Post by lassar »

Looks like good work.

I have been doing my testing in dosbox.

So far in dosbox the fastest dpmi is cswdpmi. But it is game of chance, whether your program will work with it.

I had unremarked 3 lines consisting of two print statements, and a sleep statement.

In Dosbox, under cswdpmi, this would cause my program to crash.

I remarked these 3 lines back, and no more crashing in dosbox.
MichaelW
Posts: 3500
Joined: May 16, 2006 22:34
Location: USA

Re: Which is the fastest DPMI extender for freebasic?

Post by MichaelW »

Another set of cycle counts, running under the Windows XP NTVDM:

Code: Select all

 478
 13237
 6336

 456
 13208
 6336

 443
 13213
 6336

 474
 13213
 6336
counting_pine
Site Admin
Posts: 6323
Joined: Jul 05, 2005 17:32
Location: Manchester, Lancs

Re: Which is the fastest DPMI extender for freebasic?

Post by counting_pine »

What is a cycle in DOSBox? Does it give an accurate record of real-world timing?
I suspect DOSBox may work at different speeds on different platforms, especially if some hosts have a dynamic recompiler.
lassar
Posts: 306
Joined: Jan 17, 2006 1:35

Re: Which is the fastest DPMI extender for freebasic?

Post by lassar »

I don't know what a cycle is in dosbox, but I do know you can set the speed in cycles, that you want a program to run.
MichaelW
Posts: 3500
Joined: May 16, 2006 22:34
Location: USA

Re: Which is the fastest DPMI extender for freebasic?

Post by MichaelW »

I don’t know much about DOSBox, but my quick search shows multiple reports of people using RDTSC and CPUID with it. Given this capability, under DOSBox you should be able to measure the effective clock speed of the processor, emulated or otherwise.

Code: Select all

dim as double t
dim as longint tsc1,tsc2
asm
    xor eax, eax
    cpuid
    rdtsc
    mov DWORD PTR [tsc1], eax
    mov DWORD PTR [tsc1+4], edx
end asm
t = timer + 10
do
loop until timer > t
asm
    xor eax, eax
    cpuid
    rdtsc
    mov DWORD PTR [tsc2], eax
    mov DWORD PTR [tsc2+4], edx
end asm
print using "#####MHz";(tsc2-tsc1)/10000000
sleep
The above code, running under the Windows XP NTVDM, displays a clock speed that is within a fraction of a percent of the clock speed that Windows reports.
rugxulo
Posts: 219
Joined: Jun 30, 2006 5:31
Location: Usono (aka, USA)
Contact:

Re: Which is the fastest DPMI extender for freebasic?

Post by rugxulo »

MichaelW wrote:I don’t know much about DOSBox, but my quick search shows multiple reports of people using RDTSC and CPUID with it. Given this capability, under DOSBox you should be able to measure the effective clock speed of the processor, emulated or otherwise.
...
The above code, running under the Windows XP NTVDM, displays a clock speed that is within a fraction of a percent of the clock speed that Windows reports.
I don't understand it all, but I'm not sure RDTSC is reliable in all situations, e.g. for long durations (esp. earlier cpus) or multi-core or due to power management (ACPI, e.g. TurboBoost). Newer cpus (e.g. Intel Nehalem Westmere) have RDTSCP. And really, you should always test for CPUID availability, then test via CPUID for RDTSC before using it.

And that brings us to DOSBox. DOSBox is not a full emulator. It's official and primary purpose (last I heard, unless something changed, which I highly doubt) is "only for games"! In other words, they either ignore or refuse any suggestions for features that aren't immediately necessary for old commercial games. In fact, older versions didn't even properly let you test for CPUID via flags (even though it was/is supported in its 486 DX2 emulation) because of this. DOSBox is not using V86 (since it's meant to be portable to non-x86 machines as well), so it's very slow, and the timing is not anywhere near the same as any native x86 PC. It's been said that you need at least a 1 Ghz host PC just to emulate a "fast" 486. (I would recommend native DOS, esp. via bootable USB from RUFUS, or maybe DOSEMU or VirtualBox instead of DOSBox for any development or testing.)

As for fastest "extender" ... there is no perfect answer. AFAIK, FreeBASIC (and "most" DJGPP-based stuff) since DJGPP v2 is "DPMI only", hence there is no use of real extended int 21h DOS API. CWSDPMI is 32-bit DPMI only, no extensions. HDPMI32 also has popular extensions and partial DPMI 1.0, but almost nobody (except maybe?? DPMIONE) supports full DPMI 1.0, including Windows or OS/2 (supposedly reports 0.95 for partial support, which was not a real version). And they all have various bugs and limitations (e.g. CWSDPMI keeps page tables in low memory). OpenWatcom usually requires "extenders" (DOS/4GW, PMODE/W, Causeway) for its apps and doesn't (usually) play well with CWSDPMI due to this.

If I had to guess, I'd say WDOSX (ring 0) is fastest, and it's only 10 kb, but it's always embedded into your app, you can't unstub, it always compresses (shrouds), and it's also got a few bugs (like everything in life). D3X is easier to build (NASM) and can unstub and at least appears to be friendly to CWSDPMI (unlike most others) but is non-commercial only. In other words, it's easiest to let the user choose, e.g. leave things as default (or just assume CWSDPMI except in rare circumstances).

In short, nothing is perfect, you really just have to know what you're doing, test everything, be patient, accept bugs or fix them yourself, etc. :-/
lassar
Posts: 306
Joined: Jan 17, 2006 1:35

Re: Which is the fastest DPMI extender for freebasic?

Post by lassar »

Finally managed to get pmode/dj to work. Maybe I was using the wrong version or something.

In dosbox, it seems to be about 4 to 8 times faster then cswdpmi.

My program, now runs halfway decent at 100 cycles in DosBox.
Post Reply