I (finally !!!) updated my file I/O test for FB 0.20 using
the new amount of data read returned by GET.
http://freefile.kristopherw.us/uploads/temp/fileio.zip
The good thing is, that it works as supposed. The less good but not
really new thing is how the file I/O performance degrades, see shot.
The GET of FB (block size <= 32 KiB) kills cca 12% of performance
compared to DGJPP's FREAD (block size <= 32 KiB).
The get the full truth, it should be noted that several other "layers"
exist having the same sort of problem, actually even more.
DGJPP (block size <= 32 KiB) kills cca 10% compared to real mode (block size = 60 KiB).
Running the Win32 version in DOS via HX is "noticably" faster than the "native" DOS
version ... the "translation" done by HDPMI in Ring0 causes a loss below 4% only.
FAT kills cca 20% compared to raw sector reading (block size = 64 KiB).
Reading raw sectors using INT $13 via XDMA driver didn't cause any
noticable slowdown compared to "direct" DMA usage.
And, last and worst, the mainboard + HD (IDE/ATA stuff) do kill cca 1/2 (!!!) of
theoretical performance, in this test even more, on other tests a bit less.
So finally the read performance degrades from 33 MiB/sec to 11 MiB/sec.
Writing is even worse, writing my 260'000'000 (248 MiB) file from real mode
(block size = 60 KiB) took 55 sec (4.5 MiB/sec, EDR-DOS)
or 58.5 sec (4.3 MiB/sec, FreeDOS).
Exact values depend much from hardware used (CPU, mainboard, HD), filesystem
(type, fragmentation), and software (DOS kernel, cache). In my tests, there was
no cache effect and no fragmentation. While little can be done with the
IDE/ATA stuff (except hardware upgrade, not sure how far fastest SATA addon cards can
be used in DOS efficiently or at all), the biggest flaw is the FAT filesystem.
While enabling an additional cache might help to some degree, the solution would
be a better filesystem for DOS. Let me know if you want to alpha-test my draft. ;-)
GET 0.20 improvements | File I/O performance
Your comparison to the ATA-33 interface transfer rate is probably meaningless. For a long time now hard drives have supported interface transfer rates that exceed the sustained data transfer rate of the drives, even for the outer tracks where the rate is typically at a maximum. For marketing purposes the manufacturers tend to support the maximum transfer rate from the latest ATA standard, regardless of the drive capability.
Well, I wrote my own program based on yours using both the rtl and crt file access methods and my results were as follows - It doesn't matter. Times were very close to each other and the larger factor was the disk caching of the OS. Maybe if I compiled for DOS and tried that it may make a difference on an actual DOS machine, but I doubt it.
On my machine I tested with various files (the bigger the better so that the OS can't cache the whole thing and by the end of it the begining is invalidated) and I got much better results then you did. However, I am running XP x64 with a SATA150 drive and 4GiB of memory so finding files the OS can't cache is tricky (which is also why I put in a function to create a large temporary file).
There are other things you have to consider too such as data coherency (is your disk majorly fragmented?) and the OS's driver itself (is it storing the FAT in memory or constantly jumping back to it on the disk?). Both of these will cause a lot of seeking on the disk which is where you will be losing most of your time.
If you really want to benchmark the hardware, you need to write a tool which talks to the hardware directly instead of going through millions of software layers (rtl->crt->pmode to rmode translation layer->OS->BIOS->hardware).
On my machine I tested with various files (the bigger the better so that the OS can't cache the whole thing and by the end of it the begining is invalidated) and I got much better results then you did. However, I am running XP x64 with a SATA150 drive and 4GiB of memory so finding files the OS can't cache is tricky (which is also why I put in a function to create a large temporary file).
There are other things you have to consider too such as data coherency (is your disk majorly fragmented?) and the OS's driver itself (is it storing the FAT in memory or constantly jumping back to it on the disk?). Both of these will cause a lot of seeking on the disk which is where you will be losing most of your time.
If you really want to benchmark the hardware, you need to write a tool which talks to the hardware directly instead of going through millions of software layers (rtl->crt->pmode to rmode translation layer->OS->BIOS->hardware).
Code: Select all
/'
RUTMN
Random Useless Tool to Measure Nothing but it accesses the disk a lot.
copyleft Eric Cowles, 2008
'/
#include once "crt.bi"
#include once "vbcompat.bi"
#Define TEMP_FILE "rutmn.tmp"
#Define Show_Rate( Rate, UnitSize ) Format( ( Int( ( Rate * 10 ) / ( UnitSize ^ 2 ) ) / 10 ), "#,###.0" )
#Macro Show( Access_Control, Rate )
Color 7
Print Tab( 4 );Access_Control;
Color 14
Print Tab( 31 ); Show_Rate( Rate, 1024 );
Color 10
Print Tab( 41 ); Show_Rate( Rate, 1000 )
Color 7
#EndMacro
#Macro ShowProgress( Current, Total )
Bars = Current / Total * 50
If( Bars > lBar )Then
Locate , 1, 0
Color 7
Print " Progress : [";
If( Bars > 0 )Then Print String( Bars, "#" );
If( Bars < 50 )Then Print String( 50 - Bars, "." );
Print "]";
lBar = Bars
End If
#EndMacro
Function GetParam Overload ( Param As String, Default As String ) As String
Dim As String tCom = lCase( Command )
Dim As String tParam = lCase( Param )
Dim As String rVal = Default
If( Instr( tCom, tParam ) )Then
tCom = Mid( tCom, Instr( tCom, tParam ) + Len( tParam ) )
If( Instr( tCom, "-" ) )Then
tCom = Left( tCom, Instr( tCom, "-" ) - 1 )
End If
rVal = Trim( tCom )
End If
Function = rVal
End Function
Function GetParam Overload ( Param As String, Default As Integer ) As Integer
Dim As String tCom = lCase( Command )
Dim As String tParam = lCase( Param )
Dim As Integer rVal = Default
If( Instr( tCom, tParam ) )Then
tCom = Mid( tCom, Instr( tCom, tParam ) + Len( tParam ) )
If( Instr( tCom, "-" ) )Then
tCom = Left( tCom, Instr( tCom, "-" ) - 1 )
End If
rVal = Val( tCom )
End If
Function = rVal
End Function
Function FileSize ( Pathname As String ) As uLongInt
Dim As Integer FF = FreeFile
If( Open( Pathname For Binary Access Read As #FF ) )Then
Function = 0
Else
Function = Lof( FF )
Close #FF
End If
End Function
Sub ShowRates()
Print "Maximum physical layer transfer rates:"
Print Tab( 4 ); "Access Control"; Tab( 31 ); "MiB/sec"; Tab( 41 ); "MB/sec"
Show( "PIO-0" , 2000000000 / 600 )
Show( "SCSI-1" , 5000000 * ( 8 / 8 ) )
Show( "PIO-1" , 2000000000 / 383 )
Show( "PIO-2" , 2000000000 / 240 )
Show( "Fast SCSI-2" , 10000000 * ( 8 / 8 ) )
Show( "PIO-3" , 2000000000 / 180 )
Show( "Multi-Word DMA-1" , 2000000000 / 150 )
Show( "PIO-4, MWDMA-2, UDMA-0", 2000000000 / 120 )
Show( "Fast-Wide SCSI" , 10000000 * ( 16 / 8 ) )
Show( "Ultra SCSI" , 20000000 * ( 8 / 8 ) )
Show( "PIO-5" , 2000000000 / 100 )
Show( "PIO-6, UDMA-1" , 2000000000 / 80 )
Show( "UDMA-2, UATA/33" , 2000000000 / 60 )
Show( "SSA" , 200000000 * ( 1 / 8 ) )
Show( "Ultra-Wide SCSI" , 20000000 * ( 16 / 8 ) )
Show( "Ultra2 SCSI" , 40000000 * ( 8 / 8 ) )
Show( "UDMA-3" , 2000000000 / 45 )
Show( "UDMA-4, UATA/66" , 2000000000 / 30 )
Show( "Ultra2-Wide SCSI" , 40000000 * ( 16 / 8 ) )
Show( "UDMA-5, UATA/100" , 2000000000 / 20 )
Show( "UDMA-6, UATA/133" , 2000000000 / 15 )
Show( "SATA-150" , ( 1500000000 / 8 ) * 0.80 )
Show( "Ultra3 SCSI" , 40000000 * ( 16 / 8 ) * 2 )
Show( "SATA-300" , ( 3000000000 / 8 ) * 0.80 )
Show( "Ultra-320 SCSI" , 80000000 * ( 16 / 8 ) * 2 )
Show( "SATA-600" , ( 6000000000 / 8 ) * 0.80 )
Show( "Ultra-640 SCSI" , 160000000 * ( 16 / 8 ) * 2 )
Print
End Sub
Sub ShowHelp()
Print "syntax:"
Print " rutmn <options>"
Print
Print "options:"
Print " -b:size Sets the internal buffer size to use in KB (default: 64)."
Print " -c:size Creates a test file to use (-f is ignored). The size is in MB."
Print " -f:pathname Specificies the fully qualified pathname to use."
print " -h Shows this help message."
print " -r Shows the list of common physical transport layers and speeds."
Print
End Sub
Dim As FILE Ptr hFile
Dim As Uinteger iFile
Dim As Ubyte Ptr pBuffer
Dim As Double T1 = Timer, T2 = Timer
Dim As Double Elapsed( 0 To 1 ) = { Timer, Timer }
Dim As uLongInt FileLength
Dim As String Pathname
Dim As Integer Buffer_Size
Dim As uLongInt Temp_Size
Dim As Integer lBar
Dim As Integer Bars
Print
Print "Random Useless Tool to Measure Nothing but it accesses the disk a lot."
Print "copyleft Eric Cowles, 2008"
Print
Temp_Size = GetParam( "-c:", 0 )
If( Temp_Size )Then
Pathname = TEMP_FILE
FileLength = Temp_Size * 1024^2
Else
Pathname = GetParam( "-f:", "" )
FileLength = FileSize( Pathname )
End If
Buffer_Size = GetParam( "-b:", 64 ) * 1024
If( Instr( Command, "-h" ) )OrElse( Command = "" )Then ShowHelp
If( Instr( Command, "-r" ) )Then ShowRates
If( FileLength = 0 )Or( Pathname = "" )Then Goto Main_Abort
Print "Program Fun Facts:"
Print " Filename : "; Pathname
Print " File size : "; Format( FileLength , "###,###,###,###,###" ); " bytes"
Print " Buffer : "; Format( Buffer_Size, "###,###,###,###,###" ); " bytes"
Print
If( Temp_Size )Then
ReDim As Byte Dummy( 0 To 65535 )
lBar = -1
Open Pathname For Binary As #123
For Index As uLongInt = 0 To FileLength Step 65536
Put #123, , Dummy()
ShowProgress( Index, FileLength )
Next
Print : Print
End If
pBuffer = Allocate( Buffer_Size )
If( pBuffer = 0 )Then
Print "Unable to allocate space for file buffer"
Goto Main_Abort
End If
Print __FB_SIGNATURE__; " rtl:"
iFile = FreeFile
lBar = -1
If( Open( Pathname For Binary Access Read As #iFile ) = 0 )Then
Dim As Integer BytesRead
Dim As uLongInt TotalBytes
T1 = Timer
Do
If( Get( #iFile, , *pBuffer, Buffer_Size, BytesRead ) )Then
Print " ERROR READING FILE!"
Close #iFile
Goto Main_Abort
End If
TotalBytes += BytesRead
ShowProgress( TotalBytes, FileLength )
Loop Until ( TotalBytes = FileLength )
T2 = Timer
Elapsed( 0 ) = T2 - T1
Print : Print " Read Time : "; Format( Cint( Elapsed( 0 ) * 1000 ), "###,###,###" ); " ms"
Print " Transfer Rate: "; : Color 12 : Print Show_Rate( FileLength / Elapsed( 0 ), 1024 ); : Color 7 : Print " MiB/sec"
Print
Close #iFile
Else
Print " ERROR OPENING FILE!"
Goto Main_Abort
End If
Print
Print "Standard crt: "
hFile = FOpen( Pathname, "rb" )
lBar = -1
If( hFile )Then
Dim As Integer BytesRead
Dim As uLongInt TotalBytes
T1 = Timer
Do
BytesRead = FRead( pBuffer, 1, Buffer_Size, hFile )
If( BytesRead = 0 )Then
Print " ERROR READING FILE!"
FClose( hFile )
Goto Main_Abort
End If
TotalBytes += BytesRead
ShowProgress( TotalBytes, FileLength )
Loop Until ( TotalBytes = FileLength )
T2 = Timer
Elapsed( 1 ) = T2 - T1
Print : Print " Read Time : "; Format( Cint( Elapsed( 1 ) * 1000 ), "###,###,###" ); " ms"
Print " Transfer Rate: "; : Color 12 : Print Show_Rate( FileLength / Elapsed( 1 ), 1024 ); : Color 7 : Print " MiB/sec"
Print
FClose( hFile )
Else
Print " ERROR OPENING FILE!"
Goto Main_Abort
End If
Main_Abort:
If( pBuffer )Then Deallocate( pBuffer )
If( Temp_Size )Then Kill TEMP_FILE
Print "If this closes before you see it, that is because this is a CONSOLE application"
Print "and you are running it in a window."
Locate , , 1
End
file I/O
MichaelW wrote:
It isn't. It shows where the bottlenecks are.Your comparison to the ATA-33 interface transfer rate is probably meaningless.
Strangely an ATA-33 HD is faster than an ATA-100 model on an ATA-33 mainboard.manufacturers tend to support the maximum transfer rate from the latest ATA standard, regardless of the drive capability.
Possibly true. The large cache of your OS (and the HD ?) probably hides the inefficiencies. I deliberately used no cache except what DOS kernel offers.1000101 wrote:my own program based on yours using both the rtl and crt file access methods and my results were as follows - It doesn't matter. Times were very close to each other and the larger factor was the disk caching of the OS. Maybe if I compiled for DOS and tried that it may make a difference on an actual DOS machine, but I doubt it.
On my machine I tested with various files (the bigger the better so that the OS can't cache the whole thing and by the end of it the begining is invalidated) and I got much better results then you did. However, I am running XP x64 with a SATA150 drive and 4GiB of memory so finding files the OS can't cache is tricky (which is also why I put in a function to create a large temporary file).
NOT fragmented, see above.There are other things you have to consider too such as data coherency (is your disk majorly fragmented?)
NO need ... such a tool already exists and I used it also (see bottom line, 15.7 MiB/sec result). As said, I show where the bottlenecks are. BTW, there are many complaints about bad file I/O performance (including but not limited to DOS) of FB in this forum, so ouryou really want to benchmark the hardware, you need to write a tool which talks to the hardware directly instead of going through millions of software layers (rtl->crt->pmode to rmode translation layer->OS->BIOS->hardware).
are indeed very useless :-DRandom Useless Tool to Measure Nothing but it accesses the disk a lot
Re: file I/O
Well, to be honest, I've written a few disk drive benchmark programs in the past in different languages (vb6, qb, c (16-bit dos), asm (16-bit dos)) and seen the results of third party software to do the same and they do get some differences (most notably the ones reading the disk directly are much faster, especially when disregarding data content and just measuring raw transfer speeds).DOS386 wrote:BTW, there are many complaints about bad file I/O performance (including but not limited to DOS) of FB in this forum
Also, I question whether the rated speed is the sustained or burst transfer rate. My SATA150 drive right now (running a tool which just looks at the raw transport layer) is getting a sustained rate of 78.9MB/sec where the specification says it should be giving me almost 150MB/sec. Doing a burst test gives me 122MB/sec. Obviously the drives are very much under performing or the data is misleading.
In any event, while my drive is getting 80MB/sec transfer rate reading (and it drops as it moves further out on the disk, I set the starting position to 50% and it is now getting 68.5M/sec or a 14% drop) there are many layers of software overhead which can not be done during the wait-states because of data dependency (ie: navigating the partition and moving to the proper sectors). If you look at the source, the rtl does a little bit of house keeping for you. It's this house keeping that is making most code slower and in a lot of cases unneeded as the coder is doing their own checking and handling of unexpected situations.
After looking at this I wonder about making a "slip streamed" language dialect for fb which doesn't do the stuff like filling the buffer with nulls when it gets to EOF, all the locking checks (if they can be removed or are there strictly for the thread-safe rtl) but otherwise behaving exactly like the lang fb dialect but then I realize why - I'm a lazy SOB.
Re: file I/O
Got 57 MiB/sec raw sector reading on other PC with ATA-100 :-)1000101 wrote:sustained rate of 78.9MB/sec where the specification says it should be giving me almost 150MB/sec. Doing a burst test gives me 122MB/sec. Obviously the drives are very much under performing or the data is misleading.
Right. And breaks support for files > 2 GiB on DOS. But no big problem, I have my ways to bypass both the FB RTL and DGJPP I/O if I need maximum performance and file sizes. ;-)source, the rtl does a little bit of house keeping for you. It's this house keeping that is making most code slower
Re: file I/O
The FB_LOCK/UNLOCK macros are no-ops in the non-mt rtlib, so there is no extra overhead there. The filling of the buffer with nulls only happens at the end of the file, so there shouldn't be excessive amounts of overhead there.1000101 wrote:After looking at this I wonder about making a "slip streamed" language dialect for fb which doesn't do the stuff like filling the buffer with nulls when it gets to EOF, all the locking checks (if they can be removed or are there strictly for the thread-safe rtl) but otherwise behaving exactly like the lang fb dialect but then I realize why - I'm a lazy SOB.
What sort of filesystem and version of DOS do you use to test these > 2 and/or 4 GiB files? I would be quite willing to improve the large file support for DOS if I could test it.DOS386 wrote:Right. And breaks support for files > 2 GiB on DOS. But no big problem, I have my ways to bypass both the FB RTL and DGJPP I/O if I need maximum performance and file sizes. ;-)
Instead of assuming that the drive is capable of any particular transfer rate, you should consult the drive specs to determine what it is actually capable of, and then consult the drive to determine what transfer mode it’s actually using.DOS386 wrote:MichaelW wrote:
It isn't. It shows where the bottlenecks are.Your comparison to the ATA-33 interface transfer rate is probably meaningless.
If it is then there is probably something wrong with the ATA-100 drive and/or configuration. The transfer rates to and from the drive media have steadily improved, and a nominal ATA-100 drive should be able to saturate an ATA-33 interface.Strangely an ATA-33 HD is faster than an ATA-100 model on an ATA-33 mainboard.
The driver tells ATA-33 :-)MichaelW wrote:consult the drive to determine what transfer mode it’s actually using.
Maybe ... or maybe it's just optimized for ATA-100 and the ATA-33 support of this model is targeting just a minimal correctness, not maximal performance.If it is then there is probably something wrong with the ATA-100 drive and/or configuration. The transfer rates to and from the drive media have steadily improved, and a nominal ATA-100 drive should be able to saturate an ATA-33 interface.
DrV wrote:
:-) Going to look deeper into FB GET source (C ... :-|) and send you some info + code by mail (might take some days to put together).would be quite willing to improve the large file support for DOS if I could test it.