How initialize a data memory area as quickly as possible?

General FreeBASIC programming questions.
noop
Posts: 130
Joined: Sep 18, 2006 10:29

Re: How initialize a data memory area as quickly as possible

Post by noop »

MichaelW, apparently FB doesn't reserve the memory until it's actually needed. Therefore the first method is slowed down significantly. If you initialize the array beforehand (at least on my computer) the first and last version seem to be equally fast. The assembler code is a tad quicker.

Also: Why do you push and pop the rdi register? Is this good practice but not really necessary here or is this actually needed here as well?
dodicat wrote:MichaelW
I assume that that the assembler instructions are for 64 bit.
I get a pile of errors here on 32 bit windows.
Yes this is for 64-bit. It uses the 8-byte registers. This here should work for 32-bit:

Code: Select all

asm
    push    edi
    mov     edi, [p]
    mov     eax, 0x05050505
    mov     ecx, DIMSIZE * DIMSIZE * DIMSIZE * DIMSIZE * DIMSIZE / 4
    rep     stosd
    pop     edi
end asm
dodicat wrote:A stroke of luck that Provoni needs only byte or ubyte arrays, otherwise looping in the values would be only way I reckon!
MichaelW's assembler code solves this problem. He just puts the 1-byte number 5, eight times into the 8-byte register rax. This is then written to the memory as in memset but in 8-byte steps with "rep stosq". You could also put an 8-byte number once or a 4-byte number twice into this register.

The downside, of course, is the reliance on the assembler code. I can think of another alternative to a simple loop. You can use memcpy. Write the first 32-or-so-bytes in a loop (or even hardcode them) and then copy the 32 bytes with mempcy to get a total of 64 bytes filled. Then copy the 64 bytes to get a total of 128 bytes filled. And so on. This isn't terribly fast but faster than a simple loop. I get the following timings for writing integer values in 32-bit:
simple loop 4.46 seconds
memcpy 1.37 seconds
rep stosd 0.45 seconds
MrSwiss wrote:@MichaelW,
a really nice piece of code. However, you've used what I call "The Disc-Manufacturers Cheat" to calculate Bytes:
1e9
instead of
1024e6
which, should actually be used ... to obtain correct results.
To be fair, Kilo, Mega, Giga and so on are SI prefixes standing for 10^3,10^6,10^9 and so on. The IEC has introduced a new binary prefix to resolve this problem. So now it is on us to change our habits and accept that 10^3 Bytes are one KB and 2^10 Bytes are one KiB ;-)
dodicat
Posts: 7983
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: How initialize a data memory area as quickly as possible

Post by dodicat »

Thanks noop.
Must get myself a 64 bit box.
To use memcpy rather than clear or memset then perhaps:

Code: Select all

 

#include "crt.bi"

#define DIMSIZE 40

redim as double array(DIMSIZE-1,DIMSIZE-1,DIMSIZE-1,DIMSIZE-1,DIMSIZE-1)

dim as integer jmp=@array(1,1,1,1,1)-@array(0,0,0,0,0)+1

dim as double a(jmp)
for n as integer=0 to jmp
    a(n)=566.67
next n

for z as integer=0 to dimsize-2
memcpy(@array(z,z,z,z,z),@a(0),(jmp)*sizeof(double))
next z

dim as integer i1,i2,i3,i4,i5
for i as integer = 1 to 100
    i1 = int(rnd * DIMSIZE)
    i2 = int(rnd * DIMSIZE)
    i3 = int(rnd * DIMSIZE)
    i4 = int(rnd * DIMSIZE)
    i5 = int(rnd * DIMSIZE)
    print array(i1,i2,i3,i4,i5),
next
print
print array(0,0,0,0,0)
print array(DIMSIZE-1,DIMSIZE-1,DIMSIZE-1,DIMSIZE-1,DIMSIZE-1)
sleep
noop
Posts: 130
Joined: Sep 18, 2006 10:29

Re: How initialize a data memory area as quickly as possible

Post by noop »

Something along those lines, yeah. But rather than using an extra array and looping over one dimension, you can store the data in the original array and do a sort of binary-memcpy:

Code: Select all

#include "crt.bi"

#macro fillArray(p,sz,v)
    scope
    if (p = 0) then
        print "Null pointer.": sleep: end
    end if
    
    if (sizeof(v) <> sizeof(*p)) then
        print "Types don't match.": sleep: end
    end if
    
    dim as integer m = iif(sz > 32, 31,sz-1)
    for i as integer = 0 to m
        p[i] = v
    next i
    
    if (sz > 32) then
        dim as integer type_sz = sizeof(v)
        dim as integer j,i
        j = 32
        i = 64
        while (i <= sz)
            memcpy(p+j,p,type_sz*j)
            j = i
            i *= 2
        wend
        memcpy(p+j,p,type_sz*(sz-j))
    end if
    end scope
#endmacro

dim as integer sz_i = 600
redim as single array(sz_i-1,sz_i-1,sz_i-1)
dim as single ptr p = @array(0,0,0)
dim as uinteger sz = sz_i^3
dim as single v = 99.5
dim as double t1,t2

' initialize s.t. the memory is reserved
fillArray(p,sz,0)

print "array size = ";str(int(sizeof(*p)*sz/2^20+0.5));"MiB"
print "value to set: ";str(v)

t1 = timer
    fillArray(p,sz,v)
t2 = timer
print str(t2-t1);" seconds"
print "value test: ";p[int(rnd*sz)]

sleep
On my machine this takes about half the time. The key seems to be this binary-memcpy usage, since the first loop in your code doesn't really add to the time taken.

Note that I used single as a type here in order to compare it with the 32-bit asm version. This is indeed a case where I don't see how to solve this with the "rep stos?" command. So either upgrading to 64-bit is necessary or falling back on slower solutions, like the one based on memcpy.

Edit: I've put this together which may or may not work correctly in all cases ;-)

Code: Select all

#include "crt.bi"

#macro fillArrayL(p,sz,v)
    scope
    print "Falling back onto a memcpy solution."
    
    if (p = 0) then
        print "Null pointer.": sleep: end
    end if
    
    if (sizeof(v) <> sizeof(*p)) then
        print "Types don't match.": sleep: end
    end if
    
    dim as integer m = iif(sz > 32, 31,sz-1)
    for i as integer = 0 to m
        p[i] = v
    next i
    
    if (sz > 32) then
        dim as integer type_sz = sizeof(v)
        dim as integer j,i
        j = 32
        i = 64
        while (i <= sz)
            memcpy(p+j,p,type_sz*j)
            j = i
            i *= 2
        wend
        memcpy(p+j,p,type_sz*(sz-j))
    end if
    end scope
#endmacro

#macro fillArray(p,sz,v)
    scope
    if (p = 0) then
        print "Null pointer.": sleep: end
    end if
    
    if (sizeof(v) <> sizeof(*p)) then
        print "Types don't match.": sleep: end
    end if
    
    #ifdef __FB_64BIT__
    ' p is a pointer, thus sizeof(p)=8 bytes
    asm mov rdi, p
    
    asm mov rcx, qword ptr sz
    
    select case sizeof(v)
    case 1: asm
                mov al, byte ptr v
                rep stosb
            end asm
    case 2: asm
                mov ax, word ptr v
                rep stosw
            end asm
    case 4: asm
                mov eax, dword ptr v
                rep stosd
            end asm
    case 8: asm
                mov rax, qword ptr v
                rep stosq
            end asm
    case else: fillArrayL(p,sz,v)
    end select
    #else
    ' p is a pointer, thus sizeof(p)=4 bytes
    asm mov edi, [p]
    
    asm mov ecx, dword ptr [sz]
    
    select case sizeof(v)
    case 1: asm
                mov al, byte ptr [v]
                rep stosb
            end asm
    case 2: asm
                mov ax, word ptr [v]
                rep stosw
            end asm
    case 4: asm
                mov eax, dword ptr [v]
                rep stosd
            end asm
    case else: fillArrayL(p,sz,v)
    end select
    #endif
    end scope
#endmacro

dim as integer sz_i = 470
dim as integer sz = sz_i^3
redim as double array(sz_i-1,sz_i-1,sz_i-1)
dim as double ptr p = @array(0,0,0)
dim as double v = -132.94
if (p = 0) then
    print "Error: Null pointer.": sleep: end
end if
dim as double t1,t2

print "array size = ";str(int(sizeof(*p)*sz/2^20+0.5));"MiB"
print "value to set: ";str(v)

print "--------------------"

print "initialization (clear)"
t1 = timer
    clear(*p,0,sz*sizeof(*p))
t2 = timer
print str(t2-t1);" seconds"
print "value test: ";p[int(rnd*sz)]

print "--------------------"

print "ASM"
t1 = timer
    fillArray(p,sz,v)
t2 = timer
print str(t2-t1);" seconds"
print "value test: ";p[int(rnd*sz)]

print "--------------------"

print "clearing"
t1 = timer
    clear(*p,0,sz*sizeof(*p))
t2 = timer
print str(t2-t1);" seconds"
print "value test: ";p[int(rnd*sz)]

print "--------------------"

print "loop"
t1 = timer
for i1 as integer = 0 to sz_i-1
    for i2 as integer = 0 to sz_i-1
        for i3 as integer = 0 to sz_i-1
            array(i1,i2,i3) = v
        next i3
    next i2
next i1
t2 = timer
print str(t2-t1);" seconds"
print "value test: ";p[int(rnd*sz)]

sleep
RockTheSchock
Posts: 252
Joined: Mar 12, 2006 16:25

Re: How initialize a data memory area as quickly as possible

Post by RockTheSchock »

Why do you need such a big byte array? What do you want to do? Are you making a chess engine? Maybe you could use another algorithm that uses less memory.
fxm
Moderator
Posts: 12107
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Re: How initialize a data memory area as quickly as possible

Post by fxm »

Why don't add these two efficient filling methods, on 64-bit floating-point data (double):

Code: Select all

print "--------------------"
print "loop on dereferenced pointer"
t1 = timer
for i as integer = 0 to sz
    p[i] = v
next i
t2 = timer
print str(t2-t1);" seconds"
print "value test: ";p[int(rnd*sz)]

print "--------------------"
print "loop on 'poke'"
t1 = timer
for i as double ptr = @array(0,0,0) to @array(sz_i-1,sz_i-1,sz_i-1)
    poke double, i, v
next i
t2 = timer
print str(t2-t1);" seconds"
print "value test: ";p[int(rnd*sz)]
integer
Posts: 408
Joined: Feb 01, 2007 16:54
Location: usa

Re: How initialize a data memory area as quickly as possible

Post by integer »

fxm wrote:On my PC, the first test is always measured the slower, regardless of the order of tests!
Perhaps because of the cache?
@fxm
On my xp sp3
using the: asm, clear, and memset routines
I was expecting the same results as you -- just wanted to see the time difference
However,
The first test is always faster.
??

my mistake.
I misread gb/s for duration.
Last edited by integer on Sep 07, 2015 14:32, edited 1 time in total.
fxm
Moderator
Posts: 12107
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Re: How initialize a data memory area as quickly as possible

Post by fxm »

fxm wrote:On my PC, the first test is always measured the slower, regardless of the order of tests!
Perhaps because of the cache?
Among only memset and clear!

Code: Select all

#include "crt.bi"

Dim As Integer size = 2^28
Redim As Byte array(size - 1)
Dim as byte ptr p = @array(0)
Dim As Double t

For i As Integer = 1 To 4
  Sleep 1000
  t = Timer
  memset(p, 3, size)
  t = Timer - t
  Print "memset: " & t / size * 2^30 & " s/GB"
Next i

Sleep

Code: Select all

memset: 0.6893367732535154 s/GB
memset: 0.4596383059923568 s/GB
memset: 0.4594304583440163 s/GB
memset: 0.4604886933888395 s/GB

Code: Select all

Dim As Integer size = 2^28
Redim As Byte array(size - 1)
Dim as byte ptr p = @array(0)
Dim As Double t

For i As Integer = 1 To 4
  Sleep 1000
  t = Timer
  Clear(*p, 3, size)
  t = Timer - t
  Print "clear: " & t / size * 2^30 & " s/GB"
Next i

Sleep

Code: Select all

clear: 0.6885500810850189 s/GB
clear: 0.4744401872374127 s/GB
clear: 0.4601501028695552 s/GB
clear: 0.4632220016768045 s/GB

Code: Select all

#include "crt.bi"

Dim As Integer size = 2^28
Redim As Byte array(size - 1)
Dim as byte ptr p = @array(0)
Dim As Double t

Sleep 1000
t = Timer
memset(p, 3, size)
t = Timer - t
Print "memset: " & t / size * 2^30 & " s/GB"

Sleep 1000
t = Timer
Clear(*p, 3, size)
t = Timer - t
Print "clear : " & t / size * 2^30 & " s/GB"

Sleep

Code: Select all

memset: 0.6903033765454794 s/GB
clear : 0.4599120838040847 s/GB

Code: Select all

#include "crt.bi"

Dim As Integer size = 2^28
Redim As Byte array(size - 1)
Dim as byte ptr p = @array(0)
Dim As Double t

Sleep 1000
t = Timer
Clear(*p, 3, size)
t = Timer - t
Print "clear : " & t / size * 2^30 & " s/GB"

Sleep 1000
t = Timer
memset(p, 3, size)
t = Timer - t
Print "memset: " & t / size * 2^30 & " s/GB"

Sleep

Code: Select all

clear : 0.6895021573900735 s/GB
memset: 0.4604484648226688 s/GB
integer
Posts: 408
Joined: Feb 01, 2007 16:54
Location: usa

Re: How initialize a data memory area as quickly as possible

Post by integer »

Thanks fxm
In the above post I mis-read the qty/s for duration.

The sequence of operations re worked:

Code: Select all

DECLARE SUB DisplayResult()

    #define BitSys 32       '' or 64
    #include "crt.bi"
    #define DIMSIZE 58      '' above that the PC is slower; only 700 MB mem available

#macro _init_memset    
    print "using: memset": ' [asg]
    t1 = timer
    memset( p, 3, DIMSIZE * DIMSIZE * DIMSIZE * DIMSIZE * DIMSIZE )
    t2 = timer
    print size / 1e9 / (t2-t1); " GB/s "; using " (##.#### s)";t2-t1
#ENDMACRO

#MACRO _init_clear
    print "using: clear"
    t1 = timer
    clear( *p, 7, size )
    t2 = timer
    print size / 1e9 / (t2-t1); " GB/s"; using " (##.#### s)";t2-t1
#ENDMACRO

#macro _init_asm
    print "using: asm ":    '[asg]
    t1 = timer
  #if (BitSys = 64)   /'  FOR THE 64 BIT SYS '/
    asm
        push    rdi
        mov     rdi, p
        mov     rax, 0x0505050505050505
        mov     rcx, DIMSIZE * DIMSIZE * DIMSIZE * DIMSIZE * DIMSIZE / 8
        rep     stosq
        pop     rdi    
    end asm
  #else               /'  FOR THE 32 BIT SYS '/
    asm
        push    edi
        mov     edi, [p]
        mov     eax, 0x05050505
        mov     ecx, DIMSIZE * DIMSIZE * DIMSIZE * DIMSIZE * DIMSIZE / 4
        rep     stosd
        pop     edi
    end asm
  #endif
    t2 = timer
    print size / 1e9 / (t2-t1); " GB/s"; using " (##.#### s)";t2-t1
#endmacro
 
    redim shared as byte array(DIMSIZE-1,DIMSIZE-1,DIMSIZE-1,DIMSIZE-1,DIMSIZE-1)
    dim as byte ptr p = @array(0,0,0,0,0)
    dim as double t1,t2
    dim as longint size
    dim shared as integer i1, i2, i3, i4, i5, LimitOutput = 10

    size = DIMSIZE^5
    print size;" bytes"
    print
    dim as double Startg, ending
    Startg = timer

    print "order 1"
    _Init_clear
    DisplayResult()

    _Init_memset    
    DisplayResult()
    
    _Init_asm
    DisplayResult()
    
    '& reorder2
    print "order 2"
    _Init_asm
    DisplayResult()
    
    _Init_memset    
    DisplayResult()
    
    _Init_clear
    DisplayResult()

    '& reorder3
    print "order 3"
    _Init_memset    
    DisplayResult()
    
    _Init_clear
    DisplayResult()

    _Init_asm
    DisplayResult()
    
    Ending = timer
    
    print "total time "; using"###.### s"; Ending - Startg

    sleep
    STOP

SUB DisplayResult()
    for i as integer = 1 to LimitOutput
        i1 = int(rnd * DIMSIZE)
        i2 = int(rnd * DIMSIZE)
        i3 = int(rnd * DIMSIZE)
        i4 = int(rnd * DIMSIZE)
        i5 = int(rnd * DIMSIZE)
        print array(i1,i2,i3,i4,i5);" ";
    next
    print
END SUB
THE OUTPUT:

Code: Select all

 656356768 bytes
 
order 1
using:  CLEAR
 0.1478221464135912 GB/s ( 4.4402 s)
 7  7  7  7  7  7  7  7  7  7   
 
using:  memset
 0.0381982203284193 GB/s (17.1829 s)
 3  3  3  3  3  3  3  3  3  3   
 
using: ASM 
 0.1413880456907694 GB/s ( 4.6422 s)
 5  5  5  5  5  5  5  5  5  5   
 
order 2
using: ASM 
 0.9322578763569851 GB/s ( 0.7041 s)
 5  5  5  5  5  5  5  5  5  5   
 
using:  memset
 0.9391797301182753 GB/s ( 0.6989 s)
 3  3  3  3  3  3  3  3  3  3   

using CLEAR
 0.9415705357199505 GB/s ( 0.6971 s)
 7  7  7  7  7  7  7  7  7  7   
 
order 3
using:  memset
 0.9408714586104721 GB/s ( 0.6976 s)
 3  3  3  3  3  3  3  3  3  3   
 
using CLEAR
 0.9385093146110889 GB/s ( 0.6994 s)
 7  7  7  7  7  7  7  7  7  7   
 
using: ASM 
 0.9425386638610432 GB/s ( 0.6964 s)
 5  5  5  5  5  5  5  5  5  5   
 
total time  30.690 s
MichaelW
Posts: 3500
Joined: May 16, 2006 22:34
Location: USA

Re: How initialize a data memory area as quickly as possible

Post by MichaelW »

MrSwiss wrote: However, you've used what I call "The Disc-Manufacturers Cheat" to calculate Bytes:
1e9
instead of
1024e6
Yes, I should have used the correct value, but in my defense it's the ratio of the results that matters.
MichaelW
Posts: 3500
Joined: May 16, 2006 22:34
Location: USA

Re: How initialize a data memory area as quickly as possible

Post by MichaelW »

noop wrote: Why do you push and pop the rdi register?
Because per the Microsoft X64 calling convention, RDI is one of the callee-save registers, a short description is here.
I don't know if for my code preserving RDI is actually necessary, but it might be and the conservative approach is to preserve it.
MrSwiss
Posts: 3910
Joined: Jun 02, 2013 9:27
Location: Switzerland

Re: How initialize a data memory area as quickly as possible

Post by MrSwiss »

@MichaelW,

regarding the article you've pointed to ... the code for calling the function reads:

Code: Select all

        mov     dword ptr [rsp+0x20], 5     ; output parameter 5
        mov     r9d, 4                      ; output parameter 4
        mov     r8d, 3                      ; output parameter 3
        mov     edx, 2                      ; output parameter 2
        mov     ecx, 1                      ; output parameter 1
        call    SomeFunction                ; Go Speed Racer!
However, there seems to be some faults:
  • 1) dword ptr [rsp+0x20] (shouldn't that be qword ptr ??)
    2) r9d (is this just the lower dword of r9??)
    3) the same goes for line r8d??
    4) edx (rdx??)
    5) ecx (rcx??)
this seems to be misleading information. (The writer says he's corrected those mistakes?)
Could you please shed some light on those issues? Thanks in advance.
marcov
Posts: 3462
Joined: Jun 16, 2005 9:45
Location: Netherlands
Contact:

Re: How initialize a data memory area as quickly as possible

Post by marcov »

All those issues stem from the same thing, the arguments are 32-bits, so only a 32-bit position on the stack (dword ptr) is used, as well as the 32-bit representations of the relevant registers.
MrSwiss
Posts: 3910
Joined: Jun 02, 2013 9:27
Location: Switzerland

Re: How initialize a data memory area as quickly as possible

Post by MrSwiss »

@marcov,

thanks for clearing that. I still think that the author missed, to mention that (very important) fact!
So in lang "FB" it means that the prameters are all either Long/ULong ...
dodicat
Posts: 7983
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: How initialize a data memory area as quickly as possible

Post by dodicat »

If you choose to make your large array a udt array with a static field, byte/integer/string or whatever, then it loads really fast.
Here I have compared with a relatively slow memcpy method.
I use dimsize 50 here, my machine has only 2 Gb memory.

Code: Select all

#include "crt.bi"
#define DIMSIZE 50

type _integer
    static as integer i
    dim as byte    dummy
end type:dim _integer.i as integer

type _string
    static as string i
    dim as byte    dummy
end type:dim _string.i as string
'=====================================================
#macro _init_memcpy(array,datatype,num) 
   scope
   print "using: memcpy ":print #datatype
   t1=timer
   dim as datatype ptr dp=@array(0,0,0,0,0)
   dim as integer jmp=@array(1,1,1,1,1)-dp+1,n
for n =0 to jmp
    dp[n]=num
next n
for n=1 to dimsize-2
     memcpy(@array(n,n,n,n,n),dp,(jmp)*sizeof(datatype))
next n
t2=timer
print size / 1e9 / (t2-t1); " GB/s "; using " (##.########## s)";t2-t1
end scope
   #endmacro
'========================================================   
   #macro _init_udt(array,datatype,num,X)
   print "using: udt "::print #datatype
   X.i=num
   t1=timer
   redim  as datatype array(DIMSIZE-1,DIMSIZE-1,DIMSIZE-1,DIMSIZE-1,DIMSIZE-1)
   t2=timer
   print size / 1e9 / (t2-t1); " GB/s "; using " (##.########## s)";t2-t1
   #endmacro
 '========================================================  

#macro display(array,dot)
for i as integer = 1 to 15
        i1 = int(rnd * DIMSIZE)
        i2 = int(rnd * DIMSIZE)
        i3 = int(rnd * DIMSIZE)
        i4 = int(rnd * DIMSIZE)
        i5 = int(rnd * DIMSIZE)
        print (array(i1,i2,i3,i4,i5)dot),
    next
    'print t2-t1
    print
    #endmacro
    
redim shared as single array2(DIMSIZE-1,DIMSIZE-1,DIMSIZE-1,DIMSIZE-1,DIMSIZE-1)

 dim as double t1,t2
 dim as longint size = DIMSIZE^5
 dim as integer i1,i2,i3,i4,i5
 
    dim as _integer X
   _init_udt(array1,_integer,9,X)
    display(array1,.i)
    
    
    _init_memcpy(array2,single,11.1)
    display(array2,)
    
    
    dim as _string s
    _init_udt(array3,_string,"Hi",s)
    display(array3,.i)
    
    sleep 
fxm
Moderator
Posts: 12107
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Re: How initialize a data memory area as quickly as possible

Post by fxm »

There is no miracle solution!

A static member variable is single for all instances of UDT.
This is also true for all elements of an UDT array.

Whatever i1, i2, i3, i4, i5:
@array(i1, i2, i3, i4, i5).i = @UDT.i
(all are alias of one single static variable)

So one single variable is really initialized in that case!

Code: Select all

#macro display(array,dot)
for i as integer = 1 to 15
        i1 = int(rnd * DIMSIZE)
        i2 = int(rnd * DIMSIZE)
        i3 = int(rnd * DIMSIZE)
        i4 = int(rnd * DIMSIZE)
        i5 = int(rnd * DIMSIZE)
        print @(array(i1,i2,i3,i4,i5)dot);":";(array(i1,i2,i3,i4,i5)dot),
    next
    'print t2-t1
    print
    #endmacro

Code: Select all

using: udt
_integer
 15755.03935511327 GB/s  ( 0.0000198349 s)
4264000: 9    4264000: 9    4264000: 9    4264000: 9    4264000: 9
4264000: 9    4264000: 9    4264000: 9    4264000: 9    4264000: 9
4264000: 9    4264000: 9    4264000: 9    4264000: 9    4264000: 9

using: memcpy
single
 0.3100826934538999 GB/s  ( 1.0077956835 s)
1128504024: 11.1            343187896: 11.1             384876852: 11.1
689246724: 11.1             804411600: 11.1             1062554512: 11.1
657810204: 11.1             896405536: 11.1             445913464: 11.1
806618132: 11.1             349363028: 11.1             1126825864: 11.1
278625748: 11.1             620902252: 11.1             1171720624: 11.1

using: udt
_string
 6862.624660745883 GB/s  ( 0.0000455365 s)
4264008:Hi    4264008:Hi    4264008:Hi    4264008:Hi    4264008:Hi
4264008:Hi    4264008:Hi    4264008:Hi    4264008:Hi    4264008:Hi
4264008:Hi    4264008:Hi    4264008:Hi    4264008:Hi    4264008:Hi
Post Reply