Simplistic copy for dynamic array fields when assigning between UDT instances

General FreeBASIC programming questions.
fxm
Posts: 8970
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Simplistic copy for dynamic array fields when assigning between UDT instances

Postby fxm » Jul 30, 2018 9:07

It seems that the assignment between 2 UDT instances each containing a dynamic array of equal size is not optimized like the assignment between 2 dynamic strings of equal size.
For large array sizes, the assigned array is moved in memory before data copying.
There is no reason, because the two arrays having the equal size, the memory reallocation is useless.
It seems that the particular case of arrays of equal size is not taken into account, and memory reallocation is always requested before copying.
This therefore leads to a severely degraded execution time in this case of large arrays of equal size.

Note: The global size of the array is easily accessible through its descriptor (third information with the value expressed in Bytes).

Example (firstly equal small size arrays, then equal large size arrays):

Code: Select all

Type UDT
  Dim As Byte b(Any)
End Type

Dim As UDT u1, u2
Dim As Integer N

N = 500000
Redim u1.b(N)
Redim u2.b(N)
Print @u1.b(0), @u2.b(0)
u2 = u1
Print @u1.b(0), @u2.b(0)
Print

N = 500000000
Redim u1.b(N)
Redim u2.b(N)
Print @u1.b(0), @u2.b(0)
u2 = u1
Print @u1.b(0), @u2.b(0)
Print

Sleep

Code: Select all

33292360      33792376
33292360      33792376 <-- OK

34365472      534442016
34365472      534409248 <-- NOK


No problem with strings of equal size:

Code: Select all

Type UDT
  Dim As String s
End Type

Dim As UDT u1, u2
Dim As Integer N

N = 500000
u1.s = String(N, 1)
u2.s = String(N, 2)
Print Strptr(u1.s), Strptr(u2.s)
u2 = u1
Print Strptr(u1.s), Strptr(u2.s)
Print

N = 500000000
u1.s = String(N, 1)
u2.s = String(N, 2)
Print Strptr(u1.s), Strptr(u2.s)
u2 = u1
Print Strptr(u1.s), Strptr(u2.s)
Print

Sleep

Code: Select all

7553056       33427488
7553056       33427488 <-- OK

34045984      596619296
34045984      596619296 <-- OK


[edit]
- Behavior above on Windows 10 (gas or gcc).
Last edited by fxm on Aug 03, 2018 13:46, edited 3 times in total.
jj2007
Posts: 1159
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: Weird assignment between two UDT instances each containing a dynamic array of equal large size

Postby jj2007 » Jul 30, 2018 10:03

My results:

Code: Select all

9306184       9806200
9306184       9806200

36831264      536870944
36831264      536870944

My interpretation is that u2 = u1 causes these steps:
1. free old HeapAlloc'ed u2
2. alloc new u2
3. copy from u1 to u2

Now, in step 2, there are two options:
- HeapAlloc (more precisely: RtlAllocateHeap) is lazy and just uses the old slot, given that it's equal size
- HeapAlloc stumbles over another free memory area and decides to take that one.

Which option is taken may depend on the OS version, on available RAM, or on the current lunar cycle, which is full moon ;-)
srvaldez
Posts: 1945
Joined: Sep 25, 2005 21:54

Re: Weird assignment between two UDT instances each containing a dynamic array of equal large size

Postby srvaldez » Jul 30, 2018 10:10

hello fxm
the results of your first example are as you note on Windows, though the numbers vary from run to run
on MacOS the result is

Code: Select all

FreeBASIC Compiler - Version 1.06.0 (03-01-2018), built for darwin-x86_64 (64bit)
first run
4466884608    4467392512
4466884608    4467392512

4628267008    5128269824
4628267008    5128269824

second run
4498014208    4498522112
4498014208    4498522112

4615065600    5115068416
4615065600    5115068416

third run
4489957376    4490465280
4489957376    4490465280

4670455808    5170458624
4670455808    5170458624

on Ubuntu 16.04 LTS the salt is

Code: Select all

FreeBASIC Compiler - Version 1.06.0 (02-26-2018), built for linux-x86_64 (64bit)
139943441715216             139943441211408
139943441715216             18244912

139942923399184             139942423396368
139942923399184             139942423396368
dodicat
Posts: 5759
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: Weird assignment between two UDT instances each containing a dynamic array of equal large size

Postby dodicat » Jul 30, 2018 11:40

Code: Select all

 
 
 
  For the UDT:
   
windows xp
9437256       9937272
9437256       9937272

10485792      510525472
10485792      510525472



scientific linux 6
3075768328    3075264520
3075768328    138947424

2575257608    2075254792
2575257608    2075254792



windows 10
7864392      8364406
7864392      8364406

35352608     535412768
35352608     535371808






fxm
Posts: 8970
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Re: Weird assignment between two UDT instances each containing a dynamic array of equal large size

Postby fxm » Jul 30, 2018 14:45

It seems that the consequence (destination array moved in memory or not) depends on the OS used, but in any case the compiler principle is surely the same and it should not resize the destination array if it is already the same size as the array to copy (test easy to code because the respective total sizes of arrays are already calculated in their descriptors).

It seems that this type of improvement (elimination of lost time to resize unnecessarily with the same size) has well been implemented in the case of the copy of strings.
Last edited by fxm on Aug 02, 2018 16:42, edited 1 time in total.
fxm
Posts: 8970
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Re: Weird assignment between two UDT instances each containing a dynamic array of equal large size

Postby fxm » Jul 30, 2018 15:13

Below is a small code to trap the case where the destination array is moved into memory (<escape> to quit):
(maybe the occurrence of the phenomenon depends on the OS and the array size)

Code: Select all

Type UDT
  Dim As Byte b(Any)
End Type

Dim As UDT u1, u2
Dim As Integer N = 500000000

Do
  Redim u1.b(N)
  Redim u2.b(N)
  Dim As Byte Ptr address = @u2.b(0)
  u2 = u1
  If @u2.b(0) <> address Then
    Print @u2.b(0)
    Print address
    Sleep
    Exit Do
  Else
    Erase u1.b
    Erase u2.b
  End If
  Sleep 10
Loop until Inkey = Chr(27)


[edit]
Same trapping for string copying:

Code: Select all

Type UDT
  Dim As String s
End Type

Dim As UDT u1, u2
Dim As Integer N = 500000000

Do
  u1.s = String(N, 0)
  u2.s = String(N, 0)
  Dim As Byte Ptr address = Strptr(u2.s)
  u2 = u1
  If Strptr(u2.s) <> address Then
    Print Strptr(u2.s)
    Print address
    Sleep
    Exit Do
  Else
    u1.s = ""
    u2.s = ""
  End If
  Sleep 10
Loop until Inkey = Chr(27)
Last edited by fxm on Jul 30, 2018 16:59, edited 2 times in total.
MrSwiss
Posts: 3081
Joined: Jun 02, 2013 9:27
Location: Switzerland

Re: Weird assignment between two UDT instances each containing a dynamic array of equal large size

Postby MrSwiss » Jul 30, 2018 15:52

@fxm, question:
would you call this a "bug" or a necessary "feature request"?
It should IMO be fixed, anyway ... (same treatment as: String!)

@dev's/admin's, same question?
fxm
Posts: 8970
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Re: Weird assignment between two UDT instances each containing a dynamic array of equal large size

Postby fxm » Jul 30, 2018 17:00

In theory, a bug is a behavior different from the one specified.
In this specific case of array copying, nothing like this is specified (as for copying strings).

But for this type of inconvenient implementation defect (because it penalizes very much the execution time for large arrays copying), and knowing that it was a priori already taken into account in the algorithm of string copying previously defined, I usually fill in a bug report (but it's only my feeling).
jj2007
Posts: 1159
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: Weird assignment between two UDT instances each containing a dynamic array of equal large size

Postby jj2007 » Jul 30, 2018 17:13

I tested your "trap" on Win7-64 - no errors. On Win10, however, the new array is exactly 4096 (1000h) bytes below the old one.

As written above, the heap manager can decide to move or not to move. In the Win10 case, it seems that it discovers a free slot 4096 bytes below the old one, and decides to use it. That is not particularly efficient but legitimate.

You mentioned, however, that this move may occur "just before copying". And that would indeed be surprising. Can you demonstrate such behaviour with a modification of the "trap" example?
fxm
Posts: 8970
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Re: Weird assignment between two UDT instances each containing a dynamic array of equal large size

Postby fxm » Jul 30, 2018 20:51

Algorithm proposal for improving the copying of member dynamic arrays in a UDT

When both arrays have the same total size, the reallocation/resizing of the destination array is useless.
In addition to the copy of the data block from the source array to the one of the destination array, this just also needs to copy the contents of the source array descriptor into that of the destination array (after determining the array descriptors addresses), except for:
- the second field of destination array descriptor (pointer to the first real element: '@array (lbound1, lbound2, ...)') which must not be modified,
- the first field of destination array descriptor (pointer to the real or virtual element: '@array(0, 0,...)') which must be recalculated (same offset with the second field than for the source array).
Last edited by fxm on Aug 04, 2018 9:06, edited 3 times in total.
jj2007
Posts: 1159
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: Weird assignment between two UDT instances each containing a dynamic array of equal large size

Postby jj2007 » Jul 30, 2018 22:29

fxm wrote:When both arrays have the same total size, the reallocation/resizing of the destination array is useless.
The heap manager looks for a non-committed slot of a given size. If the first contiguous free block of at least that size is at pOldArray-4096, then it will be taken. Afterwards, the data are copied from source to destination. There is no "reallocation" in the sense "throw away the old data, create new memory, put new data there". There is only a heap manager that tells the copy routine "here is a pointer to unused memory". Where that pointer is, straight on the old array or 4k earlier, is completely irrelevant for the copying task. It won't be any faster if by chance the old pointer is being picked.

It seems that Windows 10 has a better mechanism of identifying free contiguous blocks: If there are two de-committed blocks of 4k and 4,000k that happen to be just one after the other, then
- Windows 10 says, "oh, there is one 4,004kB block"
- Windows 7 sees one 4k block and one 4,000k block, but doesn't realise that they form a contiguous block.

Btw this case of u2 = u1 should not be confused with the u2 = u2 + u3 case:
- with the latter, HeapReAlloc without changing the pointer (if possible) makes sense, because you must copy only u3.
- with u2 = u1, you must HeapAlloc and copy the whole array anyway.
dodicat
Posts: 5759
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: Weird assignment between two UDT instances each containing a dynamic array of equal large size

Postby dodicat » Jul 31, 2018 0:29

Except for pure string fields, you can always resort to memcpy.

Code: Select all

 
#include "crt.bi"
Type UDT
  Dim As string b(Any)
  as long z(any)
  declare operator let(as udt)
  as string s
End Type

operator udt.let(s as udt)
memcpy(@this.b(lbound(this.b)),@s.b(lbound(s.b)),(ubound(s.b)-lbound(s.b)+1)*sizeof(string))
memcpy(@this.z(lbound(this.z)),@s.z(lbound(s.z)),(ubound(s.z)-lbound(s.z)+1)*sizeof(long))
this.s=s.s
end operator

Dim As UDT u1, u2
Dim As Integer N

N = 500000
Redim u1.b(N)
Redim u2.b(N)
Print @u1.b(0), @u2.b(0)
u2 = u1
Print @u1.b(0), @u2.b(0)
Print

N = 50000000
Redim u1.b(N)
u1.b(123456)="Hello"
Redim u2.b(N)

redim u1.z(N)
u1.z(654321)=-2
redim u2.z(N)
u1.s=string(1000,"z")
Print @u1.b(0), @u2.b(0)
u2 = u1
Print @u1.b(0), @u2.b(0)
Print
print u2.b(123456)
print u2.z(654321)
print u2.s
print "Done"
Sleep 


Although I think there will be an exit penalty for the string array.
Have not tested on Linux yet, but I feel sure there will be a comment.
fxm
Posts: 8970
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Re: Weird assignment between two UDT instances each containing a dynamic array of equal large size

Postby fxm » Jul 31, 2018 5:09

jj2007 wrote:There is only a heap manager that tells the copy routine "here is a pointer to unused memory". Where that pointer is, straight on the old array or 4k earlier, is completely irrelevant for the copying task. It won't be any faster if by chance the old pointer is being picked

But how is it that for the copy of strings of equal sizes, the character data of the destination string are never moved into memory?
fxm
Posts: 8970
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Re: Weird assignment between two UDT instances each containing a dynamic array of equal large size

Postby fxm » Jul 31, 2018 7:47

fxm wrote:Algorithm proposal for improving the copying of member dynamic arrays in a UDT

When both arrays have the same total size, the reallocation/resizing of the destination array is useless.
In addition to the copy of the data block from the source array to the one of the destination array, this just also needs to copy the contents of the source array descriptor into that of the destination array (after determining the array descriptors addresses), except for:
- the second field of destination array descriptor (pointer to the first real element: '@array (lbound1, lbound2, ...)') which must not be modified,
- the first field of destination array descriptor (pointer to the real or virtual element: '@array(0, 0,...)') which must be recalculated (same offset with the second field than for the source array).

Example of such implementation in FreeBASIC code (for 1D arrays):

Code: Select all

#include "crt/string.bi"

Function arrayDescriptorPtrFunction (Byval p As Any Ptr) As Any Ptr
  Return p
End function

#macro arrayDescriptorPtr(array, p)
  Scope
    Dim As Function (() As Typeof((array))) As Any Ptr f
    f = Cast(Function (() As Typeof((array))) As Any Ptr, @arrayDescriptorPtrFunction)
    p = f(array())
  End Scope
#endmacro

Sub arrayCopy1D (dest() As Byte, src() As Byte)
  Type descriptor1D
    Dim As Any Ptr ptr0
    Dim As Any Ptr ptrLbound
    Dim As Uinteger globalSize
    Dim As Uinteger elementSize
    Dim As Uinteger dimensionNumber
    'dimension #1
      Dim As Uinteger elementNumber
      Dim As Integer lowBound
      Dim As Integer upBound
  End Type
   
  Dim As descriptor1D Ptr pdescriptorDest
  arrayDescriptorPtr(dest, pdescriptorDest)
  Dim As descriptor1D Ptr pdescriptorSrc
  arrayDescriptorPtr(src, pdescriptorSrc)
   
  If pdescriptorDest <> pdescriptorSrc Then
    If pdescriptorDest->globalSize <> pdescriptorSrc->globalSize Then
      If pdescriptorSrc->globalSize = 0 Then
        Erase dest
      Else
        Redim dest(pdescriptorSrc->lowBound To pdescriptorSrc->upBound)
      End If
    Else
      Dim As Any Ptr ptrLbound = pdescriptorDest->ptrLbound
      *pdescriptorDest = *pdescriptorSrc
      pdescriptorDest->ptrLbound = ptrLbound
      pdescriptorDest->ptr0 = ptrLbound + (pdescriptorSrc->ptr0 - pdescriptorSrc->ptrLbound)
    Endif
    If pdescriptorSrc->globalSize > 0 Then
      memcpy(pdescriptorDest->ptrLbound, pdescriptorSrc->ptrLbound, pdescriptorSrc->globalSize)
    End If
  End If
End Sub


Type UDT
  Dim As Integer I
  Dim As Byte b(Any)
End Type


Sub instanceCopy (Byref dest as UDT, Byref src As UDT)
  If @dest <> @src Then
    dest.I = src.I
    arrayCopy1D(dest.b(), src.b())
  End If
End Sub


Dim As UDT u1, u2
Redim u1.b (3 To 7)
For I As Integer = 3 To 7
  u1.b(I) = 10 + I
Next I
Redim u2.b (-7 To -3)

instanceCopy(u2, u1)

For I As Integer = Lbound(u2.b) To Ubound(u2.b)
  Print I,u2.b(I)
Next I

Sleep
Last edited by fxm on Aug 04, 2018 10:44, edited 9 times in total.
jj2007
Posts: 1159
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: Weird assignment between two UDT instances each containing a dynamic array of equal large size

Postby jj2007 » Jul 31, 2018 8:29

fxm wrote:
jj2007 wrote:There is only a heap manager that tells the copy routine "here is a pointer to unused memory". Where that pointer is, straight on the old array or 4k earlier, is completely irrelevant for the copying task. It won't be any faster if by chance the old pointer is being picked

But how is it that for the copy of strings of equal sizes, the character data of the destination string are never moved into memory?

Sorry, I honestly do not understand your question.

Return to “General”

Who is online

Users browsing this forum: No registered users and 5 guests