-vec 2 enables intra-expression vectorization.
Not sure but it seems this functionnality is not completly implemented.
In the function extracted from ast-vectorize.bas (see end of post) only add operator appears, no sub or others. Why ?
So Z=NR(1)-NR(2) works perfectly.
Code: Select all
##Z=NR(1)-NR(2)
movss xmm6, [ebp-16]
subss xmm6, dword ptr [ebp-12]
movss [ebp-56], xmm6
With -vec 2
Code: Select all
##Z=NR(1)+NR(2)
movss xmm6, [ebp-16]
<----- missing line that exists if vec 1, see below
movss [ebp-56], xmm6
But with -vec 1
Code: Select all
##Z=NR(1)+NR(2)
movss xmm6, [ebp-16]
addss xmm6, dword ptr [ebp-12]
movss [ebp-56], xmm6
With simple variables no problem : z=f1 + f2 but with arrays and udt fields (as reported in the bug list) it doesn't work.
After quickly looking at the code it could be an error when comparing addresses of the variables involved in the statement.
If I have the time I will try to confirm the cause of the bug.
Code: Select all
private function astIntraTreeVectorize _
( _
byval n as ASTNODE ptr _
) as integer
dim as ASTNODE ptr l = any
dim as integer changed = FALSE
if( n = NULL ) then return FALSE
if( n->class = AST_NODECLASS_BOP ) then
if( n->op.op = AST_OP_ADD ) then <--------------------------------------- Only add
maxVectorWidth = 4
vectorWidth = 0
if( hMergeNode( n->l, n->r, FALSE ) ) then
maxVectorWidth = 4
vectorWidth = 0
hMergeNode( n->l, n->r, TRUE )
'' check for multiple HADDs
l = n->l
if( l->class = AST_NODECLASS_UOP ) then
if( l->op.op = AST_OP_HADD ) then
*n = *l
'' remove this node
astDelNode( l )
n->vector = 0
return TRUE
end if
end if
astDelTree( n->r )
n->r = NULL
n->class = AST_NODECLASS_UOP
n->op.op = AST_OP_HADD
n->vector = 0
return TRUE
end if
end if
end if
if( astIntraTreeVectorize( n->l ) = TRUE ) then
changed = TRUE
end if
if( astIntraTreeVectorize( n->r ) = TRUE ) then
changed = TRUE
end if
function = changed
end function