Normal Mapping

User projects written in or related to FreeBASIC.
Tourist Trap
Posts: 2958
Joined: Jun 02, 2015 16:24

Oh the scene!

Post by Tourist Trap »

Paradox, Defcon, and so much others that I just have forgotten the names. The scene was at its pinacle at C64/ Amiga times. Nice to keep this alive.
Stonemonkey
Posts: 649
Joined: Jun 09, 2005 0:08

Re: Normal Mapping

Post by Stonemonkey »

D.J.Peters wrote:Good job but sometimes I wonder me about your style of coding
you use SSE to get higher fps but some parts of your BASIC code looks far far away from optimized.

another point for your to do list of your software renderer https://www.youtube.com/watch?v=j_ghQRN5z9M
(blitting the result on screen is OpenGL but the rest is software you know ?)

Don't worry I like your work and results since all the years my friend :-)

Joshy

PS.
I personaly stopped my software renderer I use WebGL (the future to get your stuff running on countless of devices)
Nice video with the lens flares, I'll look into that.
I am sometimes willing to listen about how to improve though i know not always, I'm just self taught so i have my own ways which i know aren't always right but there may also be some things i do that aren't quite for obvious reasons which I'm open to discuss and i do very much appreciate your input even though i don't aleays listen.
Stonemonkey
Posts: 649
Joined: Jun 09, 2005 0:08

Re: Normal Mapping

Post by Stonemonkey »

Parallax mapping as it stands at the moment.

Code: Select all

type gfx_buffer
    wwidth as long
    height as long
    pitch as long
    pixels as ulong pointer
    height_map as ubyte pointer
    depth as single pointer
    normal_data as ubyte ptr
    normal as single pointer
    vp_l as long
    vp_r as long
    vp_t as long
    vp_b as long
    u_mask as long
    v_mask as long
    v_shift as long
    wwidth_f as single
    height_f as single
    ripmap_u as gfx_buffer pointer
    ripmap_v as gfx_buffer pointer
    flags as ulong
    
    declare constructor(byval as long,byval as long,byval as long=0)
    declare destructor()
end type

type vertex
    as single x,y,z,pad0
    as single nx,ny,nz,pad1
    as single tx,ty,tz,pad2
    as single bx,by,bz,pad3
    as single rx,ry,rz,pad4
    as single sx,sy,sz,pad5
    as single b,g,r,a
    as single sb,sg,sr,sa
    as single u0,v0,u1,v1
    as single su0,sv0,su1,sv1
    as single scx,scy,scz,scw
    as single slx,sly,slz,slw
    as vertex ptr next_vertex
    as any ptr vertex_data
end type

type vertex_data
    as ubyte dat(0 to sizeof(vertex)+15)
end type

type triangle
    as vertex ptr v0,v1,v2
    as single nx,ny,nz,nw
    as single tx,ty,tz,tw
    as single bx,by,bz,bw
    texture as gfx_buffer ptr
    n_map as gfx_buffer ptr
    as ulong argb
    as triangle ptr next_triangle
end type

type model
    as triangle ptr first_triangle
    as vertex ptr first_vertex
    as model ptr next_model
end type

type entity
    as ubyte matrix_data(80+15)
    as single ptr m
    as model ptr model
    as entity ptr next_entity
end type

type world
    as entity ptr camera
    as entity ptr light
    as entity ptr first_entity
    as model ptr first_model
end type

function create_world()as world ptr
    return new world
end function

function create_camera(byval world as world ptr)as entity ptr
    world->camera=new entity
    world->camera->m=cast(single ptr,(cast(ulong,@world->camera->matrix_data(0))+15)and &hfffffff0)
    world->camera->m[0]=1.0
    world->camera->m[5]=1.0
    world->camera->m[10]=1.0
    return world->camera
end function

function create_light(byval world as world ptr)as entity ptr
    world->light=new entity
    world->light->m=cast(single ptr,(cast(ulong,@world->light->matrix_data(0))+15)and &hfffffff0)
    world->light->m[0]=1.0
    world->light->m[5]=1.0
    world->light->m[10]=1.0
    return world->light
end function

function create_model(byval world as world ptr)as model ptr
    dim as model ptr model=new model
    model->next_model=world->first_model
    world->first_model=model
    return model
end function

function create_entity(byval world as world ptr,byval model as model ptr)as entity ptr
    dim as entity ptr mesh=new entity
    mesh->m=cast(single ptr,(cast(ulong,@mesh->matrix_data(0))+15)and &hfffffff0)
    mesh->m[0]=1.0
    mesh->m[5]=1.0
    mesh->m[10]=1.0
    mesh->model=model
    mesh->next_entity=world->first_entity
    world->first_entity=mesh
    return mesh
end function

    
constructor gfx_buffer(byval wwidth as long,byval height as long,byval display as long=0)
    if display=0 then
        this.wwidth=wwidth
        this.height=height
        this.pitch=wwidth
        this.pixels=new ulong[wwidth*height]{any}
        this.flags and=&h7fffffff
    else
        screenres wwidth,height,32,2',&h10
        screenset 0,1
        this.pixels=screenptr()
        if this.pixels=0 then end
        screeninfo (this.wwidth,this.height,,,this.pitch)
        this.pitch shr=2
        this.flags or=&h80000000
    end if
    this.vp_l=0
    this.vp_r=wwidth-1
    this.vp_t=0
    this.vp_b=height-1
end constructor

destructor gfx_buffer()
    if (flags and &h80000000)=&h80000000 then
        this.pixels=0
    else
        delete this.pixels
        this.pixels=0
    end if
    if this.depth<>0 then delete this.depth:this.depth=0
    if this.normal_data<>0 then delete this.normal_data:this.normal_data=0:this.normal=0
end destructor

sub set_normal(byval buffer as gfx_buffer ptr,_
                byval x as ulong,_
                byval y as ulong,_
                byval nx as single,_
                byval ny as single,_
                byval nz as single,byval h as single=0.0)
    if ((x-buffer->vp_l)<=(buffer->vp_r-buffer->vp_l)) and _
        ((y-buffer->vp_t)<=(buffer->vp_b-buffer->vp_t)) then
            dim as single d=1.0/sqr(nx*nx+ny*ny+nz*nz)
            buffer->normal[(x+y*buffer->pitch)*4]=nx*d
            buffer->normal[(x+y*buffer->pitch)*4+1]=ny*d
            buffer->normal[(x+y*buffer->pitch)*4+2]=nz*d
            buffer->normal[(x+y*buffer->pitch)*4+3]=h
            buffer->height_map[x+y*buffer->pitch]=h
    end if
end sub

sub set_pixel(byval buffer as gfx_buffer ptr,_
                byval x as ulong,_
                byval y as ulong,_
                byval argb as ulong)
    if ((x-buffer->vp_l)<=(buffer->vp_r-buffer->vp_l)) and _
        ((y-buffer->vp_t)<=(buffer->vp_b-buffer->vp_t)) then
            buffer->pixels[x+y*buffer->pitch]=argb
    end if
end sub

sub set_pixel_alpha(byval buffer as gfx_buffer ptr,_
                    byval x as long,_
                    byval y as long,_
                    byval argb as ulong)
    if ((x>=buffer->vp_l)and(x<=buffer->vp_r)and(y>=buffer->vp_t)and(y<=buffer->vp_b)) then
        dim as ulong ptr p=buffer->pixels+(x+y*buffer->pitch)

asm
    mov eax,dword ptr[p]
    mov eax,dword ptr[eax]
    mov ebx,dword ptr[argb]' load colour 1
    mov edi,ebx
    shr edi,24
    movzx ecx,ah          ' isolate green 1 and move to low byte
    movzx edx,bh          ' isolate green 2 and move to low byte
    sub ebx,eax           ' (c2-c1) for red/blue
    sub edx,ecx           ' (c2-c1) for green
    imul ebx,edi ' *alpha for red/blue
    imul edx,edi ' *alpha for green
    sar ebx,8             ' shift red/blue back to position
    add edx,ecx             ' +c1 for green
    add ebx,eax           ' +c1 for red/blue
    sub ebx,edx     
    mov bh,dh             ' recombine green with red/blue channel
    mov eax,dword ptr[p]
    mov dword ptr[eax],ebx  'write colour
end asm
        
        
    end if
end sub

function get_pixel(byval buffer as gfx_buffer ptr,_
                    byval x as ulong,_
                    byval y as ulong)as ulong
    if (x<buffer->wwidth) and _
        (y<buffer->height) then
            return buffer->pixels[x+y*buffer->pitch]
    else
        return -1
    end if
end function

sub draw_image(byval buffer as gfx_buffer ptr,_
                byval srce as gfx_buffer ptr,_
                byval xp as long,_
                byval yp as long)
    dim as long x0=xp,y0=yp,x1=x0+srce->wwidth-1,y1=y0+srce->height-1
    if x0<buffer->vp_l then x0=buffer->vp_l
    if x1>buffer->vp_r then x1=buffer->vp_r
    if y0<buffer->vp_t then y0=buffer->vp_t
    if y1>buffer->vp_b then y1=buffer->vp_b
    dim as long y=y0
    dim as ulong ptr dv=buffer->pixels+x0+y*buffer->pitch,sv=srce->pixels+(x0-xp)+(y-yp)*srce->pitch
    while y<=y1
        dim as long x=x0
        dim as ulong pointer d=dv,s=sv
        while x<=x1
            *d=*s
            d+=1
            s+=1
            x+=1
        wend
        dv+=buffer->pitch
        sv+=srce->pitch
        y+=1
    wend
end sub

sub draw_masked_image(byval buffer as gfx_buffer ptr,_
                byval srce as gfx_buffer ptr,_
                byval xp as long,_
                byval yp as long,_
                byval mask as ulong)
    dim as long x0=xp,y0=yp,x1=x0+srce->wwidth-1,y1=y0+srce->height-1
    if x0<buffer->vp_l then x0=buffer->vp_l
    if x1>buffer->vp_r then x1=buffer->vp_r
    if y0<buffer->vp_t then y0=buffer->vp_t
    if y1>buffer->vp_b then y1=buffer->vp_b
    dim as long y=y0
    dim as ulong ptr dv=buffer->pixels+x0+y*buffer->pitch,sv=srce->pixels+(x0-xp)+(y-yp)*srce->pitch
    while y<=y1
        dim as long x=x0
        dim as ulong pointer d=dv,s=sv
        while x<=x1
            if *s<>mask then *d=*s
            d+=1
            s+=1
            x+=1
        wend
        dv+=buffer->pitch
        sv+=srce->pitch
        y+=1
    wend
end sub

sub gtriangle(byval buffer as gfx_buffer ptr,_
                byval triangle as triangle pointer)
    type edge
        as single e,x,pad0,pad1
        as single b,g,r,a
        as single de,dx,dpad0,dpad1
        as single db,dg,dr,da
    end type
    dim as vertex pointer v0=triangle->v0,v1=triangle->v1,v2=triangle->v2
    if v2->sy<v0->sy then swap v0,v2
    if v1->sy<v0->sy then swap v0,v1 else if v2->sy<v1->sy then swap v1,v2
    
    dim as ubyte ptr edge_bytes=new ubyte[sizeof(edge)+15]{any},scan_bytes=new ubyte[sizeof(edge)+15]{any}
    dim as edge ptr edge=cast(edge ptr,cast(ulong,edge_bytes+15)and &hfffffff0)
    dim as edge ptr scan=cast(edge ptr,cast(ulong,scan_bytes+15)and &hfffffff0)
    
    dim as single ptr xl=any,xr=any
    edge->de=(v1->sx-v0->sx)/(v1->sy-v0->sy)
    dim as single dy=1.0/(v2->sy-v0->sy)
    edge->dx=(v2->sx-v0->sx)*dy
    if edge->dx>edge->de then xl=@edge->e:xr=@edge->x else xl=@edge->x:xr=@edge->e end if
    edge->dr=(v2->sr-v0->sr)*dy
    edge->dg=(v2->sg-v0->sg)*dy
    edge->db=(v2->sb-v0->sb)*dy
    dy=v1->sy-v0->sy
    dim as single dx=1.0/(v1->sx-(v0->sx+edge->dx*dy))
    scan->dr=(v1->sr-(v0->sr+edge->dr*dy))*dx
    scan->dg=(v1->sg-(v0->sg+edge->dg*dy))*dx
    scan->db=(v1->sb-(v0->sb+edge->db*dy))*dx
    dim as integer y_start=v0->sy+0.4999,y_end=v1->sy-0.4999,y_fin=v2->sy-0.4999
    if y_start<buffer->vp_t then y_start=buffer->vp_t
    if y_end>buffer->vp_b then y_end=buffer->vp_b
    if y_fin>buffer->vp_b then y_fin=buffer->vp_b
    dim as single d=y_start-v0->sy
    edge->e=v0->sx+edge->de*d
    edge->x=v0->sx+edge->dx*d
    edge->r=v0->sr+edge->dr*d
    edge->g=v0->sg+edge->dg*d
    edge->b=v0->sb+edge->db*d

asm 
    mov eax,dword ptr[scan]
    movaps xmm0,[eax+offsetof(edge,db)]
    mov eax,dword ptr[edge]
    movaps xmm3,[eax+offsetof(edge,b)]
    movaps xmm4,[eax+offsetof(edge,db)]
end asm

    dim as ulong ptr l_start=buffer->pixels+y_start*buffer->pitch
    while(y_start<=y_fin)
        while(y_start<=y_end)
            dim as integer x_start=*xl+0.4999,x_end=*xr-0.4999
            if x_start<buffer->vp_l then x_start=buffer->vp_l
            x_start and=not(x_start shr 31)
            if x_end>buffer->vp_r then x_end=buffer->vp_r
            dim as single ddx=x_start-edge->x
            
asm 
    movss xmm1,[ddx]
    shufps xmm1,xmm1,0
    mulps xmm1,xmm0
    addps xmm1,xmm3
    mov eax,dword ptr[x_start]
    mov ebx,dword ptr[x_end]
    shl eax,2
    shl ebx,2
    add eax,dword ptr[l_start]
    add ebx,dword ptr[l_start]
    cmp eax,ebx
    ja gtriangle_no_scanline
    gtriangle_tri_loop:
        cvtps2pi mm0,xmm1
        movhlps xmm2,xmm1
        cvtps2pi mm1,xmm2
        packuswb mm0,mm1
        packuswb mm0,mm0
        movd [eax],mm0
        add eax,4
        addps xmm1,xmm0
        cmp eax,ebx
    jle gtriangle_tri_loop
    gtriangle_no_scanline:
    emms
    addps xmm3,xmm4
end asm
            edge->x+=edge->dx:edge->e+=edge->de:l_start+=buffer->pitch:y_start+=1
        wend
        if y_start<=y_fin then
            y_end=y_fin
            edge->de=(v2->sx-v1->sx)/(v2->sy-v1->sy)
            dim as single d=y_start-v2->sy
            edge->e=v2->sx+edge->de*d
        end if
    wend
    delete edge_bytes
    delete scan_bytes
end sub

sub ftriangle(byval buffer as gfx_buffer ptr,byval triangle as triangle pointer)
    
    type edge
        as single e,x
        as single de,dx
    end type
    
    dim as vertex pointer v0=triangle->v0,v1=triangle->v1,v2=triangle->v2
    
    if v2->sy<v0->sy then swap v0,v2
    if v1->sy<v0->sy then swap v0,v1 else if v2->sy<v1->sy then swap v1,v2
    
    dim as integer y_start=v0->sy+0.4999,y_end=v1->sy-0.4999,y_fin=v2->sy-0.4999
    if y_start<buffer->vp_t then y_start=buffer->vp_t
    if y_end>buffer->vp_b then y_end=buffer->vp_b
    if y_fin>buffer->vp_b then y_fin=buffer->vp_b

    dim as edge edge=any,scan=any
    
    edge.de=(v1->sx-v0->sx)/(v1->sy-v0->sy)
    edge.dx=(v2->sx-v0->sx)/(v2->sy-v0->sy)
    dim as single ptr xl=any,xr=any
    if edge.dx>edge.de then xl=@edge.e:xr=@edge.x else xl=@edge.x:xr=@edge.e end if
    dim as single d=y_start-v0->sy
    edge.e=v0->sx+edge.de*d
    edge.x=v0->sx+edge.dx*d
    dim as ulong argb=triangle->argb
    dim as ulong ptr l_start=buffer->pixels+y_start*buffer->pitch
    while(y_start<=y_fin)
        while(y_start<=y_end)
            dim as integer x_start=*xl+0.4999,x_end=*xr-0.4999
            if x_start<buffer->vp_l then x_start=buffer->vp_l
            if x_end>buffer->vp_r then x_end=buffer->vp_r

asm 
    mov ecx,dword ptr[l_start]
    mov eax,dword ptr[x_start]
    mov ebx,dword ptr[x_end]
    lea ecx,[ecx+ebx*4]
    sub eax,ebx
    jns ftriangle_no_scanline
    mov edx,dword ptr[argb]
    ftriangle_tri_loop:
        mov dword ptr[ecx+eax*4],edx
        inc eax
    jng ftriangle_tri_loop
    ftriangle_no_scanline:
end asm
            edge.x+=edge.dx:edge.e+=edge.de:l_start+=buffer->pitch:y_start+=1
        wend
        if y_start<=y_fin then
            y_end=y_fin
            edge.de=(v2->sx-v1->sx)/(v2->sy-v1->sy)
            dim as single d=y_start-v2->sy
            edge.e=v2->sx+edge.de*d
        end if
    wend
end sub

sub set_depth(byval buffer as gfx_buffer ptr,_
                byval x as ulong,_
                byval y as ulong,_
                byval depth as single)
    if ((x-buffer->vp_l)<=(buffer->vp_r-buffer->vp_l)) and _
        ((y-buffer->vp_t)<=(buffer->vp_b-buffer->vp_t)) then
            buffer->pixels[x+y*buffer->pitch]=1.0/depth
    end if
end sub

function get_depth(byval buffer as gfx_buffer ptr,_
                    byval x as ulong,_
                    byval y as ulong)as single
    if (x<buffer->wwidth) and _
        (y<buffer->height) then
            return 1.0/buffer->pixels[x+y*buffer->pitch]
    else
        return -1.0
    end if
end function

sub enable_depth(byval buffer as gfx_buffer ptr)
    if buffer->depth=0 then
        buffer->depth=new single[buffer->pitch*buffer->height]{any}
    end if
end sub

sub enable_normal(byval buffer as gfx_buffer ptr)
    if buffer->normal=0 then
        buffer->normal_data=new ubyte[sizeof(single)*buffer->pitch*buffer->height*4+15]{any}
        buffer->normal=cast(single ptr,(cast(ulong,buffer->normal_data)+15)and &hfffffff0)
        buffer->height_map=new ubyte[buffer->pitch*buffer->height]
    end if
end sub

sub disable_depth(byval buffer as gfx_buffer ptr)
    if buffer->depth<>0 then
        delete buffer->depth
        buffer->depth=0
    end if
end sub

sub clear_buffer(byval buffer as gfx_buffer ptr,_
                byval c as ulong=0,_
                byval z as single=0.0)
    if buffer->depth=0 then
        dim as ulong ptr d=buffer->pixels,e=buffer->pixels+buffer->pitch*buffer->height
        while d<e
            *d=c
            d+=1
        wend
    else
        dim as ulong ptr ds=buffer->pixels,e=buffer->pixels+buffer->pitch*buffer->height
        dim as single ptr dz=buffer->depth
        while ds<e
            *dz=z
            *ds=c
            dz+=1
            ds+=1
        wend
    end if
end sub

sub clear_viewport(byval buffer as gfx_buffer ptr,_
                    byval c as ulong=0,_
                    byval z as single=0.0)
    if buffer->depth=0 then
        dim as long x0=buffer->vp_l,y0=buffer->vp_t,x1=buffer->vp_r,y1=buffer->vp_b
        dim as long y=y0
        dim as ulong ptr dv=buffer->pixels+x0+y*buffer->pitch
        while y<=y1
            dim as long x=x0
            dim as ulong pointer d=dv
            while x<=x1
                *d=c
                d+=1
                x+=1
            wend
            dv+=buffer->pitch
            y+=1
        wend
    else
        dim as long x0=buffer->vp_l,y0=buffer->vp_t,x1=buffer->vp_r,y1=buffer->vp_b
        dim as long y=y0
        dim as ulong ptr dv=buffer->pixels+x0+y*buffer->pitch
        dim as single ptr zv=buffer->depth+x0+y*buffer->pitch
        while y<=y1
            dim as long x=x0
            dim as ulong ptr d=dv
            dim as single ptr zp=zv
            while x<=x1
                *d=c
                *zp=z
                d+=1
                zp+=1
                x+=1
            wend
            dv+=buffer->pitch
            zv+=buffer->pitch
            y+=1
        wend
    end if
end sub

sub transition_alpha(byval buffer as gfx_buffer ptr,_
                    byval x as long,_
                    byval y as long,_
                    byval argb as ulong)
    if ((x>=buffer->vp_l)and(x<=buffer->vp_r)and(y>=buffer->vp_t)and(y<=buffer->vp_b)) then
        dim as ulong ptr p=buffer->pixels+(x+y*buffer->pitch)

asm
    mov eax,dword ptr[p]
    mov eax,dword ptr[eax]
    mov ebx,dword ptr[argb]' load colour 1
    mov edi,ebx
    shr edi,24
    movzx ecx,ah          ' isolate green 1 and move to low byte
    movzx edx,bh          ' isolate green 2 and move to low byte
    sub ebx,eax           ' (c2-c1) for red/blue
    sub edx,ecx           ' (c2-c1) for green
    imul ebx,edi ' *alpha for red/blue
    imul edx,edi ' *alpha for green
    sar ebx,8             ' shift red/blue back to position
    add edx,ecx             ' +c1 for green
    add ebx,eax           ' +c1 for red/blue
    sub ebx,edx     
    mov bh,dh             ' recombine green with red/blue channel
    mov eax,dword ptr[p]
    mov dword ptr[eax],ebx  'write colour
end asm
        
        
    end if
end sub





sub draw_triangle(byval buffer as gfx_buffer ptr,byval triangle as triangle ptr)
    dim as single ambient=.1,zero=0.0,m256=256
    dim as ulong argb=triangle->argb
    type edge
        as single z,x,e,p1
        as single dz,dx,de,p3
        as single u0,v0,u1,v1
        as single du0,dv0,du1,dv1
        as single b,g,r,a
        as single db,dg,dr,da
        as single cx,cy,cz,cw
        as single dcx,dcy,dcz,dcw
        as single lx,ly,lz,lw
        as single dlx,dly,dlz,dlw
        as single mcx,mcy,mcz,mcw
        as ulong abs_mask,am1,am2,am3
        as single tex_shift,ts1,ts2,ts3
    end type
    dim as vertex ptr v0=triangle->v0,v1=triangle->v1,v2=triangle->v2
    if (v2->sx<v0->sx) then swap v0,v2
    if (v1->sx<v0->sx) then swap v0,v1 else if (v2->sx<v1->sx) then swap v1,v2
    if (v0->sx<=buffer->vp_r)and(v2->sx>=buffer->vp_l) then
        if (v2->sy<v0->sy) then swap v0,v2
        if (v1->sy<v0->sy) then swap v0,v1 else if (v2->sy<v1->sy) then swap v1,v2
        if (v0->sy<=buffer->vp_b)and(v2->sy>=buffer->vp_t) then
            dim as integer y_start=v0->sy+.49999,y_end=v1->sy-.49999,y_fin=v2->sy-.49999
            if (y_start<buffer->vp_t) then y_start=buffer->vp_t
            if (y_end>buffer->vp_b) then y_end=buffer->vp_b
            if (y_fin>buffer->vp_b) then y_fin=buffer->vp_b
            dim as ubyte ptr edge_bytes=new ubyte[sizeof(edge)+15]{any}
            dim as ubyte ptr scan_bytes=new ubyte[sizeof(edge)+15]{any}
            dim as edge ptr edge=cast(edge ptr,(cast(ulong,edge_bytes)+15)and &hfffffff0)
            dim as edge ptr scan=cast(edge ptr,(cast(ulong,scan_bytes)+15)and &hfffffff0)
            scan->mcx=255.0
            scan->mcy=255.0
            scan->mcz=255.0
            scan->abs_mask=&h7fffffff
            scan->am1=&h7fffffff
            scan->am2=&h7fffffff
            scan->am3=&h7fffffff
            scan->tex_shift=1.0
            scan->ts1=256.0
            scan->ts2=1.0
            scan->ts3=1.0
            dim as single tw=triangle->texture->wwidth shl 8,th=triangle->texture->height shl 8
            'dim as single tw=triangle->texture->wwidth,th=triangle->texture->height
            
            v0->su0=v0->u0*tw*v0->sz
            v0->sv0=v0->v0*th*v0->sz
            v1->su0=v1->u0*tw*v1->sz
            v1->sv0=v1->v0*th*v1->sz
            v2->su0=v2->u0*tw*v2->sz
            v2->sv0=v2->v0*th*v2->sz
            dim as ulong v_shift=8'16-(log(triangle->texture->wwidth)/log(2))
            dim as ulongint vshift64=8
            dim as ulongint uv_mask=(triangle->texture->wwidth*triangle->texture->height)-1
            dim as ulong ptr texels=triangle->texture->pixels
            dim as single ptr normals=triangle->texture->normal
            dim as ubyte ptr height_map=triangle->texture->height_map
    dim as single dy01=v1->sy-v0->sy
    dim as single dy02=1.0/(v2->sy-v0->sy)
    edge->de=(v1->sx-v0->sx)/dy01
    edge->dx=(v2->sx-v0->sx)*dy02
    edge->dz=(v2->sz-v0->sz)*dy02
    dim as single dx_scan=1.0/(v1->sx-(v0->sx+edge->dx*dy01))
    dim as single d_y_start=y_start-v0->sy
    scan->dz=(v1->sz-(v0->sz+edge->dz*dy01))*dx_scan
    edge->e=v0->sx+edge->de*d_y_start
    edge->x=v0->sx+edge->dx*d_y_start
    edge->z=v0->sz+edge->dz*d_y_start
    dim as single ptr xl=any,xr=any
    if edge->dx>edge->de then xl=@edge->e:xr=@edge->x else xl=@edge->x:xr=@edge->e end if
    
#define sse_
#ifdef sse_
    asm
        movss xmm7,[dy02]
        shufps xmm7,xmm7,0
        mov eax,[v0]
        mov ebx,[v1]
        mov ecx,[v2]
        mov edx,[edge]
        
        movaps xmm0,[ecx+offsetof(vertex,sb)]
        movaps xmm1,[ecx+offsetof(vertex,su0)]
        movaps xmm2,[ecx+offsetof(vertex,scx)]
        movaps xmm3,[ecx+offsetof(vertex,slx)]
        subps xmm0,[eax+offsetof(vertex,sb)]
        subps xmm1,[eax+offsetof(vertex,su0)]
        subps xmm2,[eax+offsetof(vertex,scx)]
        subps xmm3,[eax+offsetof(vertex,slx)]
        mulps xmm0,xmm7
        mulps xmm1,xmm7
        mulps xmm2,xmm7
        mulps xmm3,xmm7
        movaps [edx+offsetof(edge,db)],xmm0
        movaps [edx+offsetof(edge,du0)],xmm1
        movaps [edx+offsetof(edge,dcx)],xmm2
        movaps [edx+offsetof(edge,dlx)],xmm3
        
        mov ecx,[scan]
        
        movss xmm7,[dy01]
        shufps xmm7,xmm7,0
        
        mulps xmm0,xmm7
        mulps xmm1,xmm7
        mulps xmm2,xmm7
        mulps xmm3,xmm7
        movaps xmm4,[ebx+offsetof(vertex,sb)]
        movaps xmm5,[ebx+offsetof(vertex,su0)]
        movaps xmm6,[ebx+offsetof(vertex,scx)]
        movaps xmm7,[ebx+offsetof(vertex,slx)]
        addps xmm0,[eax+offsetof(vertex,sb)]
        addps xmm1,[eax+offsetof(vertex,su0)]
        addps xmm2,[eax+offsetof(vertex,scx)]
        addps xmm3,[eax+offsetof(vertex,slx)]
        subps xmm4,xmm0
        subps xmm5,xmm1
        subps xmm6,xmm2
        subps xmm7,xmm3
        movss xmm0,[dx_scan]
        shufps xmm0,xmm0,0
        mulps xmm4,xmm0
        mulps xmm5,xmm0
        mulps xmm6,xmm0
        mulps xmm7,xmm0
        movaps [ecx+offsetof(edge,db)],xmm4
        movaps [ecx+offsetof(edge,du0)],xmm5
        movaps [ecx+offsetof(edge,dcx)],xmm6
        movaps [ecx+offsetof(edge,dlx)],xmm7
        
        movss xmm7,[d_y_start]
        shufps xmm7,xmm7,0
        
        movaps xmm0,[edx+offsetof(edge,db)]
        movaps xmm1,[edx+offsetof(edge,du0)]
        movaps xmm2,[edx+offsetof(edge,dcx)]
        movaps xmm3,[edx+offsetof(edge,dlx)]
        mulps xmm0,xmm7
        mulps xmm1,xmm7
        mulps xmm2,xmm7
        mulps xmm3,xmm7
        addps xmm0,[eax+offsetof(vertex,sb)]
        addps xmm1,[eax+offsetof(vertex,su0)]
        addps xmm2,[eax+offsetof(vertex,scx)]
        addps xmm3,[eax+offsetof(vertex,slx)]
        movaps [edx+offsetof(edge,b)],xmm0
        movaps [edx+offsetof(edge,u0)],xmm1
        movaps [edx+offsetof(edge,cx)],xmm2
        movaps [edx+offsetof(edge,lx)],xmm3
    end asm
#else
    edge->da=(v2->sa-v0->sa)*dy02
    edge->dr=(v2->sr-v0->sr)*dy02
    edge->dg=(v2->sg-v0->sg)*dy02
    edge->db=(v2->sb-v0->sb)*dy02
    
    edge->du0=(v2->su0-v0->su0)*dy02
    edge->dv0=(v2->sv0-v0->sv0)*dy02
    edge->du1=(v2->su1-v0->su1)*dy02
    edge->dv1=(v2->sv1-v0->sv1)*dy02
    
    edge->dcx=(v2->scx-v0->scx)*dy02
    edge->dcy=(v2->scy-v0->scy)*dy02
    edge->dcz=(v2->scz-v0->scz)*dy02
    edge->dcw=(v2->scw-v0->scw)*dy02
    
    edge->dlx=(v2->slx-v0->slx)*dy02
    edge->dly=(v2->sly-v0->sly)*dy02
    edge->dlz=(v2->slz-v0->slz)*dy02
    edge->dlw=(v2->slw-v0->slw)*dy02
    
    scan->da=(v1->sa-(v0->sa+edge->da*dy01))*dx_scan
    scan->dr=(v1->sr-(v0->sr+edge->dr*dy01))*dx_scan
    scan->dg=(v1->sg-(v0->sg+edge->dg*dy01))*dx_scan
    scan->db=(v1->sb-(v0->sb+edge->db*dy01))*dx_scan
    
    scan->du0=(v1->su0-(v0->su0+edge->du0*dy01))*dx_scan
    scan->dv0=(v1->sv0-(v0->sv0+edge->dv0*dy01))*dx_scan
    scan->du1=(v1->su1-(v0->su1+edge->du1*dy01))*dx_scan
    scan->dv1=(v1->sv1-(v0->sv1+edge->dv1*dy01))*dx_scan
    
    scan->dcx=(v1->scx-(v0->scx+edge->dcx*dy01))*dx_scan
    scan->dcy=(v1->scy-(v0->scy+edge->dcy*dy01))*dx_scan
    scan->dcz=(v1->scz-(v0->scz+edge->dcz*dy01))*dx_scan
    scan->dcw=(v1->scw-(v0->scw+edge->dcw*dy01))*dx_scan
    
    scan->dlx=(v1->slx-(v0->slx+edge->dlx*dy01))*dx_scan
    scan->dly=(v1->sly-(v0->sly+edge->dly*dy01))*dx_scan
    scan->dlz=(v1->slz-(v0->slz+edge->dlz*dy01))*dx_scan
    scan->dlw=(v1->slw-(v0->slw+edge->dlw*dy01))*dx_scan
    
    edge->a=v0->sa+edge->da*d_y_start
    edge->r=v0->sr+edge->dr*d_y_start
    edge->g=v0->sg+edge->dg*d_y_start
    edge->b=v0->sb+edge->db*d_y_start
    
    edge->u0=v0->su0+edge->du0*d_y_start
    edge->v0=v0->sv0+edge->dv0*d_y_start
    edge->u1=v0->su1+edge->du1*d_y_start
    edge->v1=v0->sv1+edge->dv1*d_y_start
    
    edge->cx=v0->scx+edge->dcx*d_y_start
    edge->cy=v0->scy+edge->dcy*d_y_start
    edge->cz=v0->scz+edge->dcz*d_y_start
    edge->cw=v0->scw+edge->dcw*d_y_start
    
    edge->lx=v0->slx+edge->dlx*d_y_start
    edge->ly=v0->sly+edge->dly*d_y_start
    edge->lz=v0->slz+edge->dlz*d_y_start
    edge->lw=v0->slw+edge->dlw*d_y_start
#endif
    
    
    
    dim as ulong ptr l_start=buffer->pixels+y_start*buffer->pitch
    while(y_start<=y_fin)
        while(y_start<=y_end)
            dim as integer x_start=*xl+0.4999,x_end=*xr-0.4999
            if x_start<buffer->vp_l then x_start=buffer->vp_l
            if x_end>buffer->vp_r then x_end=buffer->vp_r
            dim as single dx_scan=x_start-edge->x
#ifdef sse_
    asm
        movss xmm7,[dx_scan]
        shufps xmm7,xmm7,0
        mov eax,[scan]
        mov ebx,[edge]
        movaps xmm0,[eax+offsetof(edge,db)]
        movaps xmm1,[eax+offsetof(edge,du0)]
        movaps xmm2,[eax+offsetof(edge,dcx)]
        movaps xmm3,[eax+offsetof(edge,dlx)]
        movss xmm4,[eax+offsetof(edge,dz)]
        mulps xmm0,xmm7
        mulps xmm1,xmm7
        mulps xmm2,xmm7
        mulps xmm3,xmm7
        mulss xmm4,xmm7
        addps xmm0,[ebx+offsetof(edge,b)]
        addps xmm1,[ebx+offsetof(edge,u0)]
        addps xmm2,[ebx+offsetof(edge,cx)]
        addps xmm3,[ebx+offsetof(edge,lx)]
        addss xmm4,[ebx+offsetof(edge,z)]
    end asm
#else
            scan->r=edge->r+scan->dr*dx_scan
            scan->g=edge->g+scan->dg*dx_scan
            scan->b=edge->b+scan->db*dx_scan
#endif
            dim as ulong ptr p_start=l_start+x_start,p_end=l_start+x_end
            while(p_start<=p_end)
                
#ifdef sse_
    asm
        
        mov esi,[height_map]
        mov edi,[scan]
        movq mm0,[vshift64]
        movq mm1,[uv_mask]
        shufps xmm4,xmm4,0  'move 1/z into all slots
        
        
        movaps xmm7,xmm2     'get camera vector
        andps xmm7,[edi+offsetof(edge,abs_mask)]  'absolute camera vector
        
        movaps xmm5,xmm7     'copy absolute camera vector
        
        shufps xmm7,xmm7,&b11100101' move y into first slot

        maxss xmm7,xmm5            'get highest value (dx,dy)
        rcpps xmm5,xmm4          'get z in all slots
        mulps xmm5,xmm1      'initial tex coords
        xor ebx,ebx          'zero ebx
        rcpss xmm7,xmm7      '1.0/max(dx,dy)
mulps xmm5,[edi+offsetof(edge,tex_shift)]
        cvtps2pi mm3,xmm5    'convert tex coords to int
        mulss xmm7,[m256]
        pshufw mm2,mm3,&b1100     'get lower 16 bits of u/v
        shufps xmm7,xmm7,0
        mov ecx,[v_shift]    '
        mulps xmm7,xmm2
        
        movd edx,mm2         '
        'shr edx,8            'get rid of fraction
        'shl dx,cl            'combine uv
        shr edx,cl
        and edx,[uv_mask]    'mask for wrapping
        cmp bh,[esi+edx]
        jae no_ploop
mulps xmm7,[edi+offsetof(edge,tex_shift)]
        cvtps2pi mm4,xmm7
        movhlps xmm7,xmm7
        cvtss2si eax,xmm7
'xor edi,edi
        cmp eax,ebx
        jbe no_ploop
        'pshufw mm3,mm3,8
        'pshufw mm4,mm4,8
        movss xmm6,xmm7
        

ploop1:     paddd mm3,mm4
            add bx,ax
            pshufw mm2,mm3,&b1100 
            psrld mm2,mm0
            pand mm2,mm1
            movd edx,mm2
            cmp bh,[esi+edx]
        jb ploop1
no_ploop:
mov edi,[texels]
        mov eax,[p_start]
        shl edx,4
        mov esi,[normals]
        movaps xmm5,[esi+edx]
        movaps xmm6,xmm3
        movaps xmm7,xmm3
        mulps xmm6,xmm6
        mulps xmm7,xmm5
        haddps xmm6,xmm6
        haddps xmm7,xmm7
        haddps xmm6,xmm6
        haddps xmm7,xmm7
        rsqrtss xmm6,xmm6
        mulss xmm6,xmm7
        maxss xmm6,[ambient]
        shufps xmm6,xmm6,0
        
        mulps xmm6,xmm0
        divps xmm6,xmm4
        
        cvtps2pi mm0,xmm6
        movhlps xmm6,xmm6
        cvtps2pi mm1,xmm6
        packuswb mm0,mm1
        
        shr edx,2
        movd mm1,[edi+edx]
        pxor mm7,mm7
        punpcklbw mm1,mm7    'combine tex and colour
        movq mm3,mm1
        

        
        pshufw mm3,mm3,&b11111111
        pmullw mm0,mm1
        psrlw mm0,8
        


        shl edx,2
        movaps xmm5,[esi+edx]
        mulps xmm5,xmm3
        haddps xmm5,xmm5
        haddps xmm5,xmm5
        addss xmm5,xmm5
        shufps xmm5,xmm5,0
        mulps xmm5,[esi+edx]
        movaps xmm6,xmm3
        subps xmm6,xmm5
        mulps xmm6,xmm2
        haddps xmm6,xmm6
        haddps xmm6,xmm6
        movaps xmm7,xmm3
        mulps xmm7,xmm7
        haddps xmm7,xmm7
        haddps xmm7,xmm7
        movaps xmm5,xmm2
        mulps xmm5,xmm5
        haddps xmm5,xmm5
        haddps xmm5,xmm5
        mulss xmm5,xmm7
        rsqrtss xmm5,xmm5
        mulss xmm5,xmm6
        minss xmm5,[zero]
        mulss xmm5,xmm5
        mulss xmm5,xmm5
        'mulss xmm5,xmm5
        'mulss xmm5,xmm5
        'mulss xmm5,xmm5
        'mulss xmm5,xmm5
        'mulss xmm5,xmm5
        mulss xmm5,[m256]
        
        cvtps2pi mm2,xmm5
        
        pshufw mm2,mm2,0
        pmullw mm2,mm3
        psrlw mm2,8
        
        
'movd mm6,ebx
'pshufw mm6,mm6,0
'psrlw mm6,10
'paddusw mm0,mm6

        paddusw mm0,mm2 'add specular

        'movq mm0,mm2   'specular component only
        packuswb mm0,mm0
        
        'movd mm0,[argb] 'triangle random colour
        'packuswb mm1,mm1 'texture/no shading (write mm1 into [eax] in next instruction)
        
        movd [eax],mm0       'write pixel
        
no_fill:
        
        mov eax,[scan]
        addps xmm0,[eax+offsetof(edge,db)]
        addps xmm1,[eax+offsetof(edge,du0)]
        addps xmm2,[eax+offsetof(edge,dcx)]
        addps xmm3,[eax+offsetof(edge,dlx)]
        addss xmm4,[eax+offsetof(edge,dz)]
    end asm
#else
                *p_start=(scan->r shl 16)or(scan->g shl 8)or(scan->b)
                scan->r+=scan->dr
                scan->g+=scan->dg
                scan->b+=scan->db
#endif
                p_start+=1
            wend
#ifdef sse_
    asm
        emms
        mov eax,[edge]
        movaps xmm0,[eax+offsetof(edge,b)]
        movaps xmm1,[eax+offsetof(edge,u0)]
        movaps xmm2,[eax+offsetof(edge,cx)]
        movaps xmm3,[eax+offsetof(edge,lx)]
        movaps xmm4,[eax+offsetof(edge,z)]
        addps xmm0,[eax+offsetof(edge,db)]
        addps xmm1,[eax+offsetof(edge,du0)]
        addps xmm2,[eax+offsetof(edge,dcx)]
        addps xmm3,[eax+offsetof(edge,dlx)]
        addps xmm4,[eax+offsetof(edge,dz)]
        movaps [eax+offsetof(edge,b)],xmm0
        movaps [eax+offsetof(edge,u0)],xmm1
        movaps [eax+offsetof(edge,cx)],xmm2
        movaps [eax+offsetof(edge,lx)],xmm3
        movaps [eax+offsetof(edge,z)],xmm4
    end asm
#else
            edge->r+=edge->dr
            edge->g+=edge->dg
            edge->b+=edge->db
            edge->x+=edge->dx
            edge->e+=edge->de
#endif
            l_start+=buffer->pitch
            y_start+=1
        wend
        if y_start<=y_fin then
            y_end=y_fin
            edge->de=(v2->sx-v1->sx)/(v2->sy-v1->sy)
            dim as single d=y_start-v2->sy
            edge->e=v2->sx+edge->de*d
        end if
    wend
    
    
            
            delete edge_bytes
            delete scan_bytes
        end if
    end if
end sub

sub transform_vertices(byval buffer as gfx_buffer ptr,byval world as world pointer,byval mesh as entity pointer)
    dim as entity pointer cam=world->camera
    dim as single w=buffer->wwidth*.5,h=buffer->height*.5
    
    dim as single lig_xt=world->light->m[16]-mesh->m[16]
    dim as single lig_yt=world->light->m[17]-mesh->m[17]
    dim as single lig_zt=world->light->m[18]-mesh->m[18]
    dim as single lx=mesh->m[0]*lig_xt+mesh->m[4]*lig_yt+mesh->m[8]*lig_zt
    dim as single ly=mesh->m[1]*lig_xt+mesh->m[5]*lig_yt+mesh->m[9]*lig_zt
    dim as single lz=mesh->m[2]*lig_xt+mesh->m[6]*lig_yt+mesh->m[10]*lig_zt
    
    dim as single cam_xt=mesh->m[16]-cam->m[16]
    dim as single cam_yt=mesh->m[17]-cam->m[17]
    dim as single cam_zt=mesh->m[18]-cam->m[18]
    dim as single cx=-(mesh->m[0]*cam_xt+mesh->m[4]*cam_yt+mesh->m[8]*cam_zt)
    dim as single cy=-(mesh->m[1]*cam_xt+mesh->m[5]*cam_yt+mesh->m[9]*cam_zt)
    dim as single cz=-(mesh->m[2]*cam_xt+mesh->m[6]*cam_yt+mesh->m[10]*cam_zt)
    dim as single ox=cam->m[0]*cam_xt+cam->m[4]*cam_yt+cam->m[8]*cam_zt
    dim as single oy=cam->m[1]*cam_xt+cam->m[5]*cam_yt+cam->m[9]*cam_zt
    dim as single oz=cam->m[2]*cam_xt+cam->m[6]*cam_yt+cam->m[10]*cam_zt
    dim as single ux0=(cam->m[0]*mesh->m[0]+cam->m[4]*mesh->m[4]+cam->m[8]*mesh->m[8])
    dim as single ux1=(cam->m[0]*mesh->m[1]+cam->m[4]*mesh->m[5]+cam->m[8]*mesh->m[9])
    dim as single ux2=(cam->m[0]*mesh->m[2]+cam->m[4]*mesh->m[6]+cam->m[8]*mesh->m[10])
    dim as single uy0=(cam->m[1]*mesh->m[0]+cam->m[5]*mesh->m[4]+cam->m[9]*mesh->m[8])
    dim as single uy1=(cam->m[1]*mesh->m[1]+cam->m[5]*mesh->m[5]+cam->m[9]*mesh->m[9])
    dim as single uy2=(cam->m[1]*mesh->m[2]+cam->m[5]*mesh->m[6]+cam->m[9]*mesh->m[10])
    dim as single uz0=(cam->m[2]*mesh->m[0]+cam->m[6]*mesh->m[4]+cam->m[10]*mesh->m[8])
    dim as single uz1=(cam->m[2]*mesh->m[1]+cam->m[6]*mesh->m[5]+cam->m[10]*mesh->m[9])
    dim as single uz2=(cam->m[2]*mesh->m[2]+cam->m[6]*mesh->m[6]+cam->m[10]*mesh->m[10])
   
    dim as vertex pointer vert=mesh->model->first_vertex
    
    while vert<>0
        vert->rx=vert->x*ux0+vert->y*ux1+vert->z*ux2+ox
        vert->ry=vert->x*uy0+vert->y*uy1+vert->z*uy2+oy
        vert->rz=vert->x*uz0+vert->y*uz1+vert->z*uz2+oz
        
        
        vert->sz=1.0/vert->rz
        vert->sx=w+w*2.0*(vert->rx)*(vert->sz)
        vert->sy=h-w*2.0*(vert->ry)*(vert->sz)
        'vert->su0=vert->u0*vert->sz
        'vert->sv0=vert->v0*vert->sz
        
        vert->sr=vert->r*256.0*vert->sz
        vert->sg=vert->g*256.0*vert->sz
        vert->sb=vert->b*256.0*vert->sz
        
        dim as single lpx=lx-vert->x,lpy=ly-vert->y,lpz=lz-vert->z
        vert->slx=(lpx*vert->bx+lpy*vert->by+lpz*vert->bz)*vert->sz
        vert->sly=(lpx*vert->tx+lpy*vert->ty+lpz*vert->tz)*vert->sz
        vert->slz=(lpx*vert->nx+lpy*vert->ny+lpz*vert->nz)*vert->sz
        
        dim as single d=1.0'/sqr(vert->slx*vert->slx+vert->sly*vert->sly+vert->slz*vert->slz)
        vert->slx*=d
        vert->sly*=d
        vert->slz*=d
        
        dim as single cpx=cx-vert->x,cpy=cy-vert->y,cpz=cz-vert->z
        vert->scx=(cpx*vert->bx+cpy*vert->by+cpz*vert->bz)*vert->sz
        vert->scy=(cpx*vert->tx+cpy*vert->ty+cpz*vert->tz)*vert->sz
        vert->scz=(cpx*vert->nx+cpy*vert->ny+cpz*vert->nz)*vert->sz
        
        d=1.0'/sqr(vert->scx*vert->scx+vert->scy*vert->scy+vert->scz*vert->scz)
        vert->scx*=d
        vert->scy*=d
        vert->scz*=d
        
        vert=vert->next_vertex
    wend
end sub

sub render_mesh(byval dest as gfx_buffer pointer,byval mesh as entity pointer)
    dim as triangle pointer tri=mesh->model->first_triangle
    while tri<>0
        if (tri->v0->rz>4.0)and(tri->v1->rz>4.0)and(tri->v2->rz>4.0) then
            dim as single dx0=tri->v1->sx-tri->v0->sx
            dim as single dy0=tri->v1->sy-tri->v0->sy
            dim as single dx1=tri->v2->sx-tri->v0->sx
            dim as single dy1=tri->v2->sy-tri->v0->sy
            if (dx0*dy1)>(dx1*dy0) then draw_triangle(dest,tri)
        end if
        tri=tri->next_triangle
    wend
end sub

sub render_world(byval dest as gfx_buffer pointer,byval world as world pointer)
    dim as entity pointer mesh=world->first_entity
    while mesh<>0
        transform_vertices(dest,world,mesh)
        render_mesh(dest,mesh)
        mesh=mesh->next_entity
        wend
end sub

function add_vertex(byval model as model ptr,_
                    byval x as single,_
                    byval y as single,_
                    byval z as single,_
                    byval u as single=0.0,_
                    byval v as single=0.0)as vertex ptr
    dim as vertex_data ptr dat=new vertex_data
    dim as vertex ptr vertex=cast(vertex ptr,(cast(ulong,dat)+15)and &hfffffff0)
    vertex->vertex_data=dat
    vertex->x=x
    vertex->y=y
    vertex->z=z
    vertex->u0=u
    vertex->v0=v
    vertex->r=1.0
    vertex->g=1.0
    vertex->b=1.0
    vertex->next_vertex=model->first_vertex
    model->first_vertex=vertex
    function=vertex
end function

sub set_vertex_colour(byval v as vertex pointer,byval r as single,byval g as single,byval b as single)
    v->r=r
    v->g=g
    v->b=b
end sub

sub set_vertex_texture(byval vertex as vertex ptr,byval v as single,byval u as single)
    vertex->u0=u
    vertex->v0=v
end sub

function add_triangle(byval model as model ptr,_
                        byval v0 as vertex ptr,_
                        byval v1 as vertex ptr,_
                        byval v2 as vertex ptr,_
                        byval texture as gfx_buffer ptr)as triangle ptr
    dim as triangle ptr triangle=new triangle
    triangle->v0=v0
    triangle->v1=v1
    triangle->v2=v2
    triangle->argb=rnd*&hffffff
    triangle->texture=texture
    triangle->next_triangle=model->first_triangle
    model->first_triangle=triangle
    function=triangle
end function

sub normalise_model(byval model as model ptr)
    dim as triangle ptr tri=model->first_triangle
    while tri<>0
        dim as vertex ptr v0=tri->v0,v1=tri->v1,v2=tri->v2
        
        dim as single vx0=v1->x-v0->x,vy0=v1->y-v0->y,vz0=v1->z-v0->z
        dim as single vx1=v2->x-v0->x,vy1=v2->y-v0->y,vz1=v2->z-v0->z
        
        dim as single d'=1.0/sqr(vx0*vx0+vy0*vy0+vz0*vz0)
        'vx0*=d
        'vy0*=d
        'vz0*=d
        'd=1.0/sqr(vx1*vx1+vy1*vy1+vz1*vz1)
        'vx1*=d
        'vy1*=d
        'vz1*=d
        
        tri->nx=vy0*vz1-vz0*vy1
        tri->ny=vz0*vx1-vx0*vz1
        tri->nz=vx0*vy1-vy0*vx1
        'd=1.0/sqr(tri->nx^2+tri->ny^2+tri->nz^2)
        'tri->nx*=d
        'tri->ny*=d
        'tri->nz*=d
        
        dim as single du0=v1->u0-v0->u0
        dim as single dv0=v1->v0-v0->v0
        dim as single du1=v2->u0-v0->u0
        dim as single dv1=v2->v0-v0->v0
        
        'd=1.0/sqr(du0*du0+dv0*dv0)
        'du0*=d
        'dv0*=d
        'd=1.0/sqr(du1*du1+dv1*dv1)
        'du1*=d
        'dv1*=d
        
        d=1.0/(du0*dv1-du1*dv0)
        tri->bx=d*(-dv1*vx0+dv0*vx1)
        tri->by=d*(-dv1*vy0+dv0*vy1)
        tri->bz=d*(-dv1*vz0+dv0*vz1)
        tri->tx=d*(du1*vx0-du0*vx1)
        tri->ty=d*(du1*vy0-du0*vy1)
        tri->tz=d*(du1*vz0-du0*vz1)
        'd=1.0/sqr(tri->tx*tri->tx+tri->ty*tri->ty+tri->tz*tri->tz)
        'tri->tx*=d
        'tri->ty*=d
        'tri->tz*=d
        'd=1.0/sqr(tri->bx*tri->bx+tri->by*tri->by+tri->bz*tri->bz)
        'tri->bx*=d
        'tri->by*=d
        'tri->bz*=d
        
        d=1/sqr(vx0^2+vy0^2+vz0^2)
        vx0*=d:vy0*=d:vz0*=d
        d=1/sqr(vx1^2+vy1^2+vz1^2)
        vx1*=d:vy1*=d:vz1*=d
        d=acos(vx0*vx1+vy0*vy1+vz0*vz1)
        v0->nx+=d*tri->nx
        v0->ny+=d*tri->ny
        v0->nz+=d*tri->nz
        v0->tx+=d*tri->tx
        v0->ty+=d*tri->ty
        v0->tz+=d*tri->tz
        v0->bx+=d*tri->bx
        v0->by+=d*tri->by
        v0->bz+=d*tri->bz
        vx0=v0->x-v1->x:vy0=v0->y-v1->y:vz0=v0->z-v1->z
        vx1=v2->x-v1->x:vy1=v2->y-v1->y:vz1=v2->z-v1->z
        d=1/sqr(vx0^2+vy0^2+vz0^2)
        vx0*=d:vy0*=d:vz0*=d
        d=1/sqr(vx1^2+vy1^2+vz1^2)
        vx1*=d:vy1*=d:vz1*=d
        d=acos(vx0*vx1+vy0*vy1+vz0*vz1)
        v1->nx+=d*tri->nx
        v1->ny+=d*tri->ny
        v1->nz+=d*tri->nz
        v1->tx+=d*tri->tx
        v1->ty+=d*tri->ty
        v1->tz+=d*tri->tz
        v1->bx+=d*tri->bx
        v1->by+=d*tri->by
        v1->bz+=d*tri->bz
        vx0=v0->x-v2->x:vy0=v0->y-v2->y:vz0=v0->z-v2->z
        vx1=v1->x-v2->x:vy1=v1->y-v2->y:vz1=v1->z-v2->z
        d=1/sqr(vx0^2+vy0^2+vz0^2)
        vx0*=d:vy0*=d:vz0*=d
        d=1/sqr(vx1^2+vy1^2+vz1^2)
        vx1*=d:vy1*=d:vz1*=d
        d=acos(vx0*vx1+vy0*vy1+vz0*vz1)
        v2->nx+=d*tri->nx
        v2->ny+=d*tri->ny
        v2->nz+=d*tri->nz
        v2->tx+=d*tri->tx
        v2->ty+=d*tri->ty
        v2->tz+=d*tri->tz
        v2->bx+=d*tri->bx
        v2->by+=d*tri->by
        v2->bz+=d*tri->bz
        
        tri=tri->next_triangle
    wend
    dim as vertex ptr vert=model->first_vertex
    while vert<>0
        dim as single d=1.0/sqr(vert->nx^2+vert->ny^2+vert->nz^2)
        vert->nx*=d
        vert->ny*=d
        vert->nz*=d
        d=1.0/sqr(vert->tx^2+vert->ty^2+vert->tz^2)
        vert->tx*=d
        vert->ty*=d
        vert->tz*=d
        d=1.0/sqr(vert->bx^2+vert->by^2+vert->bz^2)
        vert->bx*=d
        vert->by*=d
        vert->bz*=d
        vert=vert->next_vertex
    wend
    
end sub

sub translate_entity(byval entity as entity pointer,byval x as single,byval y as single,byval z as single)
    entity->m[16]+=x
    entity->m[17]+=y
    entity->m[18]+=z
end sub

sub position_entity(byval entity as entity pointer,byval x as single,byval y as single,byval z as single)
    entity->m[16]=x
    entity->m[17]=y
    entity->m[18]=z
end sub

sub move_entity(entity as entity pointer,x as single,y as single,z as single)  
    entity->m[16]+=x*entity->m[0]+y*entity->m[1]+z*entity->m[3]
    entity->m[17]+=x*entity->m[4]+y*entity->m[5]+z*entity->m[6]
    entity->m[18]+=x*entity->m[8]+y*entity->m[9]+z*entity->m[10]
end sub

sub rotate_entity(byval entity as entity pointer,byval b as single,byval a as single,byval c as single)  
    a*=3.1415926/180.0
    b*=3.1415926/180.0
    c*=3.1415926/180.0
    dim csa as single=cos(a)
    dim sna as single=sin(a)
    dim csb as single=cos(b)
    dim snb as single=sin(b)
    dim csc as single=cos(-c)
    dim snc as single=sin(-c)
    dim x as single
    dim y as single
    dim z as single
    
    x=entity->m[0]*csa+entity->m[8]*sna
    z=entity->m[8]*csa-entity->m[0]*sna
    entity->m[8]=z*csb+entity->m[4]*snb
    y=entity->m[4]*csb-z*snb
    entity->m[0]=x*csc+y*snc
    entity->m[4]=y*csc-x*snc
    
    x=entity->m[1]*csa+entity->m[9]*sna
    z=entity->m[9]*csa-entity->m[1]*sna
    entity->m[9]=z*csb+entity->m[5]*snb
    y=entity->m[5]*csb-z*snb
    entity->m[1]=x*csc+y*snc
    entity->m[5]=y*csc-x*snc
    
    x=entity->m[2]*csa+entity->m[10]*sna
    z=entity->m[10]*csa-entity->m[2]*sna
    entity->m[10]=z*csb+entity->m[6]*snb
    y=entity->m[6]*csb-z*snb
    entity->m[2]=x*csc+y*snc
    entity->m[6]=y*csc-x*snc
end sub

sub turn_entity(byval entity as entity pointer,byval b as single,byval a as single,byval c as single)  
    a*=3.1415926/180.0
    b*=3.1415926/180.0
    c*=3.1415926/180.0
    dim csa as single=cos(-a)
    dim sna as single=sin(-a)
    dim csb as single=cos(-b)
    dim snb as single=sin(-b)
    dim csc as single=cos(c)
    dim snc as single=sin(c)
    dim x as single
    dim y as single
    dim z as single
    x=entity->m[0]*csa+entity->m[2]*sna
    z=entity->m[2]*csa-entity->m[0]*sna
    entity->m[2]=z*csb+entity->m[1]*snb
    y=entity->m[1]*csb-z*snb
    entity->m[0]=x*csc+y*snc
    entity->m[1]=y*csc-x*snc
    x=entity->m[4]*csa+entity->m[6]*sna
    z=entity->m[6]*csa-entity->m[4]*sna
    entity->m[6]=z*csb+entity->m[5]*snb
    y=entity->m[5]*csb-z*snb
    entity->m[4]=x*csc+y*snc
    entity->m[5]=y*csc-x*snc
    x=entity->m[8]*csa+entity->m[10]*sna
    z=entity->m[10]*csa-entity->m[8]*sna
    entity->m[10]=z*csb+entity->m[9]*snb
    y=entity->m[9]*csb-z*snb
    entity->m[8]=x*csc+y*snc
    entity->m[9]=y*csc-x*snc
end sub

sub copy_rotation(byval dst as entity ptr,_
                    byval src as entity ptr)
    dst->m[0]=src->m[0]
    dst->m[1]=src->m[1]
    dst->m[2]=src->m[2]
    dst->m[4]=src->m[4]
    dst->m[5]=src->m[5]
    dst->m[6]=src->m[6]
    dst->m[8]=src->m[8]
    dst->m[9]=src->m[9]
    dst->m[10]=src->m[10]
end sub

sub copy_position(byval dst as entity ptr,_
                    byval src as entity ptr)
    dst->m[16]=src->m[16]
    dst->m[17]=src->m[17]
    dst->m[18]=src->m[18]
end sub

function create_test_texture(byval w as long,byval h as long)as gfx_buffer ptr
    dim as gfx_buffer ptr texture=new gfx_buffer(w,h)
    enable_normal(texture)
    dim as single height(0 to w-1,0 to h-1)
    dim as single min=-100
    for y as integer=0 to h-1
        for x as integer=0 to w-1
            dim as single r=sqr((y-h/2)^2+(x-w/2)^2)
            height(x,y)=20*cos(r*0.1)'+rnd*20
            if ((x and 15)<5)or((y and 15)<5) then height(x,y)=-20
            height(x,y)+=rnd*10
            if (r>100)and(r<120) then height(x,y)=50-10*abs(r-110)
            height(x,y)+=(150-r)*sin(r*.2)
            'height(x,y)+=rnd*10
            
            
            
            if height(x,y)>min then min=height(x,y)
            
            
        next
    next
    for y as integer=0 to h-1
        for x as integer=0 to w-1
            height(x,y)-=min
            
            dim as single hh
            hh=-x*8
            if hh>height(x,y) then height(x,y)=hh
            hh=-y*8
            if hh>height(x,y) then height(x,y)=hh
            hh=-(255-x)*8
            if hh>height(x,y) then height(x,y)=hh
            hh=-(255-y)*8
            if hh>height(x,y) then height(x,y)=hh
            '(rnd*8*sin(y*.1))xor x
            'if (abs(x-128)<90)and(abs(y-128)<90) then height(x,y)-=200
        next
    next
    for y as long=0 to h-1
        for x as long=0 to w-1
            set_pixel(texture,x,y,(rnd*&hffffff)or &hff000000)
            if height(x,y)>(1-min) then set_pixel(texture,x,y,&hffff0000)
            if ((x>40)and(x<80))or((y>40)and(y<80)) then set_pixel(texture,x,y,&hffffff00)
            
            'set_pixel(texture,x,y,get_pixel(texture,x,y)and &hffffff)
            'set_pixel(texture,x,y,&hffffff00)
            'if ((x and 15)=0)or((y and 15)=0) then set_pixel(texture,x,y,0)'&hffff00)
            
            dim as single dx,dy
            
            dx=height((x+1)and &hff,y)-height((x-1)and &hff,y)
            dy=height(x,(y+1)and &hff)-height(x,(y-1)and &hff)
            set_normal(texture,x,y,dx,dy,40,abs(height(x,y)*.1))
            if abs(height(x,y)*.5)>255 then end
        next
    next
    function=texture
end function

function create_cylinder_model(byval world as world ptr,byval size as single)as model ptr
    dim as gfx_buffer ptr texture=create_test_texture(256,256)
    dim as vertex ptr v(0 to 24,0 to 3)
    dim as model ptr model=create_model(world)
    dim as single r=size
    for a as integer=0 to 24
        dim as single x=r*sin(a*15.0/180.0*3.1415926)
        dim as single z=r*cos(a*15.0/180.0*3.1415926)
        v(a,0)=add_vertex(model,x,size,z,a/8.0,0)
        v(a,1)=add_vertex(model,x,-size,z,a/8.0,1)
        v(a,2)=add_vertex(model,x,size,z,0.5+0.5*sin(a*15.0/180.0*3.1415926),0.5+0.5*cos(a*15.0/180.0*3.1415926))
        v(a,3)=add_vertex(model,x,-size,z,0.5+0.5*sin(a*15.0/180.0*3.1415926),0.5+0.5*cos(a*15.0/180.0*3.1415926))
        for i as integer=0 to 3
            set_vertex_colour(v(a,i),0.5,0.5,.2)
            next
    next
    dim as vertex ptr t=add_vertex(model,0,size*1,0,0.5,0.5)
    dim as vertex ptr b=add_vertex(model,0,-size*1,0,0.5,0.5)
    for a as integer=0 to 23
        add_triangle(model,v(a,0),v(a,1),v(a+1,1),texture)
        add_triangle(model,v(a+1,1),v(a+1,0),v(a,0),texture)
        
        add_triangle(model,t,v(a,2),v(a+1,2),texture)
        add_triangle(model,v(a,3),b,v(a+1,3),texture)
    next
    normalise_model(model)
    return model
end function

function create_cylinder_model2(byval world as world ptr,byval size as single)as model ptr
    dim as gfx_buffer ptr texture=create_test_texture(256,256)
    dim as vertex ptr v(0 to 1,0 to 1)
    dim as model ptr model=create_model(world)
    dim as single r=size
    for a as integer=0 to 23
        dim as single x=r*sin(a*15.0/180.0*3.1415926)
        dim as single z=r*cos(a*15.0/180.0*3.1415926)
        v(0,0)=add_vertex(model,x,size,z,1,a/8.0)
        v(0,1)=add_vertex(model,x,-size,z,0,a/8.0)
        x=r*sin((a+1)*15.0/180.0*3.1415926)
        z=r*cos((a+1)*15.0/180.0*3.1415926)
        v(1,0)=add_vertex(model,x,size,z,1,(a+1)/8.0)
        v(1,1)=add_vertex(model,x,-size,z,0,(a+1)/8.0)
        add_triangle(model,v(0,0),v(0,1),v(1,1),texture)
        add_triangle(model,v(1,1),v(1,0),v(0,0),texture)
    next
    
    normalise_model(model)
    return model
end function

function create_cube_model(byval world as world ptr,byval size as single)as model ptr
    dim as single coff=1
    dim as gfx_buffer ptr texture=create_test_texture(256,256)
    dim as model ptr model=create_model(world)
    dim as vertex ptr v0,v1,v2,v3,v4
    v0=add_vertex(model,-size, size,-size,0,0)
    v1=add_vertex(model, size, size,-size,1,0)
    v2=add_vertex(model, size,-size,-size,1,1)
    v3=add_vertex(model,-size,-size,-size,0,1)
    v4=add_vertex(model,0,0,-size*coff,.5,.5)
    'set_vertex_colour(v0,0,0,1)
    'set_vertex_colour(v1,0,1,0)
    'set_vertex_colour(v2,1,0,0)
    'set_vertex_colour(v3,1,0,1)
    
    add_triangle(model,v0,v1,v2,texture)
    add_triangle(model,v2,v3,v0,texture)
    
    'add_triangle(model,v0,v1,v4,texture)
    'add_triangle(model,v1,v2,v4,texture)
    'add_triangle(model,v2,v3,v4,texture)
    'add_triangle(model,v3,v0,v4,texture)
    
    v0=add_vertex(model,-size, size, size,0,0)
    v1=add_vertex(model,-size, size,-size,1,0)
    v2=add_vertex(model,-size,-size,-size,1,1)
    v3=add_vertex(model,-size,-size, size,0,1)
    set_vertex_colour(v0,0.5,0.5,0.5)
    set_vertex_colour(v1,0.5,0.5,0.5)
    set_vertex_colour(v2,0.5,0.5,0.5)
    set_vertex_colour(v3,0.5,0.5,0.5)
    
    add_triangle(model,v0,v1,v2,texture)
    add_triangle(model,v2,v3,v0,texture)
    
    v0=add_vertex(model, size, size, size,0,0)
    v1=add_vertex(model,-size, size, size,1,0)
    v2=add_vertex(model,-size,-size, size,1,1)
    v3=add_vertex(model, size,-size, size,0,1)
    set_vertex_colour(v0,0,0,1)
    set_vertex_colour(v1,0,0,1)
    set_vertex_colour(v2,0,0,1)
    set_vertex_colour(v3,0,0,1)
    
    add_triangle(model,v0,v1,v2,texture)
    add_triangle(model,v2,v3,v0,texture)
    
    v0=add_vertex(model, size, size,-size,0,0)
    v1=add_vertex(model, size, size, size,1,0)
    v2=add_vertex(model, size,-size, size,1,1)
    v3=add_vertex(model, size,-size,-size,0,1)
    set_vertex_colour(v0,0,0,0)
    set_vertex_colour(v1,0,0,0)
    set_vertex_colour(v2,0,0,0)
    set_vertex_colour(v3,0,0,0)
    
    add_triangle(model,v0,v1,v2,texture)
    add_triangle(model,v2,v3,v0,texture)
    
    v0=add_vertex(model,-size, size, size,0,0)
    v1=add_vertex(model, size, size, size,1,0)
    v2=add_vertex(model, size, size,-size,1,1)
    v3=add_vertex(model,-size, size,-size,0,1)
    set_vertex_colour(v0,0.2,0.2,0)
    set_vertex_colour(v1,0.2,0.2,0)
    set_vertex_colour(v2,0.2,0.2,0)
    set_vertex_colour(v3,0.2,0.2,0)
    
    add_triangle(model,v0,v1,v2,texture)
    add_triangle(model,v2,v3,v0,texture)
    
    v0=add_vertex(model,-size,-size,-size,0,0)
    v1=add_vertex(model, size,-size,-size,1,0)
    v2=add_vertex(model, size,-size, size,1,1)
    v3=add_vertex(model,-size,-size, size,0,1)
    
    add_triangle(model,v0,v1,v2,texture)
    add_triangle(model,v2,v3,v0,texture)
    normalise_model(model)
    return model
end function

sub main
    dim as gfx_buffer ptr display=new gfx_buffer(640,480,1)
    
    dim as world ptr world=create_world()
    dim as entity ptr camera=create_camera(world)
    dim as entity ptr light=create_light(world)
    position_entity light,0,3000,-500
    dim as entity ptr cylinder=create_entity(world,create_cylinder_model(world,100))
    dim as entity ptr cube=create_entity(world,create_cube_model(world,100))
    position_entity cube,-180,0,0
    position_entity cylinder,180,0,0
    move_entity camera,0,0,-650
    
    dim as double t=timer,delta=timer
    dim as string a
    while a<>chr(27)
        a=inkey
        dim as double tt=timer
        dim as double fps=1.0/(tt-t)
        t=tt
        'windowtitle(str$(fps))
        if a="a" then turn_entity cube,-1,0,0
        if a="q" then turn_entity cube,1,0,0
        if a="w" then turn_entity cube,0,-1,0
        if a="s" then turn_entity cube,0,1,0
        if a="e" then turn_entity cube,0,0,-1
        if a="d" then turn_entity cube,0,0,1
        
        while delta<timer
            turn_entity(cube,0.08,0.5,0)
            rotate_entity(cube,.11,0,.13)
            turn_entity(cylinder,0.11,-0.3,0)
            rotate_entity(cylinder,-.13,0,.07)
            delta+=.01
        wend
        
        dim as double t=timer
        render_world(display,world)
        windowtitle(str(timer-t))
        flip
        'color ,&hff00ff
        cls
    wend
    
end sub

main
end

Boromir
Posts: 463
Joined: Apr 30, 2015 19:28
Location: Oklahoma,U.S., Earth,Solar System
Contact:

Re: Normal Mapping

Post by Boromir »

The code compiles, but it crashes when I run the program.
My system.
Windows Xp 32 bit
1.8 ghz Pentuim M cpu
Stonemonkey
Posts: 649
Joined: Jun 09, 2005 0:08

Re: Normal Mapping

Post by Stonemonkey »

What version of SSE? Might be instruction like horizontal add or something causing that. Sorry.
D.J.Peters
Posts: 8586
Joined: May 28, 2005 3:28
Contact:

Re: Normal Mapping

Post by D.J.Peters »

@stonemonkey first good job looks amazing
compile with -gen gcc and you will see one dword ptr are missing

line 861 and edx, dword ptr [uv_mask]

@Boromir tested on Win XP 32-bit SP3 no problems here
does your M CPU has full support of SSE ?

Joshy
D.J.Peters
Posts: 8586
Joined: May 28, 2005 3:28
Contact:

Re: Normal Mapping

Post by D.J.Peters »

@Boromir your CPU should be OK
from wkipedia: https://en.wikipedia.org/wiki/Pentium_M looks like it supports MMX, SSE, SSE2.

@stonemonkey do you use SSE2 ?
if so all your data (inside your struct's also) must be 16 byte aligned may be the M CPU is stonger in this case !

Joshy
Stonemonkey
Posts: 649
Joined: Jun 09, 2005 0:08

Re: Normal Mapping

Post by Stonemonkey »

I'm using the SSE3 instruction haddps (horizontal add) which may not be available on the M

My data is 16 byte aligned.
D.J.Peters
Posts: 8586
Joined: May 28, 2005 3:28
Contact:

Re: Normal Mapping

Post by D.J.Peters »

In this case if SSE_ not defined your else part are incomplete (black screen) !

tip use:

Code: Select all

#if defined(SSE_)
  xxx
  #if defined (SSE3_)
    yyy
  #endif
#else
  ' no SSE at all
  zzz
#endif
Stonemonkey
Posts: 649
Joined: Jun 09, 2005 0:08

Re: Normal Mapping

Post by Stonemonkey »

I know, I have not yet written the equivelant FB code.
D.J.Peters
Posts: 8586
Joined: May 28, 2005 3:28
Contact:

Re: Normal Mapping

Post by D.J.Peters »

Stonemonkey wrote:I know, I have not yet written the equivelant FB code.
stupid lazy nerd :lol:
Stonemonkey
Posts: 649
Joined: Jun 09, 2005 0:08

Re: Normal Mapping

Post by Stonemonkey »

D.J.Peters wrote:
Stonemonkey wrote:I know, I have not yet written the equivelant FB code.
stupid lazy nerd :lol:
Don't blame me, I'm the product of self learning in the early 80s and writing machine code into a hex pad. Just using a keyboard is a luxury.
Boromir
Posts: 463
Joined: Apr 30, 2015 19:28
Location: Oklahoma,U.S., Earth,Solar System
Contact:

Re: Normal Mapping

Post by Boromir »

Powered on my desktop to test this out.
It's amazing. I've never seen software rendering this beautiful before.

So where are you headed with this? Do you plan on adding anything more to it?
Stonemonkey
Posts: 649
Joined: Jun 09, 2005 0:08

Re: Normal Mapping

Post by Stonemonkey »

Thanks, yeah, lots more I want to add to it, some stuff I've done before like shadows and cubemapping. There's still a lot to be done to get the parallax mapping working fully though. And I'm wanting to port it to android at some point.
vdecampo
Posts: 2992
Joined: Aug 07, 2007 23:20
Location: Maryland, USA
Contact:

Re: Normal Mapping

Post by vdecampo »

Can any of these functions be generalized into a software rendering library of sorts? This is very nice work.

-Vince
Post Reply