OpenCL (Open Coumputer Language) for FreeBASIC .
-
- Posts: 8586
- Joined: May 28, 2005 3:28
- Contact:
Re: OpenCL (Open Coumputer Language) for FreeBASIC .
the posted code was an early test on 32-bit (fixed now)
the line:
dim as cl_int nChars
must be:
dim as size_t nChars
Joshy
the line:
dim as cl_int nChars
must be:
dim as size_t nChars
Joshy
Re: OpenCL (Open Coumputer Language) for FreeBASIC .
Would you mind providing a link for that? Have been messing around with this and can't get anything to work.speedfixer wrote:Simply needed to install the dev packages, that's all.
Cheers,
Mike
-
- Posts: 606
- Joined: Nov 28, 2012 1:27
- Location: CA, USA moving to WA, USA
- Contact:
Re: OpenCL (Open Coumputer Language) for FreeBASIC .
Hi, Mike
To tell the truth, I have no clue which I chose: I simply selected from Synaptic those that looked appropriate.
I'm in Lubuntu (call it little Ubuntu - same kernel,smaller/lighter desktop, etc.)
If you are NOT in an Ubuntu system, let me know. Also, tell me what video card you have and its driver.
In Synaptic or whatever your software package manager may be - search for opencl and note the installed packages.
Something should be present for the video card you have installed.
If there is not - you should still have a large list of choices - libraries, tools, interfaces.
Something should match your vendor and/or your current video driver.
This particular system I am on is Nvidia.
Using the opencl search. I NOW show installed:
1 - clinfo - a OpenCL info tool - it will verify OpenCL capability
x - gdb - handy, but if you need this you know it, if you don't know it - don't bother
x - (gdbserver)
x - nvidia-cuda-doc
* - nvidia-opencl-dev
2 - nvidia-opencl-icd-367 (says transitional for 375 - I chose this)
x - nvidia-opencl-icd-375 (my nvidia driver is 375) --- choose one for YOUR driver
x - nvidia-profiler
x - nvidia-visual-profiler
* - ocl-icd-libopencl1 (should be installed with an icd)
3 - opencl-headers
x - should not be needed, but may be installed with some other choice.
Install clinfo - see what it says. Command line - no sudo needed. It may not run if your driver doesn't support OpenCL.
I would suggest installing one at a time until the compile works. No need for more than required, right?
We would all like to know how it worked out.
David
To tell the truth, I have no clue which I chose: I simply selected from Synaptic those that looked appropriate.
I'm in Lubuntu (call it little Ubuntu - same kernel,smaller/lighter desktop, etc.)
If you are NOT in an Ubuntu system, let me know. Also, tell me what video card you have and its driver.
In Synaptic or whatever your software package manager may be - search for opencl and note the installed packages.
Something should be present for the video card you have installed.
If there is not - you should still have a large list of choices - libraries, tools, interfaces.
Something should match your vendor and/or your current video driver.
This particular system I am on is Nvidia.
Using the opencl search. I NOW show installed:
1 - clinfo - a OpenCL info tool - it will verify OpenCL capability
x - gdb - handy, but if you need this you know it, if you don't know it - don't bother
x - (gdbserver)
x - nvidia-cuda-doc
* - nvidia-opencl-dev
2 - nvidia-opencl-icd-367 (says transitional for 375 - I chose this)
x - nvidia-opencl-icd-375 (my nvidia driver is 375) --- choose one for YOUR driver
x - nvidia-profiler
x - nvidia-visual-profiler
* - ocl-icd-libopencl1 (should be installed with an icd)
3 - opencl-headers
x - should not be needed, but may be installed with some other choice.
Install clinfo - see what it says. Command line - no sudo needed. It may not run if your driver doesn't support OpenCL.
I would suggest installing one at a time until the compile works. No need for more than required, right?
We would all like to know how it worked out.
David
Re: OpenCL (Open Coumputer Language) for FreeBASIC .
Thanks for the detailed answer!
This gives me something to work with. I could really use the extra speed this would provide.
I'm running Win 10 on a newer Lenovo laptop with an AMD Radeon graphics card with a R6 chipset and 10 cores.
Cheers,
Mike
This gives me something to work with. I could really use the extra speed this would provide.
I'm running Win 10 on a newer Lenovo laptop with an AMD Radeon graphics card with a R6 chipset and 10 cores.
Cheers,
Mike
-
- Posts: 8586
- Joined: May 28, 2005 3:28
- Contact:
Re: OpenCL (Open Coumputer Language) for FreeBASIC .
@h4tt3n you need the WINDOWS AMD OpenCL SDK for your readon chip thats all.
Joshy
Joshy
-
- Posts: 35
- Joined: Nov 14, 2019 17:04
Re: OpenCL (Open Coumputer Language) for FreeBASIC .
Please can a working example
with transfer to the kernel and back two or three float4 ?
with transfer to the kernel and back two or three float4 ?
-
- Posts: 8586
- Joined: May 28, 2005 3:28
- Contact:
Re: OpenCL (Open Coumputer Language) for FreeBASIC .
@12val12newakk I don't teach you in OpenCL you have to learn it self ;-)
But you can use my FreeBASIC OpenCL "template" as a starting point for your own experiments.
Try to understand all steps from the template !
If you got the details,
save it under a new name
and use float4 in the kernel !
Allocate the input / output buffers to the right size of float4 !
type cl_float4
as cl_float x, y, z, w
end type
NOTE: float4 in host memory must be 16 byte aligned !
By the way on github.com are sample OpenCL kernels to solfe the simple N-Body problem you working on.
Learning by doing have fun !
Joshy
file: tempate.bas
But you can use my FreeBASIC OpenCL "template" as a starting point for your own experiments.
Try to understand all steps from the template !
If you got the details,
save it under a new name
and use float4 in the kernel !
Allocate the input / output buffers to the right size of float4 !
type cl_float4
as cl_float x, y, z, w
end type
NOTE: float4 in host memory must be 16 byte aligned !
By the way on github.com are sample OpenCL kernels to solfe the simple N-Body problem you working on.
Learning by doing have fun !
Joshy
file: tempate.bas
Code: Select all
' file: template.bas
#include "cl.bi"
#ifndef NULL
#define NULL cptr(any ptr,0)
#endif
' GLOBALS
' Input data is stored here.
dim shared as cl_float ptr g_input
' Output data is stored here.
dim shared as cl_float ptr g_output
' Multiplier is stored in this variable
dim shared as cl_float g_multiplier
' problem size (width) for 1D algorithm
dim shared as cl_uint g_width
' problem size for 2D algorithm (width x height)
dim shared as cl_uint g_height
' The memory buffer that is used
' as input/output for OpenCL kernel
dim shared as cl_mem g_inputBuffer
dim shared as cl_mem g_outputBuffer
dim shared as cl_context g_context
dim shared as cl_device_id ptr g_devices
dim shared as cl_command_queue g_commandQueue
dim shared as cl_program g_program
' This program uses only one kernel
' and this serves as a handle to it
dim shared as cl_kernel g_kernel
' FUNCTION DECLARATIONS
' OpenCL related initialisations are done here.
' Context, Device list, Command Queue are set up.
' Calls are made to set up OpenCL memory buffers that this program uses
' and to load the programs into memory and get kernel handles.
declare sub initializeCL()
' This is called once the OpenCL context, memory etc. are set up,
' the program is loaded into memory and the kernel handles are ready.
' It sets the values for kernels' arguments and enqueues calls to the kernels
' on to the command queue and waits till the calls have finished execution.
'
' It also gets kernel start and end time if profiling is enabled.
declare sub runCLKernels()
' Releases OpenCL resources (Context, Memory etc.)
declare sub cleanupCL()
' Releases program's resources
declare sub cleanupHost()
' Prints no more than 256 elements of the given array.
' Prints full array if length is less than 256.
' Prints Array name followed by elements.
declare sub print1DArray(arrayName as string, _
arrayData as integer ptr, _
length as uinteger)
declare sub print1DFloatArray(arrayName as string, _
arrayData as cl_float ptr, _
length as uinteger)
' Host Initialization
' Allocate and initialize memory on the host
sub initializeHost
print "initializeHost"
g_width = 16
g_input = NULL
g_output = NULL
g_multiplier = 1.5
' Allocate and initialize memory used by host
dim as cl_uint sizeInBytes = g_width * sizeof(cl_float)
g_input = allocate(sizeInBytes)
if (g_input= NULL) then
print "Error: Failed to allocate input memory on host"
return
end if
g_output = callocate(sizeInBytes)
if (g_output=NULL) then
print "Error: Failed to allocate output memory on host"
return
end if
' fill the input array
for i as cl_uint = 0 to g_width-1
g_input[i] = i
next
end sub
' OpenCL related initialization
' Create Context, Device list, Command Queue
' Create OpenCL memory buffer objects
' CL code, compile, link CL source
' Build program and kernel objects
sub initializeCL
print "initializeCL"
dim as cl_int status = 0
dim as size_t deviceListSize
dim as cl_uint numPlatforms
status = clGetPlatformIDs(0, NULL, @numPlatforms)
if (status<>CL_SUCCESS) or (numPlatforms<1) then
print "Error: Getting Platforms. (clGetPlatformsIDs)"
return
end if
dim as cl_platform_id platform
status = clGetPlatformIDs(1, @platform, NULL)
if (status<>CL_SUCCESS) then
print "Error: Getting Platform Ids. (clGetPlatformsIDs)"
return
end if
dim as cl_context_properties cps(2)
cps(0) = CL_CONTEXT_PLATFORM
cps(1) = cast(cl_context_properties, platform)
cps(2) = NULL
' Create an OpenCL context from platform ID
g_context = clCreateContextFromType(@cps(0), _
CL_DEVICE_TYPE_GPU, _
NULL, _
NULL, _
@status)
if (status<>CL_SUCCESS) then
print "Error: Creating Context. (clCreateContextFromType)"
return
end if
' First, get the size of device list data
status = clGetContextInfo(g_context, _
CL_CONTEXT_DEVICES, _
0, _
NULL, _
@deviceListSize)
if(status <> CL_SUCCESS) then
print "Error: Getting Context Info (device list size, clGetContextInfo)"
return
else
print "Info: deviceListSize = " & deviceListSize
end if
' Detect OpenCL devices
g_devices = callocate(deviceListSize)
' Now, get the device list data
status = clGetContextInfo(g_context, _
CL_CONTEXT_DEVICES, _
deviceListSize, _
g_devices, _
NULL)
if (status<>CL_SUCCESS) then
print "Error: Getting Context Info (device list, clGetContextInfo)"
return
end if
' Create an OpenCL command queue
g_commandQueue = clCreateCommandQueue(g_context, _
g_devices[0], _
0, _
@status)
if (status<>CL_SUCCESS) then
print "Creating Command Queue. (clCreateCommandQueue)"
return
end if
' Create OpenCL memory buffers
g_inputBuffer = clCreateBuffer(g_context, _
CL_MEM_READ_WRITE or CL_MEM_USE_HOST_PTR, _
sizeof(cl_float) * g_width, _
g_input, _
@status)
if (status<>CL_SUCCESS) then
print "Error: clCreateBuffer (inputBuffer)"
return
end if
g_outputBuffer = clCreateBuffer(g_context, _
CL_MEM_READ_WRITE or CL_MEM_USE_HOST_PTR, _
sizeof(cl_float) * g_width, _
g_output, _
@status)
if (status<>CL_SUCCESS) then
print "Error: clCreateBuffer (outputBuffer)"
return
end if
' build CL program object
' create CL kernel object
dim as string strSource
strSource &= !"// Sample kernel which multiplies every element of the input array \n"
strSource &= !"// with a constant and stores it at the corresponding output array \n"
strSource &= !"__kernel void FloatTestKernel(__global float * output, \n"
strSource &= !" __global float * input, \n"
strSource &= !" const float multiplier) { \n"
strSource &= !" uint tid = get_global_id(0); \n"
strSource &= !" output[tid] = input[tid] * multiplier; \n"
strSource &= !"} \n"
dim as zString ptr pSource = strptr(strSource)
dim as size_t nChars = len(strSource)
g_program = clCreateProgramWithSource(g_context, _
1, _
@pSource, _
@nChars, _
@status)
if (status<>CL_SUCCESS) then
print "Error: (clCreateProgramWithSource) !"
beep:sleep:return
end if
' create a cl program executable
' for the devices specified (one device in this case)
status = clBuildProgram(g_program, 1, g_devices, NULL, NULL, NULL)
if (status<>CL_SUCCESS) then
print "Error: Building Program (clBuildProgram)"
beep:sleep:return
end if
' get a kernel object handle for
' a kernel with the given name
g_kernel = clCreateKernel(g_program, "FloatTestKernel", @status)
if (status<>CL_SUCCESS) then
print "Error: Creating Kernel from program. (clCreateKernel)"
return
end if
end sub
' Run OpenCL program
' Bind host variables to kernel arguments
' Run the CL kernel
sub runCLKernels ()
print "runCLKernels"
dim as cl_int status
dim as cl_event events(1)
dim as size_t globalThreads(1)
dim as size_t localThreads(0)
globalThreads(0) = g_width
localThreads(0) = 1
' Set appropriate arguments to the kernel
' ARG 1 the output array to the kernel
status = clSetKernelArg(g_kernel, _
0, _
sizeof(cl_mem), _
@g_outputBuffer)
if (status<>CL_SUCCESS) then
print "Error: Setting kernel argument. (output)"
return
end if
' ARG 2 the input array to the kernel
status = clSetKernelArg(g_kernel, _
1, _
sizeof(cl_mem), _
@g_inputBuffer )
if (status<>CL_SUCCESS) then
print "Error: Setting kernel argument. (input)"
return
end if
' ARG 3 the multiplier
status = clSetKernelArg(g_kernel, _
2, _
sizeof(cl_float), _
@g_multiplier )
if (status<>CL_SUCCESS) then
print "Error: Setting kernel argument. (multiplier)"
return
end if
' Enqueue a kernel run call.
status = clEnqueueNDRangeKernel(g_commandQueue, _
g_kernel, _
1, _
NULL, _
@globalThreads(0), _
@localThreads(0), _
0, _
NULL, _
@events(0))
if (status<>CL_SUCCESS) then
print "Error: Enqueueing kernel onto command queue. (clEnqueueNDRangeKernel)"
return
end if
' wait for the kernel call to finish execution
status = clWaitForEvents(1, @events(0))
if (status<>CL_SUCCESS) then
print "Error: Waiting for kernel run to finish. (clWaitForEvents)"
return
end if
status = clReleaseEvent(events(0))
if (status<>CL_SUCCESS) then
print "Error: clReleaseEvent. (events[0])"
return
end if
' Enqueue readBuffer
status = clEnqueueReadBuffer(g_commandQueue, _
g_outputBuffer, _
CL_TRUE, _
0, _
g_width * sizeof(cl_float), _
g_output, _
0, _
NULL, _
@events(1) )
if (status<>CL_SUCCESS) then
print "Error: clEnqueueReadBuffer failed. (clEnqueueReadBuffer)"
end if
' Wait for the read buffer to finish execution
status = clWaitForEvents(1, @events(1))
if (status<>CL_SUCCESS) then
print "Error: Waiting for read buffer call to finish. (clWaitForEvents)"
return
end if
status = clReleaseEvent(events(1))
if (status<>CL_SUCCESS) then
print "Error: clReleaseEvent. (events[1])"
return
end if
end sub
' Release OpenCL resources (Context, Memory etc.)
sub cleanupCL
print "cleanupCL"
dim as cl_int status
status = clReleaseKernel(g_kernel)
if (status<>CL_SUCCESS) then
print "Error: In clReleaseKernel"
return
end if
status = clReleaseProgram(g_program)
if (status<>CL_SUCCESS) then
print "Error: In clReleaseProgram"
return
end if
status = clReleaseMemObject(g_inputBuffer)
if (status<>CL_SUCCESS) then
print "Error: In clReleaseMemObject (inputBuffer)"
return
end if
status = clReleaseMemObject(g_outputBuffer)
if (status<>CL_SUCCESS) then
print "Error: In clReleaseMemObject (outputBuffer)"
return
end if
status = clReleaseCommandQueue(g_commandQueue)
if (status<>CL_SUCCESS) then
print "Error: In clReleaseCommandQueue"
return
end if
status = clReleaseContext(g_context)
if (status<>CL_SUCCESS) then
print "Error: In clReleaseContext"
return
end if
end sub
' Releases program's resources
sub cleanupHost ()
print "cleanupHost"
if (g_input<>NULL) then
deallocate g_input
g_input = NULL
end if
if (g_output<>NULL) then
deallocate g_output
g_output = NULL
end if
if (g_devices<>NULL) then
deallocate g_devices
g_devices = NULL
end if
end sub
' Print no more than 256 elements of the given array.
' Print Array name followed by elements.
sub print1DArray(arrayName as string, _
arrayData as integer ptr, _
length as uinteger)
dim as cl_uint i
dim as cl_uint numElementsToPrint = iif(length>256,256,length)
print arrayName
for i = 0 to numElementsToPrint-1
print arrayData[i] & " ";
next
print
end sub
sub print1DFloatArray(arrayName as string, _
arrayData as cl_float ptr, _
length as uinteger)
dim as cl_uint i
dim as cl_uint numElementsToPrint = iif(length>256,256,length)
print arrayName
for i = 0 to numElementsToPrint-1
print arrayData[i] & " ";
next
print
end sub
'
' main
'
' Initialize Host application
initializeHost()
' Initialize OpenCL resources
initializeCL()
' print input array
print1DFloatArray("Input",g_input,g_width)
' Run the CL program
runCLKernels()
' Print output array
print1DFloatArray("Output",g_output,g_width)
' Releases OpenCL resources
cleanupCL()
' Release host resources
cleanupHost()
sleep
-
- Posts: 8586
- Joined: May 28, 2005 3:28
- Contact:
Re: OpenCL (Open Coumputer Language) for FreeBASIC .
I tested float4 successful for you !
Joshy
Joshy
Code: Select all
' file: float4.bas
' test of: float4
#include once "cl.bi"
sub fail(errcode as cl_int,msg as string)
if (errcode<>CL_SUCCESS) then
if ScreenPtr() then sleep 2000,1:screen 0
print msg
print getCLErrorString(errcode)
beep:sleep:end
end if
end sub
function GetPlatformID() as cl_platform_id
dim as cl_platform_id patformid
dim as cl_int failed = clGetPlatformIDs(1,@patformid, NULL)
if failed then fail(failed,"Unable to enumerate platform IDs")
return patformid
end function
function GetDeviceID(byval platformid as cl_platform_id, _
byval devicetype as cl_device_type=CL_DEVICE_TYPE_GPU) as cl_device_id
dim as cl_device_id deviceid
dim as cl_int failed = clGetDeviceIDs(platformid,devicetype,1,@deviceid,NULL)
'if (failed = CL_DEVICE_NOT_FOUND) andalso (devicetype=CL_DEVICE_TYPE_GPU) then
' try CPU also
' failed = clGetDeviceIDs(platformid,CL_DEVICE_TYPE_CPU,1,@deviceid,NULL)
'end if
if failed then fail(failed,"Unable to enumerate GPU/CPU device IDs")
return deviceid
end function
function CreateContext(deviceid as cl_device_id) as cl_context
dim as cl_int failed
dim as cl_context context = clCreateContext(0,1,@deviceid,NULL,NULL,@failed)
if failed then fail(failed,"Unable to create context")
return context
end function
function CreateCommandQueue(byval context as cl_context, _
byval deviceid as cl_device_id) as cl_command_queue
dim as cl_int failed
dim as cl_command_queue queue = clCreateCommandQueue(context,deviceid,0,@failed)
if failed then fail(failed,"Unable to create command queue")
return queue
end function
function CreateProgram(byval context as cl_context, _
byval deviceid as cl_device_id, _
byval sourcecode as string) as cl_program
dim as cl_int failed
dim as zstring ptr pCode = strptr(sourcecode)
dim as cl_program program = clCreateProgramWithSource(context,1,@pCode,NULL,@failed)
if failed or program=NULL then
fail(failed,"Unable to create program from source code")
end if
failed = clBuildProgram(program,0,NULL,NULL,NULL,NULL)
if failed then
dim as size_t nChars
clGetProgramBuildInfo(program,deviceid,CL_PROGRAM_BUILD_LOG,0,NULL,@nChars)
dim as string strLog=space(nChars)
clGetProgramBuildInfo(program,deviceid,CL_PROGRAM_BUILD_LOG,nChars,strptr(strLog),NULL)
fail(failed,!"Unable to build program " & strLog)
end if
return program
end function
function CreateKernel(byval program as cl_program, _
byval kernelname as zstring ptr) as cl_kernel
dim as cl_int failed
dim as cl_kernel kernel = clCreateKernel(program,kernelname,@failed)
if failed then
clReleaseProgram(program)
fail(failed,"Unable to create kernel")
end if
return kernel
end function
function CreateReadBuffer(byval context as cl_context, _
byval nBytes as size_t) as cl_mem
dim as cl_int failed
dim as cl_mem p = clCreateBuffer(context,CL_MEM_READ_ONLY,nBytes,NULL,@failed)
if failed then fail(failed,"Unable to create READ_ONLY buffer")
return p
end function
function CreateWriteBuffer(byval context as cl_context, _
byval nBytes as size_t) as cl_mem
dim as cl_int failed
dim as cl_mem p = clCreateBuffer(context,CL_MEM_WRITE_ONLY,nBytes,NULL,@failed)
if failed then fail(failed,"Unable to create WRITE_ONLY buffer")
return p
end function
function CreateReadWriteBuffer(byval context as cl_context, _
byval nBytes as size_t) as cl_mem
dim as cl_int failed
dim as cl_mem p = clCreateBuffer(context,CL_MEM_READ_WRITE,nBytes,NULL,@failed)
if failed then fail(failed,"Unable to create READ_WRITE buffer")
return p
end function
sub SetKernelArg(byval kernel as cl_kernel, _
byval nArgument as cl_uint, _
byval ArgSize as size_t, _
byval pValue as any ptr)
dim as cl_int failed = clSetKernelArg(kernel,nArgument,ArgSize,pValue)
if failed then fail(failed,"Unable to set kernel arg " & nArgument)
end sub
function GetKernelWorkGroupSize(byval kernel as cl_kernel, _
byval deviceid as cl_device_id) as size_t
dim as size_t WorkGroupSize
dim as cl_int failed = clGetKernelWorkGroupInfo(kernel,deviceid,CL_KERNEL_WORK_GROUP_SIZE, sizeof(size_t),@WorkGroupSize,NULL)
if failed then fail(failed,"Unable to get kernel work-group size")
return WorkGroupSize
end function
function GetDeviceMaxWorkItemDimensions(byval deviceid as cl_device_id) as cl_uint
dim as size_t dimensions
clGetDeviceInfo(deviceid,CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS,sizeof(size_t),@dimensions,NULL)
return dimensions
end function
function GetDeviceMaxWorkItemSizes(byval deviceid as cl_device_id, byval dimension as cl_uint) as size_t
var maxdims = GetDeviceMaxWorkItemDimensions(deviceid)
if dimension>maxdims then return 0
dim as size_t sizes(maxdims-1)
clGetDeviceInfo(deviceid,CL_DEVICE_MAX_WORK_ITEM_SIZES,maxdims*sizeof(size_t),@sizes(0),NULL)
return sizes(dimension)
end function
'
' main
'
#if 1
type cl_float4
as cl_float x,y,z,w
end type
#else
type cl_float4
union : as cl_float x,r,s : end union
union : as cl_float y,g,t : end union
union : as cl_float z,b,u : end union
union : as cl_float w,a,v : end union
end type
#endif
dim as string code
code = !"__kernel void testFloat4( __global float4 *pBuffer, uint uWidth, uint uHeight) { \n"
code &= !" uint id = get_global_id(0); \n"
code &= !" float x = (float)(id % uWidth) / (float)(uWidth); \n"
code &= !" float y = (float)(id / uWidth) / (float)(uHeight); \n"
code &= !" pBuffer[id] = (float4)(x,y,1-x,1); \n"
code &= !"} \n"
var platform = GetPlatformID()
var device = GetDeviceID(platform)
var context = CreateContext(device)
var queue = CreateCommandQueue(context,device)
var program = CreateProgram(context,device,code)
var kernel = CreateKernel(program,"testFloat4")
var workGroupSize = GetKernelWorkGroupSize(kernel,device)
'var nDimensions = GetDeviceMaxWorkItemDimensions(device)
'for i as integer=0 to nDimensions-1
' print "workdim[" & i & "] = " & GetDeviceMaxWorkItemSizes(device,i)
'next
dim as integer w=256,h=256
'screeninfo w,h
'w*=.5 : h*=.5
dim as size_t globalSize = w*h
dim as cl_uint uWidth=w,uHeight=h
screenres w,h,32
screenset 1,0
var img = ImageCreate(w,h)
dim as ubyte ptr pPixel,pImagePixels
dim as integer iPitch
imageinfo img,,,,iPitch,pImagePixels
iPitch shr=2 ' bytes to pixels
var pBuffer = CreateReadBuffer(context,globalSize*sizeof(cl_float4))
SetKernelArg(kernel, 0, sizeof(cl_mem) ,@pBuffer)
SetKernelArg(kernel, 1, sizeof(cl_uint),@uWidth)
SetKernelArg(kernel, 2, sizeof(cl_uint),@uHeight)
dim as integer frames,fps
dim as double tNow,tLast=Timer
while inkey()=""
' run the kernel
dim as cl_int failed = clEnqueueNDRangeKernel(queue,kernel,1,NULL,@globalSize, NULL, 0, NULL, NULL)
if failed then fail(failed,"Unable to enqueue kernel")
' map the GPU buffer in HOST memory space
dim as cl_float4 ptr pFloat4 = clEnqueueMapBuffer(queue, pBuffer,CL_TRUE,CL_MAP_READ,0,globalSize*sizeof(cl_float4), 0, NULL, NULL, @failed)
if failed then fail(failed,"Unable to map buffer")
' read the mapped memory
pPixel = pImagePixels
for i as integer = 0 to globalSize-1
pPixel[2] = pFloat4[i].x*255
pPixel[1] = pFloat4[i].y*255
pPixel[0] = pFloat4[i].z*255
pPixel[3] = pFloat4[i].w*255
pPixel+=4
next
' put the image on screen
draw string img,(0,0),"frame: " & frames & " fps: " & fps,RGB(255,255,255)
put (0,0),img,PSET
' unmap the memoy
failed = clEnqueueUnmapMemObject( queue, pBuffer, pFloat4, 0, NULL, NULL)
if failed then fail(failed,"Unable to unmap buffer")
' prepare the queue for next loop
failed = clFinish(queue)
if failed then fail(failed,"Unable to finish the queue")
flip : frames+=1
if frames mod 100=0 then
tNow=timer() : fps=100/(tNow-tLast) : tLast=tNow
end if
' sleep 10
wend
' free all resources
clReleaseMemObject(pBuffer)
clReleaseKernel(kernel)
clReleaseCommandQueue(queue)
clReleaseContext(context)
Last edited by D.J.Peters on Dec 21, 2019 19:33, edited 1 time in total.
-
- Posts: 35
- Joined: Nov 14, 2019 17:04
Re: OpenCL (Open Coumputer Language) for FreeBASIC .
D.J.Peters Thank you for your help !
I could not run this code
But test01 and test02 work well
"
Unable to build program C:\Users\I54590~1\AppData\Local\Temp\\OCL6836T1.cl:1:15: error: kernel cannot be called 'main'
__kernel void main( __global float4 *dst, uint width, uint height) {
^
1 error generated.
error: Clang front-end compilation failed!
Frontend phase failed compilation.
Error: Compiling CL to IR "
PS
"main" forbidden name for __kernel ?
i rename to "openCL_sub" / and this work
-----
did not understand float4
can a specific example for dummies?
how directly convert x(i), y(i), z(i), w(i)--> float4
Vx(i),Vy(i),Vz(i),Vw(i)--> float4 ?
I want to run this kernel.
https://github.com/ubernaut/stableorbit ... y/part2.cl
and then change the gravitational potential to the one I needL-j potential
then enter events and watch the reaction of the system
I could not run this code
But test01 and test02 work well
"
Unable to build program C:\Users\I54590~1\AppData\Local\Temp\\OCL6836T1.cl:1:15: error: kernel cannot be called 'main'
__kernel void main( __global float4 *dst, uint width, uint height) {
^
1 error generated.
error: Clang front-end compilation failed!
Frontend phase failed compilation.
Error: Compiling CL to IR "
PS
"main" forbidden name for __kernel ?
i rename to "openCL_sub" / and this work
-----
did not understand float4
can a specific example for dummies?
how directly convert x(i), y(i), z(i), w(i)--> float4
Vx(i),Vy(i),Vz(i),Vw(i)--> float4 ?
I want to run this kernel.
https://github.com/ubernaut/stableorbit ... y/part2.cl
and then change the gravitational potential to the one I needL-j potential
then enter events and watch the reaction of the system
Re: OpenCL (Open Coumputer Language) for FreeBASIC .
Don't use x(i), etc. Use something(i).x, see:12val12newakk wrote: how directly convert x(i), y(i), z(i), w(i)--> float4
Vx(i),Vy(i),Vz(i),Vw(i)--> float4 ?
Code: Select all
#define cl_float single '<-- I don't have cl_float
type cl_float4
as cl_float x, y, z, w
end type
const N_OBJ = 100
dim as cl_float4 p(N_OBJ-1) 'position
dim as cl_float4 v(N_OBJ-1) 'velocity
for i as integer = 0 to N_OBJ-1
print p(i).x, p(i).y, p(i).z, p(i).w
print v(i).x, p(i).y, v(i).z, v(i).w
next
-
- Posts: 35
- Joined: Nov 14, 2019 17:04
Re: OpenCL (Open Coumputer Language) for FreeBASIC .
D.J.Peters
-----
your examples do some direct copying.i can't use it
array host_ pos (float4 ) to CreateReadWriteBuffer
array host_vel (float4 ) to CreateReadWriteBuffer
iteration calculation
CreateReadWriteBuffer to array host_ pos (float4 )
CreateReadWriteBuffer to array host_vel (float4 )
please without "clEnqueueMapBuffer" " clEnqueueUnmapMemObject"
-----
your examples do some direct copying.i can't use it
array host_ pos (float4 ) to CreateReadWriteBuffer
array host_vel (float4 ) to CreateReadWriteBuffer
iteration calculation
CreateReadWriteBuffer to array host_ pos (float4 )
CreateReadWriteBuffer to array host_vel (float4 )
please without "clEnqueueMapBuffer" " clEnqueueUnmapMemObject"
-
- Posts: 35
- Joined: Nov 14, 2019 17:04
Re: OpenCL (Open Coumputer Language) for FreeBASIC .
Temporarily abandoned float4
I can’t expand your examples .. no way
Trying to add the number of transferred arrays
cloning
(arguments) in kernel everything falls apart.
I tearfully ask for bidirectional parameter transfer (ReadWrite) for such a simplified kernel
can you specify the sizes of arrays directly N_OBJ * 4?
without width height .. I do not draw pictures. I arrange the particles and calculate their integral interaction through pair
I draw only circles with a radius of zero force
I can’t expand your examples .. no way
Trying to add the number of transferred arrays
cloning
(arguments) in kernel everything falls apart.
I tearfully ask for bidirectional parameter transfer (ReadWrite) for such a simplified kernel
Code: Select all
__kernel void SimpleKernel ( __global float * posx,
__global float * posy
__global float * posz,
__global float * velx,
__global float * vely,
__global float * velz,
__global float * mass,
__global int *clr,
__const float dt)
{
posx= posx+dt;
posy= posy+dt;
posz= posz+dt;
velx=velx-dt;
vely=vely-dt;
velz=velz-dt;
mass=mass+dt;
clr=clr-1;
}
without width height .. I do not draw pictures. I arrange the particles and calculate their integral interaction through pair
I draw only circles with a radius of zero force
-
- Posts: 8586
- Joined: May 28, 2005 3:28
- Contact:
Re: OpenCL (Open Coumputer Language) for FreeBASIC .
D.J.Peters wrote:The Open Coumputer Language for FreeBASIC.
download: OpenCL.zip from: Jan 03, 2020
import lib's 32/64-bit and the OpenCL specs. are included.
Don't forget the "readme" ;-)
Joshy
Simple test:
The default OpenGL shader from shadertoy.com as OpenCL version:Code: Select all
#include "cl.bi" dim as string CODE CODE &= !"kernel void shader (global uchar4 * pixels, uint width, uint height, float time) \n" CODE &= !"{\n" CODE &= !" uint ix = get_global_id(0); \n" CODE &= !" uint iy = get_global_id(1); \n" CODE &= !" uint i = ix+width*iy; \n" CODE &= !" float x = ix/(float)width; \n" CODE &= !" float y = iy/(float)height; \n" CODE &= !" uchar r = (uchar)(x*255.f); \n" CODE &= !" uchar g = (uchar)(y*255.f); \n" CODE &= !" uchar b = (uchar)(255.f * (0.5f + 0.5f*sin(time))); \n" CODE &= !" pixels[i] = (uchar4)(b,g,r,255); \n" CODE &= !"}\n" dim as zstring ptr pCode = strptr(CODE) dim as cl_platform_id platform ' get first platform dim as cl_int status = clGetPlatformIDs(1,@platform,NULL) ' get first device from platform try a GPU device at first dim as cl_device_id device status = clGetDeviceIDs(platform,CL_DEVICE_TYPE_GPU,1,@device,NULL) if status = CL_INVALID_DEVICE_TYPE then ' fallback: try a CPU device also status = clGetDeviceIDs(Platform,CL_DEVICE_TYPE_CPU,1,@device,NULL) end if if status then print "error: can't create OpenCL device !" flip : beep : sleep : end 1 end if ' create a device context dim as cl_context context = clCreateContext(0,1,@device,NULL,NULL,@status) ' create program object in context from sourcecode dim as cl_program program = clCreateProgramWithSource(context,1,@pCode,NULL,@status) ' compile and link the program status = clBuildProgram(program, 0, NULL, NULL, NULL, NULL) if status then print "error: can't create compile unand link kernel !" dim as size_t nChars ' get size of log clGetProgramBuildInfo(program,device,CL_PROGRAM_BUILD_LOG,0,0,@nChars) dim as string strLog=space(nChars) ' get log from compiler/linker clGetProgramBuildInfo(program,device,CL_PROGRAM_BUILD_LOG,nChars,strptr(strLog),NULL) print strLog flip : beep : sleep : end 1 end if ' create a kernel object dim as cl_kernel kernel = clCreateKernel(program, "shader", @status) ' get size of workgroup from device dim as size_t WorkGroupSize status = clGetKernelWorkGroupInfo(kernel,device,CL_KERNEL_WORK_GROUP_SIZE, sizeof(size_t),@WorkGroupSize,NULL) ' calculate local worksize '(must be workgroupsize div power of two) dim as size_t LocalSize = WorkGroupSize while LocalSize*LocalSize > WorkGroupSize LocalSize shr=1 wend dim as cl_int GlobalWidth=640,GlobalHeight = GlobalWidth/16*9 ' calculate global worksize (should be a multiply of local worksize GlobalWidth -=GlobalWidth mod LocalSize GlobalHeight-=GlobalHeight mod LocalSize screenres GlobalWidth,GlobalHeight,32,2 screenset 1,0 windowtitle "global: " & GlobalWidth & " x " & GlobalHeight & " local: " & LocalSize & " x " & LocalSize & " worksize" dim as any ptr pixels,img=imagecreate(GlobalWidth,GlobalHeight) imageinfo img,,,,,pixels dim as size_t bufSize = GlobalWidth*GlobalHeight*4 dim as size_t globalSizes(...) => {GlobalWidth,GlobalHeight} dim as size_t localSizes (...) => {LocalSize ,LocalSize} ' create a memory buffer in the context dim as cl_mem buffer = clCreateBuffer(context, CL_MEM_READ_WRITE, bufSize, NULL, @status) ' set the parameters of the kernel status = clSetKernelArg(kernel, 0, sizeof(cl_mem ),@buffer) status = clSetKernelArg(kernel, 1, sizeof(cl_uint),@GlobalWidth) status = clSetKernelArg(kernel, 2, sizeof(cl_uint),@GlobalHeight) ' create command queue dim as cl_command_queue queue = clCreateCommandQueue(context,device,0,@status) if status then print "error: can't create a command queue !" flip : beep : sleep : end 1 end if dim as integer frames,fps dim as double tStart=timer dim as double tLast =tStart while inkey()="" dim as single runtime = timer()-tStart ' set the time parameter status = clSetKernelArg(kernel, 3, sizeof(cl_float),@runtime) ' execute the "2D" kernel status = clEnqueueNDRangeKernel(queue,kernel,2, NULL, @globalSizes(0), @localSizes(0),0,NULL,NULL) ' read/copy the device buffer in the image buffer status = clEnqueueReadBuffer (queue,buffer,CL_TRUE,0,bufSize,pixels,0,NULL,NULL) ' draw the runtime and fps in the image draw string img,(0,0),"fps: " & fps & " time: " & runtime ' put it on the screen put (0,0),img,PSET ' make hiden page visible flip ' count the frames frames+=1 ' update every 60 frame the fps if frames mod 60=0 then var tNow=timer() fps=60/(tNow-tLast) tLast=tNow end if wend ' free all resources clReleaseCommandQueue queue clReleaseProgram program clReleaseMemObject buffer clReleaseContext context
Re: OpenCL (Open Coumputer Language) for FreeBASIC .
hello D.J.Peters
size_t is not defined, had to #include "crt\ctype.bi" after that it compiled and ran OK
thank you for this and a happy new year :-)
size_t is not defined, had to #include "crt\ctype.bi" after that it compiled and ran OK
thank you for this and a happy new year :-)
-
- Posts: 8586
- Joined: May 28, 2005 3:28
- Contact:
Re: OpenCL (Open Coumputer Language) for FreeBASIC .
The latest version Jan 2020 is independent from C runtime !
you have to use "cl_size_t" as replacement for "size_t" now
Joshy
from file "cl.bi"
you have to use "cl_size_t" as replacement for "size_t" now
Joshy
from file "cl.bi"
Code: Select all
'#include "crt/stdint.bi"
'#include "crt/stdlib.bi"
...
' added
type cl_size_t as uinteger ' 32/64 bit