Windows Thread Pool
Windows Thread Pool
Does anybody have any experience using the windows thread pool api's, and, what are best practices?
1. CreateThreadpool
2. SetThreadpoolThreadMaximum
3. SetThreadpoolThreadMinimum
4. CreateThreadpoolWork
5. SubmitThreadpoolWork
6. CloseThreadpoolWork
7. CloseThreadPool
1. CreateThreadpool
2. SetThreadpoolThreadMaximum
3. SetThreadpoolThreadMinimum
4. CreateThreadpoolWork
5. SubmitThreadpoolWork
6. CloseThreadpoolWork
7. CloseThreadPool
Re: Windows Thread Pool
Not knowing much about things, I put this small snippet of code together and it seems to work with both 32 and 64 bit compiles.
Code: Select all
const _WIN32_WINNT = &h0602
#INCLUDE ONCE "windows.bi"
declare sub myThread (byval Instance as PTP_CALLBACK_INSTANCE, byval Context as PVOID, byval Work as PTP_WORK)
dim shared lpCriticalSection as CRITICAL_SECTION
dim iError as long
dim ptpp as PTP_POOL
dim reserved as PVOID
dim ucbe as TP_CALLBACK_ENVIRON
dim cbe as PTP_CALLBACK_ENVIRON
dim Work as PTP_WORK
dim iIndex as integer
cbe = cast(PTP_CALLBACK_ENVIRON,varptr(ucbe))
InitializeCriticalSection(ByVal VarPtr(lpCriticalSection))
ptpp = CreateThreadpool(reserved)
iError = GetLastError()
if ptpp = 0 THEN
print "CreateThreadPool failed,error=" + str(iError)
Print "press q to quit"
Do
Sleep 1, 1
Loop Until Inkey = "q"
END
END IF
print "CreateThreadPool successful..."
TpInitializeCallbackEnviron(cbe)
TpSetCallbackLongFunction(cbe)
for iIndex = 1 to 3
Work = CreateThreadpoolWork(cast(PTP_WORK_CALLBACK,@myThread),cast(PVOID,iIndex),cbe)
SubmitThreadpoolWork(Work)
CloseThreadpoolWork(Work)
NEXT
sleep 1000,1
TpDestroyCallbackEnviron(cbe)
CloseThreadpool(ptpp)
DeleteCriticalSection(ByVal VarPtr(lpCriticalSection))
Print "press q to quit"
Do
Sleep 1, 1
Loop Until Inkey = "q"
end
sub myThread (byval Instance as PTP_CALLBACK_INSTANCE, byval Context as PVOID, byval Work as PTP_WORK)
EnterCriticalSection(ByVal VarPtr(lpCriticalSection))
print "Thread started,Instance=" + str(Instance) + ",Context=" + str(Context)
LeaveCriticalSection(ByVal VarPtr(lpCriticalSection))
end sub
-
- Posts: 4315
- Joined: Jan 02, 2017 0:34
- Location: UK
- Contact:
Re: Windows Thread Pool
Good question - never heard of them. I have used threads quite a lot and learnt that they are not cheap to create. With AES-HMAC I created a secondary thread of execution for the HMAC on decryption but only if the file being processed exceeded 1MB otherwise the object was defeated.Does anybody have any experience using the windows thread pool api's, and, what are best practices?
From that you will have gathered that what I know about thread pooling can be put on the back of a postage stamp. Anyway it looked interesting enough to put the kettle on. Thread pooling in Windows XP was poor and saw a revamp in Windows Vista. The code above requires Windows Vista and later.
First thing, CloseThreadpoolWork() is nothing like Threaddetach which releases a thread handle without waiting for the thread to finish. CloseThreadpoolWork releases the specified work object. In other words it kills the thread. I added some work to myThread but it was not being done.
Secondly, if we comment the 'Sleep 5000,1' we go flying into the cleanup section prematurely and get a 'stopped working' message from the system. We dont have a WaitForMultipleObjects as with thread creation but we do have WaitForThreadpoolWorkCallbacks and we should use that before the cleanup section and/or before any further calls to SubmitThreadpoolWork using the same TP_WORK structure.
Reading a few blogs where folks mentioned how expensive creating threads can be and that can be overcome by thread pooling resulted in the following code; adapted from rpkelly's code above. Are you Rick Kelly from PB? From MSDN's definition of the first parameter of CreateThreadpoolWork it occurred to me that when a work object had 'done it's bit' rather than create another thread we simply 'Submit' it again. With my AES-HMAC a 100MB file gets processed using 400 x 256KB buffers. That is 400 thread creations. The AES is quicker than HMAC-SHA256 so I had to wait for the HMAC before the next AES. It was worthwhile because the AES was effectively being done for 'free'. I am now wondering what it would be like with 400 x submissions instead of 400 thread creations. <big smile>
In the following only one CreateThreadpoolWork is employed but the work is submitted a couple of times. No doubt the code is an absolute disgrace but I am walking in front with a red flag.<laugh>
The primary thread has two 'Sleep 5000,1' and the secondary thread polls Timer for three seconds.
This is whatI get:
Code: Select all
CreateThreadPool successful...
Do some work 22:29:08
Thread started,Instance=12189332,Context=1
22:29:08
3.00 Finished secondary work at 22:29:11
Finished primary work at 22:29:13
Do more work 22:29:13
Thread started,Instance=12189332,Context=1
22:29:13
3.01 Finished secondary work at 22:29:16
Finished primary work at 22:29:18
press q to quit
I think that there is more to the cleanup section so we may have a leak - more reading required.
There are a handful of applications which can benefit from using a thread pool one of which is:
"An application that creates and destroys a large number of threads that each run for a short time. Using the thread pool can reduce the complexity of thread management and the overhead involved in thread creation and destruction."
All of my work using threads fall into that category. My A fast CPRNG is one such. It is very fast as is - I will go back to that to see what 'pooling' will do.
With regard best practices there are a 'pile' of them.
This is a 'bruiser' of a subject.
Code: Select all
const _WIN32_WINNT = &h0602
#INCLUDE ONCE "windows.bi"
declare sub myThread (byval Instance as PTP_CALLBACK_INSTANCE, byval Context as PVOID, byval Work as PTP_WORK)
dim shared lpCriticalSection as CRITICAL_SECTION
dim iError as long
dim ptpp as PTP_POOL
dim reserved as PVOID
dim ucbe as TP_CALLBACK_ENVIRON
dim cbe as PTP_CALLBACK_ENVIRON
dim Work as PTP_WORK
dim iIndex as integer
cbe = cast(PTP_CALLBACK_ENVIRON,varptr(ucbe))
InitializeCriticalSection(ByVal VarPtr(lpCriticalSection))
ptpp = CreateThreadpool(reserved)
iError = GetLastError()
if ptpp = 0 THEN
? "CreateThreadPool failed,error=" + str(iError)
? "press q to quit"
Do
Sleep 1, 1
Loop Until Inkey = "q"
END
END IF
? "CreateThreadPool successful..."
TpInitializeCallbackEnviron(cbe)
TpSetCallbackLongFunction(cbe)
Work = CreateThreadpoolWork(cast(PTP_WORK_CALLBACK,@myThread),cast(PVOID,1),cbe) '
SubmitThreadpoolWork(Work) ' First outing
EnterCriticalSection(ByVal VarPtr(lpCriticalSection))
? : ? "Do some work ";Time
LeaveCriticalSection(ByVal VarPtr(lpCriticalSection))
Sleep 5000, 1
? : ? "Finished primary work at ";Time
WaitForThreadpoolWorkCallbacks(Work,FALSE)
SubmitThreadpoolWork(Work) ' Second outing
EnterCriticalSection(ByVal VarPtr(lpCriticalSection))
? : ? "Do more work ";Time
LeaveCriticalSection(ByVal VarPtr(lpCriticalSection))
Sleep 5000, 1
? : ? "Finished primary work at ";Time
WaitForThreadpoolWorkCallbacks(Work,FALSE)
TpDestroyCallbackEnviron(cbe)
CloseThreadpool(ptpp)
DeleteCriticalSection(ByVal VarPtr(lpCriticalSection))
? : ? "press q to quit"
Do
Sleep 1, 1
Loop Until Inkey = "q"
sub myThread (byval Instance as PTP_CALLBACK_INSTANCE, byval Context as PVOID, byval Work as PTP_WORK)
Dim As Double t, done
EnterCriticalSection(ByVal VarPtr(lpCriticalSection))
? : ? "Thread started,Instance=" + str(Instance) + ",Context=" + str(Context)
? Time
LeaveCriticalSection(ByVal VarPtr(lpCriticalSection))
t = timer
do
sleep 1,1
done = timer - t
loop Until done >= 3
EnterCriticalSection(ByVal VarPtr(lpCriticalSection))
? Using "#.##"; done;
? " Finished secondary work at ";Time
LeaveCriticalSection(ByVal VarPtr(lpCriticalSection))
end sub
-
- Posts: 4315
- Joined: Jan 02, 2017 0:34
- Location: UK
- Contact:
Re: Windows Thread Pool
With regard CloseThreadPoolWork MSDN says "If there is a cleanup group associated with the work object, it is not necessary to call this function; calling the CloseThreadpoolCleanupGroupMembers function releases the work, wait, and timer objects associated with the cleanup group." Since we are only using a work object the CloseThreadpoolCleanupGroupMembers seems to me to be overkill. So we should use CloseThreadPoolWork. I reckon just before TpDestroyCallbackEnviron(cbe).
-
- Posts: 4315
- Joined: Jan 02, 2017 0:34
- Location: UK
- Contact:
Re: Windows Thread Pool
This is not true. I don't see the point of 2 X SubmitThreadpoolWork(Work) but if we did then we get two instance IDs with the two threads running in parallel. Of course we can have umpteen instances of SubmitThreadpoolWork but with a differing second parameter: "Optional application-defined data to pass to the callback function."and/or before any further calls to SubmitThreadpoolWork using the same TP_WORK structure.
With regard the code above it seems to me that both TpInitializeCallbackEnviron(cbe) and TpSetCallbackLongFunction(cbe) are redundant. There are reasons for our getting involved in defining a callback environment but if they do not exist then we can use Null for the third parameter of CreateThreadpoolWork. With TpSetCallbackLongFunction "The thread pool may use this information to better determine when a new thread should be created.". So far my imagination is being stifled by a lack of knowledge and I have not gone beyond my being in control of when a new thread should be created.
Taking the above into account and a tidy up my code above reduces to:
Code: Select all
const _WIN32_WINNT = &h0602
#INCLUDE ONCE "windows.bi"
declare sub myThread (byval Instance as PTP_CALLBACK_INSTANCE, byval Context as PVOID, byval Work as PTP_WORK)
dim shared lpCriticalSection as CRITICAL_SECTION
dim iError as long
dim Pool as PTP_POOL
dim Work as PTP_WORK
InitializeCriticalSection(ByVal VarPtr(lpCriticalSection))
Pool = CreateThreadpool(Null)
iError = GetLastError()
if Pool = 0 THEN
DeleteCriticalSection(ByVal VarPtr(lpCriticalSection))
? "CreateThreadPool failed,error=" + str(iError)
? "press q to quit"
Do
Sleep 1, 1
Loop Until Inkey = "q"
END
END IF
? "CreateThreadPool successful..."
Work = CreateThreadpoolWork(cast(PTP_WORK_CALLBACK,@myThread),cast(PVOID,1), Null)
SubmitThreadpoolWork(Work) ' First outing
EnterCriticalSection(ByVal VarPtr(lpCriticalSection))
? : ? "Do some work ";Time
LeaveCriticalSection(ByVal VarPtr(lpCriticalSection))
Sleep 3000, 1
? : ? "Finished primary work at ";Time
WaitForThreadpoolWorkCallbacks(Work,FALSE)
SubmitThreadpoolWork(Work) ' Second outing
EnterCriticalSection(ByVal VarPtr(lpCriticalSection))
? : ? "Do more work ";Time
LeaveCriticalSection(ByVal VarPtr(lpCriticalSection))
Sleep 3000, 1
? : ? "Finished primary work at ";Time
WaitForThreadpoolWorkCallbacks(Work,FALSE)
CloseThreadpoolWork(Work)
CloseThreadpool(Pool)
DeleteCriticalSection(ByVal VarPtr(lpCriticalSection))
? : ? "press q to quit"
Do
Sleep 1, 1
Loop Until Inkey = "q"
sub myThread (byval Instance as PTP_CALLBACK_INSTANCE, byval Context as PVOID, byval Work as PTP_WORK)
Dim As Double t, done
EnterCriticalSection(ByVal VarPtr(lpCriticalSection))
? : ? "Thread started,Instance=" + str(Instance) + ",Context=" + str(Context)
? Time
LeaveCriticalSection(ByVal VarPtr(lpCriticalSection))
t = timer
do
sleep 1,1
done = timer - t
loop Until done >= 3
EnterCriticalSection(ByVal VarPtr(lpCriticalSection))
? Using "#.##"; done;
? " Finished secondary work at ";Time
LeaveCriticalSection(ByVal VarPtr(lpCriticalSection))
end sub
Thread Pool API
Thread Pools
Understanding Thread Pool Enhancements
Developing with Thread Pool Enhancements
Using the Thread Pool Functions
Re: Windows Thread Pool
I'm the PB guy having transitioned to FB. I'm developing SQLite Client/Server classes when the whole threadpool api set came into view.
See my take at:
https://github.com/breacsealgaire/FreeB ... Lite-Class
What I found out is that I can have a thread cleanup group without a callback and then CloseThreadpoolCleanupGroupMembers function blocks until all currently executing callback functions finish which was important to allow all outstanding SQLite threads to finish before shutting down the server.
My choices in implementing the thread pool api's were, of course, oriented to SQLite and a multithreaded server.
As I learn more about thread pools, I'm made to think that I should do all my threading this way.
See my take at:
https://github.com/breacsealgaire/FreeB ... Lite-Class
What I found out is that I can have a thread cleanup group without a callback and then CloseThreadpoolCleanupGroupMembers function blocks until all currently executing callback functions finish which was important to allow all outstanding SQLite threads to finish before shutting down the server.
My choices in implementing the thread pool api's were, of course, oriented to SQLite and a multithreaded server.
As I learn more about thread pools, I'm made to think that I should do all my threading this way.
-
- Posts: 4315
- Joined: Jan 02, 2017 0:34
- Location: UK
- Contact:
Re: Windows Thread Pool
Hi Rick
I should think that I will keep life simple by concentrating on work objects.
David Roberts
Wow, that is well outside of my comfort zone. <smile>See my take at:
That is the impression that I am getting from all the blogs that I have read.As I learn more about thread pools, I'm made to think that I should do all my threading this way.
I should think that I will keep life simple by concentrating on work objects.
David Roberts
Re: Windows Thread Pool
Grab my class and sample script and with your work flow. I'd be interested in seeing if it holds up.
Welcome to the 64 bit world...
Welcome to the 64 bit world...
-
- Posts: 4315
- Joined: Jan 02, 2017 0:34
- Location: UK
- Contact:
Re: Windows Thread Pool
As the cCTServerThreadPool.bi stands I reckon that you don't need
and, therefore, cbe.
Code: Select all
TpInitializeCallbackEnviron(This.cbe)
TpSetCallbackLongFunction(This.cbe)
I will be 70 bits in December but I stopped counting at 32 bits.Welcome to the 64 bit world...
Re: Windows Thread Pool
You may be correct. I only included the environment api's since I thought threads working with SQLite would be thought of as "long". I have a lot of testing remaining and I'll find out when I throw a few hundred connections at my server class as fast as I can create them.
70 bits just means you have to keep smiling while you have most of your teeth....:-)
70 bits just means you have to keep smiling while you have most of your teeth....:-)
-
- Posts: 4315
- Joined: Jan 02, 2017 0:34
- Location: UK
- Contact:
Re: Windows Thread Pool
You may be correct as well. I have found some answers without being able to determine what the question was - bit like being handed a pair of oars without knowing what a rowing boat is. <smile> The secret, of course, is to just keep reading until the penny drops. It helps if there are several sources of reading - it is less helpful when the source is dominated by MSDN.You may be correct.
That is often the only way to go. Throw a brick at our code and see what happens.I have a lot of testing remaining and I'll find out when I throw a few hundred connections at my server class as fast as I can create them.
-
- Posts: 4315
- Joined: Jan 02, 2017 0:34
- Location: UK
- Contact:
Re: Windows Thread Pool
I have just finished CryptoRndBufferII which uses thread pooling as opposed to thread creation. The coding was not as easy - I had to use four work objects.
I expected the exhaustion stutter to be less. The exhaustion stutter occurs when a buffer exhausts before the other buffer has filled. The worst case scenario is when we request random numbers and nothing else ie flat out. I reckoned that the throughput may increase but only marginally.
Here is a comparison.
CryptoRndBuffer
Requested 10485760
BufferSize 131072 NumberOfBuffers 80
67.358ms Time to crunch
155 Million per second
Stutter 0.5871898950504352
CryptoRndBufferII
Requested 10485760
BufferSize 131072 NumberOfBuffers 80
36.103ms Time to crunch
290 Million per second
Stutter 0.1915569701375841
I have been using a buffer size of 128KB to keep the worst case stutter at a manageable level - the larger the buffer the greater the exhaustion stutter. The new version has a stutter down to a third of the original version.
The shock result is the throughput - pushing twice as fast. CRB was already faster than FB's option 2 generator, CMC, but is now leaving that standing. FB's Mersenne Twister comes in at 85 Million per second. It is worth remembering that we are talking about a CPRNG here and not a PRNG.
Needless to say much testing is required and a 1TB PractRand run is a must to make sure nothing untoward is happening.
All my ThreadCreate applications are now shaking in their boots.<smile>
Thanks, Rick.
I expected the exhaustion stutter to be less. The exhaustion stutter occurs when a buffer exhausts before the other buffer has filled. The worst case scenario is when we request random numbers and nothing else ie flat out. I reckoned that the throughput may increase but only marginally.
Here is a comparison.
CryptoRndBuffer
Requested 10485760
BufferSize 131072 NumberOfBuffers 80
67.358ms Time to crunch
155 Million per second
Stutter 0.5871898950504352
CryptoRndBufferII
Requested 10485760
BufferSize 131072 NumberOfBuffers 80
36.103ms Time to crunch
290 Million per second
Stutter 0.1915569701375841
I have been using a buffer size of 128KB to keep the worst case stutter at a manageable level - the larger the buffer the greater the exhaustion stutter. The new version has a stutter down to a third of the original version.
The shock result is the throughput - pushing twice as fast. CRB was already faster than FB's option 2 generator, CMC, but is now leaving that standing. FB's Mersenne Twister comes in at 85 Million per second. It is worth remembering that we are talking about a CPRNG here and not a PRNG.
Needless to say much testing is required and a 1TB PractRand run is a must to make sure nothing untoward is happening.
All my ThreadCreate applications are now shaking in their boots.<smile>
Thanks, Rick.
Re: Windows Thread Pool
Wow! Thanks for the feedback. The overhead for creating threads is higher than I had imagined and yours is the first proof that thread pools have their place. I'm looking forward towards throwing that proverbial brick wall at my SQLite server class. I have a connection pool class that manages the SQLite connection handles on a check out basis to keep open's to a minimum for much the same reason as using a thread pool.
Keep us posted on your journey.
Rick
Keep us posted on your journey.
Rick
-
- Posts: 4315
- Joined: Jan 02, 2017 0:34
- Location: UK
- Contact:
Re: Windows Thread Pool
OK, here is some breaking news. <smile>Keep us posted on your journey.
This was begging to be done. The following compares 100,000 SubmitThreadpoolWork with 100,000 ThreadCreate both using threads which do absolutely nothing.
The test was done five times - single tests is a bad practice which many folk do.
In comparison with SubmitThreadpoolWork the overhead for Threadcreate is enormous.
As Microsoft says "Using the thread pool can reduce the complexity of thread management and the overhead involved in thread creation and destruction."
Boy, did they get that right. We are talking 10 microseconds compared with 250 microseconds.
Given two heavy duty tasks which can be executed in parallel but only once then 250 microseconds is neither here nor there and ThreadCreate is easier to implement. However, if done many more times than once, and not necessarily heavy duty, then we have a very different ball game. All of my thread work falls into the latter case and it has been worthwhile. The mind boggles at what thread pooling can do for them.
Results ( in seconds ):
Code: Select all
1.020808976953255
27.28130401556713
0.9288334450974496
19.20958011653508
1.27032038656305
26.62135063937834
1.228162935241453
27.38241969856165
0.864315230326298
26.46788907886365
Code: Select all
Const _WIN32_WINNT = &h0602
#INCLUDE ONCE "windows.bi"
Declare Sub myThread1(As PTP_CALLBACK_INSTANCE, As PVOID, As PTP_WORK)
Declare Sub myThread2( As Any Ptr )
Dim As Long i, j
Dim Pool As PTP_POOL
Dim Work As PTP_WORK
Dim As Any Ptr hThread, x
Dim t As Double
Pool = CreateThreadpool(Null)
Work = CreateThreadpoolWork(Cast(PTP_WORK_CALLBACK,@myThread1),Cast(PVOID,1), Null)
For j = 1 To 5
t = Timer
For i = 1 To 100000
SubmitThreadpoolWork(Work)
WaitForThreadpoolWorkCallbacks(Work,FALSE)
Next
Print Timer - t
t = Timer
For i = 1 To 100000
hThread = Threadcreate( @myThread2, x )
Threadwait( hThread )
Next
Print Timer - t
Print
Next
CloseThreadpoolWork(Work)
CloseThreadpool(Pool)
Sleep
Sub myThread1(Byval Instance As PTP_CALLBACK_INSTANCE, Byval Context As PVOID, Byval Work As PTP_WORK)
'Do nothing
End Sub
Sub myThread2( Byval x As Any Ptr )
' Do nothing
End Sub
Re: Windows Thread Pool
Your comparison results are similar to mine although I only used 1000 threads.
Links I found useful for tuning of thread pools is at:
https://blogs.msdn.microsoft.com/pedram ... ol-thread/
http://www.thejoyofcode.com/Tuning_the_ThreadPool.aspx
Links I found useful for tuning of thread pools is at:
https://blogs.msdn.microsoft.com/pedram ... ol-thread/
http://www.thejoyofcode.com/Tuning_the_ThreadPool.aspx