Optimising execution speed
Optimising execution speed
Is there a resource that provides general tips and tricks for optimising the execution speed of FreeBasic code?
Re: Optimising execution speed
Not really, because optimizations are very dependent on the underlying program. Some obvious tips off the top of my head:
- Precompute as much as you can
- Avoid virtual calls as much as possible, especially in tight loops
- Pick good algorithms for the task at hand
- Complement of the above: pick the right data structures for the right task (ie don't use a linked list if you need to iterate over many elements often)
- Not so obvious, especially for beginners: try to split your code in a logic, organized way (ie favor subs and functions over shared variables). This is because a sub/function provides an abstraction layer, which you can then profile and optimize separately if needed
- Old optimization tricks (ie loop unrolling) don't work as well as they used to, because processors nowadays are quite different from the early days. Organize your data, especially data you need to iterate often, sequentially in memory (ie use arrays). This has a huge impact on performance, especially if you have data that can be processed linearly
- Try to minimize allocations, especially if they're small and happen often (ie a game loop). Instead, preallocate as much as possible and work within those limits (or only reallocate if you absolutely need to)
Re: Optimising execution speed
Some of the terms are unfamiliar to me (eg loop unrolling and virtual calls), but at least one of your tips has paid off so far. I think it was Wikipedia's article on loop unrolling that advised against relying too much on pointer arithmetic, and I've now cleaned up that mess in my code.
Thanks, and happy new year.
Thanks, and happy new year.
Re: Optimising execution speed
Virtual calls have some minor overhead, but basically it is a indirect call instead of a direct one.
Are they really the first thing to optimise these days?
Are they really the first thing to optimise these days?
Re: Optimising execution speed
Mostly on tight loops (ie you need to iterate over lots of entities in, say, a game). I've measured the overhead of a virtual call in FreeBasic: about 22%. So depending on usage, yes, it can add up to a lot.
The first thing to optimize these days is data, not so much code. Compilers are pretty efficient already and, unless you're really experienced and need to squeeze every last drop of performance, just arranging data in a cache-coherent fashion and processing things linearly, you can get significant speed improvements, but does require practice (it's basically a different paradigm)
Re: Optimising execution speed
it is a waste of time to think any optimization before any working program comes out.
Re: Optimising execution speed
I recently wrote a program to scan each sector in a dd image file of an 80GB HDD, looking for 1 of 17 5-byte signatures. I noticed that the 5th byte of each signature was always 0 or 1, so I tested this byte separately before the others. I consider this to be optimisation, and in this case it was best done at the outset.
On a related matter, I used a series of Select Case statements to perform the comparisons, but I could also have used If Then ElseIf statements. Assuming there are no constants to be tested, is there any reason, other than neatness, to use Select Case rather than If?
Re: Optimising execution speed
I assume the indirection frustrates branch prediction, though even then the second time it could do it speculatively. It is possible to make a micro benchmark out of about everything. If all you do is put a indirection in a loop, yes, then the indirection is measurable. But I wonder it is worthwhile to focus on it as part of the first round of optimisation.
Some compilers support devirtualisation as part of their global optimisations.
Cache coherency is a strategy to keep multiple caches coherent (to avoid dirty reads etc) ? Don't see the relevance for programming.The first thing to optimize these days is data, not so much code. Compilers are pretty efficient already and, unless you're really experienced and need to squeeze every last drop of performance, just arranging data in a cache-coherent fashion and processing things linearly, you can get significant speed improvements, but does require practice (it's basically a different paradigm)
But optimising for cache can be important. I'm in image processing, so you don't have to tell me. Btw I had quite some success with Loop tiling as strategy.
But usually in such instance you already identified the rate determining step, and identified it is really calculation (rather than some busy waiting) that is the problem
Re: Optimising execution speed
Nono, we're talking a real program here that had to manage several thousand entities at the same time. Now, for something simple and/or not many calls per loop then yes, the performance loss is irrelevant. But yeah, it messes with branch prediction (not to mention having to perform a lookup vs just the call)marcov wrote: ↑Jan 01, 2025 12:51 I assume the indirection frustrates branch prediction, though even then the second time it could do it speculatively. It is possible to make a micro benchmark out of about everything. If all you do is put a indirection in a loop, yes, then the indirection is measurable. But I wonder it is worthwhile to focus on it as part of the first round of optimisation.
Some compilers support devirtualisation as part of their global optimisations.
'Cache coherency' in the sense that everything that needs to be processed together is kept tightly packed in memory as much as possible. Think the way you arrange and pass data to GPUs, but for everything else as well (like in Entity-Component-System schemes). For images this is easy (all the pixels are packed together sequentially), but when you're dealing with a state that can change completely from one frame to the next it becomes really challenging.Cache coherency is a strategy to keep multiple caches coherent (to avoid dirty reads etc) ? Don't see the relevance for programming.
But optimising for cache can be important. I'm in image processing, so you don't have to tell me. Btw I had quite some success with Loop tiling as strategy.
This is perhaps the most immediately observable optimization one can make. To convince me back at the time when I read about the concept, I coded a very simple simulation: one using virtual calls and the 'usual' patterns (ie an entity->update() virtual call) vs a simple procedural approach (with the proper data arrangements of course). In this particular case, the 'simple' approach was 17 times faster than the virtual call approach. Of course, it was just for this test, not a real app, but served me to change the way I approach certain tasks (ie use virtual calls and OOP for coarse, architectural layers; and use simple arrays and linear iterations for everything else).
I've read the book about Data Oriented Design and I've found it really good, but I usually don't overdo it if I don't have to. I'm lazy, you know, and for me the best optimization approach will always be 'don't write more code than absolutely necessary to accomplish the task'

Re: Optimising execution speed
So what kind of program? A program that built a heterogeneous tree like a compiler?paul doe wrote: ↑Jan 01, 2025 14:01 Nono, we're talking a real program here that had to manage several thousand entities at the same time. Now, for something simple and/or not many calls per loop then yes, the performance loss is irrelevant. But yeah, it messes with branch prediction (not to mention having to perform a lookup vs just the call)
It is not. That is cache optimisation. Cache coherency is mostly a silicon thing synchronising between multiple caches and a few rules for programmers to not violate the model. The model is to make multiple memory systems with multiple caches seem like a single logical memory system.'Cache coherency' in the sense that everything that needs to be processed together is kept tightly packed in memory as much as possible.
As both Cache coherency and optimisation are like, euuugh, cache related, they work with cache line granularity. (as in a write from multiple sources to the same cache line is a problem) But it is not the same thing. Coherency are the rules to follow so that memory doesn't get inadvertently corrupted, cache optimisation is about not unnecessarily hitting the (silicon part of) barriers that ensure cache coherency.
Also for cache optimisation, extreme memory packing is not always a beneficial strategy. If you know multiple CPUs access the two variables, you surely don't want them in the same cache line, as that will invalidate multiple caches and reduce speed.
I usually pass data as PBO's to GPUs. But I assume you mean exactly the opposite of ECS, not having references for everything but folding it into the game objects as much as possible. Those are tricks we did back in the game, when I still did LPC MUD work on P-I systems, yes.Think the way you arrange and pass data to GPUs, but for everything else as well (like in Entity-Component-System schemes).
But I still consider that an extreme case, to be only pursued after all other avenues have been exhausted.
In general the rule is that every operations need to reduce the size of the (intermediate) result as much as possible. The newer processors got, the more processing was allowed to reach that goal. In the past that was mostly fixed point stuff in SSE registers (as floating point (singles) would decrease SIMD granularity and throughput).For images this is easy (all the pixels are packed together sequentially), but when you're dealing with a state that can change completely from one frame to the next it becomes really challenging.
Actually that might be one of the good things of the AI revolution, having cheap SIMD floating point types

It might not have been just the virtual call, but also the references and having multiple allocations vs one. Make absolutely sure that the number of allocations in both scenarios are absolutely the same. Talking about virtual call overhead mixed with allocations (dynamic memory allocation is a much "heavier" mechanism) is futile.This is perhaps the most immediately observable optimization one can make. To convince me back at the time when I read about the concept, I coded a very simple simulation: one using virtual calls and the 'usual' patterns (ie an entity->update() virtual call) vs a simple procedural approach (with the proper data arrangements of course). In this particular case, the 'simple' approach was 17 times faster than the virtual call approach.
For any good optimisation strategy, you need to revisit after a time to make sure the feeling you got when solving something is actually the cause of the change you attribute it to. Eliminate all other factors to make absolutely sure.I've read the book about Data Oriented Design and I've found it really good, but I usually don't overdo it if I don't have to. I'm lazy, you know, and for me the best optimization approach will always be 'don't write more code than absolutely necessary to accomplish the task'![]()
So many rigidly held believes that I had in the nineties when I first did my non trivial programming (started with C=64 in the eighties) have been toppled since.
Re: Optimising execution speed
Any idea how costly (slow) recursive functions are compared to a loop and with a custom stack implementation for the data?
E.g. for this crossword generator project using the 'recursiveMagic' function, where the depth is limited by the amount of words to fit (currently testing with 25 words).
Update: I got a working non-recursive version, 15% slower at the moment and more ugly.
E.g. for this crossword generator project using the 'recursiveMagic' function, where the depth is limited by the amount of words to fit (currently testing with 25 words).
Update: I got a working non-recursive version, 15% slower at the moment and more ugly.
-
- Posts: 634
- Joined: Dec 02, 2011 22:51
- Location: France
Re: Optimising execution speed
How I see the issue : there are several reasons that may explain the lack of tips and tricks regarding code optimization.
Think coding style, think about flexibility, think about the system (a little empathy for that unfortunate computer), and think about efficiency, and then, only, think optimization, and finally, with experience and technique, you will find your way.
That being said, nothing prevents you from researching technical tips: think about the process and minimize unnecessary loops as much as possible, reduce as much as possible everything that must be in a loop, and above all think through the algorithms correctly.
One last piece of advice, perhaps the most important: place immense importance on conceptualization, by trying to think also through optimization upfront. Optimizing code to death will never correct a conceptual error at the outset.
- Look on the forum; there are some particularly fast codes (Dodicat, Srvaldez, Juergen Kulhwein, etc.), and yet all these codes are very different. Coding style is a coherent way of thinking: it's absurd to change your coding style to gain a few milliseconds; the price to pay is a conceptualization guided by... nothing. Before thinking about optimization, think about building your own coding style.
- Avoid rigid or doctrinal positions as much as possible (i.e., arrays are better or lists are better, the use of pointers is irrelevant, recursive programming is better or worse, etc.). Keep in mind that the more options you manage to preserve, the more optimization paths you'll preserve. This is very difficult, because the easiest way is, of course, to lock yourself into a coding style. Before thinking about optimization, keep as many options as possible open to you.
- Optimize, yes, but against what? Take the example of recursive programming: often faster, but also a major consumer of a very specific resource: the stack. Pointers: sometimes slower than a dedicated function but less cache-intensive. Pure, stand-alone optimization in a "hard science" way doesn't always correspond to what we're looking for. Before thinking about optimization, think about the system as a whole and over time.
- Optimize, yes, but what? Identify the bottlenecks and focus on them. To be effective, before thinking about optimization, think about efficiency.
Think coding style, think about flexibility, think about the system (a little empathy for that unfortunate computer), and think about efficiency, and then, only, think optimization, and finally, with experience and technique, you will find your way.
That being said, nothing prevents you from researching technical tips: think about the process and minimize unnecessary loops as much as possible, reduce as much as possible everything that must be in a loop, and above all think through the algorithms correctly.
One last piece of advice, perhaps the most important: place immense importance on conceptualization, by trying to think also through optimization upfront. Optimizing code to death will never correct a conceptual error at the outset.
-
- Posts: 634
- Joined: Dec 02, 2011 22:51
- Location: France
Re: Optimising execution speed
edited : wrong topic
Re: Optimising execution speed
Hi there,
Here is my 2 million particles optimized simulation with collisions on CPU written on FreeBasic. In the comments there are some of optimization tips.
https://youtu.be/g3jBiWW4BzE?si=qoYAdzTZGETX7e4Z
Here is my 2 million particles optimized simulation with collisions on CPU written on FreeBasic. In the comments there are some of optimization tips.
https://youtu.be/g3jBiWW4BzE?si=qoYAdzTZGETX7e4Z