Next generation name is FB++ ?

For other topics related to the FreeBASIC project or its community.
badidea
Posts: 1457
Joined: May 24, 2007 22:10
Location: The Netherlands

Re: Next generation name is FB++ ?

Postby badidea » Jul 13, 2018 12:49

The chm file works fast here (linux). The only crappy thing here is my current viewer "xCHM", KDE's "KCHM" works better.
If it is slow, it is probably the virus scanner. Somehow Microsoft thought is was a good idea to make a 'Compiled Help File'.
jj2007
Posts: 1214
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: Next generation name is FB++ ?

Postby jj2007 » Jul 13, 2018 13:07

As a rule, the *.hlp reader is about twice as fast in loading and displaying a topic. Besides, if you ask e.g. for CreateWindowE instead of CreateWindowEx,
*.hlp will display a list of possible matches (and the one already selected is CreateWindowEx)
*.chm will display "Cannot visualise that page" - a lousy answer for a system supposed to help the user, not to confuse him.
dodicat
Posts: 5913
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: Next generation name is FB++ ?

Postby dodicat » Jul 13, 2018 13:36

jj2007 wrote:
dodicat wrote:I tested masm basic member Hutch's idea with html to overcome the .chm problem
I hope Hutch won't see your post, he doesn't really want to be associated with MasmBasic. First, he is more a PowerBasic fan, and second, he is the guy who created Masm32 and maintains the most important assembler forum worldwide...


For goodness sake jj2007 I didn't say anything about Hutch, I only mentioned his masmbasic chm fix.
What would he dislike in that? <I know what you mean really>

I don't know anything about worldwide important assembler forums.
Me:

Let not Ambition mock their useful toil,
Their homely joys, and destiny obscure;
Nor Grandeur hear with a disdainful smile
The short and simple annals of the Poor.

(T.Gray)
In a verse.

I am not impressed by much really.
But I must admit It's nice to see THE DONALD coming for a game of golf anytime soon.
jj2007
Posts: 1214
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: Next generation name is FB++ ?

Postby jj2007 » Jul 13, 2018 13:48

dodicat wrote:For goodness sake jj2007 I didn't say anything about Hutch, I only mentioned his masmbasic chm fix.
What would he dislike in that? <I know what you mean really>
My toy has guest status at his Masm32 forum, so he might be offended being called "masm basic member". But he is Australian and has a good sense of humour, don't worry <smile>
marcov
Posts: 2760
Joined: Jun 16, 2005 9:45
Location: Eindhoven, NL
Contact:

Re: Next generation name is FB++ ?

Postby marcov » Jul 14, 2018 10:11

badidea wrote:The chm file works fast here (linux). The only crappy thing here is my current viewer "xCHM", KDE's "KCHM" works better.
If it is slow, it is probably the virus scanner. Somehow Microsoft thought is was a good idea to make a 'Compiled Help File'.


There is also GNOCHM, but KDE's KChmviewer is generally the best. Or maybe least worst, all are signuficantly slower than the windows CHM viewer. Note that when I last used it, kchmviewer had options to select differing rendering engines. If you have rendering problems, you can play with it.
marcov
Posts: 2760
Joined: Jun 16, 2005 9:45
Location: Eindhoven, NL
Contact:

Re: Next generation name is FB++ ?

Postby marcov » Jul 14, 2018 10:16

jj2007 wrote:
marcov wrote:You are quite quick to consider something crappy.
Yes, that's true. After only a few years of bad experience, I immediately consider something crappy! But I am not the only one with that bad habit, see e.g. here:
A CHM topic ID that contains special characters like dots, spaces and a number of other characters is invalid and simply will not work.


Dots work, both leading and internal.

That is one of the many restrictions of the ancient Microsoft CHM system. (Actually, Windows itself doesn't really support them either, but current versions do a lot of acrobatics to make it look as though it does.)


Well, nothing is perfect. But not being perfect doesn't automatically make it crappy. One must be very careful to call out bugs, because some bugs are not caused by the viewer, but by the composing app (e.g. somebody using an old, buggy chm compiler)

Btw *.hlp is even more ancient, but it works much better than *.chm.


I never liked its RTF basis. The generation side of it was Hell for anything non trivial.

Not that its matters, there is simply no substitute with CHM. HLP is deprecated and cumbersome to install for newbs, and has no cross-platform support.

Linux never even made it to even the most basic helpsystem, pointing a browser at some lose html is as far as they got in 25 years.

Nevermind, I have no problem with you being fond of trees, you just picked a really bad example. Let's blame Micros**t ;-)


Yes, apparently I picked I topic you can't be rational about.
coderJeff
Site Admin
Posts: 3016
Joined: Nov 04, 2005 14:23
Location: Ontario, Canada
Contact:

Re: Next generation name is FB++ ?

Postby coderJeff » Jul 14, 2018 13:18

marcov wrote:P.s. it would be fun to have a data structures thread.

Yes. The concepts for stack, list, tree, etc, often have only a few simple rules. I find what gets interesting/challenging is writing bug free code to manage memory and reliably access the data in a way that's easy/convenient/simple to program. Text books I have read like to end chapters with removal/deletion/destruction left as exercise for the reader; and that's the hardest part in my opinion.
FB can do pointers, so that's available, but with support for dynamic sized arrays, in structures too, indexing methods may be convenient, especially for copying data, or saving and loading structures in files.
marcov
Posts: 2760
Joined: Jun 16, 2005 9:45
Location: Eindhoven, NL
Contact:

Re: Next generation name is FB++ ?

Postby marcov » Jul 14, 2018 14:48

coderJeff wrote:
marcov wrote:P.s. it would be fun to have a data structures thread.

Yes. The concepts for stack, list, tree, etc, often have only a few simple rules. I find what gets interesting/challenging is writing bug free code to manage memory and reliably access the data in a way that's easy/convenient/simple to program. Text books I have read like to end chapters with removal/deletion/destruction left as exercise for the reader; and that's the hardest part in my opinion.
FB can do pointers, so that's available, but with support for dynamic sized arrays, in structures too, indexing methods may be convenient, especially for copying data, or saving and loading structures in files.


The subject has always been of interest to me. I now have about 15 years of professional coding after graduation, and it gives me a certain perspective. The last few years in the Pascal worlds Generics/templates etc were introduced, and it has required some adapting.

Lessons I've learned:
  1. Big O notation is not the gospel, but a tool to do quick back-of-beercoaster estimations of performance. If your number of elements won't go to infinite, you need to regard the constant overheads too.
  2. Many datastructure classes over-focus on lookup, and the lessons scale bad to situations where the number of lookups is not an magnitude over other operations, and are not for single values.
  3. smart indexing and caching goes a long way.
  4. Ordered lists are relatively ignored by datastructure classes.
  5. Generics must support value types. Generic types that only support reference types and/or classes are limited.
  6. Token replay generics (like C++) are superior technically to interface based generics (only use what you declare like .NET). However error hunting and compiler complexity is a concern. If you go for token replay, also allow interface based generics. If you are a native compiler without single rooted type system (like .NET,Java to a high degree have), you will probably need token-replay.

I learned more about datastructure performance estimation in some of the advanced SQL classes than in datastructure class. The SQL classes introduced a linear approximation (performance = A * elements + B) , and unless it is some known extreme case (only doing lookup of single elements) that is a better estimator than big O()
Lost Zergling
Posts: 240
Joined: Dec 02, 2011 22:51
Location: France

Re: Next generation name is FB++ ?

Postby Lost Zergling » Jul 14, 2018 20:44

I wish to react to the idea of the "Left Child Right Sibling tree" which is very effective in the most common cases but nevertheless presents to me some limitations for those who wish a more complete tool. Compared to the "Left Child Right Sibling Tree", LZLE allows two-way parsing and upwelling of tree branches. These two functions are useful for significant optimization of speed in some common cases. Another important feature is the ability to track the data in the chronological order of the key sequence (or in a progammatic way you choose): this allows you to map the tree in a sorted way while keeping the possibility to browse the keys in the chronological order they where mapped. The sorted mapping is often faster than the unsorted mapping, so anticipating the sort results in a negative resource cost (in speed, not in memory). The concept of rollbackreduce (RestoreHash) is to my knowledge completely original as a set of instruction. Thus, by launching a NodeFlat followed by a HashSort (1) and a RestoreHash, it is possible to sort an index on demand. Combined with the functions of "Snatch", it becomes possible, using a second instance, to sort a selected part of the tree (or to perform other operations). The NodeFlat (Reduce) function can also be set according to the execution context of the algorithm (in a loop or not using NFrecursive) to match the expected result. Up to my opinion, this results in a versatile and intuitive tool with globally correct performance once the step taken in hand acquired.
coderJeff
Site Admin
Posts: 3016
Joined: Nov 04, 2005 14:23
Location: Ontario, Canada
Contact:

Re: Next generation name is FB++ ?

Postby coderJeff » Jul 15, 2018 12:20

marcov, heh, my level of knowledge and experience may be less than engaging for you... I will try anyway.

It's been many years since I programmed professionally, so my projects these days are just for myself. and tend to use simple approaches using stacks, lists, hashes, that are often kind of fragile around the edges.

Last summer I started working on a balanced & sorted binary tree API, only because I had never really used one. I was going to share it online, and I only needed to write the unit tests for it, but got side tracked rewriting fbc's test-suite in the process and haven't been back to it.

Even if the end result fails, I plan to try it with a program I wrote. Years ago I wrote a program to synchronize files & directories between 2 locations. I know there are tools available to do this, but this is something I wanted to control myself, and could adjust the heuristics to my work method. The first version, which worked well for me for many years, is starting to show scaling problems as directory tree it synchronized has gone 10,000 files to 200,000 files.

Here's what I was working on (from the description I left for myself in the code):

Code: Select all

'' SYS_TTREE/SYS_TTREE_CTX
''
'' - self-balancing binary search tree
'' - data is stored in the node
'' - uses sys_tlist...() for memory management
''   - nodes are fixed length
''   - multiple nodes per system allocation
'' - in addition to the user data, each node has left
''   right, parent pointers and a balance integer
'' - balance is automatically updated on insert/remove
''   and rotations
'' - when balance is more than +/- 2, rotation performed


It started off as red/black tree, but because I had to store the red/black bit somewhere, I just ended up using an integer (same size as pointer) to store the balance. When implementing the methods for remove/rotate, I found it was really convenient to have a parent pointer. And because I already had a solid LIST type API, I used that to manage the node allocations.

So in total, overhead is 5 pointers and an integer per node. On 64-bit system that's 48 bytes per node, on a record length of 512 bytes. If I update to support unicode, the record length would probably go to 1024 bytes.

Two things about my implementation:
- the overhead seems high (10%), but I have no basis to make that judgement
- right now, API is only a straight-up pointer implementation, with procedural API. No class or interface wrapper. I look back at some of my API's I have written and that tends to be my pattern. Data structures with a procedural API, that I then later wrap with a class/interface for better/safer end usages

Side note to anyone reading: I am fairly satisfied with code I write, and do want to share. But, some of the stuff I write could totally destroy files or hard drive on a pretty low level. So I am very nervous about sharing without writing the unit tests first. Someday...
caseih
Posts: 1366
Joined: Feb 26, 2007 5:32

Re: Next generation name is FB++ ?

Postby caseih » Jul 16, 2018 22:03

jj2007 wrote:I have several interesting and well-tested ways to test the speed of data structures and algorithms.
By this I hope you mean well-established and understood standard algorithms with good polynomial or logarithmic big-O runtimes, though I suspect you're referring to your own ad-hoc, brute-force methods that may or may not scale beyond your smaller scales. EDIT: I'm not saying one's own algorithms are a bad idea; for small sizes of n, it certainly doesn't matter if something is O(n^2) or not.
No, in this case I am talking about my experience with the crappy CHM format because markov is fond of it:
marcov wrote:I'm fond of B+Trees, specially if the nodes hold value types (because then if links are indexes instead of pointers, you can just memorymap it). Microsoft uses it heavily in its IStorage like fileformat. CHM is internally a derivate of that.
See also this post of today: Having a problem with help file - masm32.chm
Marcov never said he was fond of CHM. He said he was fond of B+ trees, which happens to be what CHM format uses internally. B+ trees are very useful and widely used because of their characteristiscs. But besides that, CHM is not the same thing as the MS CHM viewer.
Last edited by caseih on Jul 16, 2018 22:28, edited 1 time in total.
marcov
Posts: 2760
Joined: Jun 16, 2005 9:45
Location: Eindhoven, NL
Contact:

Re: Next generation name is FB++ ?

Postby marcov » Jul 16, 2018 22:27

coderJeff wrote:
It started off as red/black tree, but because I had to store the red/black bit somewhere, I just ended up using an integer (same size as pointer) to store the balance.


The classic way is to use the lower bit(s) of your pointers. These are nearly always 0 (addressed handled out by a suballocator are typically dividable by 8 or even 16 or 32). You can store data there by orring a number <8 and getting the pure pointer value by doing realptr=stuffedptr and not 7

This way you don't need to change your basic element allocation size. The downside is that you must access all pointers via the (ANDing) macros.
caseih
Posts: 1366
Joined: Feb 26, 2007 5:32

Re: Next generation name is FB++ ?

Postby caseih » Jul 16, 2018 22:33

badidea wrote:The chm file works fast here (linux). The only crappy thing here is my current viewer "xCHM", KDE's "KCHM" works better.
If it is slow, it is probably the virus scanner. Somehow Microsoft thought is was a good idea to make a 'Compiled Help File'.

At one time, this would have probably been a reasonable thing to do and expect. There were really only two choices. Either compile it into a binary tree once and store the result to a file for fast loading (probably memory-mapped and usable directly), or compile it every time the help file is open. Given the hardware constraints of the time, I can understand why they chose the former. Now of course we would choose the latter every time, as that divorces the in-memory format from the disk format, which is desirable for portability and standardization reasons, as well as flexibility. Just like how we now use XML-based formats for MS Office.
jj2007
Posts: 1214
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: Next generation name is FB++ ?

Postby jj2007 » Jul 16, 2018 22:47

caseih wrote:I hope you mean well-established and understood standard algorithms with good polynomial or logarithmic big-O runtimes, though I suspect you're referring to your own ad-hoc, brute-force methods
Is there any specific reason why you play that "me professional - you hobbyist" game? Does it make you feel better? If so, go ahead, I am always glad to help.
caseih
Posts: 1366
Joined: Feb 26, 2007 5:32

Re: Next generation name is FB++ ?

Postby caseih » Jul 17, 2018 1:16

Not at all; you're reading in tone and innuendo where none was intended and really doesn't exist.

I just see a propensity for some on this list to roll their own ad-hoc algorithms when a standard algorithm is often simpler and faster. Even If one is a low-level enthusiast then using existing such things should be appealing because they are so efficient. On the other hand if one just wants to get the job done, such things would be even more appealing because the hard work is already done. Besides all that, it's really fun to see how these things work.

Even if you're not a professional programmer, I still highly recommend this path. Even if just for its educational value. If I was smarter I might be able to develop my own algorithms that scale and work well, but I'm certainly not so I rely as much as possible on trusted algorithms and data structures, even when I implement them myself.

I thoroughly enjoyed my university days and I learned a lot about computer science. I guess I error in assuming that others, hobbiests and professionals, alike share the same interest in learning how to use these things to write fun and useful programs.

Return to “Community Discussion”

Who is online

Users browsing this forum: No registered users and 3 guests