The Ultimate FB HashMap

Post your FreeBASIC source, examples, tips and tricks here. Please don’t post code without including an explanation.
Post Reply
Zamaster
Posts: 1025
Joined: Jun 20, 2005 21:40
Contact:

The Ultimate FB HashMap

Post by Zamaster »

I know we've all done 96 of these, but are yours as overboard as this? I'm updating my personal FB libraries and wanted to redo my associated array via templated Hash map, so I did. Design choices: Key types are strictly long and zstring for now, and are not explicitly OOP constructs. Key types must have macros associated with them hard coded, this was done entirely for speed reasons. Assembly is used for the actual hashing, because lets face it FB ain't about to optimize any routines, fear not its x64 friendly.

Also, this table scales INSANELY well (adding a million items or one, insert/remove/seek times are the same): it never rehashes the whole table and rarely needs to allocate storage outside of the contiguous table block, in fact, a great deal of effort was put into making it cache friendly (like allocating fixed space for small strings). This table also favors (faster seek time) items stored more recently.

If you want to see how fast this thing is, add 100000 items with string keys and seek the whole table. While there are obviously faster ways to write a hash map, this implementation should provide a strong balance of not just speed, but density, scalability, and access consistency. No iterators, that's for a HashList.

Because I know how much everyone loves single file libs, just #include "hashmap.bi" , and be sure to define it somewhere at the top of your code (see the example). Don't worry about defining it multiple times in all needed files (poor C like template libraries have this problem), it has guards for that. Also its wrapped in a namespace called "dsm."

hashmap.bi

Code: Select all

#ifndef HASHMAP_BI
#define HASHMAP_BI

'from dsm\dsmstd.bi
namespace dsm

    enum bool
        false = 0
        true = 1
    end enum
    type char as ubyte
    #define NULL 0
    type size_t as ulong

end namespace

#define HASHMAP_CONTIGUOUS_BLOCK_N 4
#define HASHMAP_INITIAL_ROW_N 8
#define HASHMAP_SPLIT_RATIO 0.8
#define HASHMAP_COMPACT_RATIO 0.2

namespace dsm
    #macro HASHMAP_DELETION_LOGIC()
        for j = i to start - 1
            cur_row->slots(j) = cur_row->slots(j + 1)
        next j
        cur_row->size -= 1
        if cur_row->size = 0 then
            if last_row then
                last_row->next_block = cur_row->next_block
                deallocate(cur_row)
            else
                if cur_row->next_block then
                    *(cur_row) = *(cur_row->next_block)
                end if
            end if
        end if
        used_size -= 1
        if (cdbl(used_size) / (capacity * HASHMAP_CONTIGUOUS_BLOCK_N)) < _
            HASHMAP_COMPACT_RATIO then down_split_entry()
        return false
    #endmacro
    ' -------------------------- LONG ----------------------------
    #macro HASHMAP_long_STORAGE_DATA()
        key as long
    #endmacro
    #macro HASHMAP_long_DESTRUCT_LOGIC()
        ''
    #endmacro
    #macro HASHMAP_long_INSERTION_LOGIC()
        .key = _key
    #endmacro
    #macro HASHMAP_long_DELETION_LOGIC()
        if _key = cur_row->slots(i).key then
            HASHMAP_DELETION_LOGIC()
        end if
    #endmacro
    #macro HASHMAP_long_EXISTS_LOGIC()
        if _key = cur_row->slots(i).key then return true
    #endmacro
    #macro HASHMAP_long_RETRIEVE_LOGIC()
        if _key = cur_row->slots(i).key then 
            _item = cur_row->slots(i).data_
            return true
        end if
    #endmacro
    #macro HASHMAP_long_RETRIEVE_R_LOGIC()
        if _key = cur_row->slots(i).key then 
            return cur_row->slots(i).data_
        end if
    #endmacro   
    #macro HASHMAP_long_CALC_NEW_POS(_KEYTYPE_, _INDEX_)
        new_pos = hash_##_KEYTYPE_(this, cur_row->slots(_INDEX_).key)
    #endmacro
    #macro HASHMAP_DECLARE_HASH_WRAP_long(_KEYTYPE_, _TYPENAME_)
    function hash_wrap_long naked cdecl _
    ( _
        byref _table as HashMap_##_KEYTYPE_##_TYPENAME_, _
        _key as long _
    ) as size_t    
    #endmacro
    #macro HASHMAP_DEFINE_HASH_WRAP_long(_KEYTYPE_, _TYPENAME_)
    function HashMap_##_KEYTYPE_##_TYPENAME_.hash_wrap_long naked cdecl _
    ( _
        byref _table as HashMap_##_KEYTYPE_##_TYPENAME_, _
        _key as long _
    ) as size_t
        asm
        #ifdef __FB_64BIT__
                mov     rcx,                    &h890390f1daf308c 
                mov     rax,                    qword ptr [rsp+16]
                shr     qword ptr [rsp+16],     32
                xor     rax,                    qword ptr [rsp+16]
                mul     rcx
        #ifndef HASHMAP_FAST
                mov     qword ptr [rsp+16],     rax
                shr     qword ptr [rsp+16],     32
                xor     rax,                    qword ptr [rsp+8]
                mul     rcx
        #endif
                mov     qword ptr [rsp+16],     rax
                shr     qword ptr [rsp+16],     32
                xor     rax,                    qword ptr [rsp+16]
                mov     rcx,                    qword ptr [rsp+8]
                mov     rdx,                    rax
                and     rax,                    qword ptr [rcx+40]
                cmp     rax,                    qword ptr [rcx+8]
                jl      dsm_hashmap_hashw_long_upperlevel_64
                ret
            dsm_hashmap_hashw_long_upperlevel_64:
                and     rdx,                    qword ptr [rcx+48]
                mov     rax,                    rdx
                ret
        #else
                mov     ecx,                    &h45D9F3B
                mov     eax,                    dword ptr [esp+8]
                shr     dword ptr [esp+8],      16
                xor     eax,                    dword ptr [esp+8]
                mul     ecx
        #ifndef HASHMAP_FAST
                mov     dword ptr [esp+8],      eax
                shr     dword ptr [esp+8],      16
                xor     eax,                    dword ptr [esp+8]
                mul     ecx
        #endif
                mov     dword ptr [esp+8],      eax
                shr     dword ptr [esp+8],      16
                xor     eax,                    dword ptr [esp+8]
                mov     ecx,                    dword ptr [esp+4]
                mov     edx,                    eax
                and     eax,                    dword ptr [ecx+20]
                cmp     eax,                    dword ptr [ecx+4]
                jl      dsm_hashmap_hashw_long_upperlevel_32
                ret
            dsm_hashmap_hashw_long_upperlevel_32:
                and     edx,                    dword ptr [ecx+24]
                mov     eax,                    edx
                ret
        #endif        
        end asm
    end function
    #endmacro
    #macro HASHMAP_DECLARE_HASH_long(_KEYTYPE_, _TYPENAME_)
    declare function hash_long naked cdecl _
    ( _
        _key as long _
    ) as size_t    
    #endmacro
    #macro HASHMAP_DEFINE_HASH_long(_KEYTYPE_, _TYPENAME_)
    function HashMap_##_KEYTYPE_##_TYPENAME_.hash_long naked cdecl _
    ( _
        _key as long _
    ) as size_t
        asm
        #ifdef __FB_64BIT__
                mov     rcx,                    &h890390f1daf308c 
                mov     rax,                    qword ptr [rsp+8]
                shr     qword ptr [rsp+8],      32
                xor     rax,                    qword ptr [rsp+8]
                mul     rcx
        #ifndef HASHMAP_FAST
                mov     qword ptr [rsp+8],      rax
                shr     qword ptr [rsp+8],      32
                xor     rax,                    qword ptr [rsp+8]
                mul     rcx
        #endif
                mov     qword ptr [rsp+8],      rax
                shr     qword ptr [rsp+8],      32
                xor     rax,                    qword ptr [rsp+8]
                ret
        #else
                mov     ecx,                    &h45D9F3B
                mov     eax,                    dword ptr [esp+4]
                shr     dword ptr [esp+4],      16
                xor     eax,                    dword ptr [esp+4]
                mul     ecx
        #ifndef HASHMAP_FAST
                mov     dword ptr [esp+4],      eax
                shr     dword ptr [esp+4],      16
                xor     eax,                    dword ptr [esp+4]
                mul     ecx
        #endif
                mov     dword ptr [esp+4],      eax
                shr     dword ptr [esp+4],      16
                xor     eax,                    dword ptr [esp+4]
                ret
        #endif        
        end asm         
    end function
    #endmacro
    ' -------------------------- ZSTRING ----------------------------
    #define HASHMAP_zstring_BUFFER_N 32
    #macro HASHMAP_zstring_STORAGE_DATA()
        is_contiguous as bool
        union
            key_internal as zstring * HASHMAP_zstring_BUFFER_N
            key_external as zstring ptr
        end union
    #endmacro
    #macro HASHMAP_zstring_DESTRUCT_LOGIC()
        dim as integer j
        for j = 0 to cur_block->size - 1
            if cur_block->slots(j).is_contiguous = false then
                deallocate(cur_block->slots(j).key_external)
            end if
        next j    
    #endmacro
    #macro HASHMAP_zstring_INSERTION_LOGIC()
        dim as size_t key_length = len(_key)
        if key_length < HASHMAP_zstring_BUFFER_N then
            .key_internal = _key
            .is_contiguous = true
        else
            .key_external = allocate(key_length + 1)
            *(.key_external) = _key 
            .is_contiguous = false  
        end if      
    #endmacro
    #macro HASHMAP_zstring_DELETION_LOGIC()
        if cur_row->slots(i).is_contiguous = false then
            if _key = *(cur_row->slots(i).key_external) then
                deallocate(cur_row->slots(i).key_external)
                HASHMAP_DELETION_LOGIC()
            end if
        else
            if _key = cur_row->slots(i).key_internal then
                HASHMAP_DELETION_LOGIC()
            end if              
        end if
    #endmacro
    #macro HASHMAP_zstring_EXISTS_LOGIC()
        if cur_row->slots(i).is_contiguous = true then
            if _key = cur_row->slots(i).key_internal then
                return true
            end if
        else
            if _key = *(cur_row->slots(i).key_external) then
                return true
            end if              
        end if
    #endmacro
    #macro HASHMAP_zstring_RETRIEVE_LOGIC()
        if cur_row->slots(i).is_contiguous = true then
            if _key = cur_row->slots(i).key_internal then
                _item = cur_row->slots(i).data_
                return true
            end if
        else
            if _key = *(cur_row->slots(i).key_external) then
                _item = cur_row->slots(i).data_
                return true
            end if              
        end if
    #endmacro  
    #macro HASHMAP_zstring_RETRIEVE_R_LOGIC()
        if cur_row->slots(i).is_contiguous = true then
            if _key = cur_row->slots(i).key_internal then
                return cur_row->slots(i).data_
            end if
        else
            if _key = *(cur_row->slots(i).key_external) then
                return cur_row->slots(i).data_
            end if              
        end if
    #endmacro    
    #macro HASHMAP_zstring_CALC_NEW_POS(_KEYTYPE_, _INDEX_)
        if cur_row->slots(_INDEX_).is_contiguous = true then
            new_pos = hash_##_KEYTYPE_ _
            ( _
                cur_row->slots(_INDEX_).key_internal _
            )
        else
            new_pos = hash_##_KEYTYPE_ _
            ( _
                *(cur_row->slots(_INDEX_).key_external) _
            )
        end if
    #endmacro
    #macro HASHMAP_DECLARE_HASH_WRAP_zstring(_KEYTYPE_, _TYPENAME_)
    declare function hash_wrap_zstring naked cdecl _
    ( _
        byref _table as HashMap_##_KEYTYPE_##_TYPENAME_, _
        _key as zstring _
    ) as size_t    
    #endmacro
    #macro HASHMAP_DEFINE_HASH_WRAP_zstring(_KEYTYPE_, _TYPENAME_)
    function HashMap_##_KEYTYPE_##_TYPENAME_.hash_wrap_zstring naked cdecl _
    ( _
        byref _table as HashMap_##_KEYTYPE_##_TYPENAME_, _
        _key as zstring _
    ) as size_t
        asm
        #ifdef __FB_64BIT__
                push    rbx
                mov     rax,                    14695981039346656037
                mov     rcx,                    1099511628211
                mov     rbx,                    dword ptr [esp+32]
            dsm_hashmap_hashw_zstring_loopstart_64:
                mov     dl,                     byte ptr [rbx]
                or      dl,                     dl
                jz      dsm_hashmap_hashw_zstring_return_64
                xor     al,                     dl          
                mul     rcx
                inc     rbx
                jmp     dsm_hashmap_hashw_zstring_loopstart_64
            dsm_hashmap_hashw_zstring_return_64:
                pop     rbx
                mov     rcx,                    qword ptr [rsp+8]
                mov     rdx,                    rax
                and     rax,                    qword ptr [rcx+40]
                cmp     rax,                    qword ptr [rcx+8]
                jl      dsm_hashmap_hashw_zstring_upperlevel_64
                ret
            dsm_hashmap_hashw_zstring_upperlevel_64:
                and     rdx,                    qword ptr [rcx+48]
                mov     rax,                    rdx
                ret                         
        #else   
                push    ebx
                mov     eax,                    2166136261
                mov     ecx,                    16777619
                mov     ebx,                    dword ptr [esp+16]
            dsm_hashmap_hashw_zstring_loopstart_32:
                mov     dl,                     byte ptr [ebx]
                or      dl,                     dl
                jz      dsm_hashmap_hashw_zstring_return_32
                xor     al,                     dl
                mul     ecx
                inc     ebx
                jmp     dsm_hashmap_hashw_zstring_loopstart_32
            dsm_hashmap_hashw_zstring_return_32:
                pop     ebx
                mov     ecx,                    dword ptr [esp+4]
                mov     edx,                    eax
                and     eax,                    dword ptr [ecx+20]
                cmp     eax,                    dword ptr [ecx+4]
                jl      dsm_hashmap_hashw_zstring_upperlevel_32
                ret
            dsm_hashmap_hashw_zstring_upperlevel_32:
                and     edx,                    dword ptr [ecx+24]
                mov     eax,                    edx
                ret
        #endif    
        end asm      
    end function
    #endmacro
    #macro HASHMAP_DECLARE_HASH_zstring(_KEYTYPE_, _TYPENAME_)
    declare function hash_zstring naked cdecl (_key as zstring) as size_t    
    #endmacro
    #macro HASHMAP_DEFINE_HASH_zstring(_KEYTYPE_, _TYPENAME_)
    function HashMap_##_KEYTYPE_##_TYPENAME_.hash_zstring naked cdecl _
    ( _
        _key as zstring _
    ) as size_t
        asm
        #ifdef __FB_64BIT__
                push    rbx
                mov     rax,                    14695981039346656037
                mov     rcx,                    1099511628211
                mov     rbx,                    dword ptr [esp+24]
            dsm_hashmap_hash_zstring_loopstart_64:
                mov     dl,                     byte ptr [rbx]
                or      dl,                     dl
                jz      dsm_hashmap_hash_zstring_return_64
                xor     al,                     dl          
                mul     rcx
                inc     rbx
                jmp     dsm_hashmap_hash_zstring_loopstart_64
            dsm_hashmap_hash_zstring_return_64:
                pop     rbx          
                ret
        #else   
                push    ebx
                mov     eax,                    2166136261
                mov     ecx,                    16777619
                mov     ebx,                    dword ptr [esp+12]
            dsm_hashmap_hash_zstring_loopstart_32:
                mov     dl,                     byte ptr [ebx]
                or      dl,                     dl
                jz      dsm_hashmap_hash_zstring_return_32
                xor     al,                     dl
                mul     ecx
                inc     ebx
                jmp     dsm_hashmap_hash_zstring_loopstart_32
            dsm_hashmap_hash_zstring_return_32:
                pop     ebx
                ret
        #endif    
        end asm        
    end function
    #endmacro
end namespace

#define HashMap(_KEYTYPE_, _TYPENAME_) HashMap_##_KEYTYPE_##_TYPENAME_

namespace dsm
    type HashMap_Initialization_Object
        public:
            declare constructor(_hashmap_construct as sub(),_
                                _hashmap_destruct as sub())
            declare destructor()
        private:
            as sub() hashmap_destruct
    end type
    constructor HashMap_Initialization_Object(_hashmap_construct as sub(),_
                                              _hashmap_destruct as sub())
        _hashmap_construct()
        hashmap_destruct = _hashmap_destruct
    end constructor
    destructor HashMap_Initialization_Object()
        hashmap_destruct()
    end destructor
end namespace

#define HASHMAP_ROWSIZE(_K_, _T_) sizeof(HashMap_##_K_##_T_##_Row)

#macro dsm_HashMap_define(_KEYTYPE_, _TYPENAME_)

#ifndef HASHMAP_INITIALIZED_##_KEYTYPE_##_TYPENAME_
#define HASHMAP_INITIALIZED_##_KEYTYPE_##_TYPENAME_ 

namespace dsm
    
    type HashMap_##_KEYTYPE_##_TYPENAME_##_key_pair
        HASHMAP_##_KEYTYPE_##_STORAGE_DATA()
        as _TYPENAME_ data_
    end type
    type HashMap_##_KEYTYPE_##_TYPENAME_##_Row
        as size_t size
        as HashMap_##_KEYTYPE_##_TYPENAME_##_Row ptr next_block
        as HashMap_##_KEYTYPE_##_TYPENAME_##_key_pair _
            slots(0 to HASHMAP_CONTIGUOUS_BLOCK_N-1)
    end type
    
    type HashMap_##_KEYTYPE_##_TYPENAME_
        
        public:
            
            declare constructor()
            declare destructor()
    
            declare sub insert(_key as _KEYTYPE_, byref _item as _TYPENAME_)
            declare function remove(_key as _KEYTYPE_) as bool
            declare function exists(_key as _KEYTYPE_) as bool            
            declare function retrieve(_key as _KEYTYPE_, _
                                      byref _item as _TYPENAME_) as bool            
            declare function retrieve(_key as _KEYTYPE_) byref as _TYPENAME_    
                
            declare sub clear()
        protected:
            declare static sub static_construct()
            declare static sub static_destruct()
    
        private:
            static as HashMap_Initialization_Object init_object
            
            HASHMAP_DECLARE_HASH_WRAP_##_KEYTYPE_(_KEYTYPE_, _TYPENAME_)
            HASHMAP_DECLARE_HASH_##_KEYTYPE_(_KEYTYPE_, _TYPENAME_)
            
            declare sub init()
            declare sub clear_data()
            declare sub up_split_entry()
            declare sub down_split_entry()

            as char ptr data_
            as size_t split
            as size_t capacity
            as size_t used_size
            as size_t level
            as size_t level_wrap_mask
            as size_t level_wrap_mask_2x

            static as size_t row_size_adjust
            static as size_t row_shift_mul
    end type
    type HashMap_##_KEYTYPE_##_TYPENAME_##_Accessor extends _
        HashMap_##_KEYTYPE_##_TYPENAME_
        public:
            declare static sub construct_()
            declare static sub destruct_()
        private:
            as integer _placeholder_
    end type
    sub HashMap_##_KEYTYPE_##_TYPENAME_##_Accessor.construct_()
        static_construct()
    end sub
    sub HashMap_##_KEYTYPE_##_TYPENAME_##_Accessor.destruct_()
        static_destruct()
    end sub
    
    dim as HashMap_Initialization_Object _
        HashMap_##_KEYTYPE_##_TYPENAME_.init_object = _
        HashMap_Initialization_Object _
        ( _
            @HashMap_##_KEYTYPE_##_TYPENAME_##_Accessor.construct_, _ 
            @HashMap_##_KEYTYPE_##_TYPENAME_##_Accessor.destruct_ _
        ) 
        
    dim as size_t HashMap_##_KEYTYPE_##_TYPENAME_.row_size_adjust = 1
    dim as size_t HashMap_##_KEYTYPE_##_TYPENAME_.row_shift_mul = 0

    sub HashMap_##_KEYTYPE_##_TYPENAME_.static_construct()
        dim as size_t temp_size
        temp_size = sizeof(HashMap_##_KEYTYPE_##_TYPENAME_##_Row)
        do
            row_shift_mul += 1
            temp_size shr= 1
        loop while (temp_size <> 0)
        row_size_adjust shl= row_shift_mul
        if ((row_size_adjust shr 1) = sizeof( _
            HashMap_##_KEYTYPE_##_TYPENAME_##_Row)) then
            
            row_shift_mul -= 1
            row_size_adjust shr= 1
        end if
    end sub
    
    sub HashMap_##_KEYTYPE_##_TYPENAME_.static_destruct()
        ''
    end sub
    
    sub HashMap_##_KEYTYPE_##_TYPENAME_.init()
        split = 0
        level = HASHMAP_INITIAL_ROW_N
        level_wrap_mask = level - 1
        level_wrap_mask_2x = level shl 1 - 1
        used_size = 0
        capacity = HASHMAP_INITIAL_ROW_N    
    end sub
    
    constructor HashMap_##_KEYTYPE_##_TYPENAME_()
        init()
        data_ = callocate(row_size_adjust * capacity)
    end constructor
    
    sub HashMap_##_KEYTYPE_##_TYPENAME_.clear_data()
        dim as integer i
        dim as HashMap_##_KEYTYPE_##_TYPENAME_##_Row ptr cur_block
        dim as HashMap_##_KEYTYPE_##_TYPENAME_##_Row ptr next_block
        dim as bool not_first_block
        
        for i = 0 to capacity - 1
            cur_block = cast(HashMap_##_KEYTYPE_##_TYPENAME_##_Row ptr, _
                             data_ + i shl row_shift_mul)
            not_first_block = false
            while(cur_block)
                next_block = cur_block->next_block
                HASHMAP_##_KEYTYPE_##_DESTRUCT_LOGIC()
                if not_first_block then deallocate(cur_block)
                not_first_block = true
                cur_block = next_block
            wend
        next i       
    end sub
    
    destructor HashMap_##_KEYTYPE_##_TYPENAME_()
        clear_data()
        deallocate(data_)
    end destructor
    
    HASHMAP_DEFINE_HASH_WRAP_##_KEYTYPE_(_KEYTYPE_, _TYPENAME_)
    HASHMAP_DEFINE_HASH_##_KEYTYPE_(_KEYTYPE_, _TYPENAME_)

    sub HashMap_##_KEYTYPE_##_TYPENAME_.insert _
    ( _
        _key as _KEYTYPE_, _
        byref _item as _TYPENAME_ _
    )
        dim as HashMap_##_KEYTYPE_##_TYPENAME_##_Row ptr cur_row
        cur_row = cast(HashMap_##_KEYTYPE_##_TYPENAME_##_Row ptr, _
                       data_ + hash_wrap_##_KEYTYPE_##(this, _key) shl _
                       row_shift_mul)
        while(cur_row->next_block)
            cur_row = cur_row->next_block
        wend
        if cur_row->size = HASHMAP_CONTIGUOUS_BLOCK_N then
            cur_row->next_block = callocate(HASHMAP_ROWSIZE(_KEYTYPE_, _
                                                            _TYPENAME_))
            cur_row = cur_row->next_block
        end if
        with cur_row->slots(cur_row->size)
            .data_ = _item
            HASHMAP_##_KEYTYPE_##_INSERTION_LOGIC()
        end with
        cur_row->size += 1
        used_size += 1
        if (cdbl(used_size) / (capacity * HASHMAP_CONTIGUOUS_BLOCK_N)) > _
            HASHMAP_SPLIT_RATIO then up_split_entry()
    end sub
    
    
    function HashMap_##_KEYTYPE_##_TYPENAME_.remove _
    ( _
        _key as _KEYTYPE_ _
    ) as bool
        dim as HashMap_##_KEYTYPE_##_TYPENAME_##_Row ptr cur_row
        dim as HashMap_##_KEYTYPE_##_TYPENAME_##_Row ptr last_row
        dim as integer i
        dim as integer j
        dim as integer start
        cur_row = cast(HashMap_##_KEYTYPE_##_TYPENAME_##_Row ptr, _
                       data_ + hash_wrap_##_KEYTYPE_##(this, _key) shl _
                       row_shift_mul)
        last_row = NULL
        do
            start = cur_row->size - 1
            for i = start to 0 step -1
                HASHMAP_##_KEYTYPE_##_DELETION_LOGIC()
            next i
            last_row = cur_row
            cur_row = cur_row->next_block
        loop while (cur_row)
        return true 
    end function
    
    function HashMap_##_KEYTYPE_##_TYPENAME_.exists _
    ( _
        _key as _KEYTYPE_ _
    ) as bool
        dim as HashMap_##_KEYTYPE_##_TYPENAME_##_Row ptr cur_row
        dim as integer i
        dim as integer start
        cur_row = cast(HashMap_##_KEYTYPE_##_TYPENAME_##_Row ptr, _
                       data_ + hash_wrap_##_KEYTYPE_##(this, _key) shl _
                       row_shift_mul)
        do
            start = cur_row->size - 1
            for i = start to 0 step -1
                HASHMAP_##_KEYTYPE_##_EXISTS_LOGIC()
            next i
            cur_row = cur_row->next_block
        loop while (cur_row)
        return false    
    end function

    function HashMap_##_KEYTYPE_##_TYPENAME_.retrieve _
    ( _
        _key as _KEYTYPE_, _
        byref _item as _TYPENAME_ _
    ) as bool
        dim as HashMap_##_KEYTYPE_##_TYPENAME_##_Row ptr cur_row
        dim as integer i
        dim as integer start
        cur_row = cast(HashMap_##_KEYTYPE_##_TYPENAME_##_Row ptr, _
                       data_ + hash_wrap_##_KEYTYPE_##(this, _key) shl _
                       row_shift_mul)
        do
            start = cur_row->size - 1
            for i = start to 0 step -1
                HASHMAP_##_KEYTYPE_##_RETRIEVE_LOGIC()
            next i
            cur_row = cur_row->next_block
        loop while (cur_row)
        return false                      
    end function
    
    function HashMap_##_KEYTYPE_##_TYPENAME_.retrieve _
    ( _
        _key as _KEYTYPE_ _
    ) byref as _TYPENAME_ 
        dim as HashMap_##_KEYTYPE_##_TYPENAME_##_Row ptr cur_row
        dim as integer i
        dim as integer start
        cur_row = cast(HashMap_##_KEYTYPE_##_TYPENAME_##_Row ptr, _
                       data_ + hash_wrap_##_KEYTYPE_##(this, _key) shl _
                       row_shift_mul)
        do
            start = cur_row->size - 1
            for i = start to 0 step -1
                HASHMAP_##_KEYTYPE_##_RETRIEVE_R_LOGIC()
            next i
            cur_row = cur_row->next_block
        loop while (cur_row)
    end function
    
    sub HashMap_##_KEYTYPE_##_TYPENAME_.clear()
        dim as integer i
        clear_data()
        init()
        deallocate(data_)
        data_ = callocate(row_size_adjust * capacity)
    end sub
    
    sub HashMap_##_KEYTYPE_##_TYPENAME_.up_split_entry()
        dim as HashMap_##_KEYTYPE_##_TYPENAME_##_Row ptr cur_row
        dim as HashMap_##_KEYTYPE_##_TYPENAME_##_Row ptr next_row
        dim as HashMap_##_KEYTYPE_##_TYPENAME_##_Row ptr last_row        
        dim as HashMap_##_KEYTYPE_##_TYPENAME_##_Row ptr cur_insert_row  
        dim as integer i
        dim as integer j
        dim as integer start
        dim as integer insert_i
        dim as size_t new_pos

        capacity += 1
        data_ = reallocate(data_, capacity * row_size_adjust)
        cur_insert_row = cast(HashMap_##_KEYTYPE_##_TYPENAME_##_Row ptr, _
                              data_ + (capacity - 1) shl row_shift_mul) 
        cur_insert_row->size = 0
        cur_insert_row->next_block = NULL
        cur_row = cast(HashMap_##_KEYTYPE_##_TYPENAME_##_Row ptr, _
                       data_ + split shl row_shift_mul) 
                       
        if cur_row->size > 0 then
            last_row = NULL
            do
                next_row = cur_row->next_block
                start = cur_row->size - 1
                i = cur_row->size - 1
                while(i >= 0)
                    HASHMAP_##_KEYTYPE_##_CALC_NEW_POS(_KEYTYPE_, i)
                    new_pos and= level_wrap_mask_2x
                    if new_pos <> split then
                        if cur_insert_row->size < _
                           HASHMAP_CONTIGUOUS_BLOCK_N then
                            cur_insert_row->size += 1
                            cur_insert_row->slots _
                            ( _
                                cur_insert_row->size - 1 _
                            ) = cur_row->slots(i) 
                        else
                            cur_insert_row->next_block = callocate _
                            ( _
                                HASHMAP_ROWSIZE(_KEYTYPE_, _TYPENAME_) _
                            )
                            cur_insert_row = cur_insert_row->next_block
                            cur_insert_row->slots(0) = cur_row->slots(i)
                            cur_insert_row->size = 1
                        end if
                        for j = i to start - 1
                            cur_row->slots(j) = cur_row->slots(j + 1)
                        next j
                        cur_row->size -= 1
                        start -= 1
                    end if
                    i -= 1
                wend
                if cur_row->size = 0 then
                    if last_row = NULL then
                        if next_row then 
                            *cur_row = *next_row
                            deallocate(next_row)
                        else
                            cur_row = next_row
                        end if
                    else
                        last_row->next_block = next_row
                        deallocate(cur_row)
                        cur_row = next_row 
                    end if
                else
                    last_row = cur_row
                    cur_row = next_row      
                end if
            loop while cur_row
        end if
        split += 1
        if split >= level then
            level shl= 1
            split = 0
            level_wrap_mask = level - 1
            level_wrap_mask_2x = level shl 1 - 1
        end if
    end sub

    sub HashMap_##_KEYTYPE_##_TYPENAME_.down_split_entry()  
        dim as HashMap_##_KEYTYPE_##_TYPENAME_##_Row ptr cur_row
        dim as HashMap_##_KEYTYPE_##_TYPENAME_##_Row ptr next_row
        dim as HashMap_##_KEYTYPE_##_TYPENAME_##_Row ptr cur_insert_row       
        dim as integer i
        dim as integer insert_i
        dim as integer start
        dim as size_t new_pos
        dim as bool not_first_block
        
        if capacity > HASHMAP_INITIAL_ROW_N then
            if split = 0 then
                level /= 2
                split = level - 1
                level_wrap_mask = level - 1
                level_wrap_mask_2x = level shl 1 - 1
            else
                split -= 1
            end if

            cur_row = cast(HashMap_##_KEYTYPE_##_TYPENAME_##_Row ptr, _
                           data_ + (capacity - 1) shl row_shift_mul)    
            if cur_row->size > 0 then
                not_first_block = false
                HASHMAP_##_KEYTYPE_##_CALC_NEW_POS(_KEYTYPE_, 0)
                new_pos and= level_wrap_mask
                cur_insert_row = cast _
                ( _
                    HashMap_##_KEYTYPE_##_TYPENAME_##_Row ptr, _
                    data_ + new_pos shl row_shift_mul _
                )
                do
                    next_row = cur_row->next_block
                    start = cur_row->size - 1
                    for i = start to 0 step -1
                        do
                            if cur_insert_row->size < _
                               HASHMAP_CONTIGUOUS_BLOCK_N then
                                cur_insert_row->size += 1
                                cur_insert_row->slots _
                                ( _
                                    cur_insert_row->size - 1 _
                                ) = cur_row->slots(i) 
                                exit do
                            elseif cur_insert_row->next_block = NULL then
                                cur_insert_row->next_block = callocate _
                                ( _
                                    HASHMAP_ROWSIZE(_KEYTYPE_, _TYPENAME_) _
                                )
                                cur_insert_row = cur_insert_row->next_block
                                cur_insert_row->slots(0) = _
                                    cur_row->slots(i)
                                cur_insert_row->size = 1
                                exit do
                            else
                                cur_insert_row = cur_insert_row->next_block
                            end if
                        loop
                    next i
                    if not_first_block then deallocate(cur_row)
                    not_first_block = true
                    cur_row = next_row
                loop while (cur_row)
            end if
            
            capacity -= 1
            data_ = reallocate(data_, capacity * row_size_adjust)
            
        end if
        
    end sub
    
end namespace
#endif
#endmacro


#endif
main.bas

Code: Select all

#include "hashmap.bi"
#define TEST_HASHES 100000

dsm_HashMap_define(zstring, integer)

using dsm

dim as HashMap(zstring, integer) table
dim as integer i


function generateGarbageString() as string
    dim as integer rand_char
    dim as integer length
    dim as integer i
    dim as integer capitalize
    dim as string rand_string
    rand_string = ""
    length = int(rnd * 64) + 3
    capitalize = 1
    for i = 0 to length
        if int(rnd * 5) = 0 andAlso capitalize = 0 then
            rand_string += " "
            capitalize = 1
        else
            rand_char = int(rnd * (asc("z") - asc("a"))) + asc("a")
            if capitalize = 1 then
                rand_string += ucase(chr(rand_char))
            else
                rand_string += chr(rand_char)
            end if
            capitalize = 0
        end if
    next i
    return rand_string
end function

redim as string hash_code(0 to TEST_HASHES-1)

for i = 0 to TEST_HASHES-1
    hash_code(i) = generateGarbageString()
    table.insert(hash_code(i), i + 1)
    print i + 1, hash_code(i)
next i
print "inserted all!"
sleep
table.clear()
print "cleared all!"
sleep
for i = 0 to TEST_HASHES-1
    table.insert(hash_code(i), i + 1)
    print i + 1, hash_code(i)
next i
print "inserted all!"
sleep
dim as integer x
x = 0
for i = 0 to TEST_HASHES-1
    x += table.retrieve(hash_code(i))
next i
print x
print "sought all!"
sleep
for i = 0 to TEST_HASHES*0.5-1
    if table.remove(hash_code(i)) then
        print "failure! : "; i + 1;  
        print hash_code(i)
        sleep
    else 
        print i + 1
    end if
next i
print "removed half!"
sleep
for i = 0 to TEST_HASHES*0.5-1
    table.insert(hash_code(i), i + 1)
    print i + 1, hash_code(i)
next i
print "inserted half!"
sleep
for i = 0 to TEST_HASHES-1
    if table.remove(hash_code(i)) then
        print "failure! : "; i + 1;  
        print hash_code(i)
        sleep
    else 
        print i + 1
    end if
next i
print "removed all!"
sleep
end
MOD
Posts: 555
Joined: Jun 11, 2009 20:15

Re: The Ultimate FB HashMap

Post by MOD »

mdTypes also provides some Map classes.

Simple example code:

Code: Select all

#Include Once "md/util/mdMap.bi"

mdMapDeclare(String, String)
Dim As mdMap(String, String) map

Dim As String temp
temp = map.put("key1", "value1")
temp = map.put("key2", "value2")
temp = map.put("key3", "value3")

Dim As mdSet(String) set = map.keySet()

Dim As String element
ForEach(String, element In set)
    Print "key: " & element & " - value: " & map.get(element)
NextEach

Sleep
Zamaster
Posts: 1025
Joined: Jun 20, 2005 21:40
Contact:

Re: The Ultimate FB HashMap

Post by Zamaster »

Right right, I know there are tons of these lying around, I just thought mine was pretty quick :)
brennus
Posts: 6
Joined: Nov 17, 2011 11:03

Re: The Ultimate FB HashMap

Post by brennus »

Hello,

I want to try it but i can't compile it it :

Error: incorrect register `rbx' used with `l' suffix

Thanks for your help


"C:\My Program Files\FreeBASIC\fbc.exe" -v -R -g -s console -exx -O 3 "C:\tcc\main.bas" (dans le dossier : C:\My Program Files\FreeBASIC\)
FreeBASIC Compiler - Version 1.07.1 (2019-09-27), built for win64 (64bit)
Copyright (C) 2004-2019 The FreeBASIC development team.
standalone
target: win64, x86-64, 64bit
compiling: C:\tcc\main.bas -o C:\tcc\main.c (main module)
compiling C: C:\My Program Files\FreeBASIC\bin\win64\gcc.exe -m64 -march=x86-64 -S -nostdlib -nostdinc -Wall -Wno-unused-label -Wno-unused-function -Wno-unused-variable -Wno-unused-but-set-variable -Wno-main -Werror-implicit-function-declaration -O3 -fno-strict-aliasing -frounding-math -fno-math-errno -fwrapv -fno-exceptions -fno-unwind-tables -fno-asynchronous-unwind-tables -Wno-format -g -masm=intel "C:\tcc\main.c" -o "C:\tcc\main.asm"
assembling: C:\My Program Files\FreeBASIC\bin\win64\as.exe --64 "C:\tcc\main.asm" -o "C:\tcc\main.o"
C:\tcc\main.asm: Assembler messages:
C:\tcc\main.asm:13: Error: incorrect register `rbx' used with `l' suffix
C:\tcc\main.asm:40: Error: incorrect register `rbx' used with `l' suffix

assembling failed: 'C:\My Program Files\FreeBASIC\bin\win64\as.exe' terminated with exit code 1
Compilation échouée.
SARG
Posts: 1755
Joined: May 27, 2005 7:15
Location: FRANCE

Re: The Ultimate FB HashMap

Post by SARG »

it's not possible to assign a dword (32bit) in a 64bit register.
Try after changing dword by qword lines 297 and 360.

line 297

Code: Select all

mov     rbx,                    dword ptr [esp+32]
-->

Code: Select all

mov     rbx,                    qword ptr [esp+32]
line 360

Code: Select all

mov     rbx,                    dword ptr [esp+24]
-->

Code: Select all

mov     rbx,                    qword ptr [esp+24]
Not tested. If it fails I'll look a bit more.
marcov
Posts: 3454
Joined: Jun 16, 2005 9:45
Location: Netherlands
Contact:

Re: The Ultimate FB HashMap

Post by marcov »

Or assign it to ebx. If you assign ebx, the high 32-bits should be auto cleared.
SARG
Posts: 1755
Joined: May 27, 2005 7:15
Location: FRANCE

Re: The Ultimate FB HashMap

Post by SARG »

marcov wrote:Or assign it to ebx. If you assign ebx, the high 32-bits should be auto cleared.
Yes but as it's a pointer it's cleaner to use rbx even it should also work with ebx.
Makoto WATANABE
Posts: 231
Joined: Apr 10, 2010 11:41
Location: Japan
Contact:

Re: The Ultimate FB HashMap

Post by Makoto WATANABE »

Hi !

I tried to make a program that aggregates the order amount by comparing 15,000 item masters and 10,000 order list.
Since I used the associative array, matching was completed in an instant (less than 1 second)!
Thanks for "The Ultimate FB HashMap".

(I don't know how to sort a multi-dimensional array, so I'm forcing it into a one-dimensional array.)

region.txt

Code: Select all

Alaska
Alabama
Arkansas
Arizona
California
Colorado
Connecticut
District of Columbia
Delaware
Florida
Georgia
Hawaii
Iowa
Idaho
Illinois
Indiana
Kansas
Kentucky
Louisiana
Massachusetts
Maryland
Maine
Michigan
Minnesota
Missouri
Mississippi
Montana
North Carolina
North Dakota
Nebraska
New Hampshire
New Jersey
New Mexico
Nevada
New York
Ohio
Oklahoma
Oregon
Pennsylvania
Rhode Island
South Carolina
South Dakota
Tennessee
Texas
Utah
Virginia
Vermont
Washington
Wisconsin
West Virginia
Wyoming
PrepareTestData.bas

Code: Select all

'Prepare test Data
# include "vbcompat.bi"

Dim Shared Counter As Integer
Dim Shared ItemID As String
Dim STARTT As Long
Dim ENDTIME As Long
Dim Minut As Integer


Sub FabricateData()
   
   Dim RegionArray() As String
   Dim file_name As String 
   Dim file_num As Integer
   Dim FirstDigit As Integer
   Dim SecondDigit As Integer
   Dim ThirdDigit As Integer
   Dim CharacterString As String
   Dim Regions As Integer

   'Fabricate ItemMaster
   
   file_name = "region.txt"
   file_num = FreeFile( )                 '' 有効なファイル番号を検索します
   Regions = 0

   '' ファイルを開きます。そして、ファイル番号をそれに結び付けます。エラーが有れば、抜けます。
   If( Open( file_name For Input As #file_num ) ) Then
      Print "ERROR: 開こうとしたファイル名 " ; file_name
      Sleep
      End -1
   End If

   Do Until EOF( file_num )               '' ファイルの端に達するまで、繰り返します。
      Regions = Regions + 1
      ReDim Preserve RegionArray(Regions)
      Line Input #file_num, RegionArray(Regions)           '' テキストの行を読みます。
      Print RegionArray(Regions)                           '' 画面にそれを出力します。
   Loop
               
   Close #file_num                        '' ファイル番号を通したファイルを閉じます。
   
   Open "ItemMaster.csv"  For Output As #1
   
   Counter = 0

   For FirstDigit = 1 To 26
      For SecondDigit = 1 To 26
         For ThirdDigit = 1 To 26
            CharacterString = ""
            Counter = Counter + 1
            If Counter > 15000 Then Exit For
            ItemID = Chr(64 + FirstDigit) & Chr(64 + SecondDigit) & Chr(64 + ThirdDigit) 'ItemID
            CharacterString = ItemID
            CharacterString = CharacterString & ",""" & RegionArray(Int(Rnd() * Regions) + 1)
            CharacterString = CharacterString & """," &  (Int(Rnd() * 1000) + 1) * 10    'price
            CharacterString = CharacterString & "," &  (Int(Rnd() * 100) + 1) * 10       'weight
            'Print CharacterString
            'Sleep
            Print #1, CharacterString
         Next ThirdDigit
      Next SecondDigit
   Next FirstDigit
   
   Close #1
   
   'Fabricate OrderList
   
   Open "OrderList.csv"  For Output As #1

   For Counter = 1 To 10000
      ItemID = Chr(64 + Int(Rnd() * 26) + 1) & Chr(64 + Int(Rnd() * 26) + 1) & Chr(64 + Int(Rnd() * 26) + 1)
      CharacterString = ItemID & "," & Int(Rnd() * 100) + 1               'quantity
      Print #1, CharacterString
   Next Counter
   
   Close #1
      
End Sub


STARTT=Val(Left(Time,2))*3600+Val(Mid(Time,4,2))*60+Val(Right(Time,2))

Randomize

FabricateData()

ENDTIME = Val(Left(Time,2))*3600+Val(Mid(Time,4,2))*60+Val(Right(Time,2))
Minut=(ENDTIME-STARTT)\60
Print
Print Using "Processing time was ## minutes ## seconds."; Minut; (ENDTIME-STARTT)-Minut*60
Print "*******************************************************"
Print "Please enter any key to exit."
Sleep
OrderSorting.bas

Code: Select all

'Order sorting

'The Ultimate FB HashMap
'by Zamaster ≫ Feb 23, 2016 8:11 
'https://www.freebasic.net/forum/viewtopic.php?p=216439

'table.insert(hash_code(i), i + 1) '索引を生成
'table.clear()                     '索引を消去
'table.retrieve(hash_code(i))      '設定済の索引を取得。未登録のキーを検索すると異常終了する
'table.remove(hash_code(i)) 
'table.exists(hash_code(i))        '索引の有無チェック
'function exists(_key as _KEYTYPE_) as bool         '索引の有無チェック


#Include "hashmap.bi"                                   '★1★★★★★★★★★★★

dsm_HashMap_define(ZString, Integer)                    '★2★★★★★★★★★★★

Using dsm                                               '★3★★★★★★★★★★★

Dim As HashMap(ZString, Integer) MasterItemID           '★4★★★★★★★★★★★
Dim As HashMap(ZString, Integer) OrderItemID            '★4★★★★★★★★★★★

Dim STARTT As Long
Dim ENDTIME As Long
Dim Minut As Integer
Dim Shared ItemID As String     
Dim Shared Region As String
Dim Shared Price As String
Dim Shared Weight As String
Dim Shared Counter As Integer
Dim i As Integer
Dim BoolVar As Boolean
'Dim Quantity As Integer
Dim QuantityString As String
Dim Amount As String
Dim Dimension As Integer
Dim cellString As String
   
Dim file_name As String 
Dim file_num As Integer
Dim CharacterString As String
Dim Regions As Integer
Dim ItemMasterNo As Integer
Dim IntegratedNo As Integer

Dim Shared ItemMasterArray(20000,3) As String 'Region, price, weight
Dim Shared IntegratedOrderArray(20000,5) As String 'Region, ItemID, Quantity, Amount, Weight
Dim Shared SortArray(20000) As String

'****************************************************************

'★★★文字列ソート★★★
Sub StringSort _
    (Array(Any) As String , _
     ByVal StartLine As Integer , _
     ByVal EndLine As Integer)
	
 Dim CenterLine As Integer                       'CenterLine を格納する変数
 Dim Reference As String                         'Referenceを 格納する変数
 Dim Position As Integer                         'Position
 Dim Temporary As String                         '値をスワップするための作業域
 Dim i As Integer                                'ループカウンタ
 
    If StartLine >= EndLine Then Exit Sub        '終了番号が開始番号以下の場合、プロシージャを抜ける
    CenterLine = (StartLine + EndLine) \ 2       'CenterLineを求める
    Reference = Array(CenterLine)                '中央の値をReferenceとする
    Array(CenterLine) = Array(StartLine)         '中央の要素に開始番号の値を格納
    Position = StartLine                         'Positionを開始番号と同じにする
    For i = (StartLine + 1) To EndLine Step 1    '開始番号の次の要素から終了番号までループ
        If Array(i) < Reference  Then            '値がReferenceより小さい場合
            Position = Position + 1              'Positionをインクリメント
            Temporary = Array(Position)          'Array(i) と Array(Position) の値をスワップ
            Array(Position) = Array(i)
            Array(i) = Temporary
        End If
    Next
    Array(StartLine) = Array(Position)           'Array(Position) を開始番号の値にする
    Array(Position) = Reference                  'Referenceを Array(Position) に格納
    StringSort(Array(), StartLine, Position - 1) '分割された配列をクイックソート(再帰)
    StringSort(Array(), Position + 1, EndLine)   '分割された配列をクイックソート(再帰)

End Sub

'****************************************************************
'****************************************************************
   Print "Read ""ItemMaster.csv"" and store each line in ItemMasterArray, indexing ItemID."
'****************************************************************
   Print
   
   file_name = "ItemMaster.csv"
   file_num = FreeFile( )                 '' 有効なファイル番号を検索します

   '' ファイルを開きます。そして、ファイル番号をそれに結び付けます。エラーが有れば、抜けます。
   If( Open( file_name For Input As #file_num ) ) Then
      Print "ERROR: 開こうとしたファイル名 " ; file_name
      Sleep
      End -1
   End If
   
   Counter = 0

   Do Until Eof( file_num )               '' ファイルの端に達するまで、繰り返します。
      ItemID ="": Region ="": Price ="" : Weight =""

      Line Input #file_num, CharacterString           '' テキストの行を読みます。
      ItemID = Left(CharacterString,3)
      Region = Mid(CharacterString,6,InStrRev(CharacterString,"""")-6)
      Price  = Mid(CharacterString,InStrRev(CharacterString,"""")+2,InStrRev(CharacterString,",")-InStrRev(CharacterString,"""")-2)
      Weight = Right(CharacterString,Len(CharacterString)-InStrRev(CharacterString,","))

      BoolVar = MasterItemID.exists(ItemID)              '★6★★★★★★★★★★★

      If BoolVar = False Then
         Counter = Counter + 1
         MasterItemID.insert( ItemID, Counter )          '★5★★★★★★★★★★★
         
         ItemMasterArray(Counter,1) = Region
         ItemMasterArray(Counter,2) = Price
         ItemMasterArray(Counter,3) = Weight   
      Else
         Print BoolVar 
         Print CharacterString                           '' キー重複を画面に出力します。
         Print ItemID , Region , Price ,  Weight
         Print Counter
         Sleep
      End If
      
   Loop
   
   Print CharacterString                           '' 画面に最終行を出力します。
   Print ItemID , Region , Price ,  Weight
   Print Counter
   Print
   
   Close #file_num                        '' ファイル番号を通したファイルを閉じます。


'****************************************************************
'****************************************************************
   Print "Read ""OrderList.csv"" and compare it with ItemMasterArray to aggregate price and weight by item and region."
'****************************************************************
   Print
   
   Open "OrderItemError.csv"  For Output As #1

   file_name = "OrderList.csv"
   file_num = FreeFile( )                 '' 有効なファイル番号を検索します

   '' ファイルを開きます。そして、ファイル番号をそれに結び付けます。エラーが有れば、抜けます。
   If( Open( file_name For Input As #file_num ) ) Then
      Print "ERROR: 開こうとしたファイル名 " ; file_name
      Sleep
      End -1
   End If
   
   Counter = 0
   STARTT=VAL(LEFT(TIME,2))*3600+VAL(MID(TIME,4,2))*60+VAL(RIGHT(TIME,2))

   Do Until Eof( file_num )               '' ファイルの端に達するまで、繰り返します。
      CharacterString = "" : ItemID = "" : QuantityString = ""

      Line Input #file_num, CharacterString           '' テキストの行を読みます。
      ItemID = Left(CharacterString,3)
      QuantityString = Right(CharacterString,Len(CharacterString)-InStrRev(CharacterString,","))
      
      BoolVar = MasterItemID.exists(ItemID)

      If BoolVar = False Then      '★6★★★★★★★★★★★
         'エラー出力
         Print #1, CharacterString         
      Else
         ItemMasterNo = MasterItemID.retrieve(ItemID)
         
         'ItemMasterArray(20000,3)      'Region, price, weight
         'IntegratedOrderArray(20000,5) 'Region, ItemID, Quantity, Amount, Weight
         
         Region = ItemMasterArray(ItemMasterNo,1) & Space(20-Len(ItemMasterArray(ItemMasterNo,1)))
         Amount = Str(Val(QuantityString)*Val(ItemMasterArray(ItemMasterNo,2)))
         Weight = Str(Val(QuantityString)*Val(ItemMasterArray(ItemMasterNo,3)))

         BoolVar = OrderItemID.exists(ItemID)
   
         If BoolVar = False Then                           '★6★★★★★★★★★★★
            Counter = Counter + 1
            OrderItemID.insert( ItemID, Counter )          '★5★★★★★★★★★★★
            
            IntegratedOrderArray(Counter,1) = Region
            IntegratedOrderArray(Counter,2) = ItemID
            IntegratedOrderArray(Counter,3) = QuantityString
            IntegratedOrderArray(Counter,4) = Amount
            IntegratedOrderArray(Counter,5) = Weight
            'Print Counter,Region,ItemID,QuantityString,Amount,Weight
            'sleep
   
         Else
            IntegratedNo = OrderItemID.retrieve(ItemID) 
            
            IntegratedOrderArray(IntegratedNo,3) = Str(Val(IntegratedOrderArray(IntegratedNo,3))+Val(QuantityString))
            IntegratedOrderArray(IntegratedNo,4) = Str(Val(IntegratedOrderArray(IntegratedNo,4))+Val(Amount))
            IntegratedOrderArray(IntegratedNo,5) = Str(Val(IntegratedOrderArray(IntegratedNo,5))+Val(Weight))
            'Print IntegratedNo,ItemID,QuantityString,Amount,Weight
            'sleep
   
         End If
      End If
      
   Loop
   
   ENDTIME = VAL(LEFT(TIME,2))*3600+VAL(MID(TIME,4,2))*60+VAL(RIGHT(TIME,2))
   Minut=(ENDTIME-STARTT)\60
   Print Using "processing time: ## minutes ## seconds"; Minut; (ENDTIME-STARTT)-Minut*60
   Print
   
   Close #1
   Close #file_num                        '' ファイル番号を通したファイルを閉じます。


'****************************************************************
'****************************************************************
   Print "Output the order aggregation result as ""OrderSorting.txt""."
'****************************************************************
   Print

   'IntegratedOrderArray(20000,5) 'Region, ItemID, Quantity, Amount, Weight
   Open "OrderSorting.txt"  For Output As #1
      For i = 1 To Counter
         CharacterString = ""
         For Dimension =1 To 5
            cellString = IntegratedOrderArray(i,Dimension)
            If Dimension = 1 Then
               cellString = """" & cellString & """"
            EndIf
            If Dimension > 1 Then
               cellString = "," & cellString
            EndIf
            CharacterString = CharacterString & cellString
         Next Dimension

         Print #1, CharacterString

      Next i
   Close #1
   
'****************************************************************
'****************************************************************
   Print "Sort the totaled results by Region and ItemID."
'****************************************************************
   Print
   
   file_name = "OrderSorting.txt"
   file_num = FreeFile( )                 '' 有効なファイル番号を検索します

   '' ファイルを開きます。そして、ファイル番号をそれに結び付けます。エラーが有れば、抜けます。
   If( Open( file_name For Input As #file_num ) ) Then
      Print "ERROR: 開こうとしたファイル名 " ; file_name
      Sleep
      End -1
   End If
   
   Counter = 0

   Do Until Eof( file_num )               '' ファイルの端に達するまで、繰り返します。

      Line Input #file_num, CharacterString           '' テキストの行を読みます。
      Counter = Counter + 1
      MasterItemID.insert( ItemID, Counter )          '★5★★★★★★★★★★★
      SortArray(Counter) = CharacterString
   Loop
   
   Close #file_num                        '' ファイル番号を通したファイルを閉じます。
   
   StringSort(SortArray(),0,Counter)
   

   Open "OrderSorting.csv"  For Output As #1

   For i = 1 To Counter
      CharacterString = SortArray(i)
      Print #1, CharacterString
   Next i
   
   Close #1
'****************************************************************
'****************************************************************
   Print "Output of the sorted order aggregate has been completed as ""OrderSorting.csv""."
   Print "*******************************************************"
   Print "Please enter any key to exit."
Sleep
Makoto WATANABE
Posts: 231
Joined: Apr 10, 2010 11:41
Location: Japan
Contact:

Re: The Ultimate FB HashMap

Post by Makoto WATANABE »

Hi !

I learned a multi-dimensional array sorting program from badidea san and fxm san.
viewtopic.php?f=3&t=27993&start=30#p266914

I modified my code by incorporating a multi-dimensional array sort program.

P.S. Although I increased the number of orders list to 50,000, matching with the item master was still completed within 1 second.

region.txt

Code: Select all

Alaska
Alabama
Arkansas
Arizona
California
Colorado
Connecticut
District of Columbia
Delaware
Florida
Georgia
Hawaii
Iowa
Idaho
Illinois
Indiana
Kansas
Kentucky
Louisiana
Massachusetts
Maryland
Maine
Michigan
Minnesota
Missouri
Mississippi
Montana
North Carolina
North Dakota
Nebraska
New Hampshire
New Jersey
New Mexico
Nevada
New York
Ohio
Oklahoma
Oregon
Pennsylvania
Rhode Island
South Carolina
South Dakota
Tennessee
Texas
Utah
Virginia
Vermont
Washington
Wisconsin
West Virginia
Wyoming
PrepareTestData2.bas

Code: Select all

'Prepare test Data
# include "vbcompat.bi"

Dim Shared Counter As Integer
Dim Shared ItemID As String
Dim STARTT As Long
Dim ENDTIME As Long
Dim Minut As Integer


Sub FabricateData()
   
   Dim RegionArray() As String
   Dim file_name As String 
   Dim file_num As Integer
   Dim FirstDigit As Integer
   Dim SecondDigit As Integer
   Dim ThirdDigit As Integer
   Dim CharacterString As String
   Dim Regions As Integer

   'Fabricate ItemMaster
   
   file_name = "region.txt"
   file_num = FreeFile( )                 '' 有効なファイル番号を検索します
   Regions = 0

   '' ファイルを開きます。そして、ファイル番号をそれに結び付けます。エラーが有れば、抜けます。
   If( Open( file_name For Input As #file_num ) ) Then
      Print "ERROR: 開こうとしたファイル名 " ; file_name
      Sleep
      End -1
   End If

   Do Until EOF( file_num )               '' ファイルの端に達するまで、繰り返します。
      Regions = Regions + 1
      ReDim Preserve RegionArray(Regions)
      Line Input #file_num, RegionArray(Regions)           '' テキストの行を読みます。
      Print RegionArray(Regions)                           '' 画面にそれを出力します。
   Loop
               
   Close #file_num                        '' ファイル番号を通したファイルを閉じます。
   
   Open "ItemMaster.csv"  For Output As #1
   
   Counter = 0

   For FirstDigit = 1 To 26
      For SecondDigit = 1 To 26
         For ThirdDigit = 1 To 26
            CharacterString = ""
            Counter = Counter + 1
            If Counter > 15000 Then Exit For
            ItemID = Chr(64 + FirstDigit) & Chr(64 + SecondDigit) & Chr(64 + ThirdDigit) 'ItemID
            CharacterString = ItemID
            CharacterString = CharacterString & ",""" & RegionArray(Int(Rnd() * Regions) + 1)
            CharacterString = CharacterString & """," &  (Int(Rnd() * 1000) + 1) * 10    'price
            CharacterString = CharacterString & "," &  (Int(Rnd() * 100) + 1) * 10       'weight
            'Print CharacterString
            'Sleep
            Print #1, CharacterString
         Next ThirdDigit
      Next SecondDigit
   Next FirstDigit
   
   Close #1
   
   'Fabricate OrderList
   
   Open "OrderList.csv"  For Output As #1

   For Counter = 1 To 50000
      ItemID = Chr(64 + Int(Rnd() * 26) + 1) & Chr(64 + Int(Rnd() * 26) + 1) & Chr(64 + Int(Rnd() * 26) + 1)
      CharacterString = ItemID & "," & Int(Rnd() * 100) + 1               'quantity
      Print #1, CharacterString
   Next Counter
   
   Close #1
      
End Sub


STARTT=Val(Left(Time,2))*3600+Val(Mid(Time,4,2))*60+Val(Right(Time,2))

Randomize

FabricateData()

ENDTIME = Val(Left(Time,2))*3600+Val(Mid(Time,4,2))*60+Val(Right(Time,2))
Minut=(ENDTIME-STARTT)\60
Print
Print Using "Processing time was ## minutes ## seconds."; Minut; (ENDTIME-STARTT)-Minut*60
Print "*******************************************************"
Print "Please enter any key to exit."
Sleep
OrderSorting2.bas

Code: Select all

'Order sorting

'The Ultimate FB HashMap
'by Zamaster ≫ Feb 23, 2016 8:11 
'https://www.freebasic.net/forum/viewtopic.php?p=216439

'table.insert(hash_code(i), i + 1) '索引を生成
'table.clear()                     '索引を消去
'table.retrieve(hash_code(i))      '設定済の索引を取得。未登録のキーを検索すると異常終了する
'table.remove(hash_code(i)) 
'table.exists(hash_code(i))        '索引の有無チェック
'function exists(_key as _KEYTYPE_) as bool         '索引の有無チェック


'https://www.freebasic.net/forum/viewtopic.php?f=7&t=11538&p=273317#p273317
'by srvaldez ≫ Jun 19, 2020 12:04
'
'there's another hasmap implementation by Zamaster viewtopic.php?p=216439 but it works in 32-bit only, 
'he uses inline-asm and the 64-bit parts are wrong
'however, I run the previous test using his hashmap and there were no problems


#Include "hashmap.bi"                                   '★1★★★★★★★★★★★

dsm_HashMap_define(ZString, Integer)                    '★2★★★★★★★★★★★

Using dsm                                               '★3★★★★★★★★★★★

Dim As HashMap(ZString, Integer) MasterItemID           '★4★★★★★★★★★★★
Dim As HashMap(ZString, Integer) OrderItemID            '★4★★★★★★★★★★★

Dim STARTT As Long
Dim ENDTIME As Long
Dim Minut As Integer
Dim Shared ItemID As String     
Dim Shared Region As String
Dim Shared Price As String
Dim Shared Weight As String
Dim Shared Counter As Integer
Dim i As Integer
Dim BoolVar As Boolean
Dim QuantityString As String
Dim Amount As String
Dim Dimension As Integer
Dim cellString As String
   
Dim file_name As String 
Dim file_num As Integer
Dim CharacterString As String
Dim Regions As Integer
Dim ItemMasterNo As Integer
Dim IntegratedNo As Integer
Dim Orders As Integer

Dim Shared ItemMasterArray(20000,3) As String 'Region, price, weight
Dim Shared IntegratedOrderArray(20000,5) As String 'Region, ItemID, Quantity, Amount, Weight
Dim Shared SortArray(20000) As String

'****************************************************************
'****************************************************************

'★★★文字列ソート★★★
'How to sort datasheet (x-dim array) by a few columns (at once in 1 loop) ?
'by badidea ≫ Dec 15, 2019 21:54 
'https://www.freebasic.net/forum/viewtopic.php?f=3&t=27993&start=30#p266914

'by fxm ≫ Jun 25, 2020 6:49 
'https://www.freebasic.net/forum/viewtopic.php?f=3&p=273475&sid=547977eb5d3cc9cf3e6ee7323616cff3#p273475

#Include "crt/stdlib.bi"
#Include "string.bi"

'------------------------------- class: row_type -------------------------------

Type row_type
   Dim As String col(Any)
   Declare Operator Cast () As String
End Type

Operator row_type.cast () As String
   Dim As String tempStr
   For i As Integer = 1 To UBound(col)
      If i = 1 Then tempStr &= col(i) Else tempStr &= !"\t" & col(i) 
   Next
   Return tempStr
End Operator

'------------------------------ class: sort_type -------------------------------

Type sort_type
   Dim As Short column
   Dim As Short direction
   Declare Constructor()
   Declare Constructor(column As Short, direction As Short)
End Type

'a stupid constructor
Constructor sort_type()
   this.column = 0
   this.direction = 0
End Constructor

'another stupid constructor
Constructor sort_type(column As Short, direction As Short)
   this.column = column
   this.direction = direction
End Constructor

'------------------------------ class: data_type -------------------------------

Type data_type
   Static As sort_type sortOrder(1 To 3)
   Dim As Integer numRows, numCols
   Dim As row_type row(Any)
   Declare Constructor(numRows As Integer, numPivotMS As Integer)
   Declare Destructor()
   Declare Sub copyTo(dst As data_type)
   Declare Sub sort(sort1st As sort_type, sort2nd As sort_type, sort3rd As sort_type)
   Declare Static Function qSortCallback Cdecl(pRow1 As row_type Ptr, pRow2 As row_type Ptr) As Long
End Type

Dim As sort_type data_type.sortOrder(1 To 3)

Constructor data_type(numRows As Integer, numCols As Integer)
   ReDim row(numRows)
   this.numCols = numCols
   For iRow As Integer = 1 To numRows
      ReDim (row(iRow).col)(numCols) 'weird syntax, compiler wants the extra ( )
   Next
End Constructor

Destructor data_type()
   For iRow As Integer = 1 To numRows
      Erase row(iRow).col
   Next
   Erase row
End Destructor

Sub data_type.sort(sort1st As sort_type, sort2nd As sort_type, sort3rd As sort_type)
   'disable invalid sort filters
   sortOrder(1) = IIf(sort1st.column < 1 Or sort1st.column >= numCols, sort_type(0,0), sort1st)
   sortOrder(2) = IIf(sort2nd.column < 1 Or sort2nd.column >= numCols, sort_type(0,0), sort2nd)
   sortOrder(3) = IIf(sort3rd.column < 1 Or sort3rd.column >= numCols, sort_type(0,0), sort3rd)
   qsort(@row(1), UBound(row), SizeOf(row_type), CPtr(Any Ptr, @qSortCallback))
End Sub

Function data_type.qSortCallback Cdecl(pRow1 As row_type Ptr, pRow2 As row_type Ptr) As Long
   For i As Integer = 1 To 3
      With sortOrder(i)
         Select Case .direction
         Case +1
            If pRow1->col(.column) < pRow2->col(.column) Then Return -1
            If pRow1->col(.column) > pRow2->col(.column) Then Return +1
         Case -1
            If pRow1->col(.column) > pRow2->col(.column) Then Return -1
            If pRow1->col(.column) < pRow2->col(.column) Then Return +1
         Case Else
            'skip, including direction = 0
         End Select
      End With
   Next
   Return 0
End Function


'****************************************************************
'****************************************************************
   Print "Read ""ItemMaster.csv"" and store each line in ItemMasterArray, indexing ItemID."
'****************************************************************
   Print
   
   file_name = "ItemMaster.csv"
   file_num = FreeFile( )                 '' 有効なファイル番号を検索します

   '' ファイルを開きます。そして、ファイル番号をそれに結び付けます。エラーが有れば、抜けます。
   If( Open( file_name For Input As #file_num ) ) Then
      Print "ERROR: 開こうとしたファイル名 " ; file_name
      Sleep
      End -1
   End If
   
   Counter = 0

   Do Until Eof( file_num )               '' ファイルの端に達するまで、繰り返します。
      ItemID ="": Region ="": Price ="" : Weight =""

      Line Input #file_num, CharacterString           '' テキストの行を読みます。

      ItemID = Left(CharacterString,3)
      Region = Mid(CharacterString,6,InStrRev(CharacterString,"""")-6)
      Price  = Mid(CharacterString,InStrRev(CharacterString,"""")+2,InStrRev(CharacterString,",")-InStrRev(CharacterString,"""")-2)
      Weight = Right(CharacterString,Len(CharacterString)-InStrRev(CharacterString,","))

      BoolVar = MasterItemID.exists(ItemID)              '★6★★★★★★★★★★★

      If BoolVar = FALSE Then
         Counter = Counter + 1
         MasterItemID.insert( ItemID, Counter )          '★5★★★★★★★★★★★
         
         ItemMasterArray(Counter,1) = Region
         ItemMasterArray(Counter,2) = Price
         ItemMasterArray(Counter,3) = Weight   
      Else
         Print BoolVar 
         Print CharacterString                           '' キー重複を画面に出力します。
         Print ItemID , Region , Price ,  Weight
         Print Counter
         Sleep
      End If
      
   Loop
   Print "Numbers of Item Master = ";Counter
   Print 
   Print "Contents of the last Item Master : ";CharacterString  ' 画面に最終行を出力します。
   Print ItemID , Region , Price ,  Weight 
   Print
   
   Close #file_num                        '' ファイル番号を通したファイルを閉じます。

'****************************************************************
'****************************************************************
   Print "Read ""OrderList.csv"" and compare it with ItemMasterArray to aggregate price and weight by item and region."
'****************************************************************
   Print
   
   Open "OrderItemError.csv"  For Output As #1

   file_name = "OrderList.csv"
   file_num = FreeFile( )                 '' 有効なファイル番号を検索します

   '' ファイルを開きます。そして、ファイル番号をそれに結び付けます。エラーが有れば、抜けます。
   If( Open( file_name For Input As #file_num ) ) Then
      Print "ERROR: 開こうとしたファイル名 " ; file_name
      Sleep
      End -1
   End If
   
   Counter = 0
   Orders  = 0
   
   STARTT=Val(Left(Time,2))*3600+Val(Mid(Time,4,2))*60+Val(Right(Time,2))

   Do Until Eof( file_num )               '' ファイルの端に達するまで、繰り返します。
      CharacterString = "" : ItemID = "" : QuantityString = ""

      Line Input #file_num, CharacterString           '' テキストの行を読みます。
      Orders  = Orders + 1
      ItemID = Left(CharacterString,3)
      QuantityString = Right(CharacterString,Len(CharacterString)-InStrRev(CharacterString,","))
      
      BoolVar = MasterItemID.exists(ItemID)

      If BoolVar = FALSE Then      '★6★★★★★★★★★★★
         'エラー出力
         Print #1, CharacterString         
      Else
         ItemMasterNo = MasterItemID.retrieve(ItemID)
         
         'ItemMasterArray(20000,3)      'Region, price, weight
         'IntegratedOrderArray(20000,5) 'Region, ItemID, Quantity, Amount, Weight
         
         Region = ItemMasterArray(ItemMasterNo,1)
         Amount = Str(Val(QuantityString)*Val(ItemMasterArray(ItemMasterNo,2)))
         Weight = Str(Val(QuantityString)*Val(ItemMasterArray(ItemMasterNo,3)))

         BoolVar = OrderItemID.exists(ItemID)
   
         If BoolVar = FALSE Then                           '★6★★★★★★★★★★★
            Counter = Counter + 1
            OrderItemID.insert( ItemID, Counter )          '★5★★★★★★★★★★★
            
            IntegratedOrderArray(Counter,1) = Region
            IntegratedOrderArray(Counter,2) = ItemID
            IntegratedOrderArray(Counter,3) = QuantityString
            IntegratedOrderArray(Counter,4) = Amount
            IntegratedOrderArray(Counter,5) = Weight
            'Print Counter,Region,ItemID,QuantityString,Amount,Weight
            'sleep
   
         Else
            IntegratedNo = OrderItemID.retrieve(ItemID) 
            
            IntegratedOrderArray(IntegratedNo,3) = Str(Val(IntegratedOrderArray(IntegratedNo,3))+Val(QuantityString))
            IntegratedOrderArray(IntegratedNo,4) = Str(Val(IntegratedOrderArray(IntegratedNo,4))+Val(Amount))
            IntegratedOrderArray(IntegratedNo,5) = Str(Val(IntegratedOrderArray(IntegratedNo,5))+Val(Weight))
            'Print IntegratedNo,ItemID,QuantityString,Amount,Weight
            'sleep
   
         End If
      End If
      
   Loop
   Print "Number of orders in the list : ";Orders
   Print
   ENDTIME = Val(Left(Time,2))*3600+Val(Mid(Time,4,2))*60+Val(Right(Time,2))
   Minut=(ENDTIME-STARTT)\60
   Print Using "processing time: ## minutes ## seconds"; Minut; (ENDTIME-STARTT)-Minut*60
   Print
   
   Close #1
   Close #file_num                        '' ファイル番号を通したファイルを閉じます。



   
'****************************************************************
'****************************************************************
   Print "Sort the totaled results by Region and ItemID."
'****************************************************************
   Print
   

'-------------------------------- main program ---------------------------------

Dim As Integer numRows = Counter, numCols = 5

Var myData = data_type(numRows, numCols)

For iRow As Integer = 1 To Counter
   With myData.row(iRow)
      For iCol As Integer = 1 To UBound(.col)
         .col(iCol) = IntegratedOrderArray(iRow,iCol)
      Next
   End With
Next iRow

myData.sort(sort_type(1, +1), sort_type(2, +1), sort_type(0, 0))

For iRow As Integer = 1 To Counter
   With myData.row(iRow)
      For iCol As Integer = 1 To UBound(.col)
         IntegratedOrderArray(iRow,iCol) = .col(iCol)
      Next
   End With
Next iRow


'****************************************************************
'****************************************************************
   Print "Out put the sorted results to ""OrderSorting.csv"""
'****************************************************************
   Print

   'IntegratedOrderArray(20000,5) 'Region, ItemID, Quantity, Amount, Weight
   Open "OrderSorting.csv"  For Output As #1
      For i = 1 To Counter
         CharacterString = ""
         For Dimension =1 To 5
            cellString = IntegratedOrderArray(i,Dimension)
            If Dimension = 1 Then
               cellString = """" & cellString & """"
            EndIf
            If Dimension > 1 Then
               cellString = "," & cellString
            EndIf
            CharacterString = CharacterString & cellString
         Next Dimension

         Print #1, CharacterString

      Next i
   Close #1
   
'****************************************************************
'****************************************************************
   Print "Output of the sorted order aggregate has been completed."
   Print "*******************************************************"
   Print "Please enter any key to exit."
Sleep
paul doe
Moderator
Posts: 1730
Joined: Jul 25, 2017 17:22
Location: Argentina

Re: The Ultimate FB HashMap

Post by paul doe »

Zamaster wrote:...fear not its x64 friendly.
No, it is not. It doesn't compile (see corrections by SARG above)

Also, there's this:

Code: Select all

/' ... '/
namespace dsm

    enum bool
        false = 0
        true = 1
    end enum
/' ... '/    
Which clashes with other header definitions (the Windows headers, no less). See this post by Makoto Watanabe, were he's having trouble trying to compile this with the Windows9 library:

Simultaneous use of "The Ultimate FB HashMap" and "FB GUI library for Windows"
MrSwiss
Posts: 3910
Joined: Jun 02, 2013 9:27
Location: Switzerland

Re: The Ultimate FB HashMap

Post by MrSwiss »

paul doe wrote:Which clashes with other header definitions (the Windows headers, no less).
Nope, nothing to do with headers (the TRUE/FALSE issue) it is FBC related.

It's a change in FBC itself: OK, up to and including version: 1.03.0
From version: 1.04.0 (introduction of Boolean type into FBC) it must be removed.

The "built-in" definition in versions >= 1.04.0 is:

Code: Select all

Const false = 0
Const true = Not false
Since this implementation, if seen from the FB side, is different:
false (self explaining)
true = -1 (behind the scenes however, for C compatibility, the stored value is 1 (positive))

This is just one of the reasons, that old code might conflict, with later FBC versions.
Post Reply