Storing data efficiently

General FreeBASIC programming questions.
neil
Posts: 555
Joined: Mar 17, 2022 23:26

Storing data efficiently

Post by neil »

I am new here I wasn't sure where to post this. I am taking a programming class our teacher is teaching us how things work at a at a binary level. Also learning to work with hexadecimal numbers. The program is nothing special he just wanted us to learn how to store data more efficiently. I am mainly posting because about 4 days ago I saw something about a lookup table for this type of data. Could anyone help me find this lookup table method ? We are still on this topic at school. Here's the type of stuff I am learning.

Code: Select all

'merging 2 Binary strings to convert 2 Bytes into 1 Byte 
'no magic here the 4 bit numbers have to be in a range of 0 - 15

Dim As String binary1,binary2,binary3
Dim As UByte decimalnumber

'binary1 = 7  binary2 = 8   merged  this should be 78 in Hexadecimal
binary1 = "00000111":binary2 = "00001000"

'remove 4 zero's from each string and merge them together
binary3 = right(binary1,4) + right(binary2,4)

print "strings merged ";binary3

'convert binary string to a decimal number
decimalnumber = val(str("&B" + binary3))

print "decimal value = ";decimalnumber
print "Hex Value = ";Hex(decimalnumber)
 
'now the decimal number can be written to a file using Put #f,,decimalnumber
'this will save storage space
badidea
Posts: 2586
Joined: May 24, 2007 22:10
Location: The Netherlands

Re: Storing data efficiently

Post by badidea »

neil wrote: Mar 18, 2022 1:04I am mainly posting because about 4 days ago I saw something about a lookup table for this type of data. Could anyone help me find this lookup table method ?
Hi, I think it was >this< topic. Unfortunately, the one who started the topic deleted all his posts.
paul doe
Moderator
Posts: 1730
Joined: Jul 25, 2017 17:22
Location: Argentina

Re: Storing data efficiently

Post by paul doe »

Hi neil. Welcome to the forum.

Indeed, he deleted all his posts. However, the scheme wasn't all that mysterious:

Code: Select all

sub permute( a as string, l as long, r as long )
  if( l = r ) then
    ? a
  else
    for i as integer = l to r
      swap a[ l ], a[ i ]
      permute( a, l + 1, r )
      swap a[ l ], a[ i ]
    next
  end if
end sub

dim as string s = "12345678"

permute( s, 0, len( s ) - 1 )

sleep()
That function will show you all permutations for a given string, without duplicates. You can modify the code to instead write them to a file or array, and then use an index to refer to the permutation instead of the complete string. That's why the OP of that thread thought he was 'compressing' stuff; it doesn't. It's just a look-up table to a set of permutations.
neil
Posts: 555
Joined: Mar 17, 2022 23:26

Re: Storing data efficiently

Post by neil »

Thanks for the permutation code. I sent this to my teacher. He will use this formula in his python script and create learning exercise's with it. I was thinking it was a lookup table in the code. This is nice. My teacher asked me to write a program to prove there are no duplicate permutations. I am still thinking about that one ? Thank you again
BasicCoder2
Posts: 3906
Joined: Jan 01, 2009 7:03
Location: Australia

Re: Storing data efficiently

Post by BasicCoder2 »

To work out a proof maybe start with a small examples and look at the patterns.
s="1"
s="12"
s="123"
neil
Posts: 555
Joined: Mar 17, 2022 23:26

Re: Storing data efficiently

Post by neil »

Thanks for your tips. I was trying to find a snippet of code of how a text editor does string compares.
neil
Posts: 555
Joined: Mar 17, 2022 23:26

Re: Storing data efficiently

Post by neil »

My teacher did proofs on the lookup table. No duplicates the tables fine. He said in theory you could store 8 bytes with 2 bytes using line numbers. This would reduce a file with limited range numbers even better then zip could. He said sometimes theory's don't work out. Maybe storing the line numbers could get out of order. Something was overlooked. He asked me not to pursue this. He said there has to be a flaw and he was going to find it. Thank you for the lookup table algorithm.
badidea
Posts: 2586
Joined: May 24, 2007 22:10
Location: The Netherlands

Re: Storing data efficiently

Post by badidea »

You cannot store all possible 8 bytes combinations (64-bit) with 2 byte line numbers.
neil
Posts: 555
Joined: Mar 17, 2022 23:26

Re: Storing data efficiently

Post by neil »

My teacher's theory is based on claims made by the OP guy. Keeping the numbers in a limited range and used against the permutation table with all combination's in the table. If your limited range number is 56814327 is found you would only have to store the line number 2. In theory possible my teacher's problem is jumping all over to find a match against your numbers and keeping track of line numbers and storing them correctly. Has anybody seen this proven by the OP guy ? This only a theory my teacher does not think it will work and want's to prove it's not possible.

permutation table example
1 12345678
2 56814327
paul doe
Moderator
Posts: 1730
Joined: Jul 25, 2017 17:22
Location: Argentina

Re: Storing data efficiently

Post by paul doe »

neil wrote: Mar 19, 2022 22:36 ...
Has anybody seen this proven by the OP guy ? This only a theory my teacher does not think it will work and want's to prove it's not possible.
...
It's not a 'theory'. It's simple maths: as long as the number of permutations is below 65535 you can store them in a file and use a short integer (2 bytes) to index them. Anything above that and you'll need a 4 byte integer (up to 4294967295 permutations). It's not that hard to 'prove'. You still need the permutation table available (either in file or memory) to relate the index to the permutation. So, there's no 'compression', just a look-up table.
dodicat
Posts: 7976
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: Storing data efficiently

Post by dodicat »

For permutations of 12345678 saved to a file and retrieved from the file:

Code: Select all

#include "file.bi"

Sub Permutate(s As String,perm() As String,OptionalStop As String="")
      Dim As long p,i,j,result
      Dim As String s2=s
      var lens2=len(s2)
      Dim As Double factorial
      Dim temp As Double=1
      If Len(s2) >1 Then
            For n As Integer =1 To Len(s2)
                  temp =temp * n
            Next
            factorial =temp
      Else
            factorial =1
      End If
      Redim perm(1 To factorial)
      For p1 As long =0 To Len(s2)-2
            For p2 As long =p1 + 1 To Len(s2)-1
                  If s2[p1]>s2[p2] Then Swap s2[p1],s2[p2]
            Next p2
      Next p1
      Do
            p=p+1
            perm(p)=s2
            If s2=OptionalStop Then exit do
            Do
                  For i=Lens2-2 To 0 Step -1
                        If s2[i] <s2[i+1] Then Exit For
                  Next
                  If i <0 Then Result=0:Exit Do
                  j =Lens2-1
                  While s2[j] <= s2[i]: j -=1 : Wend
                  Swap s2[i], s2[j]
                  i +=1
                  j =Lens2-1
                  While i <j
                        Swap s2[i], s2[j]
                        i +=1
                        j -=1
                  Wend
                  result=-1:Exit Do
            Loop
      Loop Until result=0
      Redim Preserve perm(1 To p)
End Sub


Function savefile(filename As String,p() As Ulong) As String
      Dim As Long n=Freefile
      If Open (filename For Binary Access Write As #n)=0 Then
            Put #n,,p()
            Close #n
      Else
            Print "Unable to save " + filename:Sleep:End
      End If
      Return filename
End Function

Sub loadfile(file As String,u() As Ulong)
      Var  f=Freefile
      Var L=Filelen(file)/Sizeof(Typeof(u))
      Redim u(1 To L)
      If Open (file For Binary Access Read As #f)=0  Then
            If Lof(f) > 0 Then  Get #f, ,u()
            Close #f
      Else
            Print "Unable to load ";file
      End If
End Sub

dim as double t=timer
Redim As String  p()
permutate("12345678",p())

Dim As Ulong q(Lbound(p) To Ubound(p))

For n As Long=Lbound(p) To Ubound(p)
      q(n)=Valulng(p(n)) 'tranfer string to ulong to save in a file
Next
Erase(p)'finished with p

savefile("permutations.txt",q())
Erase(q)'finished with q

Print "file length "; Filelen("permutations.txt")

'PART 2 using the file 
'now load from file
Redim As Ulong NewArray()
loadfile("permutations.txt",NewArray())
Print Lbound(NewArray);"  to  ";Ubound(NewArray)
Print
Print "From file"
For n As Long=1 To 20
      Print n,NewArray(n)
Next
Print "..."
Print "..."
For n As Long=Ubound(NewArray)-20 To Ubound(NewArray)
      Print n,NewArray(n)
Next
Print

Print "done, time taken  ";timer-t
Sleep
Kill "permutations.txt"


 
neil
Posts: 555
Joined: Mar 17, 2022 23:26

Re: Storing data efficiently

Post by neil »

Thanks for the explanation about math permutations. My teacher did not like the OP's guys claim about storing 8 bytes with 2 bytes without proving it with code. My teacher is testing the encoder with random data his line numbers were out of order because he forgot to reset the file pointer to the lookup table. He said it looks like it going to work. Thanks
neil
Posts: 555
Joined: Mar 17, 2022 23:26

Re: Storing data efficiently

Post by neil »

dodicat I just saw this post. Is this the same permutation table that paul doe posted ? It now can be written to a file and put into an array. I asked about being the same permutation table because my teacher will ask me to prove that it has no duplicates. thanks
dodicat
Posts: 7976
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: Storing data efficiently

Post by dodicat »

If you sort paul doe's table then it is the same table.
In order to check for duplicates then you must sort the table and check every number does not equal it's neighbour in the table.
Here is a quicksort for numerical numbers, determined by datatype you want (first line)

Code: Select all

#define datatype ulong
Enum direction
    up
    down
End Enum

 Sub QuickSort(array() As datatype,begin As Long,Finish As Long,dirn As direction)
 Dim As long i=begin,j=finish
 Dim As datatype x =array(((I+J)\2))
   While I <= J
    If dirn=up Then
     While array(I) < X :I+=1:Wend
     While array(J) > X :J-=1:Wend
 Else
    While array(I) > X :I+=1:Wend
    While array(J) < X :J-=1:Wend 
End If
 If I<=J Then Swap array(I),array(J): I+=1:J-=1
 Wend
 If J >begin Then QuickSort(array(),begin,J,dirn)
 If I <Finish Then QuickSort(array(),I,Finish,dirn)
End Sub

'sample
dim as ulong L(1 to 6)={45,2,9,0,12,2}
Quicksort(L(),lbound(L),ubound(L),up)
for n as long=lbound(L) to ubound(L)
      print L(n)
next n
print
'check for duplicates
for n as long=lbound(L) to ubound(L)-1
      if L(n)=L(n+1) then print "Duplicate ";L(n)
next n


      sleep

 
Please note that I saved the array in binary in my last post, giving a file size of 161280 bytes, compared to 362880 bytes which would be required to save it as readable text.(8 characters plus a line end for each permutation)
neil
Posts: 555
Joined: Mar 17, 2022 23:26

Re: Storing data efficiently

Post by neil »

dodicat I thought your table was stored as as text file. When I saw 161280 bytes, compared to paul doe's 362880 byte table stored as text. That's why I thought they were different. Storing as a binary file is more efficient. Thanks for explaining that.
The tables should be the same. Thanks for the quick sort.
Post Reply