After seeing that the Burrows Wheeler Transform was not a full sorting algorithm and it worked great for text.
I wanted to try to come up with a minimal sort for binary data. The reason so I would not have to store a lot of data for the unsort.
Here's the sort I came up with: This is not a full sort; it only sorts the numbers by the most significant bit. to see what's going on. I printed it out in binary mode. It groups the numbers by whether the most significant bit has a one or zero in it. This won't work if all the numbers are above 127 or all are below 128. Using it on a random data file, there was enough change in the file to compress it.
paq8px -8 randomsorted.bin
Total input size : 415248
Total archive size : 380561
The file compressed to about 34k smaller.
Code: Select all
'minimal file sort for data compression experiment by neil
DIM AS UBYTE a,b,c,d,e,f,g,h
a = 128:b = 155:c = 63:d = 8:e = 165:f = 81:g = 85:h = 159
Print "Before sorting":
print a;" ";bin(a,8)
print b;" ";bin(b,8)
print c;" ";bin(c,8)
print d;" ";bin(d,8)
print e;" ";bin(e,8)
print f;" ";bin(f,8)
print g;" ";bin(g,8)
print h;" ";bin(h,8)
print
print
Print "After sorting":
if (a and 128) = 0 Then print a;" ";bin(a,8)
if (b and 128) = 0 Then print b;" ";bin(b,8)
if (c and 128) = 0 Then print c;" ";bin(c,8)
if (d and 128) = 0 Then print d;" ";bin(d,8)
if (e and 128) = 0 Then print e;" ";bin(e,8)
if (f and 128) = 0 Then print f;" ";bin(f,8)
if (g and 128) = 0 Then print g;" ";bin(g,8)
if (h and 128) = 0 Then print h;bin(g,8)
if (a and 128) Then print a;" ";bin(a,8)
if (b and 128) Then print b;" ";bin(b,8)
if (c and 128) Then print c;" ";bin(c,8)
if (d and 128) Then print d;" ";bin(d,8)
if (e and 128) Then print e;" ";bin(e,8)
if (f and 128) Then print f;" ";bin(f,8)
if (g and 128) Then print g;" ";bin(g,8)
if (h and 128) Then print h;" ";bin(h,8)
print
sleep