CSV to fixlength data files

General FreeBASIC programming questions.
jj2007
Posts: 2326
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: CSV to fixlength data files

Post by jj2007 »

I gave it a try, see Converting a *.csv file to a fixed size file. Conversion of a 19MB file with 44,000 lines takes 2 seconds. The length of the records is calculated automatically, and the individual lengths are shown in the console window. Let me know if this solves your problem.
Last edited by jj2007 on Aug 12, 2018 0:32, edited 1 time in total.
MrSwiss
Posts: 3910
Joined: Jun 02, 2013 9:27
Location: Switzerland

Re: CSV to fixlength data files

Post by MrSwiss »

Just to restate an old fact: Windows 2000 <--- some worlds appart ---> DOS

I think it's about time, to face facts, then draw the appropriate conclsions and,
for a change, stick to them! (meaning: burry DOS, 6 feet deep, at least).
Gablea
Posts: 1104
Joined: Apr 06, 2010 0:05
Location: Northampton, United Kingdom
Contact:

Re: CSV to fixlength data files

Post by Gablea »

jj2007 wrote:I gave it a try, see Converting a *.csv file to a fixed size file. Conversion of a 19MB file with 44,000 lines takes 2 seconds. The length of the records is calculated automatically, and the individual lengths are shown in the console window. Let me know if this solves your problem.
Thanks I shall try that when I lm back at the office tomorrow
Gablea
Posts: 1104
Joined: Apr 06, 2010 0:05
Location: Northampton, United Kingdom
Contact:

Re: CSV to fixlength data files

Post by Gablea »

MrSwiss wrote:Just to restate an old fact: Windows 2000 <--- some worlds appart ---> DOS

I think it's about time, to face facts, then draw the appropriate conclsions and,
for a change, stick to them! (meaning: burry DOS, 6 feet deep, at least).
The code is just being developed on windows 2000 as it’s the only 32 bit machine I have that can run the dos complier. My main PC is a 64bit machine and it refuses to run the dos complier.
caseih
Posts: 2157
Joined: Feb 26, 2007 5:32

Re: CSV to fixlength data files

Post by caseih »

dosbox will run the FB/DOS compiler on any machine, on any architecture, any OS. It's not the fastest little emulator in the world, but it's at least as fast as a 386 computer. It's not ideal for your POS program, though, as it doesn't have any direct way of communicating with actual hardware (serial ports, parallel ports, etc). But it would allow you to develop on a more modern PC.
caseih
Posts: 2157
Joined: Feb 26, 2007 5:32

Re: CSV to fixlength data files

Post by caseih »

jj2007 wrote:see http://masm32.com/board/index.php?topic ... 9#msg80189, Converting a *.csv file to a fixed size file
I think it would be far more helpful if you could post a FB program here that did that, rather than a win32 assembler program that won't help Gablea as it won't run on DOS or Linux.
jj2007
Posts: 2326
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: CSV to fixlength data files

Post by jj2007 »

caseih wrote:
jj2007 wrote:see http://masm32.com/board/index.php?topic ... 9#msg80189, Converting a *.csv file to a fixed size file
I think it would be far more helpful if you could post a FB program here that did that, rather than a win32 assembler program that won't help Gablea as it won't run on DOS or Linux.
OP says he gets output from a VB.net application. Does that run on Linux?
Gablea wrote:I need to get the data from a VB.net application that I struggling to output fix length data files.
I am just trying to be helpful here. The csv format is one of the lousiest formats around, and I happen to have a fast routine that can handle csv. So I wrote a little proggie that let's you drag a *.csv over the exe and a millisecond later there is its *.fix equivalent. If you have a better or "far more helpful" alternative, please go ahead.
Gablea
Posts: 1104
Joined: Apr 06, 2010 0:05
Location: Northampton, United Kingdom
Contact:

Re: CSV to fixlength data files

Post by Gablea »

@jj2007
The VB.net app runs on Windows 7 64bit.

And that app is a step forward to allow me to see what is the best way forward.

I know you all think I’m stupid for using dos but I like it and it’s what the market is requiring a simple dos based system that can be update to Windows / Linux at a latter date with out having to change the main database system
jj2007
Posts: 2326
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: CSV to fixlength data files

Post by jj2007 »

Gablea wrote:The VB.net app runs on Windows 7 64bit.
Same OS here. I have no experience with VB.net, but probably you can just insert, after having created the csv database in VB, something like System("CsvToFixedLength.exe"). The app can read the commandline, too, but if there are no arguments, it will use the most recent *.csv file in the current folder.

Just found a ref at MSDN Social googling for vb.net launch exe:

Code: Select all

Dim proc As New System.Diagnostics.Process()
proc = Process.Start("d:\windows\notepad.exe", "")
Doesn't Micros**t feel ashamed to impose such an ugly syntax on poor programmers??

P.S.: CsvToFixedLength is optimised to produce the smallest possible output (i.e. it checks the length of all records and uses the smallest size that fits them all). If you prefer hard-coded record sizes, e.g. stored in an ini file, let me know.
Gablea
Posts: 1104
Joined: Apr 06, 2010 0:05
Location: Northampton, United Kingdom
Contact:

Re: CSV to fixlength data files

Post by Gablea »

Small works for me but how would the FB app read it in? Does it not need fixed sizes?


And yes the syntax in vb is not the best. Why you think I’m trying my best to understand FreeBASIC again it’s do I can ditch the vb version as well as Dutch windows (been looking at using Debian as the operating system once we move over to Linux)
jj2007
Posts: 2326
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: CSV to fixlength data files

Post by jj2007 »

Gablea wrote:Small works for me but how would the FB app read it in? Does it not need fixed sizes?
Yes, that is a problem. You would open the file in random access mode, so you need to know the size of a record. One could write a header string containing info on record size and start points of individual fields; possible but a bit of an overkill for such small databases.

In your case, however, it might be simpler to use fixed sizes. That bloats the output a little bit but handling would be much easier. One could define record and field sizes through an ini file.
badidea
Posts: 2586
Joined: May 24, 2007 22:10
Location: The Netherlands

Re: CSV to fixlength data files

Post by badidea »

Gablea wrote:The files can be upto 30,000 products in them (I have customers running my windows version and they have 30,000 products and that is not struggling at all)
I created a 285 MB CSV-file containing 3,000,000 products (all "BAREFOOT CHARDONY", with random barcodes). Finding (with input #) one of the last items takes ~12 seconds here. Now I can try faster alternatives...

Code used:

Code: Select all

'018341751055,"BAREFOOT CHARDONY",0005,1,10,S,699,A,0,-,0,-,0,0,0,-,0,-,0,0,0,0

type product_type
	dim as string barcodenumber, posdescription, salelocation
	dim as integer agerestricted, agelimit
	dim as string pricetype
	dim as integer retailprice
	dim as string vatcode
	dim as integer print_guarantee_message
	dim as string print_guarantee_code
	dim as integer DisplayMessage
	dim as string DisplayMessage_code
	dim as integer sendtoppr, requestserial, ItemNotAllowed
	dim as string ItemNotAllowedReason
	dim as integer RestrictProductQty
	dim as string RestrictProductAllowed
	dim as integer DiscountNotAllowed, RefundNotAllowed, AskQtyBeforeSelling, HelhtlyStartVoucherOK
end type

function randomBarCode() as string
	return str(int(rnd*1e12))
end function

'dim shared as string oneLineText
dim shared as product_type product

function findProduct(PathToProductDatabase as string, BarcodeNumber as string) as integer
	dim as integer i = 0, ProductFound = 0
	dim as integer ProductFileNumber = freefile()
	Dim r1 As Integer = Open(PathToProductDatabase For Input As #ProductFileNumber)
	If r1 <> 0 Then
		print "Open error"
		return -1
	Else
		Do until EOF(ProductFileNumber)
			with product
				Input #ProductFileNumber, .barcodenumber, .posdescription, .salelocation, .agerestricted, .agelimit, .pricetype, .retailprice, .vatcode, .print_guarantee_message, .print_guarantee_code, .DisplayMessage, .DisplayMessage_code, .sendtoppr, .requestserial, .ItemNotAllowed, .ItemNotAllowedReason, .RestrictProductQty, .RestrictProductAllowed, .DiscountNotAllowed, .RefundNotAllowed, .AskQtyBeforeSelling, .HelhtlyStartVoucherOK
				If Trim(BarcodeNumber) = Trim(.barcodenumber) Then
					print i, .barcodenumber, .posdescription, .salelocation, .agerestricted, .agelimit, .pricetype, .retailprice, .vatcode, .print_guarantee_message, .print_guarantee_code, .DisplayMessage, .DisplayMessage_code, .sendtoppr, .requestserial, .ItemNotAllowed, .ItemNotAllowedReason, .RestrictProductQty, .RestrictProductAllowed, .DiscountNotAllowed, .RefundNotAllowed, .AskQtyBeforeSelling, .HelhtlyStartVoucherOK
					ProductFound = 1
					Exit Do
				End If
			end with
			i += 1
		Loop
		Close #ProductFileNumber
		return ProductFound
	End if
end function

function createProductDB(outFileName as string, numProducts as integer) as integer
	dim as integer i, outFile = freefile()
	Dim r2 As Integer = Open(outFileName For Binary, Access Write, As #outFile)
	for i = 0 to numProducts-1
		with product
			write #outFile, randomBarCode(), .posdescription, .salelocation, .agerestricted, .agelimit, .pricetype, .retailprice, .vatcode, .print_guarantee_message, .print_guarantee_code, .DisplayMessage, .DisplayMessage_code, .sendtoppr, .requestserial, .ItemNotAllowed, .ItemNotAllowedReason, .RestrictProductQty, .RestrictProductAllowed, .DiscountNotAllowed, .RefundNotAllowed, .AskQtyBeforeSelling, .HelhtlyStartVoucherOK
		end with
	next
	Close #outFile
	return 0
end function

dim as double t

sleep 1,1
t = timer
'print findProduct("products.csv", "1234567890")
'print findProduct("products.csv", "018341751055")
	print findProduct("products2.csv", "896024039248")
print timer - t

't = timer
'createProductDB("products2.csv", 3000000)
'print timer - t
Gablea
Posts: 1104
Joined: Apr 06, 2010 0:05
Location: Northampton, United Kingdom
Contact:

Re: CSV to fixlength data files

Post by Gablea »

That’s look like something I was hoping for.

Once they are in the random data file I can update records a lot easier then with the csv file.

Maybe I could have the csv file to feed data to the till and then have the till concert it to random data format that it can use (have this done at startup) and then have a function under a menu key (say 99 function key) and it would the. Regenerate the data file (this could be done once new items was added to the database in the back office system) or if I can work it out have just a update csv file that just holds the items that have changed

Does that make sense or am I thinking along the wrong lines?
jj2007
Posts: 2326
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: CSV to fixlength data files

Post by jj2007 »

badidea wrote:I created a 285 MB CSV-file containing 3,000,000 products (all "BAREFOOT CHARDONY", with random barcodes). Finding (with input #) one of the last items takes ~12 seconds here. Now I can try faster alternatives...
Nice test ;-)

Your code takes 36 seconds on my Core i5 - you must have a damn fast machine. But my own beta spreadsheet control loads the file in slightly under one second, and finds items towards the end in 1.5 seconds. There is room for improvement...

With a fixed size file, loading the file would be faster, and the speed of finding would be determined by Instr() for the whole content; afterwards, you can divide the position by the record size to get the record number. That is a lot faster, of course. Quick test:

Code: Select all

Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz
loading takes 166 ms
found in 242 ms
badidea
Posts: 2586
Joined: May 24, 2007 22:10
Location: The Netherlands

Re: CSV to fixlength data files

Post by badidea »

I think that the OS is fooling me (or a mistake). Loading the whole file in 1 binary get() takes 0.14 seconds. This is ~10 x faster then the SSD read speed (a Samsung thing with multilayer technology, forgot the exact type). Maybe I should do the tests with a external USB mechanical disk.
Post Reply