File I/O issues [solved, hardware problem]

General FreeBASIC programming questions.
Post Reply
Gonzo
Posts: 722
Joined: Dec 11, 2005 22:46

File I/O issues [solved, hardware problem]

Post by Gonzo »

this problem has been solved
i had a problem with my harddrive, moving the project to another drive fixed everything :)
Last edited by Gonzo on May 23, 2011 19:57, edited 3 times in total.
vdecampo
Posts: 2992
Joined: Aug 07, 2007 23:20
Location: Maryland, USA
Contact:

Post by vdecampo »

You need to post some code to get any real help. I use FB file I/O exhaustively and I have no problems.
Gonzo
Posts: 722
Joined: Dec 11, 2005 22:46

Post by Gonzo »

outdated code :)
Last edited by Gonzo on May 23, 2011 21:01, edited 1 time in total.
Gonzo
Posts: 722
Joined: Dec 11, 2005 22:46

Post by Gonzo »

this is starting to anger me greatly

here is an image that proves many things:
http://fbcraft.fwsnet.net/stresstest.png

it's an image of a world-eater job in progress
it's a single job that expands exponentially until the work queue is full
at around 3000 jobs the I/O stuff fails, pretty much everytime

1. nothing gets corrupted, ever
its not an engine problem with the exception of I/O related stuff, i guess
if it was something other than I/O issue bad things would happen, very fast

2. whats saved to disk is saved to disk, but can sometimes be zeroed out:
2a. this test proves that if the specific test doesnt require reading from disk nothing gets zeroed out.
the world eater test was designed to prove this

2b. the falling blocks test requires reading from disk, and most of the time it fails miserably because the data on disk is zeroed out

the most likely reason for this is that (if you look at the code in the other post) some operations fail, such as getting the location of previous data
and the engine would think there was no data in that chunk file and just write as if it was the first entry in the file, meaning everything else is lost

this can be proven by me using screenshots, and i see the effects of this regularly
basically, falling blocks is the same as a cave-in of sand blocks
if i make a huge area cave in, most of that cave-in is reset completely when i restart the engine, except for just one or two sectors :)

3. excerpt from log telling me the first errors for the world-eater test:
05-21-2011 14:19:46:: Error writing sectoral data: 4647129
05-21-2011 14:19:46:: Error writing sectoral data: 5147085
05-21-2011 14:19:46:: Error writing sectoral data: 5163477

summary:
falling blocks read and write, causes file resets (my problem) but proves a root problem
world eater only writes, causes the test to grow to a point where it doesnt continue, and never resets any files (proves the same root problem)

something has to be wrong with the fbc file I/O system stuff

edit: added more logging to get some meaningful queue data
05-21-2011 14:58:12:: Error writing sectoral data: 1909665
05-21-2011 14:58:12:: Amount of chunks in queue: 1
05-21-2011 14:58:12:: Amount of sectors in cq(0): 21

thats 1 file, but approx 21 get#'s and 21 put#'s per update (0.01 seconds)
Gonzo
Posts: 722
Joined: Dec 11, 2005 22:46

Post by Gonzo »

i figured out a way to make it alot more tolerant
i had another file open (i have the logfile open constantly)

never opening the logfile added alot more tolerance to the whole thing
now i can run a small block fall cascade, blow up a few TNTs and no issues
if i do heavier stuff, it will still bug out
MichaelW
Posts: 3500
Joined: May 16, 2006 22:34
Location: USA

Post by MichaelW »

Have you considered that this may be a file system issue?
What OS are you running?
What file system are you running?
What sort of disk hardware are you running and how is it configured?
How many physical disks and logical drives are involved?
Are your files on the same disk/drive as the OS or on a different disk/drive?
Is disk caching enabled?
What is the fragmentation status of the target drive?
How much free disk space do you have on the target drive?
How many files are you working with, what are the sizes, and what is the total size?
Are your access patterns sequential or random or?
How many threads are actively accessing the files?
Etc?
elsairon
Posts: 207
Joined: Jul 02, 2005 14:51

Post by elsairon »

Gonzo wrote:i figured out a way to make it alot more tolerant
i had another file open (i have the logfile open constantly)

never opening the logfile added alot more tolerance to the whole thing
now i can run a small block fall cascade, blow up a few TNTs and no issues
if i do heavier stuff, it will still bug out
I had a similar problem with multiple file read and writes (although from the looks of it my program runs through a lot less data than yours).

Instead of keeping my log file open, which seemed to cause problems when other files were accessed, I just open the log, write to it, and close it again. This substantially increases the number of times the files are opened and closed, creates a lot more read/writes, but stops the problem of losing all the data.
Gonzo
Posts: 722
Joined: Dec 11, 2005 22:46

Post by Gonzo »

elsairon wrote:
Gonzo wrote:i figured out a way to make it alot more tolerant
i had another file open (i have the logfile open constantly)

never opening the logfile added alot more tolerance to the whole thing
now i can run a small block fall cascade, blow up a few TNTs and no issues
if i do heavier stuff, it will still bug out
I had a similar problem with multiple file read and writes (although from the looks of it my program runs through a lot less data than yours).

Instead of keeping my log file open, which seemed to cause problems when other files were accessed, I just open the log, write to it, and close it again. This substantially increases the number of times the files are opened and closed, creates a lot more read/writes, but stops the problem of losing all the data.
yep, but it didnt solve my problem completely
any data loss at all is unacceptable
Gonzo
Posts: 722
Joined: Dec 11, 2005 22:46

Post by Gonzo »

ive operated under the assumption that the bug was mine, but
i tried disabling caching for the drive that the engine uses, and shockingly the problem still happens :)
must be something im doing then, right?

i still dont get why writing to files sometimes fails with completely valid offsets..

05-22-2011 21:37:19:: Error writing sectoral data: 16389
05-22-2011 21:37:19:: Amount of chunks in queue: 1
05-22-2011 21:37:19:: Amount of sectors in cq(0): 1

back to debugging...
TJF
Posts: 3809
Joined: Dec 06, 2009 22:27
Location: N47°, E15°
Contact:

Post by TJF »

You may try to write the log to a PIPE (not to a file, and maybe direct the output via OS to a file)?

Have you considered bad sectors at the HD?
Gonzo
Posts: 722
Joined: Dec 11, 2005 22:46

Post by Gonzo »

05-23-2011 01:04:00:: *** Starting up
05-23-2011 01:04:00:: Reading configuration file
05-23-2011 01:04:00:: Read 56 variables
05-23-2011 01:04:00:: Initializing GLFW...
05-23-2011 01:04:00:: Generating GL textures...
05-23-2011 01:04:01:: Precomputing vertex data
05-23-2011 01:04:01:: *** Loading world data for world: worlds\w3
05-23-2011 01:04:01:: Initializing world generator, using seed: 474
05-23-2011 01:04:01:: Initializing & generating sectors
05-23-2011 01:04:01:: Creating job threads
05-23-2011 01:04:01:: Initializing sound system
05-23-2011 01:09:38:: Game ending, user terminated
05-23-2011 01:04:01:: Game started

im sure you can immediately see whats strange here
i dont know much about file I/O but the last 2 lines are 5 minutes apart in the wrong order

Code: Select all


Sub AddLog(ByRef l As String)
	ChDir ExePath
	Dim As UInteger logfile = FreeFile()
	Open ExePath + "\testclient.log" For Append As #logfile
	Print #logfile, Date() + " " + Time() + ":: " + l
	Close #logfile
End Sub

note that the first part of the log isnt in a threaded enviroment
neither are any other files open, the startup is completely problem-free
there isnt any logging in the 5 minute span of the engine run, because nothing bad happened
but as i quit, its supposed to add one last line and exit

so i guess theres a potential problem with my computer?

winxp 64bit, hdd is a western digital raptor 10,000 rpm
have had no issues with it at all..
badidea
Posts: 2591
Joined: May 24, 2007 22:10
Location: The Netherlands

Post by badidea »

"so i guess theres a potential problem with my computer?"

Test on a different computer?

"note that the first part of the log isnt in a threaded enviroment"

Could it be a threading problem?

Last 2 lines should be swapped? like:

05-23-2011 01:04:01:: Game started
05-23-2011 01:09:38:: Game ending, user terminated

One of these 2 logged by a different thread?
Gonzo
Posts: 722
Joined: Dec 11, 2005 22:46

Post by Gonzo »

ok, ive made a program that does heavy writing and reading tests, and tested it on all my drives
only my raptor drive has this issue

strangely though, im using this drive for all my games, and all my heavy programs
im conflicted on wether its a drive issue or not :)

EDIT: well well well
its thread closing time
i moved my project to another drive, and everything works fine!
kiyotewolf
Posts: 1009
Joined: Oct 11, 2008 7:42
Location: ABQ, NM
Contact:

Post by kiyotewolf »

You need to post some code to get any real help
Hmm? Gonzo post code? Pfft, not likely.



:M
Post Reply