File I/O issues [solved, hardware problem]
File I/O issues [solved, hardware problem]
this problem has been solved
i had a problem with my harddrive, moving the project to another drive fixed everything :)
i had a problem with my harddrive, moving the project to another drive fixed everything :)
Last edited by Gonzo on May 23, 2011 19:57, edited 3 times in total.
this is starting to anger me greatly
here is an image that proves many things:
http://fbcraft.fwsnet.net/stresstest.png
it's an image of a world-eater job in progress
it's a single job that expands exponentially until the work queue is full
at around 3000 jobs the I/O stuff fails, pretty much everytime
1. nothing gets corrupted, ever
its not an engine problem with the exception of I/O related stuff, i guess
if it was something other than I/O issue bad things would happen, very fast
2. whats saved to disk is saved to disk, but can sometimes be zeroed out:
2a. this test proves that if the specific test doesnt require reading from disk nothing gets zeroed out.
the world eater test was designed to prove this
2b. the falling blocks test requires reading from disk, and most of the time it fails miserably because the data on disk is zeroed out
the most likely reason for this is that (if you look at the code in the other post) some operations fail, such as getting the location of previous data
and the engine would think there was no data in that chunk file and just write as if it was the first entry in the file, meaning everything else is lost
this can be proven by me using screenshots, and i see the effects of this regularly
basically, falling blocks is the same as a cave-in of sand blocks
if i make a huge area cave in, most of that cave-in is reset completely when i restart the engine, except for just one or two sectors :)
3. excerpt from log telling me the first errors for the world-eater test:
05-21-2011 14:19:46:: Error writing sectoral data: 4647129
05-21-2011 14:19:46:: Error writing sectoral data: 5147085
05-21-2011 14:19:46:: Error writing sectoral data: 5163477
summary:
falling blocks read and write, causes file resets (my problem) but proves a root problem
world eater only writes, causes the test to grow to a point where it doesnt continue, and never resets any files (proves the same root problem)
something has to be wrong with the fbc file I/O system stuff
edit: added more logging to get some meaningful queue data
05-21-2011 14:58:12:: Error writing sectoral data: 1909665
05-21-2011 14:58:12:: Amount of chunks in queue: 1
05-21-2011 14:58:12:: Amount of sectors in cq(0): 21
thats 1 file, but approx 21 get#'s and 21 put#'s per update (0.01 seconds)
here is an image that proves many things:
http://fbcraft.fwsnet.net/stresstest.png
it's an image of a world-eater job in progress
it's a single job that expands exponentially until the work queue is full
at around 3000 jobs the I/O stuff fails, pretty much everytime
1. nothing gets corrupted, ever
its not an engine problem with the exception of I/O related stuff, i guess
if it was something other than I/O issue bad things would happen, very fast
2. whats saved to disk is saved to disk, but can sometimes be zeroed out:
2a. this test proves that if the specific test doesnt require reading from disk nothing gets zeroed out.
the world eater test was designed to prove this
2b. the falling blocks test requires reading from disk, and most of the time it fails miserably because the data on disk is zeroed out
the most likely reason for this is that (if you look at the code in the other post) some operations fail, such as getting the location of previous data
and the engine would think there was no data in that chunk file and just write as if it was the first entry in the file, meaning everything else is lost
this can be proven by me using screenshots, and i see the effects of this regularly
basically, falling blocks is the same as a cave-in of sand blocks
if i make a huge area cave in, most of that cave-in is reset completely when i restart the engine, except for just one or two sectors :)
3. excerpt from log telling me the first errors for the world-eater test:
05-21-2011 14:19:46:: Error writing sectoral data: 4647129
05-21-2011 14:19:46:: Error writing sectoral data: 5147085
05-21-2011 14:19:46:: Error writing sectoral data: 5163477
summary:
falling blocks read and write, causes file resets (my problem) but proves a root problem
world eater only writes, causes the test to grow to a point where it doesnt continue, and never resets any files (proves the same root problem)
something has to be wrong with the fbc file I/O system stuff
edit: added more logging to get some meaningful queue data
05-21-2011 14:58:12:: Error writing sectoral data: 1909665
05-21-2011 14:58:12:: Amount of chunks in queue: 1
05-21-2011 14:58:12:: Amount of sectors in cq(0): 21
thats 1 file, but approx 21 get#'s and 21 put#'s per update (0.01 seconds)
i figured out a way to make it alot more tolerant
i had another file open (i have the logfile open constantly)
never opening the logfile added alot more tolerance to the whole thing
now i can run a small block fall cascade, blow up a few TNTs and no issues
if i do heavier stuff, it will still bug out
i had another file open (i have the logfile open constantly)
never opening the logfile added alot more tolerance to the whole thing
now i can run a small block fall cascade, blow up a few TNTs and no issues
if i do heavier stuff, it will still bug out
Have you considered that this may be a file system issue?
What OS are you running?
What file system are you running?
What sort of disk hardware are you running and how is it configured?
How many physical disks and logical drives are involved?
Are your files on the same disk/drive as the OS or on a different disk/drive?
Is disk caching enabled?
What is the fragmentation status of the target drive?
How much free disk space do you have on the target drive?
How many files are you working with, what are the sizes, and what is the total size?
Are your access patterns sequential or random or?
How many threads are actively accessing the files?
Etc?
What OS are you running?
What file system are you running?
What sort of disk hardware are you running and how is it configured?
How many physical disks and logical drives are involved?
Are your files on the same disk/drive as the OS or on a different disk/drive?
Is disk caching enabled?
What is the fragmentation status of the target drive?
How much free disk space do you have on the target drive?
How many files are you working with, what are the sizes, and what is the total size?
Are your access patterns sequential or random or?
How many threads are actively accessing the files?
Etc?
I had a similar problem with multiple file read and writes (although from the looks of it my program runs through a lot less data than yours).Gonzo wrote:i figured out a way to make it alot more tolerant
i had another file open (i have the logfile open constantly)
never opening the logfile added alot more tolerance to the whole thing
now i can run a small block fall cascade, blow up a few TNTs and no issues
if i do heavier stuff, it will still bug out
Instead of keeping my log file open, which seemed to cause problems when other files were accessed, I just open the log, write to it, and close it again. This substantially increases the number of times the files are opened and closed, creates a lot more read/writes, but stops the problem of losing all the data.
yep, but it didnt solve my problem completelyelsairon wrote:I had a similar problem with multiple file read and writes (although from the looks of it my program runs through a lot less data than yours).Gonzo wrote:i figured out a way to make it alot more tolerant
i had another file open (i have the logfile open constantly)
never opening the logfile added alot more tolerance to the whole thing
now i can run a small block fall cascade, blow up a few TNTs and no issues
if i do heavier stuff, it will still bug out
Instead of keeping my log file open, which seemed to cause problems when other files were accessed, I just open the log, write to it, and close it again. This substantially increases the number of times the files are opened and closed, creates a lot more read/writes, but stops the problem of losing all the data.
any data loss at all is unacceptable
ive operated under the assumption that the bug was mine, but
i tried disabling caching for the drive that the engine uses, and shockingly the problem still happens :)
must be something im doing then, right?
i still dont get why writing to files sometimes fails with completely valid offsets..
05-22-2011 21:37:19:: Error writing sectoral data: 16389
05-22-2011 21:37:19:: Amount of chunks in queue: 1
05-22-2011 21:37:19:: Amount of sectors in cq(0): 1
back to debugging...
i tried disabling caching for the drive that the engine uses, and shockingly the problem still happens :)
must be something im doing then, right?
i still dont get why writing to files sometimes fails with completely valid offsets..
05-22-2011 21:37:19:: Error writing sectoral data: 16389
05-22-2011 21:37:19:: Amount of chunks in queue: 1
05-22-2011 21:37:19:: Amount of sectors in cq(0): 1
back to debugging...
05-23-2011 01:04:00:: *** Starting up
05-23-2011 01:04:00:: Reading configuration file
05-23-2011 01:04:00:: Read 56 variables
05-23-2011 01:04:00:: Initializing GLFW...
05-23-2011 01:04:00:: Generating GL textures...
05-23-2011 01:04:01:: Precomputing vertex data
05-23-2011 01:04:01:: *** Loading world data for world: worlds\w3
05-23-2011 01:04:01:: Initializing world generator, using seed: 474
05-23-2011 01:04:01:: Initializing & generating sectors
05-23-2011 01:04:01:: Creating job threads
05-23-2011 01:04:01:: Initializing sound system
05-23-2011 01:09:38:: Game ending, user terminated
05-23-2011 01:04:01:: Game started
im sure you can immediately see whats strange here
i dont know much about file I/O but the last 2 lines are 5 minutes apart in the wrong order
note that the first part of the log isnt in a threaded enviroment
neither are any other files open, the startup is completely problem-free
there isnt any logging in the 5 minute span of the engine run, because nothing bad happened
but as i quit, its supposed to add one last line and exit
so i guess theres a potential problem with my computer?
winxp 64bit, hdd is a western digital raptor 10,000 rpm
have had no issues with it at all..
05-23-2011 01:04:00:: Reading configuration file
05-23-2011 01:04:00:: Read 56 variables
05-23-2011 01:04:00:: Initializing GLFW...
05-23-2011 01:04:00:: Generating GL textures...
05-23-2011 01:04:01:: Precomputing vertex data
05-23-2011 01:04:01:: *** Loading world data for world: worlds\w3
05-23-2011 01:04:01:: Initializing world generator, using seed: 474
05-23-2011 01:04:01:: Initializing & generating sectors
05-23-2011 01:04:01:: Creating job threads
05-23-2011 01:04:01:: Initializing sound system
05-23-2011 01:09:38:: Game ending, user terminated
05-23-2011 01:04:01:: Game started
im sure you can immediately see whats strange here
i dont know much about file I/O but the last 2 lines are 5 minutes apart in the wrong order
Code: Select all
Sub AddLog(ByRef l As String)
ChDir ExePath
Dim As UInteger logfile = FreeFile()
Open ExePath + "\testclient.log" For Append As #logfile
Print #logfile, Date() + " " + Time() + ":: " + l
Close #logfile
End Sub
neither are any other files open, the startup is completely problem-free
there isnt any logging in the 5 minute span of the engine run, because nothing bad happened
but as i quit, its supposed to add one last line and exit
so i guess theres a potential problem with my computer?
winxp 64bit, hdd is a western digital raptor 10,000 rpm
have had no issues with it at all..
"so i guess theres a potential problem with my computer?"
Test on a different computer?
"note that the first part of the log isnt in a threaded enviroment"
Could it be a threading problem?
Last 2 lines should be swapped? like:
05-23-2011 01:04:01:: Game started
05-23-2011 01:09:38:: Game ending, user terminated
One of these 2 logged by a different thread?
Test on a different computer?
"note that the first part of the log isnt in a threaded enviroment"
Could it be a threading problem?
Last 2 lines should be swapped? like:
05-23-2011 01:04:01:: Game started
05-23-2011 01:09:38:: Game ending, user terminated
One of these 2 logged by a different thread?
ok, ive made a program that does heavy writing and reading tests, and tested it on all my drives
only my raptor drive has this issue
strangely though, im using this drive for all my games, and all my heavy programs
im conflicted on wether its a drive issue or not :)
EDIT: well well well
its thread closing time
i moved my project to another drive, and everything works fine!
only my raptor drive has this issue
strangely though, im using this drive for all my games, and all my heavy programs
im conflicted on wether its a drive issue or not :)
EDIT: well well well
its thread closing time
i moved my project to another drive, and everything works fine!
-
- Posts: 1009
- Joined: Oct 11, 2008 7:42
- Location: ABQ, NM
- Contact: