Author Topic: mySQL and Bad RAM (Read 2262 times)

DaveLembke · « **on:** March 15, 2017, 08:55:42 AM »

Just sharing how I thought mySQL would be pretty solid given that its a transactional database on a laptop i have that I was building data with parsing info online that is public domain and dropping the data into my database to run an analysis on. I knew using this laptop that it has a RAM stick that failed MEMTEST86 but otherwise it never crashed and acted fine. The reason to run MEMTEST86 on it was because I run it on all systems I get just as a quick test of system health. The RAM in this system didnt fail every test loop, but like 2 out of 6 would show a pattern mismatch.

So I used it mainly because it otherwise never crashes on me and never shows any signs of illness and it was the closest laptop to grab and load up to run this project that I have.

Well after 9 hours of running all was going good. I went to end my program that I had to automate and gather info and thats when the BSOD hit. Upon reboot of the laptop, I went to start mySQL which I have set to manual start for the service and mySQL shows many error messages but the mySQL database is up and running.

Before messing around in the database I performed a backup of it performing the alldatabases dump and it completed that way to quickly. The dump of all databases usually takes about a minute and it completed in about 8 seconds. So I looked at the alldatabases SQL dump file to see how far it got before it bailed out to see what table etc it was dumping when it hits the alldatabase dump to sql backup fail condition. Wrote that down and looked at the log file and saw the error messages.

So then went to perform a database repair in which I thought mySQL had like a shadow copy feature where it kept a previous good state which the repair could be run from, to rebuild the database but I was wrong. It seems as the only way to maybe protect myself from a total trashing of the mySQL database is to have multiple machines running that database with a master-master mode or a cluster with a team of systems to handle this. However... if you have one server that has BAD RAM, I am curious if it would poison the other healthy systems database mirrors since rules are thrown out the door when RAM goes BAD. It might just crash that one server system that had the bad memory and the others that are healthy keep the database up, but it could also allow for corrupt information from a bad memory stick to write to all other systems databases and one bad apple spoils the bunch.

Curious if anyone knows the answer to this, if bad memory on one system can poison the other systems databases through replication to where having redundancy isnt even a cure for a problem like this? Is there any fail safe to a system turning rogue when it loses its mind due to memory error(s) to protect the other systems such as a check into the systems health before replication of data between systems? If something like this doesnt exist to test health of systems it could make for someone a neat little fail safe project program to work with mySQL or MariaDB etc to test another systems event logs for signs of trouble before a replication interval maybe, however a RAM faulire could happen during the replication process in which its not detected.

The good thing is that I didnt lose all my data to have to start from scratch. I perform a dump after each time running this so i have a restore point to continue to build from. But 9 hours of processing was lost because I was to trusting in a system that otherwise behaved even though memtest86 warned me to steer clear of using this system for anything important.

Just replaced the BAD Samsung ( 2 x 1GB DDR2 533Mhz ) RAM sticks with another set of Known Good Kingston ( 2 x 1GB DDR2 533Mhz ) KTD-INSP6000A/1G RAM sticks out of a parts donor but otherwise healthy but extremely to weak in processing power single-core Celeron M to install into this for more processing powerful Core 2 Duo laptop and running a MEMTEST86 on system now to verify that the RAM issue is fixed and not memory controller related.

BC_Programmer · « **Reply #1 on:** March 15, 2017, 09:37:36 AM »

There is no realistic fault-tolerance for any database system if memory errors cannot be detected, except to have say daily backups from the database itself, and hope to be able to go back to before said corruption.

"Real" database servers will have ECC Memory. (They also don't usually run on old, second-hand/third hand laptops and desktops...

) Memory errors will cause hardware failure before they propagate to the database. In most such servers those sorts of errors fall under the same as too many Fans failing, in that the system shuts down. So as soon as memory starts to have issues, suddenly the system starts to die and you know immediately that service is needed. Basically as soon as it becomes unsuitable as a database server it would die.

While the system is running it replicates to a second server. That second server acts as a fail-over if the first one dies for any reason. Because the first one won't replicate memory errors to the second memory errors cause instant shutdown, the fail-over can be considered reliable.

And that second one can have a third fail-over and so forth depending on how much reliability or how many failures can be expected.

DaveLembke · « **Reply #2 on:** March 15, 2017, 03:20:44 PM »

Thanks for that info BC ... Now I realize the importance of a server grade system vs a standard off the shelf computer for database servers. ECC Memory has its advantages for Error Correction that standard computers dont have.

I have a Dual DualCore-CPU server with 16GB ECC RAM that I could use I suppose, but its power hungry, so I think I will take my chances with this system with the new good used RAM in it that passed 10 of 10 full tests. The server has a 1000 watt power supply in it and consumes about 365 watts when idle. The Core 2 Duo laptop uses about 52 watts of power, so in 9 hours of runtime the consumption of server is 3.285kW and laptop is .468kW. If the project was mission critical then sure I'd pay the power company more money to run this, but its worth the risk of another crash with this laptop for now, and I doubt it will crash again as for MEMTEST86 shows this pair of memory sticks installed today as being healthy whereas the other pair of RAM that came with the Gateway laptop failed 2 out of 6 full tests with a pattern mismatch at the same memory address in the 632MB block. Its not for a business or anything like that, its just information gathered and then reports are created after that show statistical info.

Cost comparison based on 15 cents per kWh shows the server using 49.3 Cents of power in what the laptop can do for 7 cents.

If it was a business project where its mission critical, I'd pay the extra to be safe I suppose.

Computer Hope Forum

News:

Author Topic: mySQL and Bad RAM (Read 2262 times)

DaveLembke

mySQL and Bad RAM

BC_Programmer

Re: mySQL and Bad RAM

DaveLembke

Re: mySQL and Bad RAM