Uncategorized

MySQL Problems! – RED ALERT

The last two days have been maddening for me. On two servers, out of the blue, MySQL is freakin’ and freakin’ hard. It all started with corrupted tables, to memory preformance issues. I then jacked up some mem values, and now both servers are crashing. I think I might be dealing with two problems (second problem related to the first problem). I posted on the Gentoo forum, but another classic ‘talking-to-myself’ thread. Man, I miss the days when you’d get several reponses within minutes. My problems/questions seem pretty difficult lately, and that’s probably a big reason for the lack of replies.

Below is my basic thread of the problem.. if anyone has any suggestions, please comment!

I’m getting desperate for some MySQL help on one of my servers. It started mysteriously lastnight. One of my applications (code has not changed in serveral months to a year), complained about corrupted tables. Viewing the logs, I was definitely seeing MyISAM corruption, which I could replicate by issuing a OPTIMIZE after the sql performed a UPDATE. Again, this code has been like this for many months. I commented out the OPTIMIZE and repaired the tables, and that seemed to have fixed the problem.

The last two days have been maddening for me. On two servers, out of the blue, MySQL is freakin’ and freakin’ hard. It all started with corrupted tables, to memory preformance issues. I then jacked up some mem values, and now both servers are crashing. I think I might be dealing with two problems (second problem related to the first problem). I posted on the Gentoo forum, but another classic ‘talking-to-myself’ thread. Man, I miss the days when you’d get several reponses within minutes. My problems/questions seem pretty difficult lately, and that’s probably a big reason for the lack of replies.

Below is my basic thread of the problem.. if anyone has any suggestions, please comment!

I’m getting desperate for some MySQL help on one of my servers. It started mysteriously lastnight. One of my applications (code has not changed in serveral months to a year), complained about corrupted tables. Viewing the logs, I was definitely seeing MyISAM corruption, which I could replicate by issuing a OPTIMIZE after the sql performed a UPDATE. Again, this code has been like this for many months. I commented out the OPTIMIZE and repaired the tables, and that seemed to have fixed the problem.


071029 16:58:30 InnoDB: Started; log sequence number 0 4530733
071029 16:58:31 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.0.44' socket: '/var/run/mysqld/mysqld.sock' port: 3306 Gentoo Linux mysql-5.0.44
071106 22:38:14 [Warning] Warning: Optimize table got errno 12 on compdb.tblUsers, retrying
071106 22:38:14 [ERROR] /usr/sbin/mysqld: Incorrect key file for table './compdb/tblUsers.MYI'; try to repair it
071106 22:38:14 [ERROR] /usr/sbin/mysqld: Table './compdb/tblUsers' is marked as crashed and last (automatic?) repair failed
071106 22:38:14 [ERROR] /usr/sbin/mysqld: Table './compdb/tblUsers' is marked as crashed and last (automatic?) repair failed
071106 22:44:19 [ERROR] /usr/sbin/mysqld: Table './compdb/tblUsers' is marked as crashed and last (automatic?) repair failed
071106 22:44:19 [ERROR] /usr/sbin/mysqld: Table './compdb/tblUsers' is marked as crashed and last (automatic?) repair failed
071106 23:34:14 [ERROR] /usr/sbin/mysqld: Table './compdb/tblUsers' is marked as crashed and last (automatic?) repair failed
071106 23:34:14 [ERROR] /usr/sbin/mysqld: Table './compdb/tblUsers' is marked as crashed and last (automatic?) repair failed
071106 23:34:14 [ERROR] /usr/sbin/mysqld: Table './compdb/tblUsers' is marked as crashed and last (automatic?) repair failed
071106 23:34:14 [ERROR] /usr/sbin/mysqld: Table './compdb/tblUsers' is marked as crashed and last (automatic?) repair failed
071106 23:34:14 [ERROR] /usr/sbin/mysqld: Table './compdb/tblUsers' is marked as crashed and last (automatic?) repair failed

Later on that night, another one of my servers began seeing the same thing, but I received additional error info (slightly different than the first server)…


071107 23:35:28 [Warning] Warning: Optimize table got errno 12 on compdb2.tblProducts, retrying
071107 23:36:59 [Warning] Warning: Optimize table got errno 12 on compdb2.tblProducts, retrying
071107 23:42:42 [Warning] Warning: Optimize table got errno 12 on compdb2.tblProducts, retrying
071107 23:42:42 [ERROR] /usr/sbin/mysqld: Incorrect key file for table './compdb2/tblProducts.MYI'; try to repair it
071107 23:42:42 [ERROR] /usr/sbin/mysqld: Table './compdb2/tblProducts' is marked as crashed and last (automatic?) repair failed
071107 23:42:42 [ERROR] /usr/sbin/mysqld: Table './compdb2/tblProducts' is marked as crashed and last (automatic?) repair failed
071108 2:50:24 [ERROR] /usr/sbin/mysqld: Out of memory (Needed 179256 bytes)
071108 2:50:24 [ERROR] Out of memory; check if mysqld or some other process uses all available memory; if not, you may have to use 'ulimit' to allow mysqld to use more memory or you can add more swap space
071108 3:01:47 [ERROR] /usr/sbin/mysqld: Out of memory (Needed 179256 bytes)
071108 3:01:47 [ERROR] Out of memory; check if mysqld or some other process uses all available memory; if not, you may have to use 'ulimit' to allow mysqld to use more memory or you can add more swap space

Now, this server has a similar environment.. not much has changed (ntp, debianutils, pciutils.. are about it for both servers). As we can see, this one is complaining about memory usage. The odd thing, is that no swap was used and the load was under ‘1’ during this. I restarted the MySQL server, and everything looked good again. I also removed the OPTIMIZE call here as well. After some investigation, I saw that you could raise mem limits to help the OPTIMIZATION, so I edit’d my.cnf on both servers and edit the following


#sort_buffer_size = 512K
sort_buffer_size = 2M
#read_buffer_size = 256K
read_buffer_size = 2M

Now, the first server spirals out of control (at random) with the following messages:


071108 8:06:25 InnoDB: Started; log sequence number 0 4530733
071108 8:06:25 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.0.44' socket: '/var/run/mysqld/mysqld.sock' port: 3306 Gentoo Linux mysql-5.0.44
071108 15:14:22 [ERROR] /usr/sbin/mysqld: Out of memory (Needed 1677372 bytes)
071108 15:14:26 [ERROR] /usr/sbin/mysqld: Out of memory (Needed 1677372 bytes)
071108 15:15:01 [ERROR] /usr/sbin/mysqld: Out of memory (Needed 1677564 bytes)
071108 15:15:25 [ERROR] /usr/sbin/mysqld: Out of memory (Needed 1677372 bytes)
071108 15:15:27 [ERROR] /usr/sbin/mysqld: Out of memory (Needed 1258272 bytes)
071108 15:15:36 [ERROR] /usr/sbin/mysqld: Out of memory (Needed 1677372 bytes)
071108 15:16:16 [ERROR] /usr/sbin/mysqld: Out of memory (Needed 1677696 bytes)
071108 15:16:22 [ERROR] /usr/sbin/mysqld: Out of memory (Needed 1677720 bytes)
071108 15:16:22 [ERROR] /usr/sbin/mysqld: Out of memory (Needed 1677720 bytes)
071108 15:16:37 [ERROR] /usr/sbin/mysqld: Out of memory (Needed 1677720 bytes)
--- mysql restart here

approximately 2 hours later it happened again..


071108 15:39:40 InnoDB: Started; log sequence number 0 4530733
071108 15:39:40 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.0.44' socket: '/var/run/mysqld/mysqld.sock' port: 3306 Gentoo Linux mysql-5.0.44
071108 17:15:10 [ERROR] /usr/sbin/mysqld: Out of memory (Needed 1677372 bytes)
071108 17:15:21 [ERROR] /usr/sbin/mysqld: Out of memory (Needed 1677372 bytes)
071108 17:15:30 [ERROR] /usr/sbin/mysqld: Out of memory (Needed 1677372 bytes)
071108 17:15:40 [ERROR] /usr/sbin/mysqld: Out of memory (Needed 1677372 bytes)
071108 17:16:01 [ERROR] /usr/sbin/mysqld: Out of memory (Needed 1677372 bytes)
071108 17:16:21 [ERROR] /usr/sbin/mysqld: Out of memory (Needed 1677372 bytes)
071108 17:16:26 [ERROR] /usr/sbin/mysqld: Out of memory (Needed 1677720 bytes)
071108 17:16:30 [ERROR] /usr/sbin/mysqld: Out of memory (Needed 1677372 bytes)
--- mysql restart here

The odd things are, load is not outragous when this happens, load averages are really low on both servers, so high is 1, both server were seeing 0 swap usage. Both servers have PHP applications that have not changed in many months. Both servers have had the standard updates with out any incident. I’ll supply genlop output in case there is a issue I missed.


Sun Nov 4 08:37:17 2007 >>> sys-process/lsof-4.78-r1
Sun Nov 4 08:37:31 2007 >>> sys-apps/pciutils-2.2.7-r1
Tue Nov 6 07:53:22 2007 >>> net-misc/ntp-4.2.4_p4
Thu Nov 8 08:31:13 2007 >>> sys-apps/debianutils-2.25
Thu Nov 8 08:37:17 2007 >>> dev-libs/openssl-0.9.8g

Here is my current version of MySQL:


[ebuild R ] dev-db/mysql-5.0.44 USE="latin1 perl ssl -berkdb -big-tables -cluster -debug -embedded -extraengine -max-idx-128 -minimal (-selinux) -static" 23,869 kB

Here is a snap of top so you can see what kind of memory I have, etc.


top - 18:19:18 up 10 days, 10:21, 5 users, load average: 0.08, 0.08, 0.10
Tasks: 113 total, 1 running, 112 sleeping, 0 stopped, 0 zombie
Cpu(s): 2.9%us, 0.5%sy, 0.0%ni, 95.8%id, 0.7%wa, 0.0%hi, 0.1%si, 0.0%st
Mem: 1540848k total, 1492248k used, 48600k free, 266468k buffers
Swap: 976744k total, 0k used, 976744k free, 537800k cached

Here is ps of mysql


31159 mysql 18 0 1336m 30m 4776 S 0.7 2.0 0:08.91 mysqld

So, after doing some more tests, re-emerging MySQL, etc. I thought I had it. Of course, this morning, server 2 crashed. The difference is that the ‘new’ mem values were still there, where I re-set mem values on server 1. I reset the mem values and restarted. I also used MySQL Administrator to see mem usage via graph and was able to recreate the crash. Here are some graphs to look at

This is a illustration of the crash.. I’m hitting a site hard and crash happened within minutes.
http://www.uno-code.com/files/crash.gif

Here is a graph after resetting the mem values. I’m still hitting the site, and it’s not crashing. Is it fixed now?? Who knows.
http://www.uno-code.com/files/crash2.gif