alanwilliamson

What do you do when java is gobbling up 99% cpu?

One of our servers, that has been happily running for months without any major problem, suddenly started jumping to 99% CPU and never returning back down to normal levels. The traffic on the site hadn't changed so something was causing this to go into a 99% jump. This was going on for days and I still couldn't pin point it down. It seemed the machine was in a tight loop of sorts. Carefully profiling all the CFML that was running on the machine, there was nothing in the pages that would cause it to jump into a tight loop and never return.

The time had come to explore a thread dump. I had attempted this before, but the JVM was never returning from the tight loop actually producing 99% CPU load. So the kill -3 <pid> was yielding nothing. Doing a little searching around I discovered a small tip from BEA's WebLogic support that suggested you start up the jvm with the JIT disabled; java -nojit which didn't help any.

The next thing to look at was a tip that suggested you try and attach the java debugger (jdb) to it and see if you can suspend it that way.  Again, the font of this information was from another J2EE provider, Resin.   So you start up the java process with the following flags:  -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5432 and then in another shell, you attach the debugger to it using: $JAVA_HOME/bin/jdb -connect com.sun.jdi.SocketAttach:hostname=localhost,port=5432.  That all worked beautifully.

However come the time to actually suspend the process, it continued to hang there and not return.  So the debugger route was a washout.

So now I was kinda stuck as to what to do.  Then in a moment of desparation I simply typed in a huge long silly question to Google asking when it wasn't working.   On page 2 of the results, was my answer, from yet another J2EE server, Tomcat.  The MySQL JDBC driver I was using had a very strange bug that, when it got to the last page of a result set that you were paging using LIMIT x,y then it would go into an infinitive loop if the page boundard was exact.  For example, 100 results, 10 results per page, the 10th page would through the driver into a loop.

I upgraded the MySQL driver and after throwing some backward compatible switches on it, the whole thing burst into life with no problems reported since.  Armed with this new found information I went looking at my queries and noticed a new feature that was introduced to page through results.  This was indeed the reason.

So thanks to WebLogic, Resin and finally Tomcat for helping me get to the root of my MySQL problem!


 

Recent Cloud posts

Recent JAVA posts

Latest CFML posts


 
Site Links