  | Mailing List | | Home | | Forum Home | | JBoss - Java Application Server | | Struts - A MVC web framework | | Tomcat - JSP/Servlet container | | iText - An open source PDF Java Library | | JDOM - JDOM XML Parser | | J2EE - A mailing list for Java(tm) 2 Platform, Enterprise Edition | | J2EE Pattern - An interest list for Sun Java Center J2EE Pattern Catalog | | Servlet - A mailing list for discussion about Sun Microsystem's Java Servlet API Technology | | JSP - A mailing list about Java Server Pages specification and reference | |
Struts & Hibernate
|
|
|
  | | | WAS: tomcat 5.0.16 Replication | WAS: tomcat 5.0.16 Replication 2004-01-12 - By jean-philippe.belanger@(protected)
Back That sounds good.
I'll get the CVS head and check this out. We won't really put much stress on those server for a while, but as long as the behavior is the same. I buy! :)
btw: is there a pool config or is it hardcoded for now?
Thanks again Filip.
Jean-Philippe Belanger
Filip Hanik wrote:
>Steve and Jean-Philippe, >I've been working on some more replication stuff and made a major change >that I think you might want to use. >I have added a third configuration to the parameter replicationMode, > >replicationMode="pooled" > >With this setting it still is synchronized replication, but uses a pool of >sockets to replicate the data. >It improves performance a lot. Try it out, and let me know how it works for >you >You will notice the improvement under load. > >of course, get latest from cvs first > >Filip > >-- --Original Message-- -- >From: Steve Nelson [mailto:Steve.Nelson@(protected)] >Sent: Friday, January 09, 2004 12:05 PM >To: 'Tomcat Users List' >Subject: RE: tomcat 5.0.16 Replication > > > > >Hrmmm, perhaps I should reboot using the non-SMP kernel and try it. I'll >have to do that when I get back to the servers. > > >-- --Original Message-- -- >From: Steve Nelson [mailto:Steve.Nelson@(protected)] >Sent: Friday, January 09, 2004 2:04 PM >To: 'Tomcat Users List' >Subject: RE: tomcat 5.0.16 Replication > > >uname -a >machine #1) Linux draco 2.4.20-8smp #1 SMP Thu Mar 13 17:45:54 EST 2003 i686 >i686 i386 GNU/Linux >machine #2) Linux scorpio 2.4.20-8smp #1 SMP Thu Mar 13 17:45:54 EST 2003 >i686 i686 i386 GNU/Linux > > >java -version: >java version "1.4.2_03" >Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2_03-b02) >Java HotSpot(TM) Client VM (build 1.4.2_03-b02, mixed mode) > >same on both > > >-- --Original Message-- -- >From: Filip Hanik [mailto:devlists@(protected)] >Sent: Friday, January 09, 2004 1:56 PM >To: Tomcat Users List >Subject: RE: tomcat 5.0.16 Replication > > >[root@(protected) bin]# uname -a >Linux rh9 2.4.20-8 #1 Thu Mar 13 17:54:28 EST 2003 i686 i686 i386 GNU/Linux > >[root@(protected) bin]# java -version >java version "1.4.2_03" >Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2_03-b02) >Java HotSpot(TM) Client VM (build 1.4.2_03-b02, mixed mode) > > >-- --Original Message-- -- >From: Steve Nelson [mailto:Steve.Nelson@(protected)] >Sent: Friday, January 09, 2004 11:05 AM >To: 'Tomcat Users List' >Subject: RE: tomcat 5.0.16 Replication > > >sun JDK 1.4.2 for Linux >Kernel 2.4.20-8smp >Tomcat 5.0.16 with catalina-cluster.jar from CVS head > >Hrmmm....are yours SMP servers? Could be something odd with synch if that is >the case. > > >-- --Original Message-- -- >From: Filip Hanik [mailto:devlists@(protected)] >Sent: Friday, January 09, 2004 1:01 PM >To: Tomcat Users List >Subject: RE: tomcat 5.0.16 Replication > > >interesting, mine doesn't work at all unless I set the LD_ASSUME_KERNEL > >what VM (version and name) are you using? > >Filip > >-- --Original Message-- -- >From: Steve Nelson [mailto:Steve.Nelson@(protected)] >Sent: Friday, January 09, 2004 10:59 AM >To: 'Tomcat Users List' >Subject: RE: tomcat 5.0.16 Replication > > > >Now that's really very strange. I am running RH9 and everything seems to go >through just fine. > > >-- --Original Message-- -- >From: jean-philippe.belanger@(protected) >[mailto:jean-philippe.belanger@(protected)] >Sent: Friday, January 09, 2004 12:56 PM >To: Tomcat Users List >Subject: Re: tomcat 5.0.16 Replication > > >The replication message ACK never get back to the sender. >So my webpages never loads without that flag. > >I think it is only needed under REDHAT 9. > >Jean-Philippe B�langer > >Steve Nelson wrote: > > > >>I don't seem to need the ld_assume_kernel thing. What are the symptoms when >>it is required? >> >> >>-- --Original Message-- -- >>From: jean-philippe.belanger@(protected) >>[mailto:jean-philippe.belanger@(protected)] >>Sent: Friday, January 09, 2004 12:33 PM >>To: Tomcat Users List >>Subject: Re: tomcat 5.0.16 Replication >> >> >>Just tried the CVS head and everything works with any CPU going crazy! >>only if ld_assume_kernel is set to 2.4 >> >>One more question for you Filip, is the useDirtyFlag working at all? It >>seams like even if it's set to true, the whole session gets replicated >>after each request. :( >> >>Jean-Philippe >> >>jean-philippe.belanger@(protected) wrote: >> >> >> >> >> >>>Hurray for Fillip! :) >>> >>>I'll get the CVS head for the module today and test this out. >>>Happy to see that it got fixed that quickly! >>> >>>Thanks again and I'll let you know how it goes >>> >>>Jean-Philippe >>> >>>Filip Hanik wrote: >>> >>> >>> >>> >>> >>>>Jean-Philippe and Steve, >>>>I fixed the bug, and tried replication on RH9. Immediately it didn't >>>>work. >>>>The problem is that when RH9 tries to write the ACK back to the NIO >>>>socket, >>>>it never reaches the other node. and times out after a long time. >>>> >>>>I set LD_ASSUME_KERNEL=2.4 and it started to work >>>> >>>>Filip >>>> >>>>-- --Original Message-- -- >>>>From: Filip Hanik [mailto:devlists@(protected)] >>>>Sent: Thursday, January 08, 2004 6:43 PM >>>>To: Tomcat Users List >>>>Subject: RE: tomcat 5.0.16 Replication >>>> >>>> >>>>ok guys, >>>>good news. The 100% cpu is totally my fault. I messed up on that one. >>>>I was registering OP_WRITE as an interest >>>>this is not good :) >>>>checking in the working code in 15 min, some more regression tests >>>>Filip >>>> >>>>-- --Original Message-- -- >>>>From: Filip Hanik [mailto:devlists@(protected)] >>>>Sent: Thursday, January 08, 2004 2:54 PM >>>>To: Tomcat Users List >>>>Subject: RE: tomcat 5.0.16 Replication >>>> >>>> >>>>another code change was, that I am now accepting keys for OP_READ and >>>>OP_WRITE. before it was only OP_READ, >>>>but for synchronous replication I need both. >>>> >>>>this is good info, I just got RH9 installed. will be trying it out >>>>this and >>>>next week. >>>> >>>>Filip >>>> >>>>-- --Original Message-- -- >>>>From: jean-philippe.belanger@(protected) >>>>[mailto:jean-philippe.belanger@(protected)] >>>>Sent: Thursday, January 08, 2004 11:46 AM >>>>To: Tomcat Users List >>>>Subject: Re: tomcat 5.0.16 Replication >>>> >>>> >>>>The only changes in the ReplicationListener class is the try catch that >>>>was added. >>>> >>>>the code logic is the same. Weird enough. So it's probably elsewhere >>>>that something changed in the state of the SelectionKey. >>>> >>>>Jean-Philippe B�langer >>>> >>>>Steve Nelson wrote: >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>>>I was just about to try this actually. I found through googling alot of >>>>>people >>>>>having problems with select with 1.4 and NIO with Redhat 9. They were >>>>>actually >>>>>experiencing crashes though. >>>>> >>>>>To verify your results I just put a Thread.Sleep(1); where you >>>>> >>>>> >>>>> >>>>> >>>>> >>>>suggested and >>>> >>>> >>>> >>>> >>>> >>>> >>>>>I also see the jump in performance. >>>>> >>>>>Something must have changed in ReplicationListener that causes this >>>>>because >>>>>the 5.0.16 >>>>>version doesn't seem to have the problem. I'll see if I can figure >>>>>it out >>>>>when I get back to where I can diff the files. >>>>> >>>>>-Steve >>>>> >>>>>-- --Original Message-- -- >>>>>From: jean-philippe.belanger@(protected) >>>>>[mailto:jean-philippe.belanger@(protected)] >>>>>Sent: Thursday, January 08, 2004 12:25 PM >>>>>To: Tomcat Users List >>>>>Subject: Re: tomcat 5.0.16 Replication >>>>> >>>>> >>>>>More content for you Filip. >>>>> >>>>>I've checked and followed the code of the listen event in >>>>>ReplicationListener.java >>>>> >>>>>Here's what happening: >>>>> >>>>>selector.select(timeout) -> return immediatly with one SelectorKey >>>>> >>>>> >>>>> >>>>> >>>>> >>>>available >>>> >>>> >>>> >>>> >>>> >>>> >>>>>That key is not Acceptable and not Readable so it immediatly skip those >>>>>IFs and loops back to the beginning. >>>>> >>>>>I've put traces and this is executed once every millisecond hence the >>>>>100% load on the server. >>>>>Just to make sure, I've put a Thread.sleep(10) at the end of the loop >>>>>and the CPU dropped back to 0% and the replication still worked nicely >>>>>but probably a little slower since the wait of 10ms. >>>>> >>>>>I don't know much about those NIO packages but seams like the >>>>>select(timeout) method shouldn't return a SelectorKey of that state. >>>>>with any waiting. >>>>> >>>>>Let me know what you can dig from those. >>>>> >>>>>Jean-Philippe B�langer >>>>> >>>>>jean-philippe.belanger@(protected) wrote: >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>>Hi Filip. >>>>>> >>>>>>I did some profiling of 40mins of tomcat with and without a 2nd node >>>>>>up. here are the results with >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>-Xrunhprof:cpu=samples,thread=y,file=/u01/portal/java.hprof.txt,depth=10: >> >> >> >> >>>>>>Those number are cpu=times and not samples since the later one freezes >>>>>>on my systems. >>>>>>So that list shows the time spent in each methods. >>>>>> >>>>>>Major difference the some call to the sun.nio.ch.PollArrayWrapper >>>>>>class. I don't know much about those NIOs packages but 819000 call in >>>>>>40 mins is a lot. >>>>>>The Socket Interface was called more than twice with 2 hosts than with >>>>>>a single one. Which seams normal. >>>>>> >>>>>>Maybe this can help. >>>>>>If you need the complete hprof file I can send them to you. >>>>>> >>>>>>1 host in cluster: >>>>>>CPU TIME (ms) BEGIN (total = 19701) Thu Jan 8 10:00:59 2004 >>>>>>rank self accum count trace method >>>>>>1 11.48% 11.48% 54 85 java.lang.Object  >>>>>>2 11.46% 22.94% 117 86 java.lang.Object  >>>>>>3 10.95% 33.89% 4115 215 java.net.PlainDatagramSocketImpl  >>>>>>4 10.93% 44.81% 4114 224 java.lang.Thread  >>>>>>5 10.91% 55.73% 19005 214 sun.nio.ch.PollArrayWrapper.poll0 >>>>>>6 7.37% 63.09% 28 495 java.lang.Object  >>>>>>7 7.24% 70.34% 10 576 java.lang.Object  >>>>>>8 4.57% 74.90% 90 716 java.lang.Thread  >>>>>>9 4.48% 79.38% 1 909 java.lang.Object  >>>>>>10 4.48% 83.86% 1 908 java.lang.Object  >>>>>>11 4.48% 88.34% 15 810 java.lang.Object  >>>>>>12 4.47% 92.81% 1 910 java.net.PlainSocketImpl  >>>>>>13 0.71% 93.52% 2 623 java.lang.Object  >>>>>>14 0.56% 94.08% 2 706 java.lang.Object  >>>>>>15 0.38% 94.46% 2 914 java.lang.Object  >>>>>>16 0.24% 94.70% 775 913 java.lang.String  >>>>>>17 0.23% 94.93% 3 475 java.lang.Thread  >>>>>>18 0.16% 95.09% 2 472 java.lang.Object  >>>>>>19 0.15% 95.24% 2 595 java.lang.Thread  >>>>>>20 0.15% 95.40% 2 586 java.lang.Thread  >>>>>>21 0.15% 95.55% 2 703 java.lang.Thread  >>>>>>22 0.15% 95.70% 2 476 java.lang.Thread  >>>>>>23 0.15% 95.85% 2 692 java.lang.Thread  >>>>>>24 0.12% 95.97% 218595 385 >>>>>>java.lang.CharacterDataLatin1.toLowerCase >>>>>>25 0.12% 96.09% 218595 408 java.lang.Character  >>>>>>26 0.11% 96.20% 218595 433 >>>>>>java.lang.CharacterDataLatin1.getProperties >>>>>>27 0.10% 96.30% 210925 389 java.lang.String  >>>>>>28 0.08% 96.38% 157259 387 java.lang.String  >>>>>>29 0.08% 96.46% 1 646 java.lang.Thread  >>>>>>30 0.08% 96.53% 1 634 java.lang.Thread  >>>>>>31 0.08% 96.61% 1 903 java.lang.Thread  >>>>>>32 0.08% 96.69% 1 714 java.lang.Thread  >>>>>>33 0.08% 96.76% 1 811 java.lang.Thread  >>>>>>34 0.08% 96.84% 1 715 java.lang.Thread  >>>>>> >>>>>>2 hosts: >>>>>>CPU TIME (ms) BEGIN (total = 37247) Thu Jan 8 11:01:28 2004 >>>>>>rank self accum count trace method >>>>>>1 9.56% 9.56% 52 85 java.lang.Object  >>>>>>2 9.56% 19.12% 29 86 java.lang.Object  >>>>>>3 9.30% 28.43% 3 267 java.lang.Object  >>>>>>4 9.25% 37.68% 6644 224 java.lang.Thread  >>>>>>5 9.23% 46.91% 13116 215 java.net.PlainDatagramSocketImpl  >>>>>>6 7.67% 54.58% 3 266 java.lang.Object  >>>>>>7 5.90% 60.47% 39 847 java.lang.Object  >>>>>>8 5.76% 66.24% 12 503 java.lang.Object  >>>>>>9 3.90% 70.14% 145 975 java.lang.Thread  >>>>>>10 3.90% 74.04% 1 1174 java.lang.Object  >>>>>>11 3.90% 77.94% 1 1173 java.lang.Object  >>>>>>12 3.90% 81.84% 25 973 java.lang.Object  >>>>>>13 3.90% 85.74% 1 1175 java.net.PlainSocketImpl  >>>>>>14 3.88% 89.62% 819692 214 sun.nio.ch.PollArrayWrapper.poll0 >>>>>>15 0.75% 90.37% 2 958 java.lang.Object  >>>>>>16 0.28% 90.65% 2 457 java.lang.Object  >>>>>>17 0.26% 90.91% 2 1181 java.lang.Object  >>>>>> >>>>>>Filip Hanik wrote: >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>>I'll try to get an instance going today. Will let you know how it >>>>>>>goes >>>>>>>also, try asynchronous replication, does it still go to 100%? >>>>>>> >>>>>>>Filip >>>>>>> >>>>>>>-- --Original Message-- -- >>>>>>>From: Steve Nelson [mailto:Steve.Nelson@(protected)] >>>>>>>Sent: Wednesday, January 07, 2004 12:08 PM >>>>>>>To: 'Tomcat Users List' >>>>>>>Subject: RE: tomcat 5.0.16 Replication >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>Okay, did that got this >>>>>>> >>>>>>>BEGIN TO RECEIVE >>>>>>>SENT:Default 1 >>>>>>>RECEIVED:Default 1 FROM /10.0.0.110:5555 >>>>>>>SENT:Default 2 >>>>>>>BEGIN TO RECEIVE >>>>>>>RECEIVED:Default 2 FROM /10.0.0.110:5555 >>>>>>>SENT:Default 3 >>>>>>>BEGIN TO RECEIVE >>>>>>>RECEIVED:Default 3 FROM /10.0.0.110:5555 >>>>>>>SENT:Default 4 >>>>>>>BEGIN TO RECEIVE >>>>>>>RECEIVED:Default 4 FROM /10.0.0.110:5555 >>>>>>> >>>>>>>*shrug* >>>>>>> >>>>>>>BTW It didn't go to 100% CPU ute before I started using the code from >>>>>>>CVS. >>>>>>>Of course the Manager would almost always timeout before it would >>>>>>>recieve >>>>>>>the message. >>>>>>> >>>>>>>Now it gets the message right away, but maxes my machine out. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>-- --Original Message-- -- >>>>>>>From: Filip Hanik [mailto:devlists@(protected)] >>>>>>>Sent: Wednesday, January 07, 2004 1:58 PM >>>>>>>To: Tomcat Users List >>>>>>>Subject: RE: tomcat 5.0.16 Replication >>>>>>> >>>>>>> >>>>>>>100% cpu can mean that you have a multicast problem, try to run >>>>>>> >>>>>>>java -cp tomcat-replication.jar MCaster >>>>>>> >>>>>>>download the jar from http://cvs.apache.org/~fhanik/ >>>>>>> >>>>>>>Filip >>>>>>> >>>>>>>-- --Original Message-- -- >>>>>>>From: Steve Nelson [mailto:Steve.Nelson@(protected)] >>>>>>>Sent: Wednesday, January 07, 2004 6:51 AM >>>>>>>To: 'tomcat-user@(protected)' >>>>>>>Subject: tomcat 5.0.16 Replication >>>>>>> >>>>>>> >>>>>>> >>>>>>>I was having random problems with clustering when starting up. Mostly >>>>>>>it had >>>>>>>to do with Timing out >>>>>>>when the manager was starting up. I built the CVS version and it >>>>>>>solved that >>>>>>>problem. But it has caused >>>>>>>some serious performance problems. >>>>>>> >>>>>>>First a little background. >>>>>>> >>>>>>>I have 2 servers, dual 300mhz cpq proliants, both running Redhat - 9, >>>>>>>Tomcat >>>>>>>5.0.16 (with catalina-cluster.jar build from cvs) The multicast >>>>>>>packets are >>>>>>>restricted to a crossover link between the servers. There are 3 hosts >>>>>>>in the >>>>>>>server.xml, all with clustering set up. They all function just fine. >>>>>>> >>>>>>>But.....the cpu's spikes up to 100% if I start up both servers. I >>>>>>>know this >>>>>>>didn't happen without the new catalina-cluster.jar. If I shut down 1 >>>>>>>server >>>>>>>(doesn't matter which) everything returns to normal. But when both >>>>>>>are >>>>>>>running both servers are at 100% CPU. I am trying to profile it now, >>>>>>>but I >>>>>>>figured if someone has already experienced this they could save me >>>>>>>some >>>>>>>time. >>>>>>> >>>>>>>Oh, and there isn't anything relevant in my logs. It's not throwing >>>>>>>millions >>>>>>>of errors or something. >>>>>>> >>>>>>>-Steve Nelson >>>>>>> >>>>>>> >>>>>>> >>>>>>>-- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ------ >>>>>>>To unsubscribe, e-mail: tomcat-user-unsubscribe@(protected) >>>>>>>For additional commands, e-mail: tomcat-user-help@(protected) >>>>>>> >>>>>>> >>>>>>>-- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ------ >>>>>>>To unsubscribe, e-mail: tomcat-user-unsubscribe@(protected) >>>>>>>For additional commands, e-mail: tomcat-user-help@(protected) >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>-- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ------ >>>>>>To unsubscribe, e-mail: tomcat-user-unsubscribe@(protected) >>>>>>For additional commands, e-mail: tomcat-user-help@(protected) >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> >>>>-- >>>>Jean-Philippe B�langer >>>>(514)228-8800 ext 3060 >>>>111 Duke >>>>CGI >>>> >>>> >>>>-- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ------ >>>>To unsubscribe, e-mail: tomcat-user-unsubscribe@(protected) >>>>For additional commands, e-mail: tomcat-user-help@(protected) >>>> >>>> >>>>-- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ------ >>>>To unsubscribe, e-mail: tomcat-user-unsubscribe@(protected) >>>>For additional commands, e-mail: tomcat-user-help@(protected) >>>> >>>> >>>>-- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ------ >>>>To unsubscribe, e-mail: tomcat-user-unsubscribe@(protected) >>>>For additional commands, e-mail: tomcat-user-help@(protected) >>>> >>>> >>>>-- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ------ >>>>To unsubscribe, e-mail: tomcat-user-unsubscribe@(protected) >>>>For additional commands, e-mail: tomcat-user-help@(protected) >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>> >>> >> >> >> >> > > >-- >Jean-Philippe B�langer >(514)228-8800 ext 3060 >111 Duke >CGI > > >-- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ------ >To unsubscribe, e-mail: tomcat-user-unsubscribe@(protected) >For additional commands, e-mail: tomcat-user-help@(protected) > > >-- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ------ >To unsubscribe, e-mail: tomcat-user-unsubscribe@(protected) >For additional commands, e-mail: tomcat-user-help@(protected) > > >-- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ------ >To unsubscribe, e-mail: tomcat-user-unsubscribe@(protected) >For additional commands, e-mail: tomcat-user-help@(protected) > > >-- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ------ >To unsubscribe, e-mail: tomcat-user-unsubscribe@(protected) >For additional commands, e-mail: tomcat-user-help@(protected) > > > >
-- Jean-Philippe B�langer (514)228-8800 ext 3060 111 Duke CGI
-- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ------ To unsubscribe, e-mail: tomcat-user-unsubscribe@(protected) For additional commands, e-mail: tomcat-user-help@(protected)
|
|
 |