Deutsch English Français Italiano |
<vd71l9$mnl$1@reader1.panix.com> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!weretis.net!feeder9.news.weretis.net!panix!.POSTED.spitfire.i.gajendra.net!not-for-mail From: cross@spitfire.i.gajendra.net (Dan Cross) Newsgroups: comp.os.vms Subject: Re: Apache + mod_php performance Date: Fri, 27 Sep 2024 19:39:21 -0000 (UTC) Organization: PANIX Public Access Internet and UNIX, NYC Message-ID: <vd71l9$mnl$1@reader1.panix.com> References: <vcv0bl$39mnj$1@dont-email.me> <vd6dh4$nrif$1@dont-email.me> <vd6env$gfu$1@reader1.panix.com> <vd6n70$q3fm$1@dont-email.me> Injection-Date: Fri, 27 Sep 2024 19:39:21 -0000 (UTC) Injection-Info: reader1.panix.com; posting-host="spitfire.i.gajendra.net:166.84.136.80"; logging-data="23285"; mail-complaints-to="abuse@panix.com" X-Newsreader: trn 4.0-test77 (Sep 1, 2010) Originator: cross@spitfire.i.gajendra.net (Dan Cross) Bytes: 7953 Lines: 178 In article <vd6n70$q3fm$1@dont-email.me>, Arne Vajhøj <arne@vajhoej.dk> wrote: >On 9/27/2024 10:16 AM, Dan Cross wrote: >> In article <vd6dh4$nrif$1@dont-email.me>, >> Arne Vajhøj <arne@vajhoej.dk> wrote: >>> On 9/27/2024 9:18 AM, Craig A. Berry wrote: >>>> The only thing I can think of that hasn't already been mentioned >>>> is that Tomcat code is JIT-compiled, which is likely to be pretty good, >>>> optimized code, whereas Apache is probably either cross-compiled or >>>> native-compiled with an early enough field test compiler that there are >>>> no optimizations. >>> >>> That is a possible explanation. >>> >>> But the difference in numbers are crazy big. >>> >>> Apache getting a static text file with 2 bytes: 22 req/sec >>> >>> Tomcat with Quercus and PHP getting data out of a MySQL database on >>> Windows and outputting HTML: over 200 req/sec >>> >>> Tomcat using JSP (which get triple compiled) getting data out of a MySQL >>> database on Windows (with db connection pool) and outputting HTML: over >>> 600 req/sec. >>> >>> My gut feeling is that cross-compilation may contribute to but not >>> fully explain the difference. >> >> Almost certainly not; this is an IO bound application, not CPU >> bound. > >With static content yes. Correct. That's all you ought to be looking at under you understand why that's slow. >With dynamic content and the volume Apache+mod_php delivers yes. Maybe, but without a profile you really don't know. But beyond that, it is currently irrelevant. You see approximately the same numbers with static and dynamic content; this heavily implies that the dynamic content case is not related to the present slow-down, including it now is premature, and likely just masks what's _actually_ wrong. >With dynamic content and high volume then CPU can matter. Tomcat >and Quercus can do over 200 req/sec, but CPU utilization fluctuate >between 150% and 250% - 4 VCPU used so not CPU bound, but could >have been if it had been just 2 VCPU. See above. You know that there's a problem with Apache and static content, but you don't know _what_ that problem is. Why would you jump ahead of yourself worrying about things like that until you actually understand what's going on? In this case, concentrating on static content, CPU time consumed by Apache itself due to poor optimization or something seems like a low-probability root cause of the performance problems you are seeing, as static file service like this is IO, not compute, bound. Keep your eye on the ball. >> My strong suspicion is that what you're seeing is the result of >> a serious impedance mismatch between the multi-process model >> Apache was written to use, and its realization using the event >> signalling infrastructure on VMS. > >Yes. Maybe. You really haven't done enough investigation to know, at least going by what you've reported here. >Or actually slightly worse. > >Prefork MPM is the multi-process model used in Apache 1.x - it is still >around in Apache 2.x, but Apache 2.x on Linux use event or worker >MPM (that are a mix of processes and threads) and Apache 2.x on Windows >use winnt MPM (that is threads only). Ok, sure. But as you posted earlier, Apache on VMS, as you're using it, is using the MPM model, no? >> Again, I would try to establish a baseline. Cut out the MPM >> stuff as much as you can; > >MPM is the core of the server. No, you misunderstand. Try to cut down on contention due to coordination between multiple entities; you do this by _lowering_ the number of things at play (processes, threads, whatever). The architecture of the server is irrelevant in this case; what _is_ relevant is minimizing concurrency in its _configuration_. Does that make sense? >> ideally, see what kind of numbers you >> can get fetching your text file from a single Apache process. >> Simply adding more threads or worker processes is unlikely to >> significantly increase performance, and indeed the numbers you >> posted are typical of performance collapse one usually sees due >> to some kind of contention bottleneck. > >It increases but not enough. > >1 -> 0.1 req/sec >150 -> 11 req/sec >300 -> 22 req/sec > >> Some things to consider: are you creating a new network >> connection for each incoming request? > >Yes. Having the load test program keep connections alive >would be misleading as real world clients would be on different >systems. Again, you're getting ahead of yourself. Try simulating a single client making multiple, repeated tests to a single server, ideally reusing a single HTTP connection. This will tell you whether the issue is with query processing _inside_ the server, or if it has something to do with handling new connections for each request. If you use HTTP keep alives and the number of QPS jumps up, you've narrowed down your search space. If it doesn't, you've eliminated one more variable, and again, you've cut down on your search space. Does that make sense? >> It's possible that that's >> hitting a single listener, which is then trying to dispatch the >> connection to an available worker, > >That is the typical web server model. No, it is _a_ common model, but not _the_ "typical" model. For instance, many high-performance web solutions are built on an asynchronous model, which effectively implement state machines where state transitions yield callbacks that are distributed across a collection of executor threads. There's no single "worker" or dedicated handoff. Moreover, there are many different _ways_ to implement the "listener hands connection to worker" model, and it _may_ be that the way that Apache on VMS is trying to do it is inherently slow. We don't know, do we? But that's what we're trying to figure out, and that's why I'm encouraging you to start simply and build on what you can actually know from observation, as opposed to faffing about making guesses. >> using some mechanism that is >> slow on VMS. > >It is a good question how Apache on VMS is actually doing that. > >All thread based solutions (OSU, Tomcat etc.) just pass a >pointer/reference in memory to the thread. Easy. > >Fork create a process copy with the open socket. I am not quite >sure about the details of how it works, but it works. > >If the model on VMS is: > >---(HTTP)---parent---(IPC)---child > >then it could explain being so slow. > >I may have to read some of those bloody 3900 lines of code (in a >single file!). Precisely. And maybe run some more experiments. >> Is there a profiler available? If you can narrow >> down where it's spending its time, that'd provide a huge clue. > >Or I take another path. This is a useful exercise either way; getting to the root cause of a problem like this may teach you something you could apply to other, similar, problems in the future. - Dan C.