Notes

Notes - notes.io

On Mindcraft's April 2022 Benchmark

I simply compiled [Apache] with pretty much all modules disabled.... I'm using the highperformance-conf.dist config file from the distribution." See also Karthik's post on linux-kernel and its followups. This sounds rather like the behavior Mindcraft reported ("After the restart, Apache performance climbed back to within 30% of its peak from a low of about 6% of the peak performance"). Kernel issue 2: Wake-One and the Thundering Herd

(Note: A "task-exclusive" wake-one patch was identified by the Linux Scalability Project in their paper on the thundering herd issue. However, Andrea claims that as of 2.4.0-test10 it still wakes up processes according to the order they were put to bed, which is not optimal for caching. It would be better to have them in reverse order. See also Nov 2000 measurements by Andrew Morton ([email protected]); post 1, post 2, and Linus' reply.) - Phillip Ezolt, 5 May 1999, in linux-kernel ( "Overscheduling DOES happen with high web server load. "): "When running a SPECWeb96 strobe with Alpha/linux I found that 18% of my time was spent in scheduling. (Russinovich discussed something similar in his critique on Linux. This post started a very lively thread in linux-kernel (now on its second week). Looks like the scheduler (and possibly Apache) are in for some changes. - Rik van Riel, 6 May 1999, in linuxperf (Re: [linuxperf] Possible fix for Mindcraft Apache problem): ... The web benchmark's main problem remains. The way Apache and Linux "cooperate", there is a lot of trouble in the 'thundering shed' problem. This means that when a signal is received, all processes are woken and the scheduler must choose one of the many new runnable processes ..... The real solution is to switch from wake-all semantics and use a wake-one style to avoid the huge runqueues that Phillip Ezolt, the DEC guy, experienced. The good news is that it's a simple patch that can probably be fixed within a few days... - Tony Gale, 6 May 1999, in linuxperf ( Re: [linuxperf] Possible fix for Mindcraft Apache problem): Apache uses file locking to serialise access to the accept call. This can be very expensive on some systems. I haven’t had time to run the Linux numbers yet for the 10 or more server models that are available to find the most efficient. Check Stephens UNPv1 2nd Edition Chapter 27 for details. - Andrea Arcangeli May 12, 1999 in Linux-Kernel ( [patch] wake_one für accept(2) [was: Re: Overscheduling DOES NOT happen with high web server loads.] I released a new andrea_patch against 2.2.8. This new one contains my new wake-one code on accept(2) strightforward code. However, to get the improvement, you must ensure that your apache tasks are not sleeping in accept(2). A strace-p pidofapache should tell that. The patch is linked to from here. David Miller's answer to the above question:... on every TCP connection, there are 2 spurious and unsolicited wakeups. These wakeups originate in the write_space socket callback. This is because we free up SYN frames and wakeup listening socket sleepers. I've been working today on solving this very issue. Ingo Molnar (May 13th 1999 in Linux-kernel (Re: [RFT] 2.0.8_andrea1 wake-1 [Re: Overscheduling DOES take place with high web server load. (Refer to ]),: It is important to note that pre-2.3.1 already implements accept() in a wake-one manner... and there are many more. - Phillip Ezolt ([email protected]), May 14th, 1999, in linux-kernel ( Great News!! Was: [RFT] 2.2.8_andrea1 wake-one ): I've been doing some more SPECWeb96 tests, and with Andrea's patch to 2.2.8 (ftp://ftp.suse.com/pub/people/andrea/kernel/2.2.8_andrea1.bz) **On identical hardware, I get web-performance nearly identical to Tru64! **... Tru64 4ms2.2.5 100ms2.2.8 9ms2.2.8_a4ms... My web-performance is almost identical to Tru64's, according to this Iprobe report: As you can see, the number of SPECWeb96 MaxOps per seconds has risen. **Please put the wakeone patch in the 2.2.X Kernel if it isn’t already. Larry Sendlosky tested the patch. He said that while your 2.2.8 patch does improve apache performance for a single CPU system (e.g., a 2 CPU SMP system), there is no improvement in performance.

below. Also, see: Dimitris Michailidis - [email protected] 14 May 1999 in Linux-kernel. Might have some SMP bottleneck fixes, too. Kernel issue #3: SMP Bottlenecks in 2.2 Kernel

Juergen Schimmel, May 19th 1999 in Linux-kernel ( Bad Apache Performance wtih the linux SMP), asked why Apache is performing poorly under SMP. Andi Kleen said that it is most likely that TCP data copy runs completely serialized. This can be fixed by replacing the skb->csum = csum_and_copy_from_user(from, skb_put(skb, copy), copy, 0, &err); in tcp.c:tcp_do_sendmsg with unlock_kernel(); skb->csum = csum_and_copy_from_user(from, skb_put(skb, copy), copy, 0, &err); lock_kernel(); The patch does not violate any locking requirements in the kernel... [To fix your connection refused errors,] try: echo 32768 > /proc/sys/fs/file-max echo 65536 > /proc/sys/fs/inode-max Overall it should be clear that the current Linux kernel doesn't scale to CPUs for system load (user load is fine). It is true, but I blame the Linux vendors. ... All these problems are being fixed. [2.3 will be first fixed, then the changes will go backported to 2.2]. [Note: Andi appears to have fixed the TCP unlocking issue in 2.2.9-ac3. Andrea Arcangeli responded describing his own version of this fix ( ftp://ftp.suse.com/pub/people/andrea/kernel/2.3.3_andrea2.bz2 ) as less cluttered: If you look at my patch (the second one, in the first one I missed the reaquire_kernel_lock done before returning from schedule, woops :) then you'll see my approch to address the unlock-during-uaccess. My patch don't change tcp/ip ext2 etc... but it touch only uaccess.h and usercopy.c. I don't like to have unlock_kernel everywhere. Juergen Schmidt, 26/05/99 on linux-kernel, and new-httpd ( Linux/Apache, and SMP - My fault ), has retracted his previous problem report. I had reported "disastrous performance" for Linux and Apache on an SMP system. To double-check, I downloaded a clean kernel source (2.2.8 & 2.2.9) to confirm that they do not show the reported penalty when running under SMP systems. My error was to use the installed kernel sources (which I patched from 2.2.5 to 2.2.8 - after seeing the first very bad results). However, these sources had been modified long before my machine was born. I should have thrown them away. Please accept my apologies. Others have reported modest performance gains (around 20%) when Andrea's SMP fix is used, but only when using large files (100kilobytes). Juergen has completed his testing. Unfortunately, he neglected to compile Apache with -DSINGLE_LISTEN_UNSERIALIZED_ACCEPT, which ( according to Andrea) significantly hurt Apache performance. Juergen may have missed this. It means that it is too difficult to solve. To make it easier to get good performance in the future, we need the wake-one patch added to a stable kernel (say, 2.2.10), and we need Apache's configuration script to notice that the system is being compiled for 2.2.10 or later, and automatically select SINGLE_LISTEN_UNSERIALIZED_ACCEPT. Other Apache users getting help solving performance problems

Mike Whitaker ([email protected]), 22 mai 1999 in linuxperf (High load under Apache1.3.3/mod_perl1.16/Linux2.2.7 SMP ), described a strange performance problem. Our typical webserver is a dual PII450, with 1G and split httpd's. Typically, 300 static pages are served and 80-100 dynamic serve mod_perl ads. Unneeded modules are disabled and hostname lookups are turned off as a sensible person would. There are usually between one to three mod_perl searches/page, plus the usual dozen or so images inline. The kernel (2.2.7) has MAX_TASKS upped to 4090, and the unlock_kernel/lock_kernel around csum_and_copy_from_user() in tcp_do_sendmsg that Andi Kleen suggested. The performance is.. interesting. The load on the machine fluctuates from 10 to 120, while user CPU goes between 15% (80% idle) and 180% (180% idle, machine *crawling*), roughly once every minute. Vmstat shows that the number of processes in a state ranges from 0 (when load has been low) to 30-40. The static servers can manage 60-70 peak hits/sec. Without the dynamic httpd, everything *flies He was advised to try a kernel that has wake-one support. Identical systems (dual PII450s, 1G, and two disk controllers). The wake-one patch, as far as I can tell is doing its thing: The 2.2.7 machine still has cycles that load into three figures and the 2.3.3 machine doesn't seem to have the ability to manage a load above 1. UNFORTUNATELY, observation suggests that the 2.3.3 machine/Apache combination is dropping/ignoring about one connection in ten, maybe more. (Network error - connection reset by peer. His next update, on May 25th, reads: More progress from the bleeding edge: (Reminder: the config here is split static/mod_perl httpd's, with a pretty CPU-intensive mod_perl script serving ads as an SSI as the probable bottleneck) Linux kernel 2.2.9 plus the 2.2.

9_andrea3 (wake-1) patch seems to work. It can handle hits at a speed that suggests it's pushing its adverser to the limit. (As I stated in a previous note: Avoid 2.2.8 like the plague. It trashes HDs. See threads on Linux-kernel. However... It does get overstressed if it does. When the idle CPU is at zero (i.e. The idle CPU is essentially 0 and it starts processing advert requests. This can be caused by spikes in demand. Once you are in this state, it is difficult to get out. In fact, the only way to get the machine out of the (hopefully brief TTL) DNS round-robin is to do so while the machine is dying down. This is counterintuitive. You can *REDUCE* MaxClients and hope that the tcp listen queue can handle a load surge. This seems to work, according to experience. (Aside, this is a perfect example for Eddieware's load balance DNS. - Eric Hicks, 26 May 1999, in linux-kernel ( Apache/kernel problem? ): ... I am having major problems with the fact that a single PII 400Mhz, or a single AMD 400, will outrun a dual PII 450 when Apache requests are made. ... Ealatorre.com HTTP Server Tests Data: 100 1MByte MPEG files stored on local drives. Results: - AMD 400Mghz K6, 128MB, Linux 2.0.36; handles 1000 simultaneous clients @ 57.6Kbits/sec. - PII 400Mghz, 512MB, Linux 2.0.36; handles 1000 simultaneous clients @ 57.6Kbits/sec. - Dual PII/450Mghz and 512MB, Linux 2.2.8 and Linux 2.0.36; handles far fewer than 300 simultaneous clients @57.6Kbits/sec.

I advised him to use 2.2.9_andrea3; and he said that he would try it and report back. Kernel issue #4 - Interrupt Bottlenecks

According to Zach, the Mindcraft benchmark's use of four Fast Ethernet cards and a quad SMP system exposes a bottleneck in Linux's interrupt processing; the kernel spent a lot of time in synchronize_bh(). (A single Gigabit Ethernet cable would lessen this bottleneck. Mingo states that TCP throughput scales with more CPUs than it does with 2.2.10 (although he hasn’t tested it with multiple Ethernets). Steven Guo and Steve Underwood also commented on the issue of interrupts under heavy loads. See also Linus's "State of Linux" talk at Usenix '99 where he talks about the Mindcraft benchmark and SMP scalability. SCT's Jan 2000 comments on progress in scaling can be found here. Softnet is coming! Kernel version 2.3.43 includes the new softnet networking changes. Softnet changes the interface of the networking cards. Therefore, every driver must be updated. However, network performance should be much better on large SMP systems. (For more info, see Alexy's readme.softnet, his softnet-howto, or his Feb 15 post about how to convert old drivers.) The Feb '00 thread Gigabit Ethernet Bottlenecks (especially its second week) has lots of interesting tidbits about how what interrupt (and other) bottlenecks remain, and how they are being addressed in the 2.3 kernel. Ingo Molnar's 27 February 2000 post explains in detail the improvements made to interrupt-handling in the IA32 code. These improvements will be integrated into core kernel 2.5, it seems. Kernel issue #5 - Mysterious network slowdown

This is a bug, and not a scaling problem. Numerous 2.2 users have reported that the network speeds sometimes slow down to 1 to 10 percent of normal. They also experienced high ping times. The problem can be temporarily fixed by cycling the interface. Oystein Svendsen reported 29 June 1999 that TCP performance has been affected by the upgrade to the 2.2 series. After I have taken down the interface, I can restore normal performance by reinserting the eepro100 module in the kernel. After that, performance returns to normal within a few days to weeks. David Stahl reported on 29 Jun 1999: I have 3 computers running 2.2.10 [with multiple]3COM 905b PCI [cards ]...] After approximately two days of uptime I will begin to notice ping times jump to 7-20 secs on the local network. There is no loss, just some very high latency. ... It seems to depend on the network load -- lighter loads can lead to longer periods without problems. The problem ALSO is gradual -- it'll start at 4 second pings, then 7 second pings about 20 minutes later, than 30 minutes later it's up to 12-20 seconds. - Another eepro100 alert. A tulip report. Less likely to happen again. - David Stahl wrote on 13 July 1999: What DID fix the problem was a private reply from someone elese (sorry about the credit, but i'm not in the mood to sieve 10k emails right now), to try the alpha version of the latest 3c59x.c driver from Donald Becker (http://cesdis.gsfc.nasa.gov/linux/drivers/vortex.html). 3c59x.c:v0.99L 5/28/99 is the version that fixed it, from ftp://cesdis.gsfc.nasa.gov/pub/linux/drivers/test/3c59x.c - On 23 Sep 1999, Alexey posted a one-line patch that clears up a similar mysterious slowdown. 2.2.13 and Red Hat 6.1 already have this patch applied. This patch was applied on three Red Hat 6.x systems I know that have Masq support installed and connected to cable modems. The patch corrected a bug that caused very high pings even after short bursts with heavy TCP transfer to distant hosts. Rickard Cedergren was and Michael Brown reported that Alexey's fix greatly improved the problem. However, it is not completely gone. Tony Hoyle has been experiencing some long delays with 2.2.13. Jeremy Fitzhardinge reported a new delay. The replies indicate that it is likely due to a Tulip driver. Kernel issue #6: 2.2.x/NT TCP slowdown

Petru Paler, July 10, 1999 in linux-kernel ([BUG] TCP connections among Linux and NT ), reported that any type of TCP connection between Linux 2.2.10 and a NT Server 4 Service Pack 5 slows down to a crawl. With 2.0.37, the problem was less severe (6kbytes/sec). Andi Kleen provided a log of a slow connection with tcpdump. This allowed Andi to see that NT took a long time to ACK a particular data packet, which was causing Linux stall. Solved: false alarm! It wasn’t Linux’s fault at any point. It turned out that NT had to be told not to use full duplex mode with the ethernet card. Kernel issue #7: Scheduler

Phil Ezolt, 22 Jan 2000, in linux-kernel ( Re: Interesting analysis of linux kernel threading by IBM): When I run SPECWeb96 tests here, I see both a large number of running process and a huge number of context switches. ... Here's a sample vmstat data: procs Memory Swap io System Cpu r bw swpd-free buff cache sio bi bo in csus us sy ID... 24 0 02320 2066936 590088 1010664 0 03961 24 0 02320 2065752 590664 1061064 0 03961 1 Notice. 24 running processes and 7000 context switch. That is a lot of overhead. Every second, 7000*24 goodnesses is calculated. Not the (20*3) desktop system sees. This is a scaling issue. A better scheduler equals better scalability. Don't tell me that benchmark data is useless. If you are unable to give me data using real systems and tell me where the faults are, then benchmark data will be useless. SPECWeb96 pushes Linux until it bleeds. I'm telling you where it bleeds. You have two options. It might not be what the system is seeing right now, but it will be in time. Would you rather fix it now or wait until someone else how thrown down the performance gauntelet? ... Here's an interesting fact. During my runs I see 98% contention for the [2.2.14] kernel locks, and it's accessed A LOT. I don't have much memory support so I don't know how it compares to 2.3.40. Andrea will probably be kind enough to give me a patch and I'll be able to see if things have improved. [Phil's data is for the SPECWeb96 web server, an ES40 4 CPU EV6 running Redhat 6.0 w/kernel v2.2.14 w/SGI performance patch; the interfaces receiving load are two ACENic gigabit ethernet card. Kernel issue #8 - SMP bottlenecks in 2.4 kernel

Manfred Spraul, April 21, 2000, in linux-kernel ( [PATCH] f_op->poll() without lock_kernel()): [email protected] noticed that select() caused a high contention for the kernel lock, so here is a patch that removes lock_kernel() from poll(). [tested] with version 2.3.99. There was some discussion as to whether this was a good idea at this late time, but Linus Miller was enthusiastic. It seems like one more bottleneck is in the mix. On 26 April 2000, [email protected] posted benchmark results in Linux-Kernel with and without the lock_kernel() in poll(). A kernel patch was released to improve checksum performance. Apache 1.3 was patched to align its buffers within 32-word boundaries. Linus praised Dean Gaudet's patch for Apache 1.3, claiming that it can speed up SPECWeb results up to 3%. This was an interesting thread. This thread was interesting. Kernel issue #9: csum_partial_copy_generic

[email protected], 19 May 2000, in linux-kernel ( [PATCH] Fast csum_partial_copy_generic and more ) reports a 3% reduction in total CPU time compared to 2.3.99-pre8 on i686 by optimizing the cache behavior of csum_partial_copy_generic. The workload was ZD's WebBench. He adds The benchmark we used has almost same setting as the MINDCRAFT ones, but the apache setting is [changed] slightly not to use symlink checking. We used maximum of 24 independent clients and number of apache processes is 16.

Four-way XEON processor systems are used. The performance is twice that of a single CPU. In ZD's benchmarks with 2.2.6 a four-way XEON processor system only achieved a 1.5x increase in speed over a single CPU. Kumon is reporting a > 2x speedup. This seems to be similar to the speedup NT4.0sp3 achieved using 4 CPUs with 24 clients. It's encouraging to hear that things may have improved in the 11 months since the 2.2.6 tests. Kumon stated that there was a significant improvement between pre3 and post5, which is poll optimization. Until pre4 (I forget exact version), kernel-lock prevents performance improvement. The following mails will help to understand the background if you can retrieve lk mails between Apr 20-25. subject: namei() query subject: [PATCH] f_op->poll() without lock_kernel() subject: lockless poll() (was Re: namei() query) subject: "movb" for spin-unlock (was Re: namei() query)

On 4 Sept 2000, kumon posted again, noting that his change still hadn't made it into the kernel. Kernel issue number 10: getname() and poll() optimizations

On 22 May 2000, Manfred Spraul posted a patch on linux-kernel which optimized kmalloc(), getname(), and select() a bit, speeding up apache by about 1.5% on 2.3.99-pre8. Kernel issue #11: Reducing lock contention, poll overhead in 2.4

Alexander Viro posted a patch to fix a big lock that was in close_flip(). Kumon ran a benchmark and reported: I measured viro’s ac6D patch using WebBench on a 4cpu Xeon computer. I applied to 2.4.0-test1 not ac6. The patch reduced stext_lock by 50% and 4% respectively. ... Do_select can cause some overhead with kmalloc/kfree. This can easily be eliminated by using a small array on a stack. Kumon then posted a patch which avoids kmalloc/kfree for select() and poll() when the number of fd's is less than 64. Kernel issue #12 - Poor disk seek behavior in 2.2.2, new elevator code 2.4

On 20 July 2000, Robert Cohen ([email protected]) posted a report in Linux-kernel listing netatalk (appletalk file sharing) benchmarks comparing 2.0, 2.2, and several versions of 2.4.0-pre. The elevator code in 2.4 seems to help (some versions of 2.4 can handle 5 benchmark clients instead of 2) but ... The more recent test4 and test5pre2 don't fair quite so well. They can handle 2 clients on a 128 Megabit server perfectly, so they are doing better than 2.2. However, they choke and go seek-bound with 4 clients. Things have changed a lot since test1 - ac22. Here's an update. The *only* 2.4 kernel versions that could handle 5 clients were 2.4.0-test1-ac22-riel and 2.4.0-test1-ac22-class 5+; everything before and after (up to 2.4.0-test5pre4) can only handle 2. On 26 Sept 2000, Robert Cohen posted an update which included a simple program to demonstrate the problem, which appears to be in the elevator code. Jens Axboe ([email protected]) responded that he and Andrea had a patch almost ready for 2.4.0-test9-pre5 that fixes this problem. Robert Cohen posted a patch update on October 4, 2000 that included benchmark results for many kernels. These results showed that the problem is still present in 2.4.0.test9. Kernel issue #13: Fast Forwarding / Hardware Flow Control

On 18 Sept 2000, Jamal ([email protected]) posted a note in Linux-kernel describing proposed changes to the 2.4 kernel's network driver interface; the changes add hardware flow control and several other refinements. Robert Olson (myself and Jamal) decided that we would aim to reach the 100Mbps (14.8Kpps), routing peak before year end. I am afraid the bar has been raised. Robert is already hitting 2.4.0-test7 at 148Kpps using an ASUS CUBX motherboard with PIII 700MHZ coppermine and about 65% CPU utilization. With a single PII-based Dell machine, I was able achieve a consistent value at 110Kpps. The new goal is to get to 500Kpps by the year end. ... I believe we could have done better with the mindcraft tests with these changes in 2.2 (and HW FC turned on). [update] BTW: I was informed that Linux users were not allowed to modify the hardware during those tests. I don't think they could have used these modifications if they were available then. Kernel tuning issue: hitting TIME_WAIT

On 30 March 2000, Takashi Richard Horikawa posted a report in Linux-Kernel listing SPECWeb96 results for both the 2.2.14 and 2.3.41. Performance between a 2.1.4 client and a 2.14 server was poor due to not enough ports being used. Ports were not done with Time_WAIT by time that the port number was required again for a new connection. The lesson here is to tune clients and servers to use as wide a range of ports as possible. with echo 1024 65535 > /proc/sys/net/ipv4/ip_local_port_range to avoid bumping into this situation when trying to simulate large numbers of clients with a small number of client machines. The problem was resolved by Mr. Horikawa on 2 April 2000. Suggestions for future benchmarks

Become familiar with linux kernel and the Apache mailing list, as well as the Linux newsgroups at Usenet (try DejaNews power searching in forums matching linux *').). Post your proposed configuration to see if others agree. Post intermediate results and be open about what you have done. You should probably expect to spend a week or so mulling over ideas with these mailing lists during the course of your tests. If possible, use a modern benchmark like SPECWeb99 rather than the simple ones used by Mindcraft. It may be useful to inject latency into a path between the server/clients to better model the Internet. If possible, benchmark both single and several CPUs as well as single and multiple Ethernet interfaces. The networking performance of Linux kernel version 2.2.x does not scale well with more CPUs or Ethernet cards. This is mostly true for static pages and cached pages. Noncached dynamic pages take a lot of CPU time and should scale well as you add more CPUs. A cache can be used to save frequently generated pages. This will allow dynamic page speeds to be closer to static page speeds. When testing dynamic content: Don't use the old model of running a separate process for each request; nobody running a big web site uses that interface anymore, as it's too slow. Use a modern interface to generate dynamic content (e.g. Mod_perl is for Apache Configuring Linux

Tuning problems probably resulted in less than 20% performance decrease in Mindcraft's test, so as of 3 October 1999, most people will be happy with a stock 2.2.13 kernel or whatever comes with Red Hat 6.1. The 2.4 kernel will improve SMP performance, when it is available. Here are some notes if you want to see what people going for the utmost were trying in June: - As of June 1, Linux kernel 2.2.9 plus 2.2.9_andrea3 have been mentioned as performing well on a dual-processor task (see above). (2.2.9_andrea3 seems both to include a wake-one scheduler and an SMP lock_kernel fix. (andrea3 only works with x86, so people who have Alphas or PPCs will need to apply another wake1 and tcp_unlock patches. Jan Gruber writes: "The 2.2.9_andrea3_patch does not compile with SMP Support disabled. Andrea told me to use ftp://ftp.suse.com/pub/people/andrea/kernel-patches/2.2.9_andrea-VM4.gz instead." Andrea Arcangeli, 7 June 1999 asked: If I were going to do bench, would it be okay if you could also bench the below patch? ftp://e-mind.com/pub/andrea/kernel-patches/2.2.9_andrea-perf1.gz - On 11 Oct 1999, Andrea Arcangeli posted his list of pending 2.2.x patches, waiting to go into 2.2.13 or so. These patches could improve the performance of SMP systems or systems that are subject to heavy I/O. These may be worth considering if your system is experiencing bottlenecks. - For the truly adventurous, you might consider using the kernel-mode http web server, khttpd, to serve as a front end for Apache. It accelerates static page fetches greatly. It's at version 0.1, so use caution. - linux_kernel (week 1, week 2 ) is currently (8 Juni 1999) discussing Apache benchmarking. Linus Torvalds, who is generally supportive of khttpd or a similar program, points out that NT is doing essentially the same thing. Configuring Apache

- The usual optimizations should be applied (all unused modules should be left out when compiling, host name lookup should be disabled, and symbolic links should be followed; see http://www.apache.org/docs/misc/perf-tuning.html) - Apache should be compiled to block in accept, e.g. env CFLAGS='-DSINGLE_LISTEN_UNSERIALIZED_ACCEPT' ./configure - The http://www.arctic.org/~dgaudet/apache/1.3/top_fuel.patch may be worth applying. PC Week used top_fuel in their recent benchmarks. (See also Dean Gaudet’s interesting comments in new-httpd and linux-kernel.) According to some reports, mod_mmap_static and top_fuel.patch can reduce the number syscalls per request by reducing them from 18 to 9. - For static file benchmarks, try compiling mod_mmap_static into Apache (see http://www.apache.org/docs/mod/mod_mmap_static.html) and configuring Apache to memory-map the static documents, e.g. Create a config file by searching for /www/htdocs and printing */mmapfile. Squid being used as a front end to Apache would speed up static page fetches according to several people.

Similar reading

- Usenet posts showing slow Apache and Linux connections: "Apache isn't as fast than people claim?" ", 1999/04/05, comp.infosystems.www.servers.unix "...when we run WebBench to test the requests/sec and total throughput, Microsoft IIS 4 is 3 times faster for both Linux and Mac OS X." "Re: Apache vs IIS 4: IIS 4 3 times faster", 1999/04/02, comp.infosystems.www.servers.unix "Why are you surprised? I assumed Apache was slow. I haven’t tested IIS but I did compare Apache to a few other servers last year. I found some that were three to four times faster. There are several ways to profile the kernel. Kernel Spinlock Metering Linux IA32 is a tool to measure SMP contention. Also, see the test results comparing version 2.2 to version 2.3. A spinlock metering example to identify and fix a kernel bottleneck in 2.3.39. Andrea Arcangeli's ikd sgi's gprof kernel profiling patch (original announcement) Ingo Molnar's ktracer - for 2.1.x Example of ktracer use Example of both ktracer and ikd profile output - Christoph Lameter's perfstat patch, at Captech's Linux Performance, Stability and Scalability Project -- see also their 25 Oct 99 post on linuxperf Ways to profile user programs: - The old favorite: compile with -pg, and analyze gmon.out with gprof. Mikael Pettersson's x86 performance-monitoring counters patch. Supports 2.3.22, 2.2.13. Includes list of other related tools. Hardware performance counters with Linux by David Mentre - The Performance Counter Library -- Supports many architectures. Stephan Meyer's MSR patch -- only supports up to 2.2.6. No longer actively developed. Richard Gooch's MSR/PTC patch -- only supports Version 2.2. Requires devfs. A few linux Kernel posts: "2.2.5 Optimizations For Web Benchmarks?" ", 16/04/1999 -- Karthikprabhakar, who is about to perform serious SPECWeb96-based benchmarking, asks the right question. The followups are very interesting. "Re: 2.2.5 optimizations for web benchmarks? ", 16 Apr 1999 -- Dean Gaudet's response. Interesting insights from an Apache insider. "[patch]new scheduler", 9 mai 1999 -- Rik van Riel created the thread about possible scheduler modifications The smbtorture benchmark, which allows you to test an SMB server just like the big guys Rik van Riel’s Linux Performance Tuning site The Linux Scalability Project C10K problem - Why doesn't Johnny serve 10000 customers? Banga and Druschel's paper on web server benchmarking Linus's "State of Linux" talk at Usenix '99 where he talks about the Mindcraft benchmark and SMP scalability. my NT vs. Linux Server Benchmark Graphs page A post on comp.unix.bsd.freebsd.misc from June '99 which mentions that FreeBSD also has similar SMP scaling properties as Linux on tests like those run by Mindcraft. Mike Abbott of SGI's performance patches for Apache 1.3.9 Note: Apache 2.0 supports sendfile(), which ought to help its flat file performance.

My Website: https://ealatorre.com/

Notes is a web-based application for online taking notes. You can take your notes and share with others people. If you like taking long notes, notes.io is designed for you. To date, over 8,000,000,000+ notes created and continuing...

With notes.io;

* You can take a note from anywhere and any device with internet connection.
* You can share the notes in social platforms (YouTube, Facebook, Twitter, instagram etc.).
* You can quickly share your contents without website, blog and e-mail.
* You don't need to create any Account to share a note. As you wish you can use quick, easy and best shortened notes with sms, websites, e-mail, or messaging services (WhatsApp, iMessage, Telegram, Signal).
* Notes.io has fabulous infrastructure design for a short link and allows you to share the note as an easy and understandable link.

Fast: Notes.io is built for speed and performance. You can take a notes quickly and browse your archive.

Easy: Notes.io doesn’t require installation. Just write and share note!

Short: Notes.io’s url just 8 character. You’ll get shorten link of your note when you want to share. (Ex: notes.io/q )

Free: Notes.io works for 14 years and has been free since the day it was started.

You immediately create your first note and start sharing with the ones you wish. If you want to contact us, you can use the following communication channels;

Email: [email protected]

Twitter: http://twitter.com/notesio

Instagram: http://instagram.com/notes.io

Facebook: http://facebook.com/notesio

Regards;
Notes.io Team

Notes

Notes - notes.io

Shortened Note Link

Long File

Notes