I want to highlight the some aspects of performance and server load of the event. The server runs on 1 pretty standard machine (not a server):
- CPU - Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz
- Memory - 6 GB
- Hard disk - 1 TB
- Ethernet: 100 mbit/s
After the press published news on their sites the server got hit by the users, google analytics registered that day from 16:30 till 00:00 the following data:
- 3402 visits
- 3154 visitors
- 13861 page views
The problem was a spike right after the media sites published the news and the site started to lag. In general I did not knew what amount of users to expect, the approach was "let's wait and see", sure I ran various "ab" benchmarks but I did not knew how many users will actually use service and where the bottle neck will be CPU, RAM, IO or Bandwidth. We also have other services like WMS/WFS that are CPU and RAM heavy, the problem with static tiles is that 1 visitor produces about 30 http requests when he zooms.
Munin grapths for the event:
As you can see from the graphs the problem was in the IO, our hard disk just could not handle the load, the traffic was pretty high too. All in all I am pretty happy with the results, and I now have a better understanding on how to scale the portal horizontally in case the load grows, but thats for another post.








