I am currently looking @ various software vendors for performance testing of our ( medium ) web site. Can someone tell me the best way to estimate the concurrent number of users I need to target, based on say number of hits per day on the web site. I have an years data. Thanks in advance.
Re: Concurrent Usage
<BLOCKQUOTE><font size="1" face="Verdana, Arial, Helvetica">quote:</font><HR>Originally posted by Sadasivan:
Can someone tell me the best way to estimate the concurrent number of users I need to target, based on say number of hits per day on the web site.<HR></BLOCKQUOTE>
What you "need" to target is what you plan to support. That is what your estimate should be based on and, of course, you can have multiple estimates: light load, medium load, heavy load, etc. You can also do educated estimates by looking at your server logs and deriving information from them.
In any event, at a simple level you can do "unique visitors per day" and then relate that to twenty-four hours. Or you could do "unique visitors per minute" or "unique visitors per hour". But the problem is that people may leave and come on at different points. So generally what you need to do is work out how many actual transactions you expect the "average user" to perform and the times it will take them to perform those tasks. (This is part of Customer Behavior Modeling.) Then you can convert these figures into how many actual concurrent users you expect at any given time. (One issue is making sure you realize that browers can open their own concurrent connections, usually four, with just one user.)
I usually prefer to derive my estimates from other calculations and then plug those into a performance model to see what actually comes out. For example, we all know that, in a small and simplistic nutshell, [total bandwidth = number of concurrent connections at a given bandwidth * the given bandwidth]. That means I can assume a certain amount of concurrent users, multiply that by the access method (a given bandwidth) and work out the bits per second (or kilobits per second). That means you know the kind of bandwidth that you need with the assumption of that number of concurrent users.
I highly recommend chapter eleven of the book Scaling for E-Business. Within that chapter I specifically recommend section 11.4, "From HTTP Logs to CBMGs" which explains how to estimate things like this via server logs. (A CBMG, in case you do not know, is a Customer Behavior Model Graph.)
Bear in mind that you can also measure concurrent users not just by the user but, again, by the effects of the user. As I said, browsers can open multiple connections. This means you can consider a "concurrent user" as being an HTTP transaction. (This is how many performance models do it.) Or you could take something like Microsoft's approach which is to take the threads multiplied by the sockets per thread as being indicative of a "concurrent user". Remember, when doing this, that you also have to consider the access method being used because that can affect things.
As just a quick example of that last point, looking at requests per second can be useless without comparing concurrent connections at any time but this is especially so when dealing with various access methods. If it is seen that concurrent connections are increasing but the requests per second are pretty much at a stable level then the chances are fairly good that the connections are being forced to stay open longer to service requests. This could mean that the number of concurrent connections allowed has to be increased on the server or that the processing power of the server needs to be increased. However, it could also mean that there are too many data exchanges going on for the bandwidth of the network or for slower access methods.
With all of that being said, hopefully you realize that concurrent users can be very misleading. This was said very well in the article Trade Secrets from a Web Testing Expert. The first section is "Misunderstanding Concurrent Users". I recommend looking at the article. (Please note that this link is a direct PDF link.)
Re: Concurrent Usage
I agree with JeffNyman and I love the article he references. I have 3 articles that I have written on this topic as well that will talk you through user community modeling (what Jeff refers to as Customer Behavior Modeling) from a common sense approach that is easy on the math. It will allow you to take your user model and convert to the number of concurrent user licenses (which is your real question, yes? how big of a license do I need?) required to simulate that load very quickly and easilly.
These articles are supposed to be posted in the download section here, but have not been uploaded yet. Please email me and I will send them to you.
Sr. Performance Engineer
Re: Concurrent Usage
Thanks guys. I think I have enough to imbibe and make a good start.
PS Really, I didnt think there's so much to it !