I was wondering what kind of metrics other companies are utilizing to measure such things as software realiablity
Re: QA Metrics
Strict reliability means "the ability of a system to perform its required functions under stated conditions whenever required." It also refers to having a long mean time between failures. However, I do not know if you are using the term in the strict sense or a more general sense.
Going with the strict sense, one metric that is often used is the rate of "fault occurrence." (Sometimes you see this given as an ROCOF number.) This is the frequency of occurrence of unexpected behavior. So a value of 0.05 means that five failures are likely in each given set of operational time units. (For example, this might be that out of all the credit card transactions in, say, sixty minutes, five of them will be likely to fail.)
Another metric is the probability of failure on demand, which is a measure of the likelihood that a system will fail when a given service request is made. A lot of times this metric is given as POFOD and something like 0.05 would mean that out of a given number of service requests, say, sixty, five of them are likely to result in failure.
There is also mean time to failure (MTTF) which is just the observed time between failures.
Another metric is availability, which is a measure of how long the system is available for use, obviously done in relation to whatever time scale you want. (You should include things in here like reboots, server restarts, etc.)
Some ways to calculate them are pretty obvious in general but consider that if you want to calculate POFOD, measure the number of system failures for a given number of transactions/interactions to the system. To compute ROCOF or MTTF, just measure the time bewteen system failures or the number of transactions/interactions between system failures.
Just to be complete, since I am not sure of your focus, the IEEE also defines things like requirements reliability metrics, test reliability metrics, etc. Having said all this, here is my caveat: these are the things that I do at the organizations I work at with the understanding that I have of what reliability means.