A simple answer might be that there are really three big categories of performance testing, but lots and lots of names for each of them.
Either you run a test with user pauses designed to replicate existing or expected user loads on the system - a Load Test (or a dozen other names: Scalability test, volume test, production readiness test, etc etc etc)
OR you run a test with larger than existing or expected user loads and/or with smaller or no pauses or think time in the script - a Stress Test (or volume test, resource test, etc etc etc)
OR you run a test until the server(s) "break" - a negative test (or a failover test, degradation test, etc etc etc).
Think less about the terms used to describe it, and more about what you're trying to accomplish. Then slap any old name on it you want.
My volume testing consists of creating millions of files for our application to work with.
My performance testing uses large numbers of files and includes benchmarks for different functions used with those datasets.
For example, if you were testing a backup program. You could test to see how the application deals with 100,000,000 files for a volume test and how long it takes to backup 1,000,000 files for the performance test.