Predicting Java memory-Collection of common programming errors
In general, you can predict fairly closely what a given object will require. There’s some overhead that is relatively fixed, plus the instance fields in the object, plus a modest amount of padding. But then object size is rounded up to at least (on most JVMs) a 16-byte boundary, and some JVMs round up some object sizes to larger boundaries (to allow the use of standard sized pre-allocated object frames). But all this is relatively fixed for a given JVM.
What varies, of course, is the overhead required for garbage collection. A naive garbage collector requires 100% overhead (at least one free byte for every allocated byte), though certain varieties of “generational” collectors can improve on this to a degree. But how much space is required for GC is highly dependent on the workload (on most JVMs).
The other problem is that when you’re running at a relatively low level of allocation (where you’re only using maybe 10% of max available heap) then garbage will accumulate. It isn’t actively referenced, but the bits of garbage are interspersed with your active objects, so it takes up working set. As a result, your working set tends to be roughly equal to your current overall garbage-collected heap size (plus other system overhead).
You can, of course, “throttle” the heap size so that you run at a higher % utilization, but that increases the frequency of garbage collection (and the overall cost of GC to a lesser degree).