Back of the envelope
Everyone who writes software, and most who use it, should be familiar with "back of the envelope" calculations. I originally read about this style of thinking about problems in Jon Bentleys' Programming Pearls, Chapter 7
Part of the work we are doing in core is around rendering and performance. We are working to update our rendering abstraction, how we view rendering in general; as well as how we think about and measure performance.
As part of this we have started doing some calculations related to bus bandwidth and how much of the bus each subsystem can and should use. This is really our first attempt at these “back of the envelope” calculations and we should have done them long ago.
Anyway, enough with the recriminations and on to the data. First, here are the bus bandwidth rates we are using:
Bus Expected Perf Rate (GB/s) Expected Rate ( MB/Frame at 60 FPS ) Date introduced
PCIe 16x 2.0 4 68 2007
PCIe 16x 1.0 2 34 2004
PCIe 8x 1.0 1 17 2004
AGP 8x 1.0 1 17 2002
Ok, so let’s think about this and just the Autogen tree subsystem. In FSX, most auto generated trees have 12 vertices that contain 32 bytes of data[1] thus giving the model a size of (12 * 32 bytes) = 384 bytes per instance.
When using batching, assuming 20% of the bus bandwidth is available for auto generated trees, it is possible to transfer (0.20 * 34 MB / 384 bytes per instance) = 18568 batched trees per frame at 60 Hz.
Now, a typical scene has 50 1km x 1km cells in the scene. Autogen trees are pegged at 4500 max per cell (when the slider is all the way to the right). You can set the max up to 6000, so let’s call it 5000 for easier math. 5000*50=250,000 trees.
Holy Autogen Batman! Yes, this is why autogen brings the system to its knees. Crysis doesn’t try to render 250k trees. If you are having a major perf problem with Autogen, try using the “max in cell” tweak to reduce the max to 500. Then your max is 500*50=25,000.
It should be obvious the same holds true for Autogen buildings.
Part of our performance work is to turn the engine from a batching engine to an instancing engine to help bridge this gap. When using instancing and assuming 10% of the bus bandwidth is available for non-animated instance data (0.10 * 34MB / 48 bytes per instance) = 74, 274 instances per frame at 60 Hz can be sent across a PCIe 16x bus. If we give Autogen Trees 20%, then we get 150,000 trees. 3000 trees per cell is certainly within those limits. So that change alone gets us close to "within bounds".
Given there are sliders and config entries, users can adjust their settings to local conditions of the hardware and the style of flying you do. That is why this gap isn’t “tragic”; but it still is a rude surprise to most people that FSX tries to do so much and that is a good part of the root of the problem and why the FSX engine is so different from other engines.
And there are some other things we are doing, so the engine isn’t so overcommitted. But that is another post. J.
PS.
https://support.microsoft.com/kb/555739 lists the Autogen max per cell tweak.
PPS
The slider stops correspond to the following percentages:
0, 10, 20, 45, 70, 100
[1] 3 floating point numbers for position, 3 for normal, and 2 for texture coordinates: ((3 + 3 + 2) * 4 bytes) = 32 bytes.
Comments
Anonymous
April 17, 2008
Great post Phil and one of the reasons we rework AG in GEX the way we do. We look at the objects and add up the weight and then try and maintain a percentage which will play nice with the rest of a typical render. It’s really great to see you guys are looking at all this now. Very uplifting to know things are being considered based on REALITY instead of just FEATURES. I knew you were the man for this job!!! I look forward to your next installment on changes. Also, don’t forget to test it in combat. the theory is great, but it must always be tested on different systems to be sure the design is in fact on the right track :)Anonymous
April 17, 2008
Hi Phil! I concur with Nick, great post and insight indeed. Especially reading the last part; going from batching to (semi) instancing, got me very excited! Thanks! :)Anonymous
April 17, 2008
Hi Phil, thanks for all the work that you've put into FSX. Talking of autogen and its impact on performance, there is one thing I never quite understood. When rendering a forest, you render thousands of individual trees. Wouldn't it be quite a lot easier to render a forest? I'm thinking of a rectangular or triangular object of let's say 500 by 500 meters with a forest texture? This should save a considerable amount of performance. Best, MatthiasAnonymous
April 18, 2008
In response to Matthias, I've also thought that rendering a forest would make more sense; not only would it hugely reduce the polygon count it would probably look more realistic as well. In real life the only time you can see individual trees in a forest of reasonable density is when you're actually in it. ColinAnonymous
April 18, 2008
Imposter systems can be tricky too. And we'd have to be re-writing that for Trains to get the 10 foot experience so in the long run this is the right way to author the feature.Anonymous
April 18, 2008
Hi Phil,I was under the impression that from reading your post back at the end of last year that you stated that you had delivered and for-filled your promise on release of sp1 + 2 and that you were moving on to more pressing things, was I wrong? So what are we talking of, reading the above posts ..SP3??...I am suddenly excited!Anonymous
April 19, 2008
The comment has been removedAnonymous
April 19, 2008
The comment has been removedAnonymous
April 21, 2008
SteveyB: this discussion is about progress made on TrainSim2, not about a SP3 for FSX, there is no plan for an additional service pack.Anonymous
April 21, 2008
Ted: I used 60 just as a point of discussion, yes if you cut the FPS down to 30 that effectively doubles the amount you can push.Anonymous
April 21, 2008
Phil, Thanks for this post. I'm glad to see this sort of thing being done. Knowing the practical limitations of what you are trying to engineer is a very valuable thing! Alas, I'm no engine programmer or 3D guru at all, so can you tell me what batching versus instancing are, or at least tell me where I can learn? Keep up the good work and informative posts. MarkAnonymous
April 21, 2008
VSS: batching vs instancing can be read up on in the DX SDK or a search with "Direct3D batching instancing".Anonymous
May 01, 2008
The comment has been removedAnonymous
May 04, 2008
"Crysis doesn’t try to render 250k trees." Nope, but it does render individual, animated leaves, animated vegetation, has dynamic realtime soft shadows on everything, full HDR rendering etc. :) A tree in Crysis and a tree in FSX are completely different things. But I get the point, you can't use a graphics engine intended for shooters in a flight sim, but the oposite holds true too. Also with the low draw distance of autogen, I very much doubt you see that many trees. Forests in FSX look fairly dense close up, but the trees quickly thin out farther from the viewer.