共用方式為


Back of the envelope

Everyone who writes software, and most who use it, should be familiar with "back of the envelope" calculations. I originally read about this style of thinking about problems in Jon Bentleys' Programming Pearls, Chapter 7

Part of the work we are doing in core is around rendering and performance. We are working to update our rendering abstraction, how we view rendering in general; as well as how we think about and measure performance.

As part of this we have started doing some calculations related to bus bandwidth and how much of the bus each subsystem can and should use. This is really our first attempt at these “back of the envelope” calculations and we should have done them long ago.

Anyway, enough with the recriminations and on to the data. First, here are the bus bandwidth rates we are using:

Bus Expected Perf Rate (GB/s) Expected Rate ( MB/Frame at 60 FPS ) Date introduced

PCIe 16x 2.0 4 68 2007

PCIe 16x 1.0 2 34 2004

PCIe 8x 1.0 1 17 2004

AGP 8x 1.0 1 17 2002

Ok, so let’s think about this and just the Autogen tree subsystem. In FSX, most auto generated trees have 12 vertices that contain 32 bytes of data[1] thus giving the model a size of (12 * 32 bytes) = 384 bytes per instance.

When using batching, assuming 20% of the bus bandwidth is available for auto generated trees, it is possible to transfer (0.20 * 34 MB / 384 bytes per instance) = 18568 batched trees per frame at 60 Hz.

Now, a typical scene has 50 1km x 1km cells in the scene. Autogen trees are pegged at 4500 max per cell (when the slider is all the way to the right). You can set the max up to 6000, so let’s call it 5000 for easier math. 5000*50=250,000 trees.

Holy Autogen Batman! Yes, this is why autogen brings the system to its knees. Crysis doesn’t try to render 250k trees. If you are having a major perf problem with Autogen, try using the “max in cell” tweak to reduce the max to 500. Then your max is 500*50=25,000.

It should be obvious the same holds true for Autogen buildings.

Part of our performance work is to turn the engine from a batching engine to an instancing engine to help bridge this gap. When using instancing and assuming 10% of the bus bandwidth is available for non-animated instance data (0.10 * 34MB / 48 bytes per instance) = 74, 274 instances per frame at 60 Hz can be sent across a PCIe 16x bus. If we give Autogen Trees 20%, then we get 150,000 trees. 3000 trees per cell is certainly within those limits. So that change alone gets us close to "within bounds".

Given there are sliders and config entries, users can adjust their settings to local conditions of the hardware and the style of flying you do. That is why this gap isn’t “tragic”; but it still is a rude surprise to most people that FSX tries to do so much and that is a good part of the root of the problem and why the FSX engine is so different from other engines.

And there are some other things we are doing, so the engine isn’t so overcommitted. But that is another post. J.

PS.

https://support.microsoft.com/kb/555739 lists the Autogen max per cell tweak.

PPS

The slider stops correspond to the following percentages:

0, 10, 20, 45, 70, 100


[1] 3 floating point numbers for position, 3 for normal, and 2 for texture coordinates: ((3 + 3 + 2) * 4 bytes) = 32 bytes.

Comments

  • Anonymous
    April 17, 2008
    Great post Phil and one of the reasons we rework AG in GEX the way we do. We look at the objects and add up the weight and then try and maintain a percentage which will play nice with the rest of a typical render. It’s really great to see you guys are looking at all this now. Very uplifting to know things are being considered based on REALITY instead of just FEATURES. I knew you were the man for this job!!! I look forward to your next installment on changes. Also, don’t forget to test it in combat. the theory is great, but it must always be tested on different systems to be sure the design is in fact on the right track :)

  • Anonymous
    April 17, 2008
    Hi Phil! I concur with Nick, great post and insight indeed. Especially reading the last part; going from batching to (semi) instancing, got me very excited! Thanks! :)

  • Anonymous
    April 17, 2008
    Hi Phil, thanks for all the work that you've put into FSX. Talking of autogen and its impact on performance, there is one thing I never quite understood. When rendering a forest, you render thousands of individual trees. Wouldn't it be quite a lot easier to render a forest? I'm thinking of a rectangular or triangular object of let's say 500 by 500 meters with a forest texture? This should save a considerable amount of performance. Best, Matthias

  • Anonymous
    April 18, 2008
    In response to Matthias, I've also thought that rendering a forest would make more sense; not only would it hugely reduce the polygon count it would probably look more realistic as well. In real life the only time you can see individual trees in a forest of reasonable density is when you're actually in it. Colin

  • Anonymous
    April 18, 2008
    Imposter systems can be tricky too. And we'd have to be re-writing that for Trains to get the 10 foot experience so in the long run this is the right way to author the feature.

  • Anonymous
    April 18, 2008
    Hi Phil,I was under the impression that from reading your post back at the end of last year that you stated that you had delivered and for-filled your promise on release of sp1 + 2 and that you were moving on to more pressing things, was I wrong? So what are we talking of, reading the above posts ..SP3??...I am suddenly excited!

  • Anonymous
    April 19, 2008
    The comment has been removed

  • Anonymous
    April 19, 2008
    The comment has been removed

  • Anonymous
    April 21, 2008
    SteveyB: this discussion is about progress made on TrainSim2, not about a SP3 for FSX, there is no plan for an additional service pack.

  • Anonymous
    April 21, 2008
    Ted: I used 60 just as a point of discussion, yes if you cut the FPS down to 30 that effectively doubles the amount you can push.

  • Anonymous
    April 21, 2008
    Phil, Thanks for this post.  I'm glad to see this sort of thing being done.  Knowing the practical limitations of what you are trying to engineer is a very valuable thing!  Alas, I'm no engine programmer or 3D guru at all, so can you tell me what batching versus instancing are, or at least tell me where I can learn? Keep up the good work and informative posts. Mark

  • Anonymous
    April 21, 2008
    VSS: batching vs instancing can be read up on in the DX SDK or a search with "Direct3D batching instancing".

  • Anonymous
    May 01, 2008
    The comment has been removed

  • Anonymous
    May 04, 2008
    "Crysis doesn’t try to render 250k trees." Nope, but it does render individual, animated leaves, animated vegetation, has dynamic realtime soft shadows on everything, full HDR rendering etc. :) A tree in Crysis and a tree in FSX are completely different things. But I get the point, you can't use a graphics engine intended for shooters in a flight sim, but the oposite holds true too. Also with the low draw distance of autogen, I very much doubt you see that many trees. Forests in FSX look fairly dense close up, but the trees quickly thin out farther from the viewer.