Jaa


Performance Improvements for ASP.NET Shared Hosting Scenarios in .Net 4.5

Disclaimer: The article below discusses pre-release features and performance numbers. Some of these features could change on the RTM version. Also, the actual performance numbers could change from now until we ship.

During .Net 4.5 development we tried to improve the performance of ASP.NET in shared hosting scenarios, where many web sites might share the same machine. In such environment traffic is usually low. In this case startup time becomes very important. That is the time it takes a web site to receive a first request and respond to it, when the worker process was down, and the web site was already compiled, which is the most common case. Also assuming that other worker processes are running other web sites, so shared components will be already loaded.

Another important metric is the site density. The more sites we can fit in a box the lower the cost for the hosters. The site metric we used was the unshared working set. That is the total working set minus the shared working set.

Scenarios.

The first thing we needed to validate our work was a set of scenarios that could allow us to measure our progress, through this two metrics: Startup and Site Density. We decided to use community created and used web applications. We gathered a bunch of applications deployed by the Web Platform Installer (Web PI).

We focused on three applications: DotNetNuke, Blogengine and IBuySpy+*. The decision was based on our planned work, which included features that will deal with assemblies in the Bin folder. So we picked three apps that have different levels of Bin folder use. DotNetNuke has heavy use of assemblies in the Bin folder, Blogengine has some use and IBuySpy+ has none.

Startup

We added several features during this product cycle to improve startup time. These features are the results of the work of several teams at Microsoft, including CLR and Windows Server. ASP.NET team worked on enabling these features in our product. The feature list is below:

  1. Prefetch.
  2. Bin assemblies interning.
  3. Multi-core JIT.

Memory.

The features implemented to aid the memory site were:

  1. HighDensityWebHosting performance scenario.
  2. Bin assemblies interning.

Numbers.

Baseline: Windows 2008 R2, .Net 4.0

Results: Windows 8 Server,  .Net 4.5

App Name

Metric

Unit

Baseline

Result

Diff

blogengine

Unshared Working Set

KB

17,212.67

12,985.33

24.56%

dotnetnuke

Unshared Working Set

KB

27,383.33

16,743.33

38.86%

ibuyspyplus

Unshared Working Set

KB

17,299.33

13,744.67

20.55%

blogengine

Colds Startup Time

secs

1.63

1.09

33.05%

dotnetnuke

Colds Startup Time

secs

2.58

1.71

33.53%

ibuyspyplus

Colds Startup Time

secs

1.52

1.06

30.27%

Hardware used:

  <Machine>

    <Hardware>

      <Processor Num="2">Intel(R) Xeon(R) CPU 5160 @ 3.00GHz</Processor>

      <Memory>4096</Memory>

      <HardDrives Num="2">

        <HardDrive>HP DF146ABAA9 SCSI Disk Device SCSI[136.73 GB]</HardDrive>

        <HardDrive>SEAGATE ST3300656SS SCSI Disk Device SCSI[279.39 GB]</HardDrive>

      </HardDrives>

    </Hardware>

  </Machine>

Other resources for the features implemented:

https://www.asp.net/vnext/whats-new#_Toc_perf

 

*IBuySpy+ is a variation of the IBuySpy (a web app we use extensively on the ASP.NET perf team), which added some of the new features of ASP.NET/.Net framework added during the v3.5 timeframe.

Comments

  • Anonymous
    November 09, 2014
    The comment has been removed
  • Anonymous
    November 09, 2014
    Related questions:   1)  Does the assembly being interned have to be a >=net45, or just the web app that is hosted in the app pool.  I.e. can I intern a net35 assembly referenced by a net45 webapp/website?   2)  I assume the interned assembly's singletons, static vars, file paths, and all resources are unique and local per app domain, and not some unwanted global affect/shared memory between app pools, correct?  We would not want a per-app-domain singleton (manual or via DI Container) to suddenly become a cross-app singleton instance.)  I'm guessing it has to per app domain and each app pool memory has is isolated and there's no unwanted shared memory access and marshaling.