Udostępnij za pośrednictwem


How Do The Script Garbage Collectors Work?

UPDATE: This article was written in 2003. Since that time the JScript garbage collector has been completely rewritten so as to be more performant in general, to handle the larger working sets entailed by modern web applications that we had absolutely no idea were coming when we designed the JScript GC back in 1995, to be better at predicting when there is garbage that needs collecting, and to be better at handling circular references involving browser objects. I did not do any of that work; I haven't worked on the script team for almost a decade now. I do not know how the modern JScript GC works; I've had the architect describe the basics to me but I am not an expert on it. This article should be considered "for historical purposes only"; it does not reflect how JScript works today.


JScript and VBScript both are automatic storage languages. Unlike, say, C++, the script developer does not have to worry about explicitly allocating and freeing each chunk of memory used by the program. The internal device in the engine which takes care of this task for the developer is called the garbage collector.

Interestingly enough though, JScript and VBScript have completely different garbage collectors. Occasionally people ask me how the garbage collectors work and what the differences are.

JScript uses a nongenerational mark-and-sweep garbage collector. It works like this:

  • Every variable which is "in scope" is called a "scavenger". A scavenger may refer to a number, an object, a string, whatever. We maintain a list of scavengers -- variables are moved on to the scav list when they come into scope and off the scav list when they go out of scope.
  • Every now and then the garbage collector runs. First it puts a "mark" on every object, variable, string, etc – all the memory tracked by the GC. (JScript uses the VARIANT data structure internally and there are plenty of extra unused bits in that structure, so we just set one of them.)
  • Second, it clears the mark on the scavengers and the transitive closure of scavenger references. So if a scavenger object references a nonscavenger object then we clear the bits on the nonscavenger, and on everything that it refers to. (I am using the word "closure" in a different sense than in my earlier post.)
  • At this point we know that all the memory still marked is allocated memory which cannot be reached by any path from any in-scope variable. All of those objects are instructed to tear themselves down, which destroys any circular references.

Actually it is a little more complex than that, as we must worry about details like "what if freeing an item causes a message loop to run, which handles an event, which calls back into the script, which runs code, which triggers another garbage collection?" But those are just implementation details. (Incidentally, every JScript engine running on the same thread shares a GC, which complicates the story even further.)

You'll note that I hand-waved a bit there when I said "every now and then..." Actually what we do is keep track of the number of strings, objects and array slots allocated. We check the current tallies at the beginning of each statement, and when the numbers exceed certain thresholds we trigger a collection.

The benefits of this approach are numerous, but the principle benefit is that circular references are not leaked unless the circular reference involves an object not owned by JScript.

However, there are some down sides as well. Performance is potentially not good on large-working-set applications -- if you have an app where there are lots of long-term things in memory and lots of short-term objects being created and destroyed then the GC will run often and will have to walk the same network of long-term objects over and over again. That's not fast.

The opposite problem is that perhaps a GC will not run when you want one to. If you say "blah = null" then the memory owned by blah will not be released until the GC releases it. If blah is the sole remaining reference to a huge array or network of objects, you might want it to go away as soon as possible. Now, you can force the JScript garbage collector to run with the CollectGarbage() method, but I don't recommend it. The whole point of JScript having a GC is that you don't need to worry about object lifetime. If you do worry about it then you're probably using the wrong tool for the job.

VBScript, on the other hand, has a much simpler stack-based garbage collector. Scavengers are added to a stack when they come into scope, removed when they go out of scope, and any time an object is discarded it is immediately freed.

You might wonder why we didn't put a mark-and-sweep GC into VBScript. There are two reasons. First, VBScript did not have classes until version 5, but JScript had objects from day one; VBScript did not need a complex GC because there was no way to get circular references in the first place! Second, VBScript is supposed to be like VB6 where possible, and VB6 does not have a mark-n-sweep collector either.

The VBScript approach pretty much has the opposite pros and cons. It is fast, simple and predictable, but circular references of VBScript objects are not broken until the engine itself is shut down.

The CLR GC is also mark-n-sweep but it is generational – the more collections an object survives, the less often it is checked for life.  This dramatically improves performance for large-working-set applications. Of course, the CLR GC was designed for industrial-grade applications, the JScript GC was designed for simple little web pages.

What happens when you have a web page, ASP page or WSH script with both VBScript and JScript? JScript and VBScript know nothing about each others garbage collection semantics. A VBScript program which gets a reference to a JScript object just sees another COM object. The same for a VBScript object passed to JScript. A circular reference between VBScript and JScript objects would not be broken and the memory would leak (until the engines were shut down). A noncircular reference will be freed when the object in question goes out of scope in both language (and the JS GC runs.)

Comments

  • Anonymous
    September 21, 2003
    The comment has been removed

  • Anonymous
    September 22, 2003
    The comment has been removed

  • Anonymous
    February 11, 2004
    Eric, we've coded an entire framework in JS to do as much work as possible in the client without having to postback to the server everytime. So our case doesn't exactly fit in the "simple little web pages"... Do you know of something that can help us for example in tunning the GC... registry keys, anything?

  • Anonymous
    February 12, 2004
    As I said in my later article ("Thin to my Chagrin"), you are not alone. A lot of people use the wrong tool for the job and end up in a bad way.

    Unfortunately, if you need to tune the garbage collector, that's a sign that you're using the wrong tool.

    Every tool has tradeoffs. The JScript garbage collector was simply not designed to be performant in situations where there is a large working set and frequent allocations and releases.

    If you need fine-grained control over memory usage of a complex application, there are lots of good languages for that -- C, for example.

  • Anonymous
    July 24, 2004
    Who can solve the following problem: VBScript runs out of memory?

    Dim oHTTP

    i = 0
    Call start()
    wscript.sleep 500

    Sub start()
    i = i + 1
    Set oHTTP = CreateObject("Microsoft.XMLHTTP")

    oHTTP.Open "GET", "http://www.yahoo.com", False

    oHTTP.send

    If oHTTP.statusText = "OK" Then
    wscript.echo i & " OK"
    Else
    wscript.echo "Error: " & oHTTP.statusText
    End If

    'release system resource
    Set oHTTP = nothing
    wscript.sleep 500
    Call start()
    End sub

    wscript.Quit(0)

  • Anonymous
    July 24, 2004
    I can simplify your repro:

    start
    sub start
    start
    end sub

    You have an infinite recursive descent. Don't do that -- it will eat all the stack and then error out.

  • Anonymous
    June 06, 2005
    Avoid deadlocks when tearing down JScript in your script host by calling into the IActiveScriptGarbageCollector implementation.

  • Anonymous
    July 08, 2005
    Tim,

    JavaScript was initially designed exactly for "simple little web pages". Obviously, we are using JavaScript for far more complicated projects. We are using JavaScript for purposes that Java was originally designed for (at least with respect to the web) -- safe, mobile code.

    So, nobody should get offended by that statement, although some will, anyway.

  • Anonymous
    August 31, 2005
    In the article you mentioned "We check the current tallies at the beginning of each statement, and when the numbers exceed certain thresholds we trigger a collection". What is the threshold limit in IE for various versions of browsers and Windows OS?

  • Anonymous
    August 31, 2005
    The heuristics are we do a GC on the next statement after any one of the following limits are passed since the previous GC:

    0x100 variables/temps/etc allocated
    0x1000 array slots allocated
    0x10000 bytes of strings allocated

    The array slot heuristic was not present in all versions of JScript -- we discovered that some ASP pages were producing ENORMOUS integer arrays over and over again and never triggering a collection because they were always assigned to the same variables and never allocating strings. But aside from that, the heuristic has been pretty much the same in all versions. As you can see it's a pretty naive heuristic.

  • Anonymous
    February 27, 2006
    So... we shouldn't use JavaScript for major client-side coding because it's the wrong tool... and it's the wrong tool by design--because nobody uses JavaScript for major client-side coding...

    Circular arguments like this one eventually get garbage-collected, you know.  ;)

  • Anonymous
    March 05, 2006
    So what's going to happen in IE7?  It's clear that JS is no longer a language for tiny webpages: people are building fully functional applications with it.  JScript's memory management performance is one of the bigger difficulties people face in making such apps run on IE6.

    I'm sure you know that there are much better GC algorithms than mark-and-sweep, and just tweaking the heuristics could make life much better.  Can we look forward to an improvement here?  

    Folks here are claiming that the JScript GC time cost is exponential in the amount of data: http://blogs.ebusiness-apps.com/dave/?p=45#comment-500
    From your description it seems like it should be n^2 rather than exp(n) but it still amounts to a wall that JS applications on IE6 are hitting when they reach a certain level of complexity.

  • Anonymous
    March 05, 2006
    Sorry, that heuristic is not naive, it's just plain wrong.  Allocation of particular types or slots does not predict garbage, and the magnitudes are wrong compared to the cost of GC'ing a large live object graph.

    I too hope IE7 can fix this bug.  What are the chances at this point?

    /be

  • Anonymous
    March 07, 2006
    Hey Brendan,

    I agree with you that these heuristics are broken and have been for a long time.  Unfortunately, it's been out of my hands for some time now, and I cannot say whether the team that is taking over ownership of the legacy script code for future releases is going to do any work on the GC or not.  They haven't told me their plans.

  • Anonymous
    March 07, 2006
    And yes, to address the earlier comment, the GC behaviour is n-squared time complexity as the working set grows, both theoretically and practically.

  • Anonymous
    January 11, 2007
    For Vista/IE 7 (jscript 5.7) we did improve the heuristics for the JScript garbage collector. For some applications, the performance improvement is dramatic. Here's how it works: The initial threshholds and items counted are the same, but for vars and slots, they double (up to a large maximum) each time a collection recovers less than 15% of the outstanding items. The collector thereby roughly sizes itself to an app's working set as it it grows. When a collection recovers more than 85% of the items, the counts are reset to the starting default. The threshhold on total bytes in the SysAllocString string space does not adapt - just the vars and slots. To trim memeory usage once the app has reached steady-state, a collection is also triggered every 10 seconds. For the timer-triggered collections, the threshholds are not changed.

  • Anonymous
    March 11, 2007
    PingBack from http://outatime.wordpress.com/2007/03/12/garbage-collection-in-ie6/

  • Anonymous
    March 14, 2007
    Hey Eric, Can you clarify what the thresholds control. There seems to be a little difference (mainly around the GcValTrigger) between what you mention above and the MS KB Hotfix article which seems to reference this issue. From http://support.microsoft.com/kb/919237 0x100 GcVarTrigger, Variables Allocated 0x1000 GcValTrigger, Literal Values Allocated 0x10000 GccbSysTrigger, String Bytes Allocated From your comments above 0x100 variables/temps/etc allocated (Does this include literals? String literals?) 0x1000 array slots allocated (assuming this is GcValTrigger, hence my confusion) 0x10000 bytes of strings allocated I think this clarification would be useful. For example, if I knew that it was the literal theshold which was constantly being hit due to my code structure, I could look at optimizing code by setting the literals up as constants outside of my critical path/loop Unfortunately, some of the things you'd change to reduce the number of times you hit the GC thresholds (like reducing short term variables/objects) have other performance impacts (like scope lookup) so it seems like it's a bit of a balancing act - but all I'm saying is that it would be good to have accurate info before starting off down this path. NOTE: I agree that JS developers really should not be trying to write code centered around GC performance especially due to the balancing act above, but the reality is that there are an ever increasing number of complex JS based web applications out there with IE 6 support requirements and it would be really beneficial to find out what impact code structure has - since it's the only factor under the developers control.

  • Anonymous
    March 23, 2007
    PingBack from http://www.ajaxgirl.com/2007/03/07/garbage-collection-in-ie6/

  • Anonymous
    March 30, 2007
    Hi, I have a question in Javascript. I have an array, and i need to do three operations on this,

  1. Add elements (added as and when i get updates from database)
  2. Do some Processing (Not hamper the array values, just iterate and use these values)
  3. Since the work is done (These elements are used only once ) remove them. Basically free up the memory. For this third point is it ok if i just do myArray=new Array(); Will this free up the memory used for the previous elements. Since Iam not sure abt this what iam doing is : myArray=null; myArray=new Array(); Please suggest which one is correct. Regards, Archana.
  • Anonymous
    March 30, 2007
    First, if your array is so large that you feel that for performance reasons, you need to manage the memory yourself, then you are using the wrong programming language.  You should be using a language like C that lets you have fine-grained control over memory management.  So my advice in this case is "don't worry about it".  If JScript works for you for this problem then let JScript manage your memory. Second, supposing that you still do wish to free up the memory for performance reasons, well, if it is a local variable then it will be collected when the local variable goes out of scope.  So, don't worry about it.  If it is a global variable then... why are you using a global variable to store huge amounts of memory that you are processing temporary data with???  Stop doing that, use a more modular design to put this processing into a helper function, and let the script engine GC deal with scavenging the locals when they go out of scope.  So again, my advice is "let the script engine manage your memory". If that still doesn't dissuade you: to let the GC know that an array is available for scavenging, all you need to do is remove the last live reference to the array.  It doesn't matter what the former live reference refers to afterwards, just so long as it isn't the thing you're trying to free.  You can set it to null, zero, "your mother", whatever you want.  null seems like a reasonable choice. Of course, getting rid of the last live reference STILL doesn't do a collection.  You have to wait for the GC to run.  If you need to force the GC to run, see point one.  You are using the wrong language.  If that still doesn't dissuade you, call the collectGarbage method.

  • Anonymous
    April 12, 2007
    Unfortunately when the JScript engine was being "tweaked" for the IE 7 release a statement limit bug was introduced.  Starting with jscript.dll 5.7.0.5730 dated Oct 17, 2006, a code block is now limited to 32767 statements.  It was easily reproduced by 2 MSVPs.  See my comments at http://www.microsoft.com/communities/newsgroups/list/en-us/default.aspx?query=bug&dg=microsoft.public.scripting.jscript&cat=en_us_d7935fc4-a2a7-4ca3-be64-b36165171379&lang=en&cr=us&pt=&catlist=&dglist=&ptlist=&exp=&sloc=en-us.  The thread is called "Statement limit bug".

  • Anonymous
    April 30, 2007
    People are already aware of IE-JScript circular memory leak problem. Let me take some time and explain

  • Anonymous
    May 04, 2007
    昨天发现了一个可以引起IE的JScript解析引擎发生Memory Leak的bug,及其引起该bug的代码。后来问题男和Laser.NET两位网友给出了很多很有意义的讨论,当然ccBoy网友也给了不少建议,不过ccBoy却更关心innerHTML和appendChild的效率,对ML问题一带而过,好像觉得那根本不是什么大不了得问题。

  • Anonymous
    June 01, 2007
    We are developing a large client side executing application using javascript.  You are making it sound like it is just the nature of the beast and we shouldn't use javascript.  I would accept that, except Firefox performs extremely well with our application: about 6 to 10 times faster. I don't accept it is a limitation of javascript; it is a limitation of the javascript engine inside of IE.  I appreciate the good info and we are auditing our IE 6 implementation to see if we can do anything.

  • Anonymous
    June 01, 2007
    Bill Robertson: "Javascript" is a trademark of Sun Microsystems and is the name of the implementation of ECMAScript created by Netscape and those who inherited their code. I have never worked on or with Javascript, I have only rarely run a Javascript program. I know very little about it. I write a blog about JScript, which is Microsoft's implementation of the ECMAScript programming language.  I know rather a lot about that. Unless specifically otherwise noted, my comments are always about JScript, the Microsoft implementation of ECMAScript.  They should not be construed to be about other third-party implementations which I have little experience with or knowledge of.

  • Anonymous
    June 28, 2007
    The comment has been removed

  • Anonymous
    July 03, 2007
    @ Bill Robertson: 6 to 10 times faster ECMAScript implementation in Firefox might mean that its script engine was designed with different objectives than JScript in IE, i.e. designed to work not only "for simple little web pages", but "for industrial-grade applications" as well :)

  • Anonymous
    July 30, 2007
    The comment has been removed

  • Anonymous
    September 17, 2007
    @Mike U.: ECMAScript implementation in Firefox SHOULD be designed with different objectives, because Firefox extensions are written in ECMAScript too and I suppose they share the same garbage collector.

  • Anonymous
    October 30, 2007
    PingBack from http://www.ajaxgirl.com/2007/10/31/gmail-gets-a-javascript-facelift/

  • Anonymous
    October 30, 2007
    PingBack from http://gueschla.com/2007/10/31/lancement-du-nouveau-frontend-gmail/

  • Anonymous
    October 31, 2007
    PingBack from http://www.javascriptnews.com/javascript/gmail-gets-a-javascript-facelift.html

  • Anonymous
    November 01, 2007
    PingBack from http://turnings.phrasewise.com/2007/11/01/gmail-gets-a-javascript-facelift/

  • Anonymous
    January 10, 2008
    PingBack from http://www.ajaxgirl.com/2008/01/10/how-ie-mangles-the-design-of-javascript-libraries/

  • Anonymous
    January 10, 2008
    PingBack from http://blogsurfer.net/2880/how-ie-mangles-the-design-of-javascript-libraries.html

  • Anonymous
    January 10, 2008
    PingBack from http://www.javascriptnews.com/javascript/how-ie-mangles-the-design-of-javascript-libraries.html

  • Anonymous
    January 10, 2008
    PingBack from http://blogsurfer.net/2933/how-ie-mangles-the-design-of-javascript-libraries-3.html

  • Anonymous
    January 10, 2008
    PingBack from http://blogsurfer.net/2935/blu-ray-gets-a-boost-4.html

  • Anonymous
    January 10, 2008
    PingBack from http://blogsurfer.net/2963/how-ie-mangles-the-design-of-javascript-libraries-4.html

  • Anonymous
    January 11, 2008
    PingBack from http://noisylime.com/2008/01/11/how-ie-mangles-the-design-of-javascript-libraries/

  • Anonymous
    January 12, 2008
    PingBack from http://blogsurfer.net/3657/media-attribute-what-have-you-done-for-me-lately-21.html

  • Anonymous
    January 12, 2008
    PingBack from http://blogsurfer.net/3720/simplicity-php-ajax-framework-using-ext-8.html

  • Anonymous
    January 12, 2008
    PingBack from http://blogsurfer.net/3731/how-ie-mangles-the-design-of-javascript-libraries-8.html

  • Anonymous
    January 13, 2008
    PingBack from http://blogsurfer.net/3749/how-ie-mangles-the-design-of-javascript-libraries-9.html

  • Anonymous
    January 13, 2008
    PingBack from http://blogsurfer.net/3767/how-ie-mangles-the-design-of-javascript-libraries-10.html

  • Anonymous
    January 13, 2008
    PingBack from http://blogsurfer.net/4046/how-ie-mangles-the-design-of-javascript-libraries-13.html

  • Anonymous
    January 14, 2008
    PingBack from http://blogsurfer.net/4136/simplicity-php-ajax-framework-using-ext-10.html

  • Anonymous
    January 14, 2008
    PingBack from http://dezone.wordpress.com/2008/01/15/performance-di-jscript/

  • Anonymous
    January 15, 2008
    PingBack from http://blogsurfer.net/4336/how-ie-mangles-the-design-of-javascript-libraries-16.html

  • Anonymous
    January 15, 2008
    PingBack from http://blogsurfer.net/4538/future-of-web-standards-my-take-41.html

  • Anonymous
    January 15, 2008
    PingBack from http://blogsurfer.net/4633/how-ie-mangles-the-design-of-javascript-libraries-18.html

  • Anonymous
    January 15, 2008
    PingBack from http://blogsurfer.net/4718/an-introduction-to-web-standards-24.html

  • Anonymous
    January 16, 2008
    PingBack from http://blogsurfer.net/4736/how-ie-mangles-the-design-of-javascript-libraries-19.html

  • Anonymous
    April 23, 2008
    Hello Friends, Today I am going to talk about some of the Garbage Collector improvement we have done.

  • Anonymous
    May 03, 2008
    Good to see that the three thresholds are fixed, but this website doesn't render this good. All text runs through the screen to the right.

  • Anonymous
    May 09, 2008
    PingBack from http://czsilence.yo2.cn/articles/javascript-memory-leak-1.html

  • Anonymous
    May 12, 2008
    PingBack from http://czsilence.yo2.cn/articles/javascript-memory-leak-4.html

  • Anonymous
    May 29, 2008
    关于Javascript的内存泄漏问题的整理稿 常规循环引用内存泄漏和Closure内存泄漏 要了解javascript的内存泄漏问题,首先要了解的就是javascript的GC原理。 ...

  • Anonymous
    June 06, 2008
    I too hope this bug will be fixed in IE8

  • Anonymous
    June 08, 2008
    PingBack from http://www.sitepen.com/blog/2008/06/09/string-performance-getting-good-performance-from-internet-explorer/

  • Anonymous
    June 11, 2008
    Hi Eric, I have a question. fucntion makecb(ele){  return function(){   alert(ele) } } document.getElementById("div1").onclick = makedb(document.getElementById("div1") So,  this is the memory leak, since the div DOM/COM Object is not managed by JScript GC.  and will not marked nor sweep. but under recycle reference,  div1 -> js closure ->div1 . the "js closure" created by makedb, will be  GCed or not ? i doens't seem any "scavenger" reference to it.

  • Anonymous
    June 12, 2008
    In IE7 and earlier: The garbage collector will release it, but of course div1 is also holding onto a reference to the closure. Only when all the references are gone will it go away.  But we have a circular reference, so the reference will never go away. Therefore, memory leak. In IE8: The garbage collector has been rewritten so that IE-owned objects participate in the garbage collection process. This should not leak anymore.

  • Anonymous
    June 12, 2008
    Hope IE8 will take international standards.. then it shouldnt be a problem just like FireFox.

  • Anonymous
    July 02, 2008
    The garbage collector distribution includes a C string package that provides for fast concatenation and substring operations on long strings. A simple curses- and win32-based editor that represents the entire file as a cord is included as a sample application.

  • Anonymous
    July 07, 2008
    常规循环引用内存泄漏和Closure内存泄漏 要了解javascript的内存泄漏问题,首先要了解的就是javascript的GC原理。 我记得原来在犀牛书《JavaScript:TheDe...

  • Anonymous
    September 10, 2008
    PingBack from http://blueboxsols.com/?p=1813

  • Anonymous
    September 15, 2008
    Have you seen Google's Chrome?  They have generational garbage collection for JavaScript.  Looks like they'd like to see a trend toward rich clients written in JavaScript.

  • Anonymous
    October 02, 2008
    All these different ways of coding and different programming languages. I wonder where we would be if everybody would be using the same computer language...

  • Anonymous
    November 05, 2008
    常规循环引用内存泄漏和Closure内存泄漏 要了解javascript的内存泄漏问题,首先要了解的就是javascript的GC原理。 我记得原来在犀牛书《JavaScript:TheDe...

  • Anonymous
    November 07, 2008
    PingBack from http://vinaytech.wordpress.com/2008/11/07/garbage-collection-in-ie6/

  • Anonymous
    November 15, 2008
    The comment has been removed

  • Anonymous
    January 21, 2009
    The comment has been removed

  • Anonymous
    April 08, 2009
    PingBack from http://www.yaohaixiao.com/?p=256

  • Anonymous
    May 13, 2009
    常规循环引用内存泄漏和Closure内存泄漏 要了解javascript的内存泄漏问题,首先要了解的就是javascript的GC原理。 我记得原来在犀牛书《JavaScript:TheDe...

  • Anonymous
    November 27, 2009
    Hi, We're trying to follow a specific API which does not copy and return an object but which sorts it (i.e., along with its keys) by reference--necessitating our deleting properties and then adding them back. I know iteration order is not guaranteed, but all major browsers seem to usually iterate via for...in starting with the first property added. However, in IE, when such properties are deleted and then added back, they still go back into the same relative order as before. I was wondering whether it were possible to force object properties to be truly deleted (i.e., so that a function can add back properties without them falling into the same relative position as before). To give a simple example: function reverseObj (obj) {    var keys = [], vals = [];    for (var p in obj) {        keys.push(p);        vals.push(obj[p]);        delete obj[p];    }    keys.reverse();    vals.reverse();    for (var i=0; i < keys.length; ++i) {                obj[keys[i]] = vals[i]; // In IE, adds back in same position as before; other browsers append to the end (the latter properties will show up last when iterating in a for...in loop in these other browsers)    } } Thanks!

  • Anonymous
    December 13, 2009
    Hi Eric, it been a good post. well I got a question, think of <input click ="javascript function" upon the GC or page unload the function is not freed from memory because of the dependecy b/w the dom and javascript element. If it's held in the memory, is the possible to access  or identify, or get a hook to the method in the reloaded page. or any means to get a hook to access the leaked javascript objects.

  • Anonymous
    January 22, 2010
    Good point, I hope IE8 will take international standards too. It shouldnt be a problem anymore, just like FireFox.

  • Anonymous
    February 08, 2010
    Have you seen Google's Chrome lately, it has a generational garbage collection for JavaScript.