Variable names in C# Part 3

How descriptive should a variable name be (and when should you abbreviate)?

This entry on internal naming convention in C# presents one of the more esoteric elements of stylistic convention.  When is it appropriate to abbreviate, and when is it appropriate to provide extremely descriptive names about variables?  This topic can be broken down into a few key categories:

  • Local variables
  • Containers
  • Member categorization

Local Variables

Most local names fall into a few basic categories -- parameter names, indexers, temporary state, and temporary return values.  Indexers are probably the easiest to start with.  Code tends to use indexers a lot.  When stepping through an array or list of objects, there may be multiple places where an indexer is used.  Brevity is often the key, and can help indicate the nature of the variable.  For most loops, I recommend the i,j,k approach.  The notation is standard to linear algebra, and is common enough to almost be a coding standard in and of itself.  I would avoid the variable name index used as a local; it's difficult to nest in a logical way, and it makes lines of indexing code less readable.  Simply indicate the level of a nested indexer by it's name.  This rule is not all-encompassing though.  Sometimes it may seem logical to index using the variable names row and col.   These names are short and maintain the brevity rule, while giving a logical clue as to what is being indexed.  I would not use column as a variable name because it creates a visual inconsistency with row.  One thing that I see often used inappropriately (particularly in my work on Managed DirectX) are the indexes x,y,z.  The only time these should be used is when your data set is specifically axis aligned.   This is a much less common situation than usually thought.  If I'm doing lookup into a 2D image bitmap, I would always use i,j variable names.  First, there may not be a guarantee of final bitmap orientation (I'm think of the image as a contiguous piece of data in memory, not a “picture“), second, if I'm treating a bitmap as a single-dimension array, i makes a lot more sense than x.  If I was indexing into screen coordinates, this might be an acceptable time to use x,y names because this is a system that indicates specific orientation.

Parameter names are some of the trickiest locally scoped variables to get right.  This is particularly true for public members (esp. constructors) that set data in some way.  If a parameter has the exact same name as the internal member that it is supposed to fill, how do you make the set code readable?  It's possible to use:

public void Func(object theData) { this.theData = thedata; }

While it maintains the descriptiveness of your exposed parameter names, it introduces a somewhat inappropriate use of the this keyword.  There are a few common ways to avoid this problem.

  • Set by category.  If your set function is called SetInstanceData and will be setting an internal member called instanceData, label the parameter data.
  • Use a member-prefix.  I don't use this very often, but there are definitely times that it might be helpful.  If your member is called _instanceData and your parameter is called instanceData, then you don't have a problem.
  • Use a parameter prefix/suffix.  You can use an affix to make the parameter name more descriptive than the member data.  A single letter example might be adding an i prefix for in parameters and an o prefix for outputs.
    • I'm not a fan of this method, but it can be used to good effect on projects with lots of implementation in the public functions.  Any time this kind of issue pops up, I always ask myself if there's not some better way to encapsulate to improve overall maintainability first. 
    • Using an o prefix for an output for example, might cause a developer to erroneously leave out the ref or out keywords.  This makes a potential code defect out of a style issue.
  • Shorten the parameter name.  The function prototype includes type information and additional information gleaned from the function name.  You should be able to abbreviate or otherwise reduce the amount of information in a parameter name. This is the method I use in most cases, and makes for the most visually appealing code (though some would argue that it might be less maintainable because of the lack of scope indication).

Temporary state and return values are usually pretty easy.  If I have a function that returns an object, I often name the return value temp.   This is a personal clue, and should have minimal impact on code readability, provided you are consistent in your own code and with your team.  Temporary state values should follow basic rules for members (camelCase, no sequential capital letters).  I tend to abbreviate short-lived items or items that occur on only a few lines of code.  As long as a developer can view the entire usage of the temporary variable in a single “screen” of code, there shouldn't be ambiguity problems when abbreviating in this way.

Containers

Containers are a case where being descriptive may trump the rule about not using type-names in variable-names.  There are times when it may be appropriate to use types in member names, but such is not always the case with container classes.  Collections are the obvious time to use them -- in many cases the type of objects contained and the function of the collection are synonymous.  Sometimes it's not necessary.  If I have an ArrayList that stores Int32 values that represent user ids, then calling the collection userIds should be descriptive enough.  The non-obvious case would be generics (as found in the Visual Studio 2005 beta).  Templatized types are a new addition to the C# world, and demand some discussion.  If I create a Vector<UserInformation>  that stores active users, what would make an appropriate name?  I would probably call it either activeUsers or activeUserInformation.  In this case I'd have to look at the rest of members to make a decision.  Do I have other vectors active user data of a different type?  In that case I may want to use the long name.  However, if there's no room for ambiguity, the short name may actually prove more readable since it indicates function (and type is assumed from declaration and compiler / IDE hints). 

Member Categorization

Categorizing members by function or type may seem like an instant indication of problems in encapsulation and scope.  However, there are plenty of times I've found myself wanting granular representation of elements, but grouped into logical categories.  Categories tend to fall into three patterns: by type, by function, by usage.  This kind of organization is very subjective, so I'll list my thoughts as well as some uses of categorization.

  • Categorization by type
    • Description
      • Members with similar function are differentiated by their types.
      • Examples include:
        • Xml documents versus nodes
        • Labels versus their corresponding UI controls
    • My thoughts
      • I like this kind of organization with UI elements.  Often, I'll use the element type first, followed by it's description.  For example, if I tend to get Text from TextBox controls all at once (like on an OK button click), I'll use names like textBoxName, textBoxAge to organize the controls by the usage implied by their type.
      • If I simply want a hint about a member's usage, I'll avoid suffixing a type name onto a variable name unless there is a substantial amount of class hierarchy consideration.  For example, a currentXmlDoc and currentXmlPositionNode might be categorically related, but will give an important hint when using inherited members.
      • Over use of this will break naming by function rules, and is easy to mistakenly use as a replacement for functional description.
  • Categorization by function
    • Description
      • Functional categorization is useful when affixing a word (or words) indicating the function of multiple elements that are similar in function.
      • Examples
        • currentUserName, currentUserId
        • particleFactory, particleCollector
        • nameTokenizer, headerTokenizer
    • My Thoughts
      • I like to use common words and phrases to link members visually.
      • Prefixing members with categorical terms can aid in intellisense lookup
      • Suffix categorization is perfectly fine if lookup is not essential
  • Categorization by usage
    • Description
      • Usage categorization is typically useful as a prefix term linking multiple elements that have similar usage patterns.  A good Example of this are elements that need to be initialized or cleaned up.  The prefix gives a strong visual hint that a similar operation must be applied to multiple variables.
      • Examples
        • resourceTextures, resourceGeometry  (in this case, we might have two elements that are linked by their need to be initialized at similar times)
    • My Thoughts
      • I find myself using this a lot, particularly for things that need to be cleaned up or re-initialized after specific events.
      • This can be mistaken for type-names, but there are cases when different types will require similar usage pattern.  The ideal time to use this convention is when categorizing several variables of different types that have similar usage patterns.
      • If using Auto-Complete / Intellisense lookups, this can improve efficiency and accuracy when variable use is localized into certain sections (like an Initialize() method that sets up several similar private members).

 

So that's probably my most subjective entry on style.  Nothing here is strictly a rule, but it will hopefull get people thinking about how they use names in their own code.  I'm looking forward to your feedback on this one.  Only one more part to go!

Comments

  • Anonymous
    July 14, 2004
    If you use consistent schemes for member variables, arguments and locals, most of these points become somewhat moot. Specifically, I use:<br/>
    * A for parameter names<br/>
    * F for field names<br/>
    * camel case for locals<br/>
    So, properties with backing storage would go kind of like this:<br/>
    public int value {<br/>
    get { return FValue; }<br/>
    set { FValue = value; }
    };<br/>
    and a constructor like this:<br/>
    Test(int AValue) {<br/>
    FValue = AValue;<br/>
    }
  • Anonymous
    July 14, 2004
    If you use consistent schemes for member variables, arguments and locals, most of these points become somewhat moot. Specifically, I use:
    * A for parameter names
    * F for field names
    * camel case for locals
    So, properties with backing storage would go kind of like this:
    public int value {
    get { return FValue; }
    set { FValue = value; }
    };
    and a constructor like this:<br/>
    Test(int AValue) {
    FValue = AValue;
    }
  • Anonymous
    July 14, 2004
    I use similar conventions as sebmol. As far as I remember I got used to the habit of prefixing fields with an F while programming in Delphi.

    - I use all lowercase and camelCase for locals and arguments.
    - I prefix all instance fields with an 'F'
    - I prefix all static fields with a 'G' (for 'Global')

    When looking through code written by others I really hate it when I can't see from a variable's name whether it is local (or an argument) or a field.
  • Anonymous
    July 14, 2004
    The comment has been removed
  • Anonymous
    July 14, 2004
    Now a question...

    Why are you set against the use of this? I find it much easier to read when looking at others code.

    I can't see how there would be any form of performance problem as everything will map back to parameter 0 in a method in IL regardless of the use of this.

    Can you tell me a bit more?

  • Anonymous
    July 14, 2004
    I have three primary reasons actually. None of them do with perf -- it's all about code maintainance.

    1. Assignment Consistency. Take the following example:

    private string userName, location, serviceName;

    public void SetUserInfo(string name, string location)
    {
    userName = name;
    this.location = location;
    }

    In this case, we have some mixed signals for someone reading this code. Do we then insist upon using this.userName = name? This is superfluous. It's a no win situation.

    2. Required usage.

    For me, the 'this' keyword has some very specific uses for which it is absolutely necessary, such as: using this to pass the current object as an argument, when declaring indexers, or when defining constructor overloads. The 'this' keyword is my clue for these situations. If I've done my job well and used a naming convention or solid descriptions to differentiate my members from parameters from locals, the 'this' hint becomes a distraction rather than a helper.


    3. Assignment ambiguity. See example.

    string name, shortName;
    public void SetFullName(string name)
    {
    this.name = name.ToUpper();
    this.shortName= name.Substring(0,5);
    }

    Now here there is a very good possibility of introducing a code error simply because one developer sees a mistake here and another does not. There is no way to get intent from this block of code -- is there a mistake(oops, I forgot the this)? Simply using a different parameter name (like fullName), might reduce ambiguity in design as below:

    string name, shortName;
    public void SetFullName(string fullName)
    {
    name = fullName.ToUpper();
    shortName= fullName.Substring(0,5);
    }

    Now there is no question of intent, and the code is much more maintainable.
  • Anonymous
    July 20, 2004
    .....