Performance Quiz #6 -- Chinese/English Dictionary reader
Raymond Chen is running a series of articles about how to build and optimize the startup time of a Chinese/English dictionary.
- Developing a Chinese/English dictionary: Introduction
- Loading the dictionary, part 1: Starting point and then my analysis
- Loading the dictionary, part 2: Character conversion and then my analysis
- Loading the dictionary, part 3: Breaking the text into lines and then my analysis
- Loading the dictionary, part 4: Character conversion redux and then my analysis
- Analyzing the managed code
- Loading the dictionary, part 5: Avoiding string copying and then my analysis
- Optimizing the managed version
- Loading the dictionary, part 6: Taking advantage of our memory allocation pattern and then my analysis
- Performance Quiz #6 -- Conclusion, Studying the Space
Actually truth be told I got a look at his article quite some time ago as he was kind enough to ask me for comments well in advance. At the time I couldn't resist doing a managed version of the same program to see how it would do. So I encourage you to watch as Raymond works through various steps optimizing his program and see how it comes along.
This managed code is a line for line conversion in the dumbest possible way of his initial program with no attempt whatsoever to optimize anything.
And then, the question of the hour: How does Raymond's program fare vs. the equivalent managed code below?
Feel free to comment on the code, the problem, or just the unfairness of it all but please don't accuse me of concluding too much from the result of just this one benchmark :) :)
using System;
using System.IO;
using System.Text;
using System.Collections;
namespace NS
{
class Test
{
[System.Runtime.InteropServices.DllImport("Kernel32.dll")]
private static extern bool QueryPerformanceCounter(out long lpPerformanceCount);
[System.Runtime.InteropServices.DllImport("Kernel32.dll")]
private static extern bool QueryPerformanceFrequency(out long lpFrequency);
static void Main(string[] args)
{
long startTime, endTime, freq;
QueryPerformanceFrequency(out freq);
QueryPerformanceCounter(out startTime);
Dictionary dict = new Dictionary();
QueryPerformanceCounter(out endTime);
Console.WriteLine("Length: {0}", dict.Length());
Console.WriteLine("frequency: {0:n0}", freq);
Console.WriteLine("time: {0:n5}s", (endTime - startTime)/(double)freq);
}
class DictionaryEntry
{
private string trad;
private string pinyin;
private string english;
static public DictionaryEntry Parse(string line)
{
DictionaryEntry de = new DictionaryEntry();
int start = 0;
int end = line.IndexOf(' ', start);
if (end == -1) return null;
de.trad = line.Substring(start, end - start);
start = line.IndexOf('[', end);
if (start == -1) return null;
end = line.IndexOf(']', ++start);
if (end == -1) return null;
de.pinyin = line.Substring(start, end - start);
start = line.IndexOf('/', end);
if (start == -1) return null;
start++;
end = line.LastIndexOf('/');
if (end == -1) return null;
if (end <= start) return null;
de.english = line.Substring(start, end-start);
return de;
}
};
class Dictionary
{
ArrayList dict;
public Dictionary()
{
StreamReader src = new StreamReader(
"cedict.b5",
System.Text.Encoding.GetEncoding(950));
string s;
DictionaryEntry de;
dict = new ArrayList();
while ((s = src.ReadLine()) != null)
{
if (s.Length > 0 && s[0] != '#') {
if (null != (de = DictionaryEntry.Parse(s))) {
dict.Add(de);
}
}
}
}
public int Length() { return dict.Count; }
};
}
}
Comments
Anonymous
May 10, 2005
Rico Mariani decided to try a managed version of the dictionary I talked about earlier today. According to Rico...Anonymous
May 10, 2005
I want to go on the record and note that I will not be deveoping a Chinese/English Dictionary, in unmanaged...Anonymous
May 11, 2005
Converting the file as we read it is taking a lot of time.Anonymous
May 12, 2005
Stefang jumped into the fray with his analysis in the comments from my last posting.&nbsp; Thank you...Anonymous
May 20, 2005
The comment has been removedAnonymous
May 31, 2005
So I was reading through one of my favorite MSDN blogs (http://blogs.msdn.com/oldnewthing/)
And he...Anonymous
July 31, 2006
I'm just not an expert.Anonymous
August 01, 2006
PingBack from http://w2k.fz.se/blog/?p=26Anonymous
November 29, 2006
• Closures and Continuations / c# .net continuations Continuations in their full glory capture moreAnonymous
December 28, 2006
PingBack from http://www.livejournal.com/users/mikelehen/5481.htmlAnonymous
December 28, 2006
PingBack from http://www.ljseek.com/rico-and-raymond_53939019.htmlAnonymous
January 04, 2007
PingBack from http://console.writeline.net/blog/?p=6Anonymous
January 23, 2007
The fun continues as today we look at Raymond's third improvement . Raymond starts using some prettyAnonymous
January 23, 2007
Stefang jumped into the fray with his analysis in the comments from my last posting . Thank you Stefang.