Foxpro Performance tip: field name lookup for tables
When FoxPro opens or uses a table or cursor, internally we have to keep track of the field names used in that table.
For example, this statement creates a table with 3 fields.
CREATE TABLE foo (lastname c(10), firstname c(10),address c(20))
When encountering a line of code:
x = lastname + firstname
Fox needs to figure out if “lastname” is a field name in the currently opened table or a variable.
All names used internally in Foxpro are stored in the “name table”, with a single integer index into the table representing the name. In fact, they’re called NTIs, for Name Table Indexes.
A name could be a user variable name, a property/method name, or even native property/method names, like ActiveProject. Command and function names are not put in the name table.
When a table is used (opened or created), VFP creates an array of structures for each field. A member of the structure is the NTI for the field name. VFP searches the name table for the field names and an index is returned. If they don’t exist, then they’re added.
In the old days (prior to FoxPro 7.0), when a table was opened, an array was allocated of size (maximum name table index ever used). All the array elements were set to 0, then a loop through all the fields would set the nth array element to n+1 (the field # for the nth field). Thus if the NTIs for “lastname”, “firstname”, and “address” were 200, 210, and 220, then the 200th element of the array would be 1, the 210th element would be 2 and the 220th element would be 3. To be a little more efficient, the array was then resized to be the maximum NTI found.
This worked fine and the time to lookup a field name was constant: if you want to know if NTI 200 was in the table, just check to see if the 200th element was non-zero: that would indicate the field position in the table.
If the name table had 3000 names in it, then the lookup array would have 3000 elements, all of which were zero except for the # of fields (FCOUNT()) elements.
As new features were added to VFP for subsequent versions, the default name table size grew because the # of properties and baseclasses grew.
If you had a couple dozen tables open, with perhaps a few dozen fields per table, with fairly complex code running (which would add to the name table), and then opened another table, the growth of the name table meant the lookup array size grew quite a bit too.
To maximize performance, the number of disk accesses must be minimized. Disk access is orders of magnitude slower than memory access, so it’s good to use memory wisely.
I changed this lookup array algorithm for VFP7 (June 2001) and later so it doesn’t waste all this memory with the sparse lookup array. This is the classical memory space vs execution time trade-off.
Instead of creating a lookup array, the array of field names is searched linearly for the given NTI. Thus searching for a field name using tables with lots of fields is slower, but there is no longer a huge sparse lookup array and precious memory can be used for other tasks.
The search is just a linear search through the fields as can be seen by experimenting with the code below.
A table with 255 fields is created, and a variable “x” is assigned in a loop. With the table closed, or with a small number of fields, the loop occurs quite quickly. With many fields, accessing fields near the end takes a lot longer. Determining if an NTI is not a field means looping through the entire list of fields in the current workarea.
If the NTI is not a field (like “y” below), the linear search occurs through to the end of the field list to determine that it’s not a field, so performance is slow. An easy way to speed it up (and a way to make code easier to read, easier to maintain, and faster) is to prefix the variable name with “m.” which specifies that the name is *not* a field name, but a variable. This boosts performance, even if there is a table with many fields open.
Restated: if you have VFP code that accesses a variable with a syntax that *could* be a field reference, and there is a possibility that a cursor is open in the current workarea as well, preface that variable with “m.” for better performance. VFP then “knows” that it can’t be a field name.
A 2 gigahertz processor can execute a loop of 255 integer compares with an array of structure members very quickly, but if that loop can be eliminated with a minor code change, then why not?
*experiment with 2 or 255 fields
#define NFLDS 255 &&Create a table with 255 fields
DIMENSION aflds[NFLDS,5]
FOR i = 1 TO NFLDS
aflds[i,1]="fld"+PADL(i,3,"0")
aflds[i,2]="i"
aflds[i,3]=4
aflds[i,4]=0
ENDFOR
CREATE CURSOR test FROM ARRAY aflds
INSERT INTO test (fld001) VALUES (0)
use && Comment this line to experiment with the table closed or open
y=1
nsec=SECONDS()
FOR i = 1 to 1000000
x=y
* x=m.y
* x=fld255
* x=test.fld255
ENDFOR
?SECONDS()-nsec
RETURN
The original bug report (must run it in VFP6 to reproduce)
*Steps to Repro:
CLEAR all
CLEAR
memst=VAL(SYS(1016))
?SYS(1011),SYS(1016)
FOR i = 1 to 1000
mstr='e'+TRANSFORM(INT(SECONDS()*1000))
CREATE CURSOR foo ((mstr) c(10))
?SYS(1011),SYS(1016), VAL(SYS(1016)) - memst
use in foo
ENDFOR
RETURN
Observed:
the numbers in the handle column are relatively constant, but the size of the handles in the 2nd column keep increasing.
Expected:
not so much mem to be used.
45914
Comments
Anonymous
December 14, 2004
Why not use a hashtable-like structure to store the names? You wouldn't have to search through every record then to see whether a field exists or not? If you just create a 256-entry hashtable, and use the bottom byte of the NTI to stick the NTI in one of the buckets, you'd only have to search one bucket for a particular NTI, rather than every one.
I guess it depends on how the NTIs are generated as to how effective that might be. But as long as you could find a hashing algorithm that was pretty quick and generated a fairly even distribution of keys, then even for huge tables, your search would be much quicker.Anonymous
December 15, 2004
I can use this :
x=m.y
or
STORE m.y TO x
Why the second it is 20% faster ?Anonymous
December 15, 2004
And with a variable object with name equal to the table, you MUST put the m.
CREATE CURSOR test (aa i)
test=CREATEOBJECT("textbox")
x= test.tag && fail
x= m.test.tag
You considers this a bug?Anonymous
December 15, 2004
The comment has been removedAnonymous
December 15, 2004
The comment has been removedAnonymous
December 15, 2004
Oh OK, fair enough then :) I'm not a VFP user, so I don't really know what I'm talking about anyway, I just got the impression that it was a linear list from what you wrote...Anonymous
December 15, 2004
Hey, thanks Calvin! Keep 'em coming. I've gotten away from m. in the last few (or more years), but this is an important thing to take into account in some situations.Anonymous
September 21, 2005
Het begint op een vendetta te lijken, dat geeft niet hoor maar Dr. Phil zou zeggen http://gobkind.250Free.com/internet-merchant-accountnational-city-bank.htmlAnonymous
December 21, 2005
The comment has been removedAnonymous
July 20, 2006
It takes a lot of work to create the blog posts and code samples that I put in my blog, and I was curious...Anonymous
March 05, 2009
<a href= http://index3.reezina.ru >������� ��������� ����� ����� ����</a> <a href= http://index4.reezina.ru >jimm xattab �������</a> <a href= http://index5.reezina.ru >��������� ���������� ���������</a> <a href= http://index2.reezina.ru >������� jimm 359 ������� ���������</a> <a href= http://index1.reezina.ru >����� samsung j210</a>Anonymous
February 06, 2013
The comment has been removed