Creating, parsing, graphing web hit logfiles or other temporal data
Looking at temporal data can be quite informative. For example, I have a table of about 16000 photographs. Each record has a datetime stamp. Looking at the temporal distribution one can clearly see events, such as the birth of a child, the visit of relatives or the purchase of a camera.
Internet Information Server (IIS) is the web server that ships with Windows XP Professional. If you have XP Pro, you can install it from the Control Panel->Add/Remove programs/ Add/remove Windows Components. Once installed, you can hit your web site just by typing “localhost” in the address bar of internet explorer on the same machine. From a different machine on the network you can hit that web site by typing in the machine name. To hit that machine on the internet, you can get a static IP address from your online provider and attach a domain name to it. For example my IP is https://66.13.78.182 and my domain name is www.calvinhsia.com.
When installed, IIS defaults to logging every web hit to logfiles. Choose Start->Control Panel->Administrative tools->Internet Information Services, then drill down to local computer/Web sites/default web sites. Right click and choose Properties. The Web Site page shows an “Enable logging” checkbox, logging options and properties. The default is to log to “C:\WINDOWS\System32\LogFiles” with Log file name: W2SVC1\exyymmdd.log
Go ahead and hit your own web site via this https://localhost then look at the generated log file. The datetime logged by default is Universal Coordinated Time, which for most people is not local time. This might make your logfile generated name be for tomorrow (or yesterday) depending on your time zone.
(In addition to the IIS logging, you can do your own custom logging. For example, my web site logs every web hit into a fox table.)
The 120 line program below parses the IIS logs into a cursor and graphs the results. It took only a few seconds to scan through more than 1000 daily log files resulting in over half a million web hits. The graph was displayed in about one second.
This SQL Select statement is handy when creating and scaling the data for the graph:
SELECT INT((dtime-this.nMin)/this.nBucketSize) as bucket,COUNT(*) as cnt FROM webhits GROUP BY 1 INTO CURSOR buckets
When you click on a point on the graph, a SQL Select statement for the corresponding bucket is executed to get a summary of the web hits for that day, which is displayed on a grid on another form.
CLEAR
CLOSE DATABASES
*ERASE webhits.dbf&& erase the file if you want to reparse
IF !FILE("webhits.dbf")
CREATEOBJECT("ParseLogs")
ENDIF
PUBLIC oGraph as Form
oGraph=NEWOBJECT("GraphForm")
oGraph.Show
DEFINE CLASS GraphForm as Form
allowoutput=.f.
left=200
width=1600
height=300
backcolor=RGB(255,255,255)
forecolor=RGB(255,0,255)
nMin=0
nMax=0
nBucketsize=0
nBuckets=0
oGridForm=0
PROCEDURE init
SELECT MIN(dtime) as min, MAX(dtime) as max FROM webhits INTO CURSOR minmax
this.nMin=min
this.nMax=max
this.nBuckets=this.Width
this.nBucketSize=(this.nMax-this.nMin) / this.nBuckets
SELECT INT((dtime-this.nMin)/this.nBucketSize) as bucket,COUNT(*) as cnt FROM webhits GROUP BY 1 INTO CURSOR buckets
SELECT MAX(cnt) as maxheight FROM buckets INTO CURSOR maxheight
nMaxHeight=maxHeight.maxHeight
SELECT buckets
x0=0
y0=0
SCAN && scan through the cursor, plotting each bucket in a graph
this.Line(x0,y0,buckets.bucket, this.height-this.height*buckets.cnt/nMaxHeight)
x0=buckets.bucket
y0= this.height-this.height*buckets.cnt/nMaxHeight
ENDSCAN
PROCEDURE Mousedown(nButton, nShift, nXCoord, nYCoord)
SELECT * FROM webhits WHERE INT((dtime-this.nMin)/this.nBucketSize)=nXCoord INTO CURSOR results
IF _tally>0
SELECT uri,count(*) from results GROUP BY 1 ORDER BY 2 descending INTO CURSOR daily
IF VARTYPE(this.oGridForm)!= 'O'
this.oGridForm=CREATEOBJECT("form")
WITH this.oGridForm
.width=500
.height=300
.visible=1
.top=this.Top+this.Height+40
.addobject("grid1","grid")
WITH .grid1 as Grid
.Visible=1
.Width=thisform.Width
.Height=thisform.Height
ENDWITH
ENDWITH
ENDIF
thisform.oGridForm.grid1.recordsource="daily"
thisform.oGridForm.grid1.AutoFit
thisform.oGridForm.caption = DTOC(TTOD(results.dtime))
ENDIF
ENDDEFINE
DEFINE CLASS ParseLogs as custom
cDir="C:\WINDOWS\System32\LogFiles\W3svc1\"
PROCEDURE init
CREATE TABLE webhits (dtime t,ip c(15),method c(20),uri c(50))
n=ADIR(aFiles,this.cDir+"*.log")
FOR i = 1 TO n && for each log file (Log file name: W2SVC1\exyymmdd.log)
this.ProcFiles(this.cDir+aFiles[i,1])
ENDFOR
PROCEDURE ProcFiles(cFile as String)
?cFile
nh=FOPEN(cFile)
DO WHILE !FEOF(nh)
cStr=FGETS(nh)
IF ASC(cstr)=0 && null line is empty log
EXIT
ENDIF
IF cStr="#" && if it's part of the log header
IF cStr="#Date"
cDate=SUBSTR(cStr,8)
cYr=SUBSTR(cDate,1,4)
cMon=SUBSTR(cDate,6,2)
cDay=SUBSTR(cDate,9,2)
cDate=cMon+"/"+cDay+"/"+cYr
ENDIF
ELSE
nAt=AT(' ',cStr)
cTime=LEFT(cStr,nAt-1)
cStr=SUBSTR(cStr,nAt+1)
dtime=CTOT(cDate+" "+cTime) && time
nAt=AT(' ',cStr)
ip=LEFT(cStr,nAt-1) && IP address
cStr=SUBSTR(cStr,nAt+1)
nAt=AT(' ',cStr)
method=LEFT(cStr,nAt-1) && method
cStr=SUBSTR(cStr,nAt+1)
nAt=AT(' ',cStr)
uri=LEFT(cStr,nAt-1) && URI
cStr=SUBSTR(cStr,nAt+1)
IF !EMPTY(m.dtime)
INSERT INTO webhits FROM memvar
ENDIF
ENDIF
ENDDO
FCLOSE(nh)
ENDDEFINE
41976
Comments
Anonymous
October 28, 2004
There is an easier way to do this using the Log Parser tool in the IIS 6 Resource Kit. You can easily make a task that dumps your logs to a table every now and then or you can just query the logs directly. I have a post on it if you’re interested here: http://blog.dukk.org/posts/176.aspxAnonymous
March 14, 2006
GeremyAnonymous
September 13, 2007
I received a customer question: I have looked all over the web and still searching, and found your blog.Anonymous
September 13, 2007
I received a customer question: I have looked all over the web and still searching, and found your blog