Sunday, December 02, 2007

 

Powerful CGI applications


In the article Myths About CGI Scalability published some time ago by Lars from z505, one of the authors of Powerful Web Utilities, he exposes a problem every CGI programmer faces with database applications. As you can see in that document, every time a CGI is called by a web browser, the operating system execute it and wait until its end showing the results to the client's browser. This operation is cached by the operating system after the first execution of the CGI, resulting in a faster execution of following calls.

If a web application uses a database, it's very common to create a CGI that connects to the database, perform a query then format the results and return them to the client. Using this approach, every time a user try to execute the CGI program a new connection to the database is created, resulting in slow and inefficient response times.

To resolve this problem, I created a simple CGI that sends socket commands to a program (I'll call it AppServer) wich takes care of database connections and heavy processes and returns its results to the CGI to be shown in the client's browser, using only one database connection to handle every CGI request. If the CGI finds the AppServer isn't running, it starts an instance and send the commands.

One interesting part of this approach is it runs in cheap hosting accounts, using Apache, IIS or any CGI capable web server. Also it can be compiled using FreePascal or Delphi.

The example uses PWU and Synapse for the AppServer, but you can use WebBroker or any other CGI technology.

You can download the CGI client and server to give it a try.

The CGI side is this:
uses
classes,
PwuMain,
HttpSend,
Windows;

function GetHtml: Boolean;
var
lHttp: THTTPSend;
lHtml: TStringList;

begin
Result := False;
lHtml := TStringList.Create;
lHttp := THTTPSend.Create;

with lHttp do
begin
HTTPMethod('GET', 'localhost:85/');
lHtml.LoadFromStream(Document);
Result := lHtml.Text <> '';
out(lHtml.Text);
end;
lHttp.Free;
lHtml.Free;
end;

var
lHandle: longint;

begin
if not GetHtml then
begin
(* If not connected, start the process *)
{$IFDEF WIN32}
lHandle := ShellExecute(0 , 'open', '.\httpserver.exe',
nil, nil, SW_HIDE);
{$ELSE}
lHandle := Shell('./httpserver');
{$ENDIF}
while True do
begin
if GetHtml then
Break;
end;
end
end.

Comments:
Interesting!

Did you write a way to handle database connections so that two people do not collide in the same connection?

i.e. pooling?
 
Also to add to and clarify your comment about the operation being cached in CGI, by the OS..

There are three things that are cached which programmers generally forget about or do not know about and it causes the myth about CGI being slow.

1. file system access caches, such as reopening the same pwmain.TemplateOut() files over and over again. Not just static HTML files served by apache, but any file on the hard disk is smartly cached if it is continually reopened.. and templates are somethign we always reopen along with pwmain.FileOut() calls.

2. EXE/ELF code segment caches.. unlike Windows 3.1 the data segment is not shared.. but the code segment is cached/shared when the same exe is called, on unix and windows. Most people thought only windows 3.1 and only dll code segments are cached/reused, but not so. All OS's cache the EXE code segment as if it is a DLL, especially when being reused. In fact an EXE is just a DLL in disguise.

3. database caching and indexing.. databases have connection slowdown which of course we must address.. but there are caching of queries and index caches done by the DB automatically, which many programmers try and program themselves due to not realizing it is simple as tweaking the database config settings.

Of course, even with all this being said.. it is still obviously useful to optimize further with database pooling when necessary.. on large servers that need it.

CGI also cleans up garbage automatically, such as leaking memory.. since each CGI is its own little black box separate from all other CGI's. This discussed in another article too.
 
Yes, depending on which type of access I need, I use connection pooling or simply a Critical Section.

If the system must handle Inserts/Updates or Deletes to the database, I use connection pooling. If It only responses to Selects, I Acquire a Critical section, then process the query then Release it.

I'll be posting an example of this.
 
I will add your information to the powtils web page and documentation as we can..

Thanks much for the experiment with database pooling I we will really find this useful in Powtils to offer a "scalability" option for people who are worried about when their application grows huge..
 
Post a Comment



<< Home

This page is powered by Blogger. Isn't yours?