1 0 0 AN INTRODUCTION TO WRITING WEBSHARE CGI SCRIPTS + AN INTRODUCTION TO WRITING WEBSHARE CGI SCRIPTS 0 Melinda Varian 0 Office of Computing and Information Technology Princeton University 87 Prospect Avenue Princeton, NJ 08544 USA --.-- Email: Melinda@Princeton.EDU Web: http://PUCC.Princeton.EDU/~Melinda Telephone: 1-609-258-6016 0 VM Workshop June, 1996 0 Rick Troth, the great builder of VM tools for the Internet, is the author of a widely-used VM Web server called Webshare. You can get Webshare from Rick's home page: 0 http://ua1vm.ua.edu/~troth/ 0 One of the many nice features of Webshare is that its Common Gateway Interface (CGI) routines are written as CMS Pipelines filters. Thus, if + _____________ you serve Web forms from a Webshare server, you can process the input from those forms using REXX and CMS Pipelines. In this session, I will + _____________ introduce some Pipelines techniques that you may find useful in writing + _________ CGIs for Webshare, but much of what I will be discussing here is applicable to using Pipes for writing service machines in general. + _____ 0 If you are a Web page novice, there are many places to turn for help: 0 o You will need to learn some HTML (Hyper-Text Markup Language). HTML is a formatting language rather reminiscent of GML and is so simple that there is not much to learn. A good place to start is at the NCSA Web site: 0 http://www.ncsa.uiuc.edu/General/Internet/WWW/HTML.Primer.html 0 That will get you to NCSA's HTML primer. In the primer, you will find links to other, more advanced papers, including some on HTML forms, which you will need to learn about before doing a Web page that invokes a CGI. 0 o You will also want to learn about writing CGI routines in REXX. The best person to teach you that is Les Cottrell at SLAC: 0 http://www.slac.stanford.edu/~cottrell/rexx/share/ 1 0 Page 2 Writing Webshare CGI Scripts ------------------------------------------------------------------------ 0 Les has written a very good introduction to the concepts of CGI routines in general and of REXX CGI routines specifically. Although his examples are written in uni-REXX, they are quite applicable to VM. 0 o You should also read all of the help files that come with Webshare. They are sparse, but informative. 0 o You will likely find it useful to subscribe to the WWW-VM mailing list. You can do that by sending email to listserv@sjuvm.stjohns.edu and making the body of your mail the words "subscribe www-vm" followed by your name. WWW-VM is a very good place for asking questions about implementing Web pages under VM. 0 o You will also want to learn to use an HTML validation service and to use it to check your forms before you put them into production. One service that I use is: 0 http://www.webtechs.com/html-val-svc/ 0 Now, let's start with the HTML for a very simple Web form: 0 +----------------------------------------------------------------------+ | | | | | | | Sample Form for Sample CGI | | | | | | | |

| | | |

| | | |

| | Userid: | | | | Password: | | | |

| | | | | | | |

| | | | | | | | | | | +----------------------------------------------------------------------+ 1 0 Writing Webshare CGI Scripts Page 3 ------------------------------------------------------------------------ 0 This example builds a Web form that prompts the user for two fields, a userid and a password. When the user clicks on the "Send" button, the specified action takes place, which means in this case that the Web server invokes a program named POSTTEST CGI, passing it the information that the user entered into the form. 0 When your Webshare CGI is invoked, it has four sources of information: 0 1. The primary input stream for your CGI routine contains the data that the user posted to your Web form. There is one input record per field, each record in the format: name=value. 0 2. The secondary input stream for your CGI routine contains one record per HTTP header line; you are not likely to need this information in the beginning. 0 3. The standard CGI "environment variables" are available to your CGI through the CMS GLOBALV command. Enter HELP CGI ENVIRONMENT to get a list of the variables. 0 4. If the URL that addressed your CGI contained a question mark, anything after that question mark is passed as a calling argument. You can access it using a REXX Parse Arg instruction. 0 Your CGI routine has a single output stream, in addition to its two input streams. When you write HTML on that output stream, it is sent to the client browser to be displayed for the user. 0 Since a Webshare CGI routine is a pipeline stage, you need to understand the basics of writing a REXX pipeline filter, but little more than that is absolutely necessary. If you wish, you can simply read the input records using the CMS Pipelines PEEKTO or READTO commands and write your + _____________ responses to the user using OUTPUT commands. The CMSHELP CGI that comes with Webshare illustrates this technique quite well, so we will concentrate instead on writing CGI routines that make more interesting uses of CMS Pipelines. + _____________ 0 Our simple Web form invoked POSTTEST CGI, which comes with Webshare. POSTTEST is a good thing to play around with when you are starting to do your own forms and CGIs. It is invoked from a form to display the values entered on the form. Here is a portion of POSTTEST: 0 +----------------------------------------------------------------------+ | | | /* Copyright 1994, Richard M. Troth | | * | | * Name: POSTTEST CGI | | * to verify the correct operation of HTML "forms" | | * Author: Rick Troth, Houston, Texas, USA | | * Date: 1994-Aug-15 | | * last updated for CMS HTTP 1.1.6 and 1.1.6v | | */ | | | 1 0 Page 4 Writing Webshare CGI Scripts ------------------------------------------------------------------------ 0 | 'OUTPUT' "<html>" /* Start HTML stream. */ | | | | . . . | | | | 'OUTPUT' "<hr> Results (what you posted, if anything): <br>" | | 'CALLPIPE', /* Display posted fields: */ | | '*: |', /* Stream 0 is posted fields. */ | | 'change /</&lt;/ |', /* Make "<" safe to display. */ | | 'change />/&gt;/ |', /* Make ">" safe to display. */ | | 'spec 1-* 1 /<br>/ next |', /* <BR> at end of each line. */ | | '*:' /* HTML output to browser. */ | | | | 'OUTPUT' "<hr> Environment (of the server): <br>" | | | +----------------------------------------------------------------------+ (continued) 1 0 Writing Webshare CGI Scripts Page 5 ------------------------------------------------------------------------ 0 +----------------------------------------------------------------------+ | | | 'CALLPIPE', /* Display environment variables: */ | | 'command GLOBALV SELECT HTTPD LIST |', /* Environment vars. */ | | 'drop first 1 |', /* Drop title line. */ | | 'nfind _TYPE.| nfind _PIPE.|', /* Discard uninteresting stuff.*/ | | 'nfind _GTYPE.| nfind _MTYPE.|', | | 'sort |', /* Sort by variable name. */ | | 'change /</&lt;/ |', /* Make "<" safe to display. */ | | 'change />/&gt;/ |', /* Make ">" safe to display. */ | | 'spec 1-* 1 /<br>/ next |', /* <BR> at end of each line. */ | | '*:' /* HTML output to browser. */ | | | | 'OUTPUT' "<hr></html>" /* Show end of HTML. */ | | | +----------------------------------------------------------------------+ 0 POSTTEST first issues an OUTPUT to write a record denoting the beginning of an HTML file. The next OUTPUT writes the header for a display. Then a CALLPIPE is used to read the posted fields from the primary input stream, massage them to encode left and right carets, insert a line break at the end of each record, and then write them out to the browser, thus reflecting the data from the form back to the user as HTML. Another OUTPUT writes another header and another CALLPIPE is used to issue a CMS GLOBALV command to get the environment variables, which are formatted as HTML and sent on the output stream to the browser. Finally, another OUTPUT command writes the record denoting the end of an HTML stream. 0 Thus, when your form invokes POSTTEST, you can see the data that your own CGI will receive when you change the form to invoke your CGI rather than POSTTEST. When you are ready to do that, you might change the ACTION attribute of the HTML FORM tag as follows: 0 <FORM METHOD=POST ACTION="/local/sample.*cgi"> 0 That will cause the routine SAMPLE CGI to be invoked when the user clicks on the "Send" button. Here is SAMPLE CGI, which we will discuss one section at a time: 0 +----------------------------------------------------------------------+ | | | /* SAMPLE CGI: Common Gateway Interface example */ | | | | parm.userid = '' /* Initialize form variables. */ | | parm.password = '' | | | | 'CALLPIPE (name GetForm)', /* Input: name=value */ | | '*: |', /* Get input from sample form. */ | | 'xlate 1-* 05 40 |', /* Convert tabs to blanks. */ | | 'strip |', /* Strip leading/trailing. */ | | 'locate 1 |', /* Discard null lines, if any. */ | | 'xlate fieldsep = f1 upper |', /* Upper-case variable names. */ | 1 0 Page 6 Writing Webshare CGI Scripts ------------------------------------------------------------------------ 0 | 'change //=PARM./ |', /* ==> =PARM.name=value */ | | 'varload' /* Set PARM.name variables. */ | | | +----------------------------------------------------------------------+ (continued) 1 0 Writing Webshare CGI Scripts Page 7 ------------------------------------------------------------------------ 0 +----------------------------------------------------------------------+ | | | If parm.userid == '' | parm.password == '' Then Do /* Invalid? */ | | /* Your processing here */ /* Yes, do something. */ | | Exit /* And exit. */ | | End | | | | Address Command 'CP SET MSG IUCV' /* To receive MSGes in pipe. */ | | | | 'CALLPIPE (endchar ? name ToServer)', /* Send request to server: */ | | 'literal SMSG SAMPSRV' parm.userid parm.password '|', | | 'cp |', /* Issue the SMSG command. */ | | 'console', /* Display any CP response. */ | | '?', | | 'starmsg |', /* Get MSGes from the server. */ | | 'find 00000001SAMPSRV_|', /* Ignore if anything else. */ | | 'spec 17-* 1 |', /* Remove MSG header. */ | | 'take 1 |', /* Done when get one line. */ | | 'var response |', /* Save that response. */ | | 'pipestop', /* Terminate DELAY stage. */ | | '?', | | 'literal +20 |', /* Specify 20 secs from now. */ | | 'delay |', /* Wait up to 20 seconds. */ | | 'count lines |', /* Remember if DELAY completed.*/ | | 'var timeout |', /* Store 0 or 1 in variable. */ | | 'pipestop' /* Terminate STARMSG stage. */ | | | | If timeout Then Do /* Tell user if timed out. */ | | 'CALLPIPE (name GotTimeOut)', | | 'literal <html>|', | | 'append literal <head><title>Error Message</title></head>|', | | 'append literal <hr>Server is not available.<hr>|', | | 'append literal </html>|', | | '*:' /* To Webshare to send out. */ | | Address Command 'CP SET MSG ON' | | Exit /* Abort the request. */ | | End | | | | Else /* Tell user if successful. */ | | 'CALLPIPE (name SendResp)', | | 'literal <html>|', | | 'append literal <head><title>Response</title></head>|', | | 'append literal <h2>Response to your request:</h2>|', | | 'append literal <hr>|', | | 'append var response |', /* Load response from server. */ | | 'append literal <hr></html>|', | | '*:' /* To Webshare to send out. */ | | | | Address Command 'CP SET MSG ON' /* Restore setting. */ | | | | Exit | | | +----------------------------------------------------------------------+ 1 0 Page 8 Writing Webshare CGI Scripts ------------------------------------------------------------------------ 0 The first thing to note is that the filetype of this program is CGI, not REXX, even though it is a pipeline filter. The first two statements initialize the variables in which we will store the data the user enters into the two boxes on the form: 0 +----------------------------------------------------------------------+ | | | /* SAMPLE CGI: Common Gateway Interface example */ | | | | parm.userid = '' /* Initialize form variables. */ | | parm.password = '' | | | +----------------------------------------------------------------------+ 0 When Webshare invokes a CGI, it passes the information from the form as records on the primary input stream of the CGI routine. These records are of the format "name=value", where "name" is the name specified as the NAME attribute of the HTML INPUT tag, and "value" is the data the user entered for that field on the form. There is one such record for each INPUT tag in the form. Thus, our SAMPLE CGI should receive two input records, one for the userid and one for the password. 0 The CGI routine can read these records with a CMS Pipelines PEEKTO + _____________ command, or it can read them into a subroutine pipeline that connects to the primary input stream of the CGI routine, as we do here: 0 +----------------------------------------------------------------------+ | | | 'CALLPIPE (name GetForm)', /* Input: name=value */ | | '*: |', /* Get input from sample form. */ | | 'xlate 1-* 05 40 |', /* Convert tabs to blanks. */ | | 'strip |', /* Strip leading/trailing. */ | | 'locate 1 |', /* Discard null lines, if any. */ | | 'xlate fieldsep = f1 upper |', /* Upper-case variable names. */ | | 'change //=PARM./ |', /* ==> =PARM.name=value */ | | 'varload' /* Set PARM.name variables. */ | | | +----------------------------------------------------------------------+ 0 The first thing this subroutine pipeline does to the records that come through its "*:" input connector is use an xlate stage to translate tab characters to blanks, as some Web browsers treat the two as equivalent. Then a strip stage strips away any leading or trailing blanks. In case that removes all the characters from a record, a locate 1 is used to discard null records (all records that aren't at least one byte long). There should not be any null input records, but you never know what some strange Web browser may send you. 0 The second xlate stage is a bit more complicated. It defines the equal 1 0 Writing Webshare CGI Scripts Page 9 ------------------------------------------------------------------------ 0 sign as its field separator.(1) Thus, since the format of the records is "name=value", the first field of the records is the name, and xlate field 1 upper translates just the first field, the name, to uppercase. (This is done in preparation for using the name field as part of a REXX variable name.) 0 Then the change stage inserts "=PARM." at the beginning of each record, thus converting each record to the format "=PARM.name=value". This is exactly the form required by the varload stage to store "value" as the value of the REXX variable "PARM.name". That is, varload expects a record that begins with a delimiter character (the equal sign in this case) followed by a variable name, the same delimiter character again, and then the variable's desired value. 0 So, this subroutine pipeline as a whole receives the two records that the Web server creates from the input the user enters into the two boxes on the form and stores the values the user entered as the REXX variables PARM.USERID and PARM.PASSWORD. The reason for prefixing the name fields from the form with "PARM." is simple: That way, you control the names + ___ of the variables that are stored, thus handling the case where a malicious user sends a bogus form with unexpected field names. When you have used this technique, you need be suspicious of only those variables that have the stem PARM. 0 The next bit of processing is indicated here in skeleton form: 0 +----------------------------------------------------------------------+ | | | If parm.userid == '' | parm.password == '' Then Do /* Invalid? */ | | /* Your processing here */ /* Yes, do something. */ | | Exit /* And exit. */ | | End | | | +----------------------------------------------------------------------+ 0 You would insert code appropriate for validating the user's input. When the input is in error, you would take suitable action, such as sending an error message, before exiting from the CGI routine. We will see an example of sending such an error message later. 0 The next statement issues the CP command SET MSG IUCV, which tells CP that you wish to receive any messages from other users in a buffer in memory, rather than on your console: 0 0 -------------------- 0 (1) The xlate filter was enhanced to support the field and field separator keywords in CMS 11 (VM/ESA 1.2.2). 1 0 Page 10 Writing Webshare CGI Scripts ------------------------------------------------------------------------ 0 +----------------------------------------------------------------------+ | | | Address Command 'CP SET MSG IUCV' /* To receive MSGes in pipe. */ | | | +----------------------------------------------------------------------+ 0 This allows you to capture messages from other users in a pipeline for processing there. (Note that the CP command is issued with Address Command. Because this is a CMS Pipelines filter, the default is to + ______________ address CMS Pipelines, not the CP/CMS command environment.) + _____________ 0 The big subroutine pipeline that comes next is the heart of the CGI. It is rather unusual in that it is composed of three pipeline segments, none of which is connected to the others. We will discuss it one segment at a time. 0 The first segment sends the data the user entered on the Web form to a service virtual machine called SAMPSRV: 0 +----------------------------------------------------------------------+ | | | 'CALLPIPE (endchar ? name ToServer)', /* Send request to server: */ | | 'literal SMSG SAMPSRV' parm.userid parm.password '|', | | 'cp |', /* Issue the SMSG command. */ | | 'console', /* Display any CP response. */ | | | +----------------------------------------------------------------------+ 0 literal builds a CP SMSG command for sending both the userid and the password to SAMPSRV. (SMSG stands for "special message"; the SMSG command is often used for sending information to service machines.) This SMSG record is received by the cp stage, which issues it as a CP command. If the SMSG command completes with an error message, such as "SAMPSRV not logged on", the console stage displays that, so it will appear in the spooled console. 0 So, the job of this first pipeline segment is to send information about the user's request to be processed in another virtual machine, SAMPSRV. The bulk of the processing is done there, rather than in the Web server, where our CGI routine is executing. You want to keep the part of your processing that is actually done in the Web server as "light-weight" as possible, for reasons of both security and integrity. 0 The second segment of the subroutine pipeline waits for a response from SAMPSRV: 1 0 Writing Webshare CGI Scripts Page 11 ------------------------------------------------------------------------ 0 +----------------------------------------------------------------------+ | | | '?', | | 'starmsg |', /* Get MSGes from the server. */ | | 'find 00000001SAMPSRV_|', /* Ignore if anything else. */ | | 'spec 17-* 1 |', /* Remove MSG header. */ | | 'take 1 |', /* Done when get one line. */ | | 'var response |', /* Save that response. */ | | 'pipestop', /* Terminate DELAY stage. */ | | | +----------------------------------------------------------------------+ 0 starmsg is the pipeline stage that receives messages from other virtual machines when you have issued the CP SET MSG IUCV command. starmsg feeds each message it receives into the pipeline as a record preceded by a 16-byte header field. The first 8 bytes are "00000001", which shows that the input is from a CP MSG command; the second 8 bytes are the name of the virtual machine that sent the message. 0 This find stage discards any input captured by starmsg that either was not the result of a CP MSG command or was not sent by SAMPSRV. (The underscore specifies that there must be one blank after the characters "SAMPSRV".) The purpose of discarding such records is to protect your CGI routine from random messages from other users. After the 16-byte header has been validated, it is removed by a spec stage, since it is of no further interest. 0 In this application, each transaction should result in one message from SAMPSRV. The take 1 stage causes this segment of the pipeline to terminate as soon as a transaction is complete (that is, as soon as SAMPSRV has sent one message). In another application, you might need to use some other stage to terminate the processing, such as a tolabel that looks for a string known to mark the end of a transaction. Unless you do something to terminate this pipeline segment, starmsg will continue to wait for more input from CP. 0 var stores the response line in the REXX variable RESPONSE. In addition to setting the value of that variable, however, var also copies the line to its output stream, where it becomes the input for a pipestop stage. Whenever pipestop receives an input record, it terminates any stages in the pipeline set that are waiting for an external event. In this case, that would be the delay stage in the next pipeline segment, which, as we will see shortly, is waiting for a timer "pop". 0 So, what this second pipeline segment as a whole does is receive a response from the SAMPSRV service machine, store it in a REXX variable, and then terminate both itself and the third pipeline segment. 0 The third pipeline segment is used to make sure that the entire pipeline does not hang forever if, for example, SAMPSRV is not logged on: 1 0 Page 12 Writing Webshare CGI Scripts ------------------------------------------------------------------------ 0 +----------------------------------------------------------------------+ | | | '?', | | 'literal +20 |', /* Specify 20 secs from now. */ | | 'delay |', /* Wait up to 20 seconds. */ | | 'count lines |', /* Remember if DELAY completed.*/ | | 'var timeout |', /* Store 0 or 1 in variable. */ | | 'pipestop' /* Terminate STARMSG stage. */ | | | +----------------------------------------------------------------------+ 0 literal creates a record that says "+20", which is read into delay. To delay, such a record means "wait 20 seconds". After 20 seconds, delay copies the "+20" record to its output and then terminates (since it has no more input). 0 If delay runs long enough to write that output record, count lines will count the record and write an output record containing a count of 1. On the other hand, if delay is stopped by the pipestop in the second pipeline segment before the 20 seconds have elapsed, delay will not + ___ write an output record, so count lines will receive no input and will write an output record containing a count of 0. 0 var stores the output of count lines in the REXX variable TIMEOUT. The value of TIMEOUT will therefore be either 0 or 1, so TIMEOUT can be used as a Boolean variable indicating whether or not the request timed out. var also copies the record it receives from count lines to its output stream, where it is read by pipestop, which stops the starmsg stage in the second pipeline segment, which is still waiting for a message to arrive. 0 The assumption in all this is that if SAMPSRV has not responded in 20 seconds, it is likely never to respond, so the request should be aborted. The request is aborted in the next section of code: 0 +----------------------------------------------------------------------+ | | | If timeout Then Do /* Tell user if timed out. */ | | 'CALLPIPE (name GotTimeOut)', | | 'literal <html>|', | | 'append literal <head><title>Error Message</title></head>|', | | 'append literal <hr>Server is not available.<hr>|', | | 'append literal </html>|', | | '*:' /* To Webshare to send out. */ | | Address Command 'CP SET MSG ON' | | Exit /* Abort the request. */ | | End | | | +----------------------------------------------------------------------+ 0 If the value of TIMEOUT is "true" (1), a subroutine pipeline generates the HTML to display an error message to the user of the Web page. This 1 0 Writing Webshare CGI Scripts Page 13 ------------------------------------------------------------------------ 0 HTML goes out the "*:" output connector of the subroutine pipeline, where the Web server will receive it and transmit it across the network to the user's Web browser. 0 If you wish, you can also send HTML to the user by using the CMS + ___ Pipelines OUTPUT command. Whatever is written to the primary output + _________ stream of the CGI routine, either by a subroutine pipeline or by an OUTPUT command, is sent by the Web server to the Web client. 0 Similarly, if the value of TIMEOUT is "false" (0), because SAMPSRV did + ___ respond in less than 20 seconds, the last subroutine pipeline in SAMPLE CGI is invoked to send the response to the user (wrapped in the appropriate HTML): 0 +----------------------------------------------------------------------+ | | | Else /* Tell user if successful. */ | | 'CALLPIPE (name SendResp)', | | 'literal <html>|', | | 'append literal <head><title>Response</title></head>|', | | 'append literal <h2>Response to your request:</h2>|', | | 'append literal <hr>|', | | 'append var response |', /* Load response from server. */ | | 'append literal <hr></html>|', | | '*:' /* To Webshare to send out. */ | | | +----------------------------------------------------------------------+ 0 And that is all there is to it. 0 Here is a skeleton for the SAMPSRV service machine, which you can study for yourself: 1 0 Page 14 Writing Webshare CGI Scripts ------------------------------------------------------------------------ 0 +----------------------------------------------------------------------+ | | | /* SAMPSRV EXEC: Sample service machine for sample Web form. */ | | | | 'CP SET SMSG IUCV' /* To receive SMSGes in pipe. */ | | | | Do Forever /* Loop processing requests. */ | | | | 'PIPE (name SampleIn)', /* Get next request from CGI: */ | | 'starmsg |', /* Listen for SMSGes. */ | | 'find 00000004HTTPD _|', /* Must be SMSG from Web servr.*/ | | 'take 1 |', /* Stop listening when get 1. */ | | 'xlate |', /* Upper-case the request. */ | | 'spec', /* Extract request fields: */ | | '17-* 1', /* userid and password; */ | | 'write', /* write those as a record; */ | | '9.8 nw |', /* sender's ID from header. */ | | 'var request |', /* Store request in variable. */ | | 'drop 1 |', /* Discard that record. */ | | 'var sender' /* Next record is sender's ID. */ | | | | If Symbol(sender) == 'LIT' /* Pipe halted with HMSG? */ | | Then Leave /* Yes, leave the loop. */ | | | | /* Process the request here */ | | | | 'CP MSG' sender 'COMPLETE' /* Reply to the waiting CGI. */ | | | | End | | | +----------------------------------------------------------------------+ 0 Note that the find stage asumes that the name of the VM Web server begins "HTTPD" and that it is no longer than six characters. Note also that this service machine can be halted by typing the immediate command "HMSG" on its virtual console, in which case the variable SENDER will be dropped, because var will have received no input.