1 0 0 Cramming for the Journeyman Plumber Exam + Cramming for the Journeyman Plumber Exam 0 Part I: Record Flow in CMS Pipelines + Part I: Record Flow in CMS Pipelines + _____________ 0 Melinda Varian 0 Office of Computing and Information Technology Princeton University 87 Prospect Avenue Princeton, NJ 08544 USA --.-- Email: Melinda@princeton.edu Web: http://pucc.princeton.edu/~Melinda Telephone: (609) 258-6016 0 VM Workshop June 1997 0 0 I. INTRODUCTION + I. INTRODUCTION + ________________ 0 Once one has a thorough understanding of the flow of records through a pipeline, writing complex CMS Pipelines applications becomes much + _____________ easier. One is no longer troubled by pipeline stalls, so one can concentrate on the function required, rather than on the infrastructure. Every journeyman plumber should understand record flow well enough to be able to write pipelines that give the effect of multi-tasking, such as, for example, a server that will interact with multiple clients concurrently. Fortunately, once one understands the basics of record flow, the writing of such "multi-tasking" pipelines becomes an essentially trivial exercise. 0 I will be borrowing heavily from John Hartmann's paper CMS Pipelines + _____________ Explained, as that is still the canonical work for understanding CMS + _________ ___ Pipelines. I will also be quoting from PIPELINE CFORUM postings by two + _________ of IBM's master plumbers, Glenn Knickerbocker and Steve Hayes. 0 II. RECORD FLOW + II. RECORD FLOW + ________________ 0 The flow of records through a pipeline is under the control of the pipeline dispatcher, which dispatches the individual programs (or "stages") of the pipeline to make data flow in an orderly fashion. 0 The programs in a pipeline do not exchange data directly; rather, they exchange data [under the auspices of] the pipeline dispatcher. They call the pipeline dispatcher to read a record 1 0 Page 2 Record Flow in CMS Pipelines ------------------------------------------------------------------------ 0 from an input stream or to write a record to an output stream.(1) + ____________ _____________ 0 The dispatcher does not buffer records between stages; instead, the dispatcher minimises the number of records in flight by giving + _________ priority to stages that can pump data out of the pipeline. 0 Reading and Writing Records + Reading and Writing Records + ___________________________ 0 To see how records flow through a pipeline, let's step through the running of a very simple pipeline that displays the count of the words in a file: 0 pipe disk input file | count words | console 0 1. The dispatcher calls disk, which reads a record from the CMS file and calls the dispatcher to write the record into the pipeline. 0 2. The dispatcher tests whether the stage that is connected to disk's output is waiting to read a record. This is not the case; the dispatcher must suspend disk, because it cannot return to disk before the record has been processed. It then looks for other work to do. 0 3. The dispatcher now starts the next stage, count. That is, it calls the entry point for count. 0 4. count calls the pipeline dispatcher to read a record from the pipeline. disk and count are now at a rendez-vous; disk has + ___________ produced a record and count intends to consume one. The dispatcher already has a record, which it passes on to count as it returns to count. (Note that it does not call count again; there are no recursions into stages.) 0 5. count scans the record it has just obtained and counts the words. It then calls the dispatcher to get the next record, in effect discarding the record it has just processed. 0 6. The dispatcher suspends count because it has no record available for it. But now the record that disk wrote has been processed; the dispatcher can resume disk in the hope that it will then write another record, which can be passed on to count. 0 7. Alas, the file has only one record. disk has no more to do; it returns to the dispatcher. 0 -------------------- 0 (1) All quotes not otherwise marked are from John Hartmann's CMS + ___ Pipelines Explained. + ___________________ 1 0 Record Flow in CMS Pipelines Page 3 ------------------------------------------------------------------------ 0 8. The initialisation/termination part of the dispatcher regains control and cleans up after disk. It notices that count is waiting for its next input record, but clearly there will never be one; the dispatcher sets a return code for count to indicate end-of-file and resumes count; that is, it returns to count. 0 9. count notes the end-of-file condition and writes a record containing the count of input records to its output. It does this by calling the dispatcher entry point that writes a record into the pipeline just as disk did. 0 10. count is now in the same situation as disk in step 2; count is suspended; the dispatcher finally starts console. 0 11. console calls the dispatcher to read a record. 0 12. The dispatcher makes available the record that count just wrote and returns to console. 0 13. console writes a line on the terminal and calls the dispatcher to read another record. 0 14. The dispatcher suspends console and resumes count. 0 15. count has no more to do and returns to the dispatcher. 0 16. The dispatcher resumes console indicating end-of-file. 0 17. console returns to the dispatcher. 0 18. All stages of the pipeline have now returned to the dispatcher; the PIPE command returns to CMS. 0 You will see from this description that a program in a pipeline does not read all of its input records and write all of its output records and then terminate before the next program begins processing (as in DOS pipes), nor does a program in a pipeline produce an indeterminate number of bytes that must be buffered somewhere until the kernel switches to the next program to process those bytes (as in UNIX pipes). Instead, when a stage in CMS Pipelines writes an output record, the pipeline + ______________ dispatcher does not allow it to resume running again until after its output record has been read by the next stage in the pipeline. 0 One might think that the dispatcher [would] gobble up all the output from disk [in the example above] and then present it as a unit to count.... Such a strategy would, however, limit the size of files that could be processed and cause unnecessary complications in the dispatcher; instead, the dispatcher transfers control between the stages in a pipeline using a strategy that is designed to expedite the flow of data and minimise the number of records in flight. 1 0 Page 4 Record Flow in CMS Pipelines ------------------------------------------------------------------------ 0 The dispatcher never buffers the records flowing between stages, nor does it write them to an intermediate file. In fact, the dispatcher never touches the records at all. Rather, when a stage writes a record into the pipeline, it simply gives the pipeline dispatcher a pointer to the buffer containing that record and waits until the dispatcher allows it to continue processing. When a stage requests a record from the pipeline, the dispatcher just gives it the pointer to the record in the buffer belonging to the stage that produced the record. 0 The stage on the left-hand of a connection produces a record when + ________ it writes one into the pipeline. When the stage on the other side of the connection consumes the record, the rendez-vous between the + ________ ___________ producer and consumer is over; each can continue its own processing. The record has moved across the connection: the producer can re-use the output area; the consumer can read another record. 0 Thus, you will see that the two programs must operate in lock-step. The right-hand program must either consume the record or terminate before the left-hand program will be allowed to do any further processing. This behavior is necessary because the record exists only in the buffer belonging to the left-hand program. That program must not be allowed to regain control and possibly alter the contents of its buffer until after the consumer has moved the record out of the buffer. 0 The fact that the producing and consuming stages operate in lock-step is one of the tools that allow you to control the flow of records through the pipeline, a topic we will return to later. 0 Peeking at records + Peeking at records + __________________ 0 In a typical pipeline, one record goes through all of the stages from the beginning to the end before the next record starts that journey. This behavior is so different from what most people expect that it often takes a while for new plumbers really to believe that pipelines behave this way. Every beginning plumber should get hold of Chuck Boeheim's brilliant PIPEDEMO program and watch it as it runs its demonstration programs. Watching PIPEDEMO display the progress of records through a few pipelines is a good way to become convinced that in general most pipelines have only one record passing through at a time. 0 I mentioned earlier that the dispatcher minimizes the number of records in flight by giving priority to stages that can pump out data. This strategy would tend to result in the observed behavior, but in most cases the dispatcher has no real choice of which stage to dispatch next, because the producer and consumer of a record operate in lock-step, as we have just seen, and because most stages produce their output record + and before they consume the input record from which that output record was derived. That is, most pipeline stages "peek" at the record written by the left-hand stage, write an output record derived from that input record, and then consume the input record. 1 0 Record Flow in CMS Pipelines Page 5 ------------------------------------------------------------------------ 0 The peekto pipeline command ... peeks at a record without + _____ consuming it. A peek is a non-destructive read; a particular record can be peeked any number of times before being consumed with a readto pipeline command. When a peek completes, the producer is guaranteed to be blocked in a write, or there is no producer--the stream has been severed and is at end-of-file. Blocking the producer ensures that the contents of a record are not changed while it is [still] being processed by a subsequent stage. 0 Thus, a stage can determine whether it likes a record before consuming it; in fact, a stage can produce a derivative of a record it has peeked before it consumes that record. 0 And, in fact, most pipeline stages do produce a derivative of their input record before they consume that input record. They do this by processing each record through a peekto-output- readto loop. 0 One needs to be clear on the difference between the readto and peekto pipeline commands. In MVS terms, this is the difference between a move-mode read and a locate-mode read. That is, a readto command copies the record from the producer's buffer to the consumer's buffer, thereby freeing up the producer to produce additional records using that buffer. On the other hand, peekto in effect operates on the record while it is still in the producer's buffer; therefore, the producer must not be allowed to run, lest it change the contents of its buffer. 0 Let's step through another simple pipeline to see how this peekto-output-readto scheme works: 0 pipe < input file | change /abc/def/ | xlate | > output file a 0 1. The dispatcher starts <, which reads a record from the CMS file and does an output to write the record into the pipeline. 0 2. The dispatcher suspends <, which must wait for its output command to complete. 0 3. The dispatcher looks for a stage that can consume the record and starts change. 0 4. change does a peekto and is given the pointer to the record that < wrote to the pipeline. < remains blocked, as its output command will not complete until change has consumed the record. 0 5. change may or may not change the record. Either way, it does an output to write the record to the pipeline. 0 6. change is blocked in its write (as is <), so the dispatcher looks for another stage to run and starts xlate. 1 0 Page 6 Record Flow in CMS Pipelines ------------------------------------------------------------------------ 0 7. xlate does a peekto to get the record, upper-cases it, and does an output to write the upper-cased record into the pipeline. 0 8. Now, as you will see, <, change, and xlate are all blocked waiting for their output records to be consumed, so the dispatcher starts >, which is the only stage that can run. 0 9. > does a peekto and writes the peeked record to a CMS file. Then, since it has no output stream connected, > consumes its input record by doing a readto. It is then ready to process another record, so it does a peekto to get the next record. Since there is no record available, the dispatcher suspends > and looks for another stage to dispatch. 0 10. xlate can now be dispatched, because its output command completed when > consumed the record with readto. Now that the derivative record has been safely written, xlate does a readto to consume its input record and then does a peekto to get another record to process. 0 11. There is no record available for xlate, so the dispatcher looks for another stage to run and finds that the only stage that can run is change, which can now be dispatched, because its output command completed when xlate consumed the record. change can now do a readto to consume the input record it has finished processing. It then peeks to get another record, which causes it to give up control to the dispatcher. 0 12. The dispatcher finds that change, xlate, and > are all waiting for input, so the only stage it can dispatch is <, which can now run because its output command has completed. < reads from the CMS file. If there is another record, that record will flow through the pipeline in exactly the same way the first record did, going all the way to the end before the next record gets started through the pipeline. And this sequence will repeat until all the records in the input file have been processed. 0 13. When all the records have been read, < returns to the dispatcher, causing its output stream to be severed. 0 14. The dispatcher looks for work to do and sees that the only stage it can dispatch is change, which is sitting in a peekto waiting for a record on the stream that has just been severed. It dispatches change with a return code of 12 to indicate end-of-file on that peekto. 0 15. change sees that it has no more input records, so it returns to the dispatcher, causing its output stream to be severed. 0 16. Similarly, xlate and then > are dispatched with return code 12 on their peekto commands, so they also return to the dispatcher. 1 0 Record Flow in CMS Pipelines Page 7 ------------------------------------------------------------------------ 0 What happened here was that the last stage wrote each record before the first stage consumed it. Stage 1 wrote a record and became blocked, stage 2 peeked it and wrote it and became blocked, stage 3 peeked it and wrote it and became blocked, etc. Finally, stage n consumed it, stage + ___ _ n-1 consumed it, ... stage 1 consumed it. Then the next record started + _ through the pipeline, and it, too, shot through all of the stages as quickly as possible. 0 Thus, in this simple pipeline and, indeed, in most pipelines, the pipeline dispatcher never had a choice of which stage to run next. Because most pipeline stages produce their output record before they consume the corresponding input record, the flow of records through a pipeline is completely predictable. The dispatcher simply runs the stage that can consume a record, if there is one waiting to be consumed; if there is no record waiting to be consumed, it finds a stage that is runnable and runs it. 0 Pipeline Stalls + Pipeline Stalls + _______________ 0 What happens when the dispatcher discovers that no stage can run? The + __ pipeline stalls. 0 A pipeline is stalled when no stage can run and no stages are waiting for external events, but not all stages have terminated. A stall can occur only if at least one stage of a pipeline specification has secondary streams. A stall is easily provoked; for example, this two-stage pipeline stalls: 1 0 Page 8 Record Flow in CMS Pipelines ------------------------------------------------------------------------ 0 +----------------------------------------------------------------+ | | | +---------+ +-------+ | | | literal |--->|0 0|---+ | | +---------+ | | | | | | fanin | | | | | | | | | +--->|1 | | | | | +-------+ | | | | | | | +----------------+ | | | |----------------------------------------------------------------| + + + | | | pipe literal abc | i: fanin | i: | | | | Pipeline stalled. | | ... Issued from stage 2 of pipeline 1. | | ... Running "fanin". | | Stage is wait out. | | ... Issued from stage 1 of pipeline 1. | | ... Running "literal abc". | | Stage is wait out. | | ... Issued from stage 2 of pipeline 1. | | ... Running "fanin". | | Ready(-4095); | | | +----------------------------------------------------------------+ 0 The label reference at the end of the pipeline specification defines fanin's secondary streams; its primary output stream is connected to its secondary input stream. 0 fanin works by first passing all records on the primary input stream to the primary output stream; it then passes all records on the secondary input stream to the primary output stream, and so on. In this example, it peeks at the input record on the primary input stream and writes it to the primary output stream. The dispatcher blocks fanin (makes it non-dispatchable) because it is waiting for an output record to be consumed. But it is the very same stage that must consume the record, and (being non-dispatchable) fanin cannot possibly consume an input record to satisfy its own write. The pipeline dispatcher now tries to find a stage to run to consume the record, but cannot find one; the pipeline is stalled. To continue, the pipeline dispatcher then severs all connections and makes all stages dispatchable with return code -4095. 0 Pipeline stalls most commonly arise from this sort of pipeline topology: 1 0 Record Flow in CMS Pipelines Page 9 ------------------------------------------------------------------------ 0 +----------------------------------------------------------------------+ | | | +----------+ +------+ +------+ +----------+ | | | |----| |----| |----| | | | | | +------+ +------+ | | | | ---| splitter | | joiner |--- | | | | +------+ | | | | | |----------| |----------| | | | +----------+ +------+ +----------+ | | | +----------------------------------------------------------------------+ 0 A "splitter" stage writes records to two or more output streams. Later in the pipeline, a "joiner" stage reads records from those same streams. 0 As we have seen, when any stage writes an output record, it remains blocked until that record has been consumed by the stage connected to its output stream. Thus, when a splitter stage (such as locate, chop, drop, or fanout) writes a record on one output stream, it must then wait until that record has been consumed before it can write another record to that same stream or to any other stream. + or 0 But, as we have seen, most pipeline stages do not consume a record when they first read it. They process every record through a peekto-output-readto loop. They peek their input record and write a derivative and do not consume their input until after they have written the derivative to their own output stream and it has been consumed by the stage to which they wrote it. 0 If all the stages in the multistream portions of a pipeline like the one shown above use peekto- output-readto loops, each of them passes the record along without consuming it until after each of the subsequent stages has consumed it. Ultimately, then, the splitter stage must wait for the joiner stage to consume each record before the splitter can write the next one. If the joiner stage cannot consume a record that the splitter stage is trying to write, the pipeline stalls. 0 If the joiner stage is faninany, this configuration will never stall, because faninany always reads any record that is available on any of its input streams. Other joiner stages are more exacting, however. Some, such as spec and synchronise, wait until they have a record available on each of their input streams before they consume any of them and then consume them in stream-number order. Others, such as collate and merge, wait until they have a record available on each of their input streams and then choose which one to consume next based on the contents of the records. fanin and lookup have the most extreme requirements; they consume all the records from one input stream before they will read any records from any other input stream. 0 A further complication is that a stage in the multistream portion of a pipeline may "buffer" the records; that is, some stages, such as sort and instore, consume all their input records before writing any output 1 0 Page 10 Record Flow in CMS Pipelines ------------------------------------------------------------------------ 0 records and, thus, may keep the joiner stage waiting for records on one of its input streams. 0 So, in a pipeline where there is a stage that reunites streams that originated in a single stage earlier in the pipeline, there is a potential for pipeline stalls. The splitter stage may try to write onto one stream while the joiner stage is trying to read from another stream. When that happens, the pipeline stalls. 0 To prevent stalls, one inserts between the splitter and joiner stages (on one or more of the streams) a pipeline stage that will unblock the splitter stage by consuming the necessary number of records and holding them until the joiner stage is ready to read them. 0 The number of records that need to be "buffered" in this way varies. In the case with the least requirement for such buffering, the splitter stage writes a record to each of its output streams in stream-number order; the joiner stage reads records in stream-number order; and all of the intermediate stages contain peekto-output-readto loops. In this case, the flow of the records is not data-dependent, and a stall can be prevented simply by introducing copy stage(s) on the low-numbered stream(s): 0 +----------------------------------------------------------------------+ | | | +----------+ +------+ +----------+ | | | |----| copy |----| | | | | | +------+ | | | | ---| fanout | | spec |--- | | | | +------+ | | | | | |----| |----| | | | +----------+ +------+ +----------+ | | | +----------------------------------------------------------------------+ 0 copy is a very simple built-in program consisting of a readto-output loop. That is, copy violates the rule we have been discussing; it does not use peekto. This would be the equivalent of copy in REXX: 0 +----------------------------------------------------------------------+ | | | /* COPY REXX */ | | Signal On Error | | Do Forever /* Do until end-of-file. */ | | 'READTO record' /* Consume a record. */ | | 'OUTPUT' record /* Copy it to the output. */ | | End | | Error: Exit RC*(RC<>12) | | | +----------------------------------------------------------------------+ 1 0 Record Flow in CMS Pipelines Page 11 ------------------------------------------------------------------------ 0 copy does a consuming read to get a record and then copies that record to its output stream. So, the copy stage in the diagram above consumes the record that fanout writes to its primary output stream, freeing fanout to write a record to its secondary output stream. Meanwhile, copy writes that first record to its output stream. As a result, spec + ___ then finds records available on both of its input streams and consumes them both, freeing fanout to write to its primary output stream again and freeing copy to read from that stream again. 0 The readto pipeline command [in copy] performs a consuming read; the producer is not blocked and can produce another record while [copy's] output pipeline command ... is blocked in its write. Whether this actually happens depends on the implementation of the pipeline dispatcher; this is deliberately unspecified. 0 In other words, in general, either copy or fanout could write the next record. In the general case, that could introduce some indeterminacy into your pipeline. In this specific case, however, the pipeline is completely determinate. spec will not process a record from either input stream until it has peeked a record on both input streams. Therefore, it doesn't matter whether copy or fanout produces its output record first, for spec will not process either record until it has peeked both of them. copy has introduced just enough buffering to prevent a stall. 0 At the other extreme in the amount of buffering required to prevent a stall is the case where the joiner stage is fanin. Because fanin will read no records from its secondary stream until it has read all the records from its primary stream, one must introduce a stage such as buffer to consume all the records on the secondary stream and hold them until fanin is ready for them: 0 +----------------------------------------------------------------------+ | | | +----------+ +--------+ +----------+ | | | |---| |---| | | | | | +--------+ | | | | ---| fanout | | fanin |--- | | | | +--------+ | | | | | |---| buffer |---| | | | +----------+ +--------+ +----------+ | | | +----------------------------------------------------------------------+ 0 Otherwise, the splitter stage (fanout here) would become blocked trying to write the first record to fanin's secondary input and would never be able to write records 2-n to fanin's primary input. + _ 0 The same requirement to insert a stage to buffer all the records on a stream may arise because one of the other streams contains a stage, such as sort, that buffers all the records: 1 0 Page 12 Record Flow in CMS Pipelines ------------------------------------------------------------------------ 0 +----------------------------------------------------------------------+ | | | +----------+ +------+ +----------+ | | | |----| sort |----| | | | | | +------+ | | | | ---| fanout | | spec |--- | | | | +--------+ | | | | | |---| buffer |---| | | | +----------+ +--------+ +----------+ | | | +----------------------------------------------------------------------+ 0 In the intermediate case, one or more of the streams may need some buffering of records, depending on the order in which the splitter stage decides to write and the order in which the joiner stage decides to read. This is a job for elastic: 0 +----------------------------------------------------------------------+ | | | +--------+ +---------+ +---------+ | | | |--------| elastic |---------| | | | | | +---------+ | | | | ---| fanout | | collate |--- | | | | +-------+ +--------+ | | | | | |---| xlate |---| locate |---| | | | +--------+ +-------+ +--------+ +---------+ | | | +----------------------------------------------------------------------+ 0 elastic copies its input records to its output, buffering as many as may be necessary to prevent a pipeline stall. It reads input records whenever they become available and writes output records as they are consumed, while attempting to minimize the number of records in its buffer. 0 The fanout stage here makes two copies of each record. The copy that is written to fanout's primary output stream is not changed as it passes through this pipeline. The copy that is written to fanout's secondary output stream is modified and filtered by a series of stages. The collate stage reads records from both of its input streams, to find pairs that have matching keys. When collate finds a pair of records that match, it copies them to its output stream, discarding all others. 0 The elastic stage here prevents a pipeline stall by consuming the records fanout writes on its primary output stream, as it writes them, thus freeing fanout to write to its secondary output stream. elastic then makes the records available to collate whenever it wants them, so that collate always has a record available on its primary input when locate selects a record and writes it to collate's secondary input. elastic's buffering introduces no indeterminacy here, because the order of the records written by collate is determined by the contents of the records, not by the timing of their arrival. 1 0 Record Flow in CMS Pipelines Page 13 ------------------------------------------------------------------------ 0 copy would not do the job here because more than one record may need to be buffered at a given time, depending on how many records locate discards. buffer could be used, but there would then need to be a buffer on the other stream as well, and collate would see no records at all until after fanout had terminated. This would require enough virtual memory to hold one whole and one partial copy of the input. Thus, elastic is preferred here because it does just the minimum buffering required to prevent this pipeline from stalling. 0 In general, one can use elastic wherever buffering is needed to prevent a stall. However, if one knows that no more than one record will ever need to be buffered at a time, copy is more efficient than elastic, while if one knows that the entire file must be buffered, buffer is more efficient than elastic. 0 Debugging Pipeline Stalls + Debugging Pipeline Stalls + _________________________ 0 If one keeps these models in mind, it is usually straightforward to write pipelines that don't stall, but now and then one slips up, and a pipeline stalls. If the pipeline is complex, the cause of the stall may not be obvious, but fixing stalls is not difficult if one understands how records flow through a pipeline. 0 When fixing a stall, it is important to keep in mind that pipeline stalls are the penalty we pay for the fact that the pipeline dispatcher moves records through a multistream pipeline in such a way that the order of their arrival at the end of the pipeline is predictable. By reasoning carefully about record flow, one can learn to prevent stalls while retaining predictability. "Fixing" a stall by introducing bunches of buffer or copy or elastic stages at random may cause your pipeline to use more virtual memory than necessary, but, more importantly, it may introduce indeterminacy into the pipeline. It is well worth learning to cure stalls by inserting only the minimal buffering required to solve the problem. 0 There are three tools that you can use to analyze a stall: 0 1. Pipe dumps: When a pipeline stalls, a file called PIPDUMP LISTING is + Pipe dumps: written to the A-disk. It contains a dump of Pipelines control + _________ blocks somewhat cryptically formatted. It is possible to solve a + __ stall by reading a PIPDUMP. I did it once, but I don't recommend it. In fact, you may prefer to turn PIPDUMPs off by issuing the CMS command GLOBALV SETP PIPDUMP OFF. 0 2. Stall traces: When a pipeline stalls, a stall trace is written to + Stall traces: the console unless you have suppressed Pipelines messages by + _________ modifying the MSGLEVEL setting.(2) It is quite feasible to shoot a 0 -------------------- 0 (2) The option nomsglevel 6 disables the message "Issued from ...", but retains the message "Running ...". 1 0 Page 14 Record Flow in CMS Pipelines ------------------------------------------------------------------------ 0 stall using the stall trace. 0 Here is a pipeline that stalls: 0 +-------------------------------------------------------------------+ | | | 'PIPE (nomsglevel 6 endchar ? name StallingPipeline)', | | '< input file |', /* Read input file. */ | | 'f: find V|', /* Select detail records. */ | | 'l: lookup 13.8 master |', /* Compare with masters. */ | | '> output file a', /* Write matched masters. */ | | '?', | | 'f: |', /* Non-detail records here. */ | | 'locate 2 /C/ |', /* Select master records. */ | | 'l:' /* Into LOOKUP's secondary. */ | | | +-------------------------------------------------------------------+ +-------------------------------------------------------------------+ | | | Pipeline stalled. | | ... Running "> output file a". | | Stage is wait out. | | ... Running "< input file". | | Stage is wait out. | | ... Running "find V". | | Stage is wait loc. | | ... Running "lookup 13.8 master". | | Stage is wait loc. | | ... Running "> output file a". | | Stage is wait loc. | | ... Running "locate 2 /C/". | | Ready(-4095); | | | +-------------------------------------------------------------------+ 0 One thing you need to know is that the "Pipeline stalled" message in the trace does not point to the stage responsible for the stall; it + ___ just points to the last stage the dispatcher tried to dispatch. In other words, the dispatcher had tried to dispatch all of the other stages and had been unable to, so when it got to this one it realized that the pipeline was stalled. In fact, a stall is not caused by a specific stage; it is caused by the topology of the pipeline. 0 The approach I find most useful in debugging a stall is to check the stall trace for all stages that are marked "wait out". Those stages were blocked waiting for their output to be consumed. I next look at the stages that were supposed to consume those records and try to figure out why they could not consume them. Either they are also marked "wait out" (i.e., we have a cascade of stages waiting for the + ____ same record to be consumed), or they are marked "wait loc", which means that they were waiting for input, but on a different input + _________ stream. Until that other stream produces an input record, the stage 1 0 Record Flow in CMS Pipelines Page 15 ------------------------------------------------------------------------ 0 marked "wait loc" will not switch to read the record from the stage marked "wait out". When you have figured out why that other stream did not produce a record, you have figured out the cause of your stall. 0 Looking at the PIPE command and this stall trace, we see: 0 o < is waiting for its output to be consumed. Its only output stream is connected to find, which is also waiting for its output to be consumed. Therefore, find had peeked a record from <, leaving < blocked in an output while find tried to write that record itself. 0 o find has two output streams. Its primary output is connected to lookup, which is waiting to read input. If it had been the case that (1) find was trying to write to its primary output and (2) + and lookup was trying to read from its primary input, they would have been able to do so. Therefore, we can conclude that at least one of those statements is untrue. 0 o The input stream of locate is connected to the secondary output of find, and locate is waiting to read. If find had been trying to write to its secondary output, locate would have been able to read + _________ the record. From that we can conclude that find must have been trying to write to its primary output. 0 o Therefore, lookup must not have been reading from its primary input, which is why the pipeline stalled. 0 You will recall that lookup must read all records from its secondary input before it reads any records from its primary input. (That is, it reads all of the master records into its reference before it begins processing its detail records and matching them against that reference.) In this pipeline, find can't write any records to its primary output until lookup begins reading from its primary input, but lookup won't start reading from its primary input until it has seen end-of-file on its secondary input. It will not see end-of-file on its secondary input until find has read all of its input records and copied them to either its primary output or its secondary output. This is an impasse, which is why the pipeline stalled. 0 To fix the stall, one must allow find to write to its primary output when it finds a record that should go to that stream. The way to do that is to insert a buffer stage between find and lookup to absorb all of the records that find writes to its primary output and hold them until lookup begins reading from its primary input: 1 0 Page 16 Record Flow in CMS Pipelines ------------------------------------------------------------------------ 0 +-------------------------------------------------------------------+ | | | 'PIPE (endchar ? name NonStallingPipeline)', | | '< input file |', /* Read input file. */ | | 'f: find V|', /* Select detail records. */ | | 'buffer |', /* Hold until masters read. */ | | 'l: lookup 13.8 master |', /* Compare with masters. */ | | '> output file a', /* Write matched masters. */ | | '?', | | 'f: |', /* Non-detail records here. */ | | 'locate 2 /C/ |', /* Select master records. */ | | 'l:' /* Into LOOKUP's secondary. */ | | | +-------------------------------------------------------------------+ 0 Note that introducing a buffer before the lookup will not introduce unpredictability into this pipeline. The order of the output records here is determined solely by the order of the records on the primary input of the lookup, which is not changed by holding those records in a buffer for a while. 0 Here is another pipeline that stalls: 0 +-------------------------------------------------------------------+ | | | /* BADUPW REXX -- *BAD* Uppercase word in left margin */ | | signal on novalue | | 'callpipe (name BadUpW nomsglevel 6 endchar ?)', | | '| *:', | | '|c: chop blank', /* Truncate after the first word */ | | '| xlate upper', /* Make first word uppercase */ | | '|i: spec 1-* 1 select 1 1-* next', /* Rebuild record */ | | '| *:', | | '?c:', /* Rest of record */ | | '|i:' /* Into SPEC */ | | exit RC | | | +-------------------------------------------------------------------+ 1 0 Record Flow in CMS Pipelines Page 17 ------------------------------------------------------------------------ 0 +-------------------------------------------------------------------+ | | | pipe (nomsglevel 6) literal abc def | badupw | console | | | | Pipeline stalled. | | ... Running "console". | | Stage is wait out. | | ... Running "literal abc def". | | Stage is unavail. | | ... Running "badupw". | | Stage is wait loc. | | ... Running "console". | | Stage is wait out. | | ... Running "chop blank". | | Stage is wait out. | | ... Running "xlate upper". | | Stage is wait loc. | | ... Running "spec 1-* 1 select 1 1-* next". | | Ready(-4095); | | | +-------------------------------------------------------------------+ 0 From this stall trace, we see: 0 o The REXX filter badupw is marked "unavail"; this means that its callpipe command is running. Therefore, the input and output streams of the badupw stage have been taken over by the subroutine pipeline; specifically, the output of the literal in the main pipeline is connected to the input of the chop in the subroutine pipeline, and the output of the spec in the subroutine pipeline is connected to the input of the console stage in the main pipeline. 0 o literal is waiting for its output record to be consumed by chop. 0 o chop is also waiting for its output to be consumed, but we don't yet know to which output stream that record is being written. 0 o xlate is also waiting for its output to be consumed. We know that xlate does not consume its input until after its output has been consumed, so xlate must have blocked chop by peeking at chop's output. This tells us that chop is trying to write to its primary output stream and that the cascade of literal, chop, and xlate must be working on the same record. 0 o spec is waiting to read. Given that xlate is waiting to write to the primary input of spec, we can assume that spec must be trying to read from its other input; otherwise, spec would read xlate's + _____ output. 0 o We have already concluded that chop is blocked trying to write to its primary output, so it will never be able to satisfy spec's read from its secondary output, which is why the pipeline is stalled. 1 0 Page 18 Record Flow in CMS Pipelines ------------------------------------------------------------------------ 0 The problem is that chop produces a record on two output streams one + ___ at a time, but spec requires both of its inputs to be available at + _________ the same time. The fix is quite simple. One inserts a copy on the primary output of chop to buffer one record: 0 +-------------------------------------------------------------------+ | | | /* BADUPW REXX -- *GOOD* Uppercase word in left margin */ | | signal on novalue | | 'callpipe (name GoodUpW nomsglevel 6 endchar ?)', | | '| *:', | | '|c: chop blank', /* Truncate after the first word */ | | '| copy', /* Add quantum delay so no stall */ | | '| xlate upper', /* Make first word uppercase */ | | '|i: spec 1-* 1 select 1 1-* next', /* Rebuild record */ | | '| *:', | | '?c:', /* Rest of record */ | | '|i:' /* Into SPEC */ | | exit RC | | | +-------------------------------------------------------------------+ 0 Again, this fix does not introduce unpredictability, because the flow of records is still determinate. The two records that chop produces from each original input record are still processed by spec at the same time, so the order of the records is not perturbed. 0 3. Jeremy: If you have a data-dependent stall, you may need to see the + Jeremy: contents of the records that are being written before you can determine the problem. My pipeline formatter jeremy(3) can help by displaying the contents of your pipeline's output buffers. jeremy will also tell you which stream each stage was reading from or writing to, so you can skip a good deal of the analysis we needed to do when we were looking at the two traces above. Here is the output from jeremy for the stall we have just analyzed: 0 0 0 0 0 -------------------- 0 (3) jeremy is available from Princeton's CMS Pipelines Runtime Library + _____________ Distribution Web page, http://pucc.princeton.edu/~pipeline/. jeremy will be built into Pipelines in CMS 14. + _________ 1 0 Record Flow in CMS Pipelines Page 19 ------------------------------------------------------------------------ 0 +-------------------------------------------------------------------+ | | | Pipeline specification 1 "JeremyTest" | | Stage 1 "literal abc def" is wait.out: "abc def". | | Stage 2 "badupw" is wait.subr. | | Stage 3 "console" is wait.locate. | | Pipeline specification 2 "BadUpW" | | Stage 2 "chop blank" is wait.out: "abc". | | input 0 selected. | | output 0 selected. | | Stage 3 "xlate upper" is wait.out: "ABC". | | Stage 4 "spec 1-* 1 select 1 1-* n" is wait.locate. | | input 1 selected -- producer's stream not selected. | | output 0 selected. | | | +-------------------------------------------------------------------+ 0 You will note that the records being written by the stages that are blocked in output commands are displayed, and we are told that spec has selected its secondary input stream but that its producer (chop) does not have that stream selected. 0 One other comment about pipeline stalls: if you get a stall with one or more stages marked "wait com" ("wait for commit"), seek professional help. 0 Dispatching Order + Dispatching Order + _________________ 0 As we have seen, stalls happen when the dispatcher can find no stage to run. Now, let's consider what happens when the dispatcher has a choice of stages to run. Actually, the dispatcher often has some choice of which stage to run next. In the pipelines we stepped through earlier, there were places where I glossed over this point, because it made no difference. For example, when the readto command completes in a stage that contains a peekto-output-readto loop, there are, at that instant, two stages that could run, the consumer stage (the stage that did the readto) and the producer stage (the stage that issued the output command that produced the record that was consumed by the readto). Usually, it makes no difference which gets dispatched next, because nothing useful will happen until the producer stage runs. If the consumer stage is dispatched first, it will immediately become blocked in its next peekto, which won't complete until the producer is dispatched and produces a record. The two stages will operate in lock-step, no matter what the dispatching order. 0 Hanging around the Plumbers' Hall, apprentices hear that "the order of dispatching is unspecified". This is quite true, but it is usually irrelevant. As the Piper has said, "Even though the order of dispatching is unspecified, if only one stage can run, that will be the one running". However, in those relatively rare pipelines in which the dispatcher has a real choice of which stage to run next, the algorithm 1 0 Page 20 Record Flow in CMS Pipelines ------------------------------------------------------------------------ 0 by which it makes that choice is not documented and, in fact, has changed over the years as the dispatcher has been improved. 0 Typically, the lesson that the dispatching order is unspecified doesn't really sink in until one learns it the hard way. Often, this is when one writes a pipeline with more than one stage that can be started first, and one assumes that the order in which the stages will begin executing is the order in which they appear in the pipeline specification. 0 In fact, of course, any stage can be started first, but that doesn't + ___ matter, because most stages try to read from their input stream as soon as they are started. If they have no input waiting, they become blocked immediately and can't do anything useful until they get some input. Thus, the order in which they are actually started is irrelevant. However, there are stages for which that is not true. One case that trips novices up is the command processor stages (e.g., cms, command, + ____ cp, subcom, etc.). When those stages have a command specified as their + ____ argument, they will issue that command as soon as they are started. And there is no reason why the dispatcher can't start them before other stages, even if they are positioned very late in the pipeline specification. If you want to issue a command only after earlier portions of the pipeline have finished processing, this is the idiom to use: 0 pipe . . . | hole | append command FINIS * * A | hole 0 append passes all of the records from its input to its output and doesn't launch the stage specified in its argument until after it has finished passing all of those records. The process of copying the input records to the output completes as soon as end-of-file is received on the input stream or on the output stream. The first hole stage here + or runs until it receives end-of-file on its input. When that hole stage terminates, append sees that it has no more input records to copy to its output, so it runs the command stage. There is nothing tricky about that; it is the way people usually use append. What makes this case tricky is that append is free to run its argument stage earlier than one might have expected, if its output is not connected. If it discovers + __ that its output is not connected, it will not try to copy its input to its output. It will then run its argument stage as soon as the dispatcher dispatches it. So, the purpose of putting a stage after append here is to keep append from getting premature end-of-file on its output. Some other stage, such as console, would do instead of hole, but the output of append should be connected when append is used to wait to issue a command until after earlier portions of the pipeline have completed. 0 Another case that trips up plumbers in training is a pipeline that contains two independent input device drivers. Again, there is a tendency to assume that stages run in the order they occur in the pipeline specification. Here is an example: 1 0 Record Flow in CMS Pipelines Page 21 ------------------------------------------------------------------------ 0 +----------------------------------------------------------------------+ | | | 'PIPE (endchar ?) var banana | f: faninany | > output file a', | | '? stem glorp. | f:' | | | +----------------------------------------------------------------------+ 0 What will be the order of the records in the output file? Novices may assume that the output of the var stage will be first, followed by the records from the stemmed array. That may be the case, but it could be the other way around, or the records could be intermixed. (In fact, the results have been different at different levels of CMS Pipelines.) Both + _____________ var and stem are able to run as soon as the pipeline starts up, so the dispatcher could choose to run either of them first. Since the sources of the two streams are independent, the order in which the records arrive at faninany is left to the whim of the dispatcher. 0 Nothing in this pipeline constrains the relative order in which the two input device drivers (var and stem) are initiated and dispatched; they operate completely independently of one another, each writing its output to faninany. faninany has no preference for one of its input streams over the other. Each device driver operates in lock-step with faninany, in the sense that neither can produce another output record until after faninany has consumed the earlier record from that device driver, but the order in which faninany will read from its streams when both have a record available is not specified, nor is the order in which the two device drivers will be started. 0 In a case like this, where multiple input device drivers operate independently of one another, the plumber must take steps to constrain the order of the records in the output file to be what he wishes. In this case, the most obvious solution is to replace faninany with fanin, as fanin will read all of the records produced by one of the device drivers before reading any of the records produced by the other. Which stream is read first is determined by the arguments to fanin. 0 Both device drivers may get started and produce a record before fanin is started. One will then have to wait for that record to be consumed until after fanin has read all of the records from the other device driver. Or, in fact, one of the device drivers may not even be dispatched to produce its first record until after the other has terminated. 0 It is possible to determine the dispatching order by tracing a pipeline. Indeed, there have been numerous misguided requests to document the relative starting order of various input device drivers. It is a mistake, however, to write pipelines that depend on that order. The order has changed in the past and is likely to do so again. Pipelines + _________ has mechanisms for enforcing deterministic behaviour, and it behooves the journeyman plumber to learn to use them. 1 0 Page 22 Record Flow in CMS Pipelines ------------------------------------------------------------------------ 0 Record Delay + Record Delay + ____________ 0 Now, let's go on to the more interesting case where one can get burned by making assumptions about dispatching order. 0 In one's early career as a plumber, one can get along nicely without understanding the concept of record delay, but later it becomes essential, so we will discuss record delay at some length. In fact, we have already been discussing record delay, but without calling it by name. I have avoided the term "delay the record" up until now, because it seems to be a difficult one to absorb. As Steve Hayes has said: 0 Perhaps "distract the record" would be a better term. 0 "Record delay" occurs when a stage tells the stage upstream that it has finished with a record before this is really true. The plumbers' jargon for this is "consuming the record". A stage that does a readto before an output unblocks its upstream stage while it still has work to do, leaving two stages able to run at the same time. You can't predict which the pipeline dispatcher will choose, so your records may arrive in the "wrong" order. 0 There can be many good reasons for doing this; for instance, unique last must do this, because it needs to read record n+1 + must _ before deciding whether to write record n, and the Pipelines model + _ _________ ensures that a stage can get records from a stream only one at a time. 0 But if you have a simple peekto-output-readto loop, the output completes before the readto occurs, which ensures that the downstream stage has finished with the record before it tells the upstream stage that it has finished with it.(4) 0 In other words, a stage that processes its records through a peekto-output-readto loop does not delay the record. Thus, a stage that does not delay the record has the properties we have already been discussing: 0 In general, in a cascade of stages that do not delay the record, a single record moves from stage to stage, leaving all the stages it passes through blocked. When the record reaches the end of the pipeline, then the last stage, which does no output operation to the pipeline and thus does not become blocked in a write, can consume the record; this unblocks the next-to-last stage, which can then consume its input record, unblocking the previous stage, until all the stages unblock, like dominoes falling. 0 -------------------- 0 (4) From PIPELINE CFORUM, 6 April 1994, 15 May 1996, and 6 September 1996. 1 0 Record Flow in CMS Pipelines Page 23 ------------------------------------------------------------------------ 0 Thus, a cascade of filters that do not delay the record behaves as a unit. Only one record will be traversing the cascade at any one time; the order of the individual records cannot be changed in the course of traversing the cascade. This remains true even if the cascade includes multiple streams and even if there are different numbers of stages in different paths. Records leave the cascade in the same order they enter it; that is, they are not delayed relative to one another. 0 Record delay is relevant only in multistream pipelines, so let's look at a simple multistream pipeline that appends a certain string to any record that contains the name "Sid": 0 +----------------------------------------------------------------+ | | | 'PIPE (endchar ?)', | | '< input file |', | | 'l: locate /Sid/ |', | | 'spec 1-* 1 /(!)/ next |', | | 'f: faninany |', | | '> output file a', | | '?', | | 'l: |', | | 'f:' | | | +----------------------------------------------------------------+ 0 Assuming the input file contains the records on the left [below], we would expect the output file to contain the records on the right: 0 +----------------------------------------------------------------+ + + | | | | Don't forget | Don't forget | | to tell Sid | to tell Sid(!) | | about this. | about this. | | | | +----------------------------------------------------------------+ + + 0 There would be cause for concern if the records in the output file were not in the same order as the records in the input file, but how can we be sure this is the case and how can we reason about other pipeline topologies? 0 Let's walk the three records through this pipeline network: 1 0 Page 24 Record Flow in CMS Pipelines ------------------------------------------------------------------------ 0 +----------------------------------------------------------------+ | | | +----------+ +------+ +----------+ | | | |----| spec |----| | | | | | +------+ | | | | ---| locate | | faninany |--- | | | | | | | | | |----------------| | | | +----------+ +----------+ | | | +----------------------------------------------------------------+ 0 1. locate peeks at the first record, which does not contain the required string; locate writes the record to its secondary output stream rather than discarding it. locate is now blocked in a write. What the dispatcher now does is "unspecified", but it is clear that no further data can flow before faninany reads from its secondary input stream. 0 2. faninany starts and issues a call to the dispatcher to wait until a record is available on any one (or more) of its input streams. 0 3. faninany is resumed, because there is a record available on its secondary input stream. We know that locate is blocked after it has written the first record to its secondary output stream. spec is probably waiting for an input record, but locate cannot produce one on its primary output stream while it is blocked waiting to complete the write on its secondary output stream. 0 4. faninany peeks at the record on its secondary input stream and writes it to its primary output stream. The record has now left the multistream part of the pipeline; locate is still blocked waiting for its output to complete. 0 5. The stage connected to faninany's output stream consumes the record. faninany can now run; it consumes the record on the secondary input stream (which releases locate) and waits for the next input record on any of its input streams. 0 6. locate consumes the first input record and peeks at the second one, which does contain the specified string; locate writes that record to its primary output stream. 0 7. spec finally gets a record to work with. It builds the output record in a work area (copying the input record and appending the literal string) and writes the amended output record. Both locate and spec are now blocked waiting for their output to complete. 1 0 Record Flow in CMS Pipelines Page 25 ------------------------------------------------------------------------ 0 8. faninany is resumed; it peeks at the record, writes it, consumes it (which releases spec), and waits for the next record wherever it may appear. 0 9. spec is resumed; it in turn consumes the record (releasing locate)(5) and peeks for some more to do. 0 10. locate consumes the second record, peeks the third, and writes it to the secondary output stream because it does not contain the specified string. locate is now blocked in a write; spec is blocked in a peek; again, only faninany can run. 0 11. faninany writes the third record, consumes it (releasing locate), and waits for the next one. 0 12. locate receives end-of-file on its input and returns. This severs its two output streams. 0 13. spec is resumed with an indication of end-of-file; it also returns, severing its output stream. 0 14. faninany is resumed with an end-of-file return code because all of its input streams are now severed; it returns as well. 0 We see that the output records arrive in the expected order, because peeking a record ensures that the producer cannot run and thus cannot produce another record while the first one is still in flight. The important piece of information is that spec does not delay the record. This means that a record being processed by spec cannot be overtaken by another record that takes a different path. 0 So far we have met only built-in programs that do not delay the record; are there any that do? sort must of necessity see all input records before it can begin to produce output. However, a more subtle form of delay occurs with unique last, which compares pairs of records and discards duplicates, retaining the last of a set of identical records. Clearly, unique last must buffer one record internally while it examines the next one; as a result the output record is delayed relative to records that take a path that bypasses the unique last stage. Now, insert unique last instead of spec in the multistream example and run the file through it again: 0 0 -------------------- 0 (5) Both spec and the stage to its left [(locate)] can now run; conceptually they run in parallel, but spec will be blocked if it peeks and there is no new record yet available. 1 0 Page 26 Record Flow in CMS Pipelines ------------------------------------------------------------------------ 0 +----------------------------------------------------------------+ | | | +----------+ +--------+ +----------+ | | | |---| unique |---| | | | | | +--------+ | | | | ---| locate | | faninany |--- | | | | | | | | | |----------------| | | | +----------+ +----------+ | | | +----------------------------------------------------------------+ 0 Watch this happen: 0 1. The first record does not contain the specified string; it bypasses unique and becomes the first output record. 0 2. The second record is read into unique's buffer; unique now peeks to see the next input record. 0 3. The third record bypasses unique as well. Thus it becomes the second output record. 0 4. Finally, unique receives end-of-file; it writes the record from its buffer, making it the third output record. 0 Note that even though unique has only a one-record delay, the effect of the pipeline topology may mean that many more records can overtake the one being buffered. 0 Thus, to reason about this in general, the concept of a record + ______ delay (or simply delay) is introduced. This delay is not the time + _____ it takes for a stage to process a record; rather, it represents a change in the relative order of records that pass through different paths in a multistream pipeline. If a record takes a path entirely through stages that do not delay the record, then the record must arrive at the end of the pipeline ahead of any record that enters the pipeline path after it does. If a record passes through a stage that delays records, then the record may arrive at the end of the pipeline later than a record that takes a path without delay. 0 All CMS Pipelines built-in programs are written not to delay the + _____________ record, except where required by the function they perform. A user-written stage might delay records if it processes them with a readto-output loop; it will not delay them if it processes them with a peekto-output-readto loop. 0 Appendix A contains examples of user-written stages with various degrees of delay. The author's help files (which you can display using the undocumented pipe ahelp command) define quite rigorously the record delay introduced by each built-in stage. Let's look at a few examples from these help files: 1 0 Record Flow in CMS Pipelines Page 27 ------------------------------------------------------------------------ 0 o Strictly non-delaying stage (console): 0 Record Delay: console strictly does not delay the record. 0 Like most other stages, console uses a peekto-output-readto loop, so it writes each output record before it consumes the corresponding input record and does not delay the record. 0 o Strictly non-delaying stage (delay): 0 Record Delay: delay strictly does not delay the record. That is, delay consumes the input record after it has copied it to the primary output stream; records are delayed in time, but the relative order of records that originate in a particular filter is unchanged. 0 delay also uses a peekto-output-readto loop, so records passing through it, no matter how long they may take, cannot be bypassed by records going through the pipeline on a parallel stream, if none of the stages delays the record. Note, however, that this does not mean that the entire pipeline necessarily waits while a delay stage is waiting for its timer to expire. Whether that happens depends upon the topology of the pipeline, but if all the stages between delay and the source of its records are ones that do not delay the record, that source stage will not be able to produce other records before delay's timer has expired. 0 o Non-delaying stage (getfiles): 0 Record Delay: getfiles writes all output for an input record before consuming the input record. 0 getfiles does not delay the record. It writes all of the records from a file it has gotten before it consumes the input record that specified the name of that file. The reason that getfiles is not described as being "strictly" non-delaying is that it can produce more than one output record for each input record. When getfiles is used in a multistream pipeline containing only non-delaying stages, all of its output records derived from a given input record will get to the end of the pipeline ahead of the records derived from later input records, no matter what path they take through the pipeline. That is, it truly does not delay the record. 0 o Stage with different levels of delay on different streams (cms): 0 Record Delay: cms writes all output for an input record before consuming the input record. When the secondary output stream is defined, the record containing the return code is written to the secondary output stream with no delay. 0 When a cms stage issues the command contained in an input record, it writes all of the lines of the response to its output before it 1 0 Page 28 Record Flow in CMS Pipelines ------------------------------------------------------------------------ 0 consumes that input record. In this, it is exactly parallel to getfiles; neither of them delays the record. cms (like most of the other command processor stages) has the added wrinkle that it writes the return code from the command to its secondary output, if there is a secondary output. One return code record is written for each input record, and the return code record is written before the input is consumed, so the return code record is strictly not delayed. 0 o Stage with different delay depending on the keywords used (drop): 0 Record Delay: drop first does not delay the record. drop last delays the specified number of records. 0 drop first reads and discards the specified number of records and then shorts its input to its output, so it does not delay the records; indeed, it does not even see them. 0 drop last must hold the specified number of records in a rotating buffer until end-of-file, when it discards them. As it reads each new record from its input, it can write out the record that has been in its buffer the longest, keeping the specified number of records in its buffer. Thus, it delays the specified number of records. 0 o Potentially delaying stage (copy): 0 Record Delay: copy has the potential to delay one record. 0 The copy program can be thought of as a one-step elastic; it can delay one record, but it need not. As we have seen, copy uses a readto-output loop. After copy consumes its input record, the dispatcher may allow it to write its output record immediately, in which case the record will not be delayed. However, the dispatcher may instead allow the producer stage to run and produce another output record before copy is allowed to write its output record, in which case the record is delayed by one. 0 o Stage with unspecified delay (block): 0 Record Delay: block delays input records as required to build an output record. The delay is unspecified. 0 There is not a one-to-one equivalence between the input and output records for block; it may need to span the bytes from a given input record across more than one output record, so the record delay is undefined. 0 o Stage that delays the entire file (hole): 0 Record Delay: hole delays all records until end-of-file. 0 hole writes no output records, but it keeps its consumer stage blocked in a peekto until after it has seen end-of-file on its own input; thus, it delays all records until end-of-file. 1 0 Record Flow in CMS Pipelines Page 29 ------------------------------------------------------------------------ 0 Now, let's review the concept of record delay by looking at explanations by a couple of master plumbers. First is Glenn Knickerbocker: 0 "Delay", in Pipelines terms, is any time that elapses between + _________ releasing the input record and releasing the output record. 0 When every stage releases its output record before releasing its input record, every record must go through the whole pipeline before another record can be processed. The record may take a microsecond or an hour to get through the pipeline, but it is not delayed in relation to the other records. 0 When any stage releases its input record before releasing its output record, processing may (or may not) start on the next record before this one is fully processed. This record has been delayed in relation to the next record entering the pipeline. 0 [unique last] delays the record by exactly one record. Each output record is released after the next input record arrives and before it is released. 0 A readto-output loop, on the other hand, delays the record by some unspecified amount less than or equal to one record. The exact amount is determined by the rest of the pipeline and the whim of the dispatcher. But it is the act of releasing the input record before the output record that introduces the delay in the first place. 0 You might say the terminology is backwards, and what is really happening is not that the stage has delayed processing of the first record but that it has allowed processing of the next record to begin early. But remember, despite how it looks from outside, Pipelines does only one thing at a time. If it is processing the + _________ next record, it is not processing this one, so the processing of this record is delayed.(6) 0 And this is from Steve Hayes: 0 I find the best way to look at the term "delay the record" is that a stage that delays the record contains a place where that record stays whilst other processing goes on.... A readto-output loop contains such a place: the variable used on the readto. A peekto-output-readto loop does not, since the peekto, unlike the readto, does not allow stages upstream to execute and so the record has "left" its variable before any other processing goes on, and therefore there is no delay. 0 0 -------------------- 0 (6) From PIPELINE CFORUM, 5 September 1996. 1 0 Page 30 Record Flow in CMS Pipelines ------------------------------------------------------------------------ 0 An important thing to realise is that there are stages which may + ___ delay records (e.g., copy and elastic) and ones which do delay + ____ __ records (e.g., [unique last]). The former can introduce needed + ____ flexibility or unwanted (and often unnoticed) indeterminacy in the pipeline. The latter cannot, although they can produce a stall (and you then may need one of the former to solve it).(7) 0 III. WRITING "MULTI-TASKING" PIPELINES + III. WRITING "MULTI-TASKING" PIPELINES + _______________________________________ 0 So far, we have been discussing how to write pipelines through which records flow in an entirely predictable manner, so that the order of records is never altered except by stages, such as sort, that are specifically intended to reorder records. In such pipelines, the goal is to make sure that the dispatcher has no leeway, that it makes the stages of the pipeline operate in lock-step, generally moving one record at a time through the entire length of the pipeline. 0 That model will hardly do, however, when one wants to build a multi-tasking pipeline, such as a server that will serve multiple clients concurrently. In that case, the records representing the requests from concurrent clients must be able to flow through the pipeline simultaneously and independently of one another. Fortunately, if one has a basic understanding of the flow of records through a pipeline, it is extremely easy to write a "multi-tasking" pipeline. Although Pipes has recently been enhanced to support CMS Multi-Tasking, + _____ what I will be discussing here is how to achieve the effect of multi-tasking within a single pipeline set, whether or not CMS Multi-Tasking is active. 0 In CMS Pipelines, each stage is, in a very real sense, an independent + _____________ thread of execution. And pipeline segments behave like processes. The power of the Pipelines dispatcher allows properly coded pipeline + _________ segments to overlap their execution quite effectively, with or without the use of CMS Multi-Tasking. 0 CMS Pipelines itself is not MP-capable, so the sort of pipeline I will + _____________ be showing you cannot use more than one virtual processor at a time, and the entire pipeline set is suspended when one of the stages issues a synchronous DIAGNOSE or a CMSCALL. Other than that, the independent pipeline "processes" will continue running in parallel as long as one takes a bit of care to make sure that they can get work when they are ready for it and that they don't become blocked trying to write to other portions of the pipeline. 0 It doesn't matter whether the pipeline segments to be overlapped are created by callpipe or addpipe or are just part of a big multistream pipeline, so long as they are all part of the same pipeline set and, 0 -------------------- 0 (7) From PIPELINE CFORUM, 6 September 1996. 1 0 Record Flow in CMS Pipelines Page 31 ------------------------------------------------------------------------ 0 thus, are controlled by the same instantiation of the pipeline dispatcher. 0 I've been experimenting over the past two years with writing such multi-tasking pipelines and have been delighted with the results I've gotten and the ease with which I've gotten them. I have a multi-tasking SMTP client that uses multiple connections to a server SMTP to move more data than was possible with a single connection. (This SMTP client is light-weight enough to run very well on a P/370 system, where it serves as an Internet/BITNET gateway.) I also have a multi-tasking version of Rick Troth's Webshare Web server that supports multiple simultaneous Web clients with quite good responsiveness (even while running CGIs for one or more of them). 0 There is no point, of course, in writing a multi-tasking pipeline unless at least some of the pipeline segments wait on asynchronous events outside of the virtual machine. One could certainly write a multi-tasking pipeline that, say, split a file into two streams and then performed the same filtering or changing operations on both streams. That would work, but it would be slower than putting all of the records through a single stream, because the dispatcher overhead would be greater. However, when you have a process that must wait for input from the network, you can markedly increase total throughput by using multiple processes, thus allowing one process to execute while another is waiting on network I/O. 0 My approach in both the SMTP client and the Web server was to create multiple identical processes and to let a deal stage dole work out to them as they request it. When a process completes a task, it queues for more work by writing a record containing its process number. In return, deal passes it a record that describes its next task. (In the SMTP case, this record contains the spoolid of the next input file; in the Webshare case, the record describes the socket for a client connection request.) 0 Each process must be careful to consume its input record "quickly", so that deal can continue passing out records to the other processes. Similarly, when the process writes a record to queue for more work, something in the central portion of the pipeline must consume that record quickly, so that the process doesn't become blocked waiting for its output to be consumed. 0 In other words, all there is to writing multi-tasking pipelines is to let the ends of the processes flap a bit. That is very easy to do, as I + very will now show you. 0 To see how to build up a multi-tasking pipeline, let's start with a simple SMTP client that has only one connection to its server: 0 PIPE starmsg | find ... | spec 26.4 1 | smtp 1 0 Page 32 Record Flow in CMS Pipelines ------------------------------------------------------------------------ 0 A starmsg stage captures CP information messages, and a find (which isn't completely shown) selects the messages describing arriving reader files. spec isolates the spoolid number, which is passed to the smtp stage, signalling it to process the corresponding spool file. When the smtp stage has finished processing the spool file, it will read another spoolid record from its input and then process that file, continuing to process one file at a time forever. 0 An easy way to add more connections to this scheme is to build a pipeline containing multiple smtp stages and to use a deal stage to "round-robin" the files amongst the smtp stages: 0 +----------------------------------------------------------------------+ | | | 'PIPE (endchar ?)', /* Dispatch by round-robin: */ | | 'starmsg |', /* Capture spool arrivals. */ | | 'find' arrival_msg'|', /* Select file arrival message.*/ | | 'spec 26.4 1 |', /* Isolate spoolids. */ | | 'd: deal', /* Dole them out to SMTPs. */ | | '?', | | 'd: | copy | smtp', /* DEAL's output stream 1. */ | | '?', | | 'd: | copy | smtp', /* DEAL's output stream 2. */ | | '?', | | 'd: | copy | smtp' /* DEAL's output stream 3. */ | | | |----------------------------------------------------------------------| + + + | | | +------+ +------+ +--------+ | | -->| find |-->| spec |-->|0 0|-| | | +------+ +------+ | | | | | | +------+ +------+ | | | 1|-->| copy |-->| smtp |-| | | | | +------+ +------+ | | | deal | | | | | +------+ +------+ | | | 2|-->| copy |-->| smtp |-| | | | | +------+ +------+ | | | | | | | | +------+ +------+ | | | 3|-->| copy |-->| smtp |-| | | +--------+ +------+ +------+ | | | +----------------------------------------------------------------------+ 0 deal was added in CMS 12. It works like a card player dealing cards, passing records to its output streams in turn. In this example, deal has no primary output stream, so it passes the first record it receives to its secondary output stream (stream 1), the next record to its tertiary (stream 2), the next to it quaternary (stream 3), and then the next to its secondary again, continuing in sequence to round-robin its output. 1 0 Record Flow in CMS Pipelines Page 33 ------------------------------------------------------------------------ 0 If you want these smtp stages to run in parallel, though, you must be careful that the output of deal is consumed "quickly", so that deal can continue passing out spoolid records to other streams. The copy stages consume the output of deal to unblock it immediately. (As we have seen earlier, copy can be thought of as a one-record buffer.) 0 This scheme is sufficient to allow the smtp stages to run in parallel. However, one of the smtp stages might be sending a large file on a slow link, so it might not be ready to consume a spoolid record the next time through the round-robin process. That would block deal and cause the other smtp stages to go idle, because deal could not give them any work to do. One could change each copy stage to elastic, of course. That would prevent the blocking of deal, but would still leave files queued in elastic's buffer waiting for a busy smtp stage to complete. 0 A better scheme would be not to dispatch the SMTP processes in round-robin fashion at all, but instead to let each process queue for a file when it is ready for one. This pipeline does that: 1 0 Page 34 Record Flow in CMS Pipelines ------------------------------------------------------------------------ 0 +----------------------------------------------------------------------+ | | | 'PIPE (endchar ?)', /* Dispatch on request: */ | | 'starmsg |', /* Capture spool arrivals. */ | | 'find' arrival_msg'|', /* Select file arrival message.*/ | | 'spec 26.4 1 |', /* Isolate spoolids. */ | | 'd: deal secondary', /* Dole them out to SMTPs. */ | | '?', | | 'y: faninany | elastic |', /* Secondary input of DEAL. */ | | 'd: | copy | smtp | literal | spec /1/ 1 | y:', /* Process 1. */ | | '?', | | 'd: | copy | smtp | literal | spec /2/ 1 | y:', /* Process 2. */ | | '?', | | 'd: | copy | smtp | literal | spec /3/ 1 | y:' /* Process 3. */ | | | |----------------------------------------------------------------------| + + + | | | +----+ +------+ +-----+ | | -->|spec|-->|0 0|-| |-|0 0|--+ | | +----+ | | | | | | | +----+ | | +----+ +----+ +-------+ +----+ | f | | | | +->|elas|-->|1 1|->|copy|->|smtp|->|literal|->|spec|-->|1 a | | | | | |tic | | | +----+ +----+ +-------+ +----+ | n | | | | | +----+ | deal | | i | | | | | | | +----+ +----+ +-------+ +----+ | n | | | | | | 2|->|copy|->|smtp|->|literal|->|spec|-->|2 a | | | | | | | +----+ +----+ +-------+ +----+ | n | | | | | | | | y | | | | | | | +----+ +----+ +-------+ +----+ | | | | | | | 3|->|copy|->|smtp|->|literal|->|spec|-->|3 | | | | | +------+ +----+ +----+ +-------+ +----+ +-----+ | | | | | | | +------------------------------------------------------------------+ | | | +----------------------------------------------------------------------+ 0 As before, each SMTP process gets its input from an output stream of deal and consumes it quickly, but now each smtp stage also has an output stream of its own that feeds ultimately into the secondary input of deal (through the spec, faninany, and elastic). deal is invoked with the secondary option, which says to read a stream of stream numbers from its secondary input and write the records from its primary input to the specified output streams. So, when an SMTP process is ready for another input file, it simply produces a record containing its stream number, thus queuing for output from deal secondary. That is, when a process is ready for work, it writes an output record containing its stream number to an input stream of faninany, and faninany feeds those stream number records to elastic, which consumes them immediately and buffers them for feeding into the secondary input of deal secondary (the second occurrence in the pipeline of the label "d:"), in the order in which they were received. (The literal in each segment produces a first record to get the process started.) 1 0 Record Flow in CMS Pipelines Page 35 ------------------------------------------------------------------------ 0 The multi-tasking scheme used here is quite effective. In the production version of this SMTP client, the total throughput with four processes is twice as great as with one. (It increases little with additional processes, presumably as the result of the synchronization imposed by the DIAGNOSE used to read from the spool.) Letting the processes queue for work rather than dispatching them in round-robin fashion definitely improves throughput. With queuing, the elapsed time to deliver a thousand identical files using five processes was reduced by nine percent in comparison with strict round-robining. Obviously, the improvement would be greater in situations in which the files were of differing sizes. 0 Any pipeline process can be made multi-tasking using the same scheme. This is how it works with my variant of Rick Troth's TCPSHELL: 0 +----------------------------------------------------------------------+ | | | processes = '' /* Initialize pipeline segment.*/ | | Do n = 1 to processno /* Build server processes. */ | | processes = processes, /* Append another process. */ | | 'd: |', /* Next spool file id to here. */ | | 'tcpshell' n '|', /* Invoke server process #n. */ | | 'y:', /* Streamnumber recs from here.*/ | | '?' | | End | | | | 'PIPE (endchar ? name TCPShell)', /* Run TCP server processes: */ | | 'tcplisten' server_port '|',/* Wait for the next client. */ | | 'd: deal secondary', /* Clients to ready processes. */ | | '?', | | 'y: faninany |', /* Stream of streamnumber recs.*/ | | 'elastic |', /* Hold until ready for them. */ | | processes /* Feed to server processes. */ | | | +----------------------------------------------------------------------+ 1 0 Page 36 Record Flow in CMS Pipelines ------------------------------------------------------------------------ 0 +----------------------------------------------------------------------+ | | | +-----------+ +------+ +-----+ | | -->| tcplisten |-->|0 0|-| |-|0 0|--+ | | +-----------+ | | | | | | | +-----------+ | | +-----------+ | f | | | | +->| elastic |-->|1 1|-->| process 1 |-->|1 a | | | | | +-----------+ | | +-----------+ | n | | | | | | deal | | i | | | | | | | +-----------+ | n | | | | | | 2|-->| process 2 |-->|2 a | | | | | | | +-----------+ | n | | | | : : : : y : : | | : : : : : : | | | | | +-----------+ | | | | | | | n|-->| process n |-->|n | | | | | +------+ +-----------+ +-----+ | | | | | | | +------------------------------------------------------+ | | | +----------------------------------------------------------------------+ 0 A pipeline segment is built up to contain the specified number of processes, each of which reads records describing work to do from its input stream and writes a record containing its stream number each time it is ready for work. In this case, the work-to-do records are generated by a tcplisten stage as it receives client connection requests, but I think you can see that this scheme is equally applicable to other kinds of servers. 0 Indeed, I will submit that one can use this framework to build multi-tasking servers even without having passed the journeyman plumber exam. One must take some care that the processes can co-exist peaceably. For example, if they use GLOBALV, they should name their variables in accordance with their process number to avoid destroying one another's variables. If they use tapes, they should have a scheme for making sure that they don't all try to attach their tape drives at the same virtual address. But such considerations are easily sorted out and one quickly finds that one has built a robust server that handles concurrent clients gracefully. 1 0 REXX Pipeline Stages with Different Delays Page 37 ------------------------------------------------------------------------ 0 Appendix A + Appendix A 0 REXX PIPELINE STAGES WITH DIFFERENT DELAYS + REXX PIPELINE STAGES WITH DIFFERENT DELAYS + __________________________________________ 0 The following skeleton pipeline stages were written by Steve Hayes, of IBM. They are examples of stages that do not, may, and do delay one record: 0 o REXX stage that does not delay the record: NULL REXX uses a + REXX stage that does not delay the record: peekto-output-readto loop in the canonical way, so it does not delay the record. It is a good model to use for one's own REXX filters. 0 +----------------------------------------------------------------------+ | | | /* NULL REXX: Skeleton filter that does not delay the record */ | | Signal on Novalue /* No uninitialised variables */ | | Signal on Failure /* Allow RC > 0 for a moment */ | | 'MAXSTREAM INPUT' /* Check only one stream */ | | Signal On Error /* now stop for any error */ | | if RC <> 0 then 'ISSUEMSG 264 PIPSJH'/* too many streams: crash */ | | do forever /* until EOF */ | | 'PEEKTO record' /* read from primary input */ | | 'OUTPUT' record /* write to primary output */ | | 'READTO' /* release input */ | | end /* next record */ | | Failure: | | Error: | | Exit (RC * (RC <> 12)) /* RC = 0 if EOF */ | | | +----------------------------------------------------------------------+ 1 0 Page 38 REXX Pipeline Stages with Different Delays ------------------------------------------------------------------------ 0 o REXX stage that may delay the record: COPY REXX may delay the + REXX stage that may delay the record: record, depending on the dispatching order. It is equivalent to the copy built-in program. 0 +----------------------------------------------------------------------+ | | | /* COPY REXX: Skeleton filter that may delay one record */ | | SIGNAL ON NOVALUE /* No uninitialised variables */ | | SIGNAL ON FAILURE /* Allow RC > 0 for a moment */ | | 'MAXSTREAM INPUT' /* Check only one stream */ | | SIGNAL ON ERROR /* now stop for any error */ | | if RC <> 0 then 'ISSUEMSG 264 PIPSJH'/* too many streams: crash */ | | do forever /* until EOF */ | | 'READTO record' /* read from primary input */ | | 'OUTPUT' record /* write to primary output */ | | end /* next record */ | | FAILURE: | | ERROR: | | Exit (RC * (RC <> 12)) /* RC = 0 if EOF */ | | | +----------------------------------------------------------------------+ 0 o REXX stage that does delay the record: HICCUP REXX has a one-record + REXX stage that does delay the record: delay. It reads a record into a buffer and peeks the next record before copying the first record to its output. 0 +----------------------------------------------------------------------+ | | | /* HICCUP REXX: Skeleton filter that delays one record */ | | SIGNAL ON NOVALUE /* No uninitialised variables */ | | SIGNAL ON FAILURE /* Allow RC > 0 for a moment */ | | 'MAXSTREAM INPUT' /* Check only one stream */ | | SIGNAL ON ERROR /* now stop for any error */ | | if RC <> 0 then 'ISSUEMSG 264 PIPSJH'/* too many streams: crash */ | | do forever /* until EOF */ | | 'READTO record1' /* read from primary input */ | | 'PEEKTO record2' /* check next input ready */ | | 'OUTPUT' record1 /* write to primary output */ | | end /* next record */ | | FAILURE: | | ERROR: | | if RC = 12 & symbol('record1') = 'VAR' & symbol('record2') = 'LIT' | | then 'OUTPUT' record1 /* input record pending */ | | Exit (RC * (RC <> 12)) /* RC = 0 if EOF */ | | | +----------------------------------------------------------------------+