1
0
 
0
                     Cramming for the Journeyman Plumber Exam
+                    Cramming for the Journeyman Plumber Exam
0                      Part I: Record Flow in CMS Pipelines
+                      Part I: Record Flow in CMS Pipelines
+                                             _____________
0
                                  Melinda Varian
0                 Office of Computing and Information Technology
                               Princeton University
                                87 Prospect Avenue
                            Princeton, NJ  08544  USA
                                      --.--
                          Email:  Melinda@princeton.edu
                     Web:  http://pucc.princeton.edu/~Melinda
                            Telephone:  (609) 258-6016
0
                                   VM Workshop
                                    June 1997
0
 
0                                I.  INTRODUCTION
+                                I.  INTRODUCTION
+                                ________________
0    Once one has a  thorough understanding of the flow of  records through a
     pipeline,   writing  complex  CMS Pipelines  applications  becomes  much
+                                  _____________
     easier.   One  is no  longer troubled  by pipeline  stalls,  so  one can
     concentrate on the function required, rather than on the infrastructure.
     Every journeyman plumber should understand record flow well enough to be
     able to write pipelines that give the effect of multi-tasking,  such as,
     for  example,   a  server  that  will  interact  with  multiple  clients
     concurrently.   Fortunately,  once one understands  the basics of record
     flow,   the  writing  of  such   "multi-tasking"  pipelines  becomes  an
     essentially trivial exercise.
0    I will  be borrowing  heavily from John  Hartmann's paper  CMS Pipelines
+                                                               _____________
     Explained,  as  that is still the  canonical work for  understanding CMS
+    _________                                                            ___
     Pipelines.   I will also be quoting from PIPELINE CFORUM postings by two
+    _________
     of IBM's master plumbers, Glenn Knickerbocker and Steve Hayes.
0
 
                                 II.  RECORD FLOW
+                                II.  RECORD FLOW
+                                ________________
0    The flow  of records  through a  pipeline is  under the  control of  the
     pipeline  dispatcher,   which  dispatches the  individual  programs  (or
     "stages") of the pipeline to make data flow in an orderly fashion.
0       The programs in a pipeline do not exchange data directly;  rather,
        they  exchange   data  [under  the   auspices  of]   the  pipeline
        dispatcher.   They call  the pipeline dispatcher to  read a record
1
0    Page 2                                      Record Flow in CMS Pipelines
     ------------------------------------------------------------------------
0
        from an input stream or to write a record to an output stream.(1)
+               ____________                            _____________
0       The dispatcher does  not buffer records between  stages;  instead,
        the dispatcher minimises the number of records in flight by giving
+                                                      _________
        priority to stages that can pump data out of the pipeline.
0
     Reading and Writing Records
+    Reading and Writing Records
+    ___________________________
0    To see  how records  flow through  a pipeline,   let's step  through the
     running of a very  simple pipeline that displays the count  of the words
     in a file:
0                  pipe disk input file | count words | console
0       1. The dispatcher calls  disk,  which reads a record  from the CMS
           file and  calls the  dispatcher to  write the  record into  the
           pipeline.
0       2. The dispatcher  tests whether  the stage  that is  connected to
           disk's output  is waiting to read  a record.   This is  not the
           case;  the  dispatcher must  suspend disk,   because it  cannot
           return to disk before the record  has been processed.   It then
           looks for other work to do.
0       3. The dispatcher now starts the next stage, count.   That is,  it
           calls the entry point for count.
0       4. count calls the  pipeline dispatcher to read a  record from the
           pipeline.   disk and count are now  at a rendez-vous;  disk has
+                                                   ___________
           produced  a record  and  count intends  to  consume one.    The
           dispatcher already has a record, which it passes on to count as
           it returns to count.   (Note that it does not call count again;
           there are no recursions into stages.)
0       5. count scans  the record  it has  just obtained  and counts  the
           words.  It then calls the dispatcher to get the next record, in
           effect discarding the record it has just processed.
0       6. The  dispatcher  suspends  count  because   it  has  no  record
           available for it.   But now the record that disk wrote has been
           processed;  the dispatcher can resume disk  in the hope that it
           will then  write another  record,  which  can be  passed on  to
           count.
0       7. Alas, the file has only one record.  disk has no more to do; it
           returns to the dispatcher.
0
     --------------------
0    (1) All  quotes  not  otherwise  marked are  from  John  Hartmann's  CMS
+                                                                         ___
         Pipelines Explained.
+        ___________________
1
0    Record Flow in CMS Pipelines                                      Page 3
     ------------------------------------------------------------------------
0
        8. The initialisation/termination  part of the  dispatcher regains
           control and  cleans up after disk.    It notices that  count is
           waiting for its next input record, but clearly there will never
           be one; the dispatcher sets a return code for count to indicate
           end-of-file and resumes count; that is, it returns to count.
0       9. count  notes the  end-of-file  condition  and writes  a  record
           containing the count of input records  to its output.   It does
           this by calling the dispatcher entry point that writes a record
           into the pipeline just as disk  did.
0       10.    count is now in the same situation as disk in step 2; count
           is suspended; the dispatcher finally starts console.
0       11.    console calls the dispatcher to read a record.
0       12.    The dispatcher makes  available the record that  count just
           wrote and returns to console.
0       13.    console  writes  a  line  on the  terminal  and  calls  the
           dispatcher to read another record.
0       14.    The dispatcher suspends console and resumes count.
0       15.    count has no more to do and returns to the dispatcher.
0       16.    The dispatcher resumes console indicating end-of-file.
0       17.    console returns to the dispatcher.
0       18.    All  stages  of  the  pipeline have  now  returned  to  the
           dispatcher; the PIPE command returns to CMS.
0    You will see from this description that a program in a pipeline does not
     read all of  its input records and  write all of its  output records and
     then terminate  before the  next program  begins processing  (as in  DOS
     pipes), nor does a program in a pipeline produce an indeterminate number
     of bytes  that must be buffered  somewhere until the kernel  switches to
     the next program  to process those bytes (as in  UNIX pipes).   Instead,
     when a  stage in CMS  Pipelines writes  an output record,   the pipeline
+                     ______________
     dispatcher does  not allow it  to resume  running again until  after its
     output record has been read by the next stage in the pipeline.
0       One might  think that  the dispatcher  [would] gobble  up all  the
        output from disk [in  the example above] and then present  it as a
        unit to count....   Such a strategy would, however, limit the size
        of  files   that  could   be  processed   and  cause   unnecessary
        complications in the dispatcher; instead, the dispatcher transfers
        control between the stages in a  pipeline using a strategy that is
        designed to expedite  the flow of data and minimise  the number of
        records in flight.
1
0    Page 4                                      Record Flow in CMS Pipelines
     ------------------------------------------------------------------------
0
     The dispatcher  never buffers the  records flowing between  stages,  nor
     does it write  them to an intermediate file.   In  fact,  the dispatcher
     never touches the records at all.   Rather, when a stage writes a record
     into the pipeline,  it simply gives the pipeline dispatcher a pointer to
     the buffer containing that record and  waits until the dispatcher allows
     it to  continue processing.   When  a stage  requests a record  from the
     pipeline,  the dispatcher just gives it the pointer to the record in the
     buffer belonging to the stage that produced the record.
0       The stage on the left-hand of  a connection produces a record when
+                                                   ________
        it writes one into the pipeline.  When the stage on the other side
        of the connection consumes the record, the rendez-vous between the
+                         ________                 ___________
        producer  and  consumer  is  over;   each  can  continue  its  own
        processing.   The  record has  moved across  the connection:   the
        producer can re-use the output area; the consumer can read another
        record.
0    Thus, you will see that the two programs must operate in lock-step.  The
     right-hand program  must either consume  the record or  terminate before
     the left-hand  program will  be allowed  to do  any further  processing.
     This behavior is necessary because the  record exists only in the buffer
     belonging to the left-hand program.  That program must not be allowed to
     regain control and possibly alter the contents of its buffer until after
     the consumer has moved the record out of the buffer.
0    The fact that the producing and consuming stages operate in lock-step is
     one of the tools  that allow you to control the  flow of records through
     the pipeline, a topic we will return to later.
0
     Peeking at records
+    Peeking at records
+    __________________
0    In a typical pipeline,   one record goes through all of  the stages from
     the beginning  to the end  before the  next record starts  that journey.
     This behavior is so different from what most people expect that it often
     takes a while  for new plumbers really to believe  that pipelines behave
     this way.   Every  beginning plumber should get hold  of Chuck Boeheim's
     brilliant PIPEDEMO  program and  watch it as  it runs  its demonstration
     programs.   Watching PIPEDEMO display the  progress of records through a
     few pipelines  is a good  way to become  convinced that in  general most
     pipelines have only one record passing through at a time.
0    I mentioned earlier that the dispatcher  minimizes the number of records
     in flight by  giving priority to stages  that can pump out  data.   This
     strategy would  tend to result  in the  observed behavior,  but  in most
     cases the dispatcher has no real choice of which stage to dispatch next,
     because the producer and consumer of  a record operate in lock-step,  as
     we have just seen,  and because  most stages produce their output record
+                        and
     before they consume  the input record from which that  output record was
     derived.   That is, most pipeline stages "peek" at the record written by
     the left-hand  stage,  write  an output record  derived from  that input
     record, and then consume the input record.
1
0    Record Flow in CMS Pipelines                                      Page 5
     ------------------------------------------------------------------------
0
        The  peekto  pipeline  command  ...  peeks  at  a  record  without
+                                            _____
        consuming it.    A peek is  a non-destructive read;   a particular
        record can  be peeked  any number of  times before  being consumed
        with  a readto  pipeline command.    When a  peek completes,   the
        producer is guaranteed to  be blocked in a write,  or  there is no
        producer--the  stream has  been  severed  and is  at  end-of-file.
        Blocking the  producer ensures that the  contents of a  record are
        not changed  while it is [still]  being processed by  a subsequent
        stage.
0       Thus,  a  stage can  determine whether  it likes  a record  before
        consuming it;   in fact,  a  stage can  produce a derivative  of a
        record it has peeked before it consumes that record.
0    And,  in  fact,  most pipeline stages  do produce a derivative  of their
     input record  before they consume that  input record.   They do  this by
     processing each record through a peekto-output- readto loop.
0    One needs to  be clear on the  difference between the readto  and peekto
     pipeline commands.    In MVS  terms,  this is  the difference  between a
     move-mode read and a locate-mode read.  That is, a readto command copies
     the record from the producer's buffer to the consumer's buffer,  thereby
     freeing up the producer to produce additional records using that buffer.
     On the other hand,  peekto in effect  operates on the record while it is
     still in  the producer's buffer;  therefore,   the producer must  not be
     allowed to run, lest it change the contents of its buffer.
0    Let's   step  through   another  simple   pipeline  to   see  how   this
     peekto-output-readto scheme works:
0         pipe < input file | change /abc/def/ | xlate | > output file a
0    1. The dispatcher starts <,  which reads a  record from the CMS file and
        does an output to write the record into the pipeline.
0    2. The dispatcher suspends <,  which must wait for its output command to
        complete.
0    3. The dispatcher  looks for  a stage  that can  consume the  record and
        starts change.
0    4. change does a  peekto and is given  the pointer to the  record that <
        wrote to the pipeline.  < remains blocked, as its output command will
        not complete until change has consumed the record.
0    5. change may or  may not change the  record.   Either way,  it  does an
        output to write the record to the pipeline.
0    6. change is blocked in its write (as is <), so the dispatcher looks for
        another stage to run and starts xlate.
1
0    Page 6                                      Record Flow in CMS Pipelines
     ------------------------------------------------------------------------
0
     7. xlate does a peekto to get the  record,  upper-cases it,  and does an
        output to write the upper-cased record into the pipeline.
0    8. Now, as you will see,  <,  change,  and xlate are all blocked waiting
        for their output records to be consumed,  so the dispatcher starts >,
        which is the only stage that can run.
0    9. > does a peekto  and writes the peeked record to  a CMS file.   Then,
        since it has no output stream connected,  > consumes its input record
        by doing a readto.  It is then ready to process another record, so it
        does a  peekto to  get the next  record.   Since  there is  no record
        available,  the dispatcher suspends > and  looks for another stage to
        dispatch.
0    10.    xlate can now be dispatched, because its output command completed
        when >  consumed the  record with readto.    Now that  the derivative
        record has been  safely written,  xlate does a readto  to consume its
        input record and then does a peekto to get another record to process.
0    11.    There is no record available for  xlate,  so the dispatcher looks
        for another stage to  run and finds that the only  stage that can run
        is change,  which can now be  dispatched,  because its output command
        completed when xlate consumed the record.  change can now do a readto
        to consume  the input  record it has  finished processing.    It then
        peeks to get another  record,  which causes it to give  up control to
        the dispatcher.
0    12.    The dispatcher finds that change,  xlate,   and > are all waiting
        for input, so the only stage it can dispatch is <,  which can now run
        because its output command has completed.  < reads from the CMS file.
        If  there is  another  record,  that  record  will  flow through  the
        pipeline in exactly the same way the first record did,  going all the
        way  to the  end  before the  next record  gets  started through  the
        pipeline.  And this sequence will repeat until all the records in the
        input file have been processed.
0    13.    When all the records have been read, < returns to the dispatcher,
        causing its output stream to be severed.
0    14.    The dispatcher looks for work to do  and sees that the only stage
        it can dispatch is change, which is sitting in a peekto waiting for a
        record on  the stream  that has  just been  severed.   It  dispatches
        change  with a  return code  of 12  to indicate  end-of-file on  that
        peekto.
0    15.    change sees that it has no more  input records,  so it returns to
        the dispatcher, causing its output stream to be severed.
0    16.    Similarly, xlate and then > are dispatched with return code 12 on
        their peekto commands, so they also return to the dispatcher.
1
0    Record Flow in CMS Pipelines                                      Page 7
     ------------------------------------------------------------------------
0
     What happened here was that the last  stage wrote each record before the
     first stage  consumed it.   Stage 1  wrote a record and  became blocked,
     stage 2 peeked it and wrote it and became blocked, stage 3 peeked it and
     wrote it and became blocked, etc.   Finally, stage n consumed it,  stage
+                                 ___                   _
     n-1 consumed it, ... stage 1 consumed it.   Then the next record started
+    _
     through the pipeline,  and it,  too,  shot  through all of the stages as
     quickly as possible.
0    Thus,  in  this simple pipeline and,   indeed,  in most  pipelines,  the
     pipeline  dispatcher never  had a  choice of  which stage  to run  next.
     Because most  pipeline stages  produce their  output record  before they
     consume the corresponding  input record,  the flow of  records through a
     pipeline is  completely predictable.    The dispatcher  simply runs  the
     stage that can consume a record, if there is one waiting to be consumed;
     if there is no record waiting to be  consumed,  it finds a stage that is
     runnable and runs it.
0
     Pipeline Stalls
+    Pipeline Stalls
+    _______________
0    What happens when the dispatcher discovers  that no stage can run?   The
+                                                     __
     pipeline stalls.
0       A pipeline  is stalled  when no stage  can run  and no  stages are
        waiting for external events,  but  not all stages have terminated.
        A  stall can  occur  only if  at  least one  stage  of a  pipeline
        specification has secondary streams.   A stall is easily provoked;
        for example, this two-stage pipeline stalls:
1
0    Page 8                                      Record Flow in CMS Pipelines
     ------------------------------------------------------------------------
0
        +----------------------------------------------------------------+
        |                                                                |
        |                  +---------+    +-------+                      |
        |                  | literal |--->|0     0|---+                  |
        |                  +---------+    |       |   |                  |
        |                                 | fanin |   |                  |
        |                                 |       |   |                  |
        |                            +--->|1      |   |                  |
        |                            |    +-------+   |                  |
        |                            |                |                  |
        |                            +----------------+                  |
        |                                                                |
        |----------------------------------------------------------------|
+       +                                                                +
        |                                                                |
        |       pipe literal abc | i: fanin | i:                         |
        |                                                                |
        |       Pipeline stalled.                                        |
        |       ... Issued from stage 2 of pipeline 1.                   |
        |       ... Running "fanin".                                     |
        |       Stage is wait out.                                       |
        |       ... Issued from stage 1 of pipeline 1.                   |
        |       ... Running "literal abc".                               |
        |       Stage is wait out.                                       |
        |       ... Issued from stage 2 of pipeline 1.                   |
        |       ... Running "fanin".                                     |
        |       Ready(-4095);                                            |
        |                                                                |
        +----------------------------------------------------------------+
0       The  label reference  at  the end  of  the pipeline  specification
        defines fanin's secondary  streams;  its primary output  stream is
        connected to its secondary input stream.
0       fanin works  by first  passing all  records on  the primary  input
        stream to the primary output stream; it then passes all records on
        the secondary input  stream to the primary output  stream,  and so
        on.   In this example, it peeks at the input record on the primary
        input stream  and writes  it to the  primary output  stream.   The
        dispatcher blocks fanin (makes it non-dispatchable)  because it is
        waiting for an output record to be  consumed.   But it is the very
        same   stage  that   must   consume   the  record,    and   (being
        non-dispatchable) fanin cannot possibly consume an input record to
        satisfy its own write.   The pipeline dispatcher now tries to find
        a stage to run  to consume the record,  but cannot  find one;  the
        pipeline is stalled.   To continue,   the pipeline dispatcher then
        severs  all connections  and makes  all  stages dispatchable  with
        return code -4095.
0    Pipeline stalls most commonly arise from this sort of pipeline topology:
1
0    Record Flow in CMS Pipelines                                      Page 9
     ------------------------------------------------------------------------
0
     +----------------------------------------------------------------------+
     |                                                                      |
     |        +----------+    +------+    +------+    +----------+          |
     |        |          |----|      |----|      |----|          |          |
     |        |          |    +------+    +------+    |          |          |
     |     ---| splitter |                            |  joiner  |---       |
     |        |          |          +------+          |          |          |
     |        |          |----------|      |----------|          |          |
     |        +----------+          +------+          +----------+          |
     |                                                                      |
     +----------------------------------------------------------------------+
0    A "splitter" stage writes records to two or more output streams.   Later
     in the pipeline, a "joiner" stage reads records from those same streams.
0    As we have  seen,  when any stage  writes an output record,   it remains
     blocked until  that record has been  consumed by the stage  connected to
     its output stream.   Thus, when a splitter stage (such as locate,  chop,
     drop, or fanout) writes a record on one output stream, it must then wait
     until that record  has been consumed before it can  write another record
     to that same stream or to any other stream.
+                        or
0    But, as we have seen,  most pipeline stages do not consume a record when
     they   first  read   it.     They  process   every   record  through   a
     peekto-output-readto loop.    They peek their  input record and  write a
     derivative and do not consume their  input until after they have written
     the derivative to  their own output stream  and it has been  consumed by
     the stage to which they wrote it.
0    If all the stages in the multistream portions of a pipeline like the one
     shown above  use peekto- output-readto loops,   each of them  passes the
     record along  without consuming  it until after  each of  the subsequent
     stages has consumed it.   Ultimately, then, the splitter stage must wait
     for the  joiner stage  to consume  each record  before the  splitter can
     write the next one.    If the joiner stage cannot consume  a record that
     the splitter stage is trying to write, the pipeline stalls.
0    If the joiner  stage is faninany,  this configuration  will never stall,
     because faninany always reads any record that is available on any of its
     input streams.   Other joiner stages are more exacting, however.   Some,
     such as spec and synchronise, wait until they have a record available on
     each of  their input streams  before they consume  any of them  and then
     consume them in stream-number order.  Others, such as collate and merge,
     wait until they have  a record available on each of  their input streams
     and then choose which  one to consume next based on  the contents of the
     records.   fanin  and lookup have  the most extreme  requirements;  they
     consume all the records from one input  stream before they will read any
     records from any other input stream.
0    A further complication is  that a stage in the multistream  portion of a
     pipeline may "buffer" the records;  that is,  some stages,  such as sort
     and instore,  consume all their input  records before writing any output
1
0    Page 10                                     Record Flow in CMS Pipelines
     ------------------------------------------------------------------------
0
     records and, thus,  may keep the joiner stage waiting for records on one
     of its input streams.
0    So,  in  a pipeline where  there is a  stage that reunites  streams that
     originated  in a  single  stage earlier  in the  pipeline,   there is  a
     potential for pipeline stalls.  The splitter stage may try to write onto
     one stream while the joiner stage is trying to read from another stream.
     When that happens, the pipeline stalls.
0    To prevent stalls,   one inserts between the splitter  and joiner stages
     (on one or more of the streams)   a pipeline stage that will unblock the
     splitter stage by consuming the necessary  number of records and holding
     them until the joiner stage is ready to read them.
0    The number of records that need to be "buffered" in this way varies.  In
     the case  with the least requirement  for such buffering,   the splitter
     stage writes  a record to  each of  its output streams  in stream-number
     order; the joiner stage reads records in stream-number order; and all of
     the intermediate  stages contain  peekto-output-readto loops.    In this
     case, the flow of the records is not data-dependent,  and a stall can be
     prevented  simply by  introducing  copy  stage(s)  on  the  low-numbered
     stream(s):
0    +----------------------------------------------------------------------+
     |                                                                      |
     |             +----------+    +------+    +----------+                 |
     |             |          |----| copy |----|          |                 |
     |             |          |    +------+    |          |                 |
     |          ---|  fanout  |                |   spec   |---              |
     |             |          |    +------+    |          |                 |
     |             |          |----|      |----|          |                 |
     |             +----------+    +------+    +----------+                 |
     |                                                                      |
     +----------------------------------------------------------------------+
0    copy is  a very  simple built-in program  consisting of  a readto-output
     loop.   That is, copy violates the rule we have been discussing; it does
     not use peekto.  This would be the equivalent of copy in REXX:
0    +----------------------------------------------------------------------+
     |                                                                      |
     |  /* COPY REXX */                                                     |
     |  Signal On Error                                                     |
     |  Do Forever                           /* Do until end-of-file.  */   |
     |     'READTO record'                   /* Consume a record.      */   |
     |     'OUTPUT' record                   /* Copy it to the output. */   |
     |  End                                                                 |
     |  Error: Exit RC*(RC<>12)                                             |
     |                                                                      |
     +----------------------------------------------------------------------+
1
0    Record Flow in CMS Pipelines                                     Page 11
     ------------------------------------------------------------------------
0
     copy does a consuming  read to get a record and  then copies that record
     to its output stream.   So, the copy stage in the diagram above consumes
     the record  that fanout  writes to its  primary output  stream,  freeing
     fanout to  write a record to  its secondary output  stream.   Meanwhile,
     copy writes that first record to its output stream.   As a result,  spec
+                                     ___
     then finds records  available on both of its input  streams and consumes
     them both,  freeing  fanout to write to its primary  output stream again
     and freeing copy to read from that stream again.
0       The readto pipeline  command [in copy] performs  a consuming read;
        the producer is  not blocked and can produce  another record while
        [copy's] output  pipeline command  ...  is  blocked in  its write.
        Whether this actually happens depends on the implementation of the
        pipeline dispatcher; this is deliberately unspecified.
0    In other words,  in general,  either copy or fanout could write the next
     record.   In the general case,   that could introduce some indeterminacy
     into your pipeline.   In this specific  case,  however,  the pipeline is
     completely determinate.    spec will  not process  a record  from either
     input  stream until  it  has  peeked a  record  on  both input  streams.
     Therefore,  it doesn't matter whether copy or fanout produces its output
     record first,   for spec  will not  process either  record until  it has
     peeked both  of them.    copy has  introduced just  enough buffering  to
     prevent a stall.
0    At the other  extreme in the amount  of buffering required to  prevent a
     stall is the case where the joiner  stage is fanin.   Because fanin will
     read no  records from  its secondary stream  until it  has read  all the
     records from  its primary stream,   one must  introduce a stage  such as
     buffer to consume all the records on  the secondary stream and hold them
     until fanin is ready for them:
0    +----------------------------------------------------------------------+
     |                                                                      |
     |             +----------+   +--------+   +----------+                 |
     |             |          |---|        |---|          |                 |
     |             |          |   +--------+   |          |                 |
     |          ---|  fanout  |                |  fanin   |---              |
     |             |          |   +--------+   |          |                 |
     |             |          |---| buffer |---|          |                 |
     |             +----------+   +--------+   +----------+                 |
     |                                                                      |
     +----------------------------------------------------------------------+
0    Otherwise, the splitter stage (fanout here)  would become blocked trying
     to write the first record to fanin's  secondary input and would never be
     able to write records 2-n to fanin's primary input.
+                            _
0    The same requirement  to insert a stage  to buffer all the  records on a
     stream may arise because one of the other streams contains a stage, such
     as sort, that buffers all the records:
1
0    Page 12                                     Record Flow in CMS Pipelines
     ------------------------------------------------------------------------
0
     +----------------------------------------------------------------------+
     |                                                                      |
     |             +----------+    +------+    +----------+                 |
     |             |          |----| sort |----|          |                 |
     |             |          |    +------+    |          |                 |
     |          ---|  fanout  |                |   spec   |---              |
     |             |          |   +--------+   |          |                 |
     |             |          |---| buffer |---|          |                 |
     |             +----------+   +--------+   +----------+                 |
     |                                                                      |
     +----------------------------------------------------------------------+
0    In the  intermediate case,   one or more  of the  streams may  need some
     buffering of records, depending on the order in which the splitter stage
     decides to  write and  the order in  which the  joiner stage  decides to
     read.  This is a job for elastic:
0    +----------------------------------------------------------------------+
     |                                                                      |
     |         +--------+        +---------+         +---------+            |
     |         |        |--------| elastic |---------|         |            |
     |         |        |        +---------+         |         |            |
     |      ---| fanout |                            | collate |---         |
     |         |        |   +-------+   +--------+   |         |            |
     |         |        |---| xlate |---| locate |---|         |            |
     |         +--------+   +-------+   +--------+   +---------+            |
     |                                                                      |
     +----------------------------------------------------------------------+
0    elastic copies its input records to its output, buffering as many as may
     be  necessary to  prevent a  pipeline  stall.   It  reads input  records
     whenever they  become available  and writes output  records as  they are
     consumed,  while  attempting to  minimize the number  of records  in its
     buffer.
0    The fanout stage here makes two copies of each record.  The copy that is
     written to  fanout's primary output stream  is not changed as  it passes
     through this pipeline.   The copy that  is written to fanout's secondary
     output stream  is modified  and filtered  by a  series of  stages.   The
     collate stage  reads records from  both of  its input streams,   to find
     pairs that  have matching keys.   When  collate finds a pair  of records
     that match, it copies them to its output stream, discarding all others.
0    The  elastic stage  here  prevents a  pipeline  stall  by consuming  the
     records fanout writes on its primary  output stream,  as it writes them,
     thus freeing fanout  to write to its secondary  output stream.   elastic
     then makes the records available to  collate whenever it wants them,  so
     that collate  always has a  record available  on its primary  input when
     locate selects  a record  and writes  it to  collate's secondary  input.
     elastic's buffering introduces no indeterminacy here,  because the order
     of the records written  by collate is determined by the  contents of the
     records, not by the timing of their arrival.
1
0    Record Flow in CMS Pipelines                                     Page 13
     ------------------------------------------------------------------------
0
     copy would not do the job here because  more than one record may need to
     be  buffered at  a given  time,  depending  on how  many records  locate
     discards.   buffer  could be used,   but there would  then need to  be a
     buffer on the other stream as well,  and collate would see no records at
     all  until after  fanout  had terminated.    This  would require  enough
     virtual memory  to hold  one whole and  one partial  copy of  the input.
     Thus,   elastic is  preferred  here because  it  does  just the  minimum
     buffering required to prevent this pipeline from stalling.
0    In general,  one can use elastic wherever buffering is needed to prevent
     a stall.   However,  if one knows that no more than one record will ever
     need to  be buffered at  a time,  copy  is more efficient  than elastic,
     while if one knows that the entire file must be buffered, buffer is more
     efficient than elastic.
0
     Debugging Pipeline Stalls
+    Debugging Pipeline Stalls
+    _________________________
0    If one  keeps these models  in mind,   it is usually  straightforward to
     write pipelines that don't stall,  but now and then one slips up,  and a
     pipeline stalls.  If the pipeline is complex, the cause of the stall may
     not be obvious,   but fixing stalls is not difficult  if one understands
     how records flow through a pipeline.
0    When fixing  a stall,   it is important  to keep  in mind  that pipeline
     stalls are the penalty we pay for  the fact that the pipeline dispatcher
     moves records  through a  multistream pipeline  in such  a way  that the
     order of their arrival  at the end of the pipeline  is predictable.   By
     reasoning carefully about record flow,  one  can learn to prevent stalls
     while retaining predictability.  "Fixing" a stall by introducing bunches
     of buffer or copy or elastic stages at random may cause your pipeline to
     use more virtual memory than necessary,  but,  more importantly,  it may
     introduce indeterminacy into the pipeline.  It is well worth learning to
     cure stalls  by inserting only the  minimal buffering required  to solve
     the problem.
0    There are three tools that you can use to analyze a stall:
0    1. Pipe dumps:  When a pipeline stalls, a file called PIPDUMP LISTING is
+       Pipe dumps:
        written to  the A-disk.    It contains  a dump  of Pipelines  control
+                                                          _________
        blocks somewhat  cryptically formatted.   It  is possible to  solve a
+                                                     __
        stall by reading a PIPDUMP.  I did it once, but I don't recommend it.
        In fact,   you may  prefer to turn  PIPDUMPs off  by issuing  the CMS
        command GLOBALV SETP PIPDUMP OFF.
0    2. Stall traces:   When a pipeline stalls,   a stall trace is written to
+       Stall traces:
        the  console  unless  you  have   suppressed  Pipelines  messages  by
+                                                     _________
        modifying the MSGLEVEL setting.(2)   It is  quite feasible to shoot a
0    --------------------
0    (2) The option nomsglevel 6 disables the message "Issued from ...",  but
         retains the message "Running ...".
1
0    Page 14                                     Record Flow in CMS Pipelines
     ------------------------------------------------------------------------
0
        stall using the stall trace.
0       Here is a pipeline that stalls:
0       +-------------------------------------------------------------------+
        |                                                                   |
        | 'PIPE (nomsglevel 6 endchar ? name StallingPipeline)',            |
        |       '< input file |',          /* Read input file.         */   |
        |    'f: find V|',                 /* Select detail records.   */   |
        |    'l: lookup 13.8 master |',    /* Compare with masters.    */   |
        |       '> output file a',         /* Write matched masters.   */   |
        | '?',                                                              |
        |    'f: |',                       /* Non-detail records here. */   |
        |       'locate 2 /C/ |',          /* Select master records.   */   |
        |    'l:'                          /* Into LOOKUP's secondary. */   |
        |                                                                   |
        +-------------------------------------------------------------------+
        +-------------------------------------------------------------------+
        |                                                                   |
        |       Pipeline stalled.                                           |
        |       ... Running "> output file a".                              |
        |       Stage is wait out.                                          |
        |       ... Running "< input file".                                 |
        |       Stage is wait out.                                          |
        |       ... Running "find V".                                       |
        |       Stage is wait loc.                                          |
        |       ... Running "lookup 13.8 master".                           |
        |       Stage is wait loc.                                          |
        |       ... Running "> output file a".                              |
        |       Stage is wait loc.                                          |
        |       ... Running "locate 2 /C/".                                 |
        |       Ready(-4095);                                               |
        |                                                                   |
        +-------------------------------------------------------------------+
0       One thing you need to know is  that the "Pipeline stalled" message in
        the trace does not point to the  stage responsible for the stall;  it
+                      ___
        just points to the last stage  the dispatcher tried to dispatch.   In
        other words,  the  dispatcher had tried to dispatch all  of the other
        stages and had been unable to, so when it got to this one it realized
        that the pipeline was stalled.   In fact,  a stall is not caused by a
        specific stage; it is caused by the topology of the pipeline.
0       The approach I find most useful in  debugging a stall is to check the
        stall trace for all stages that are marked "wait out".   Those stages
        were blocked waiting for their output to be consumed.  I next look at
        the stages  that were supposed  to consume  those records and  try to
        figure out  why they could not  consume them.   Either they  are also
        marked "wait out" (i.e.,  we have a cascade of stages waiting for the
+                          ____
        same record to  be consumed),  or they are marked  "wait loc",  which
        means that  they were waiting  for input,   but on a  different input
+                                                             _________
        stream.   Until that other stream produces an input record, the stage
1
0    Record Flow in CMS Pipelines                                     Page 15
     ------------------------------------------------------------------------
0
        marked "wait loc" will  not switch to read the record  from the stage
        marked "wait out".   When you have  figured out why that other stream
        did not  produce a record,   you have figured  out the cause  of your
        stall.
0       Looking at the PIPE command and this stall trace, we see:
0       o  < is  waiting for  its output  to be  consumed.   Its  only output
           stream is connected to find,  which is also waiting for its output
           to be  consumed.   Therefore,   find had peeked  a record  from <,
           leaving <  blocked in  an output  while find  tried to  write that
           record itself.
0       o  find has two output streams.   Its  primary output is connected to
           lookup,  which is waiting to read input.   If it had been the case
           that (1)  find was  trying to write to its primary  output and (2)
+                                                                     and
           lookup was trying to read from its primary input,  they would have
           been able to do so.   Therefore, we can conclude that at least one
           of those statements is untrue.
0       o  The input stream of locate is connected to the secondary output of
           find,  and locate is waiting to read.   If find had been trying to
           write to its secondary output, locate would have been able to read
+                       _________
           the record.   From  that we can conclude that find  must have been
           trying to write to its primary output.
0       o  Therefore,  lookup  must not  have been  reading from  its primary
           input, which is why the pipeline stalled.
0       You will recall that lookup must  read all records from its secondary
        input before it reads any records from its primary input.   (That is,
        it  reads all  of the  master records  into its  reference before  it
        begins processing its  detail records and matching  them against that
        reference.)   In this pipeline,  find can't  write any records to its
        primary output  until lookup begins  reading from its  primary input,
        but lookup  won't start reading from  its primary input until  it has
        seen end-of-file on its secondary input.  It will not see end-of-file
        on its secondary input  until find has read all of  its input records
        and copied them to either its primary output or its secondary output.
        This is an impasse, which is why the pipeline stalled.
0       To fix the stall,  one must allow find to write to its primary output
        when it finds a record that should go to that stream.   The way to do
        that is to  insert a buffer stage  between find and lookup  to absorb
        all of the  records that find writes  to its primary output  and hold
        them until lookup begins reading from its primary input:
1
0    Page 16                                     Record Flow in CMS Pipelines
     ------------------------------------------------------------------------
0
        +-------------------------------------------------------------------+
        |                                                                   |
        | 'PIPE (endchar ? name NonStallingPipeline)',                      |
        |       '< input file |',          /* Read input file.         */   |
        |    'f: find V|',                 /* Select detail records.   */   |
        |       'buffer |',                /* Hold until masters read. */   |
        |    'l: lookup 13.8 master |',    /* Compare with masters.    */   |
        |       '> output file a',         /* Write matched masters.   */   |
        | '?',                                                              |
        |    'f: |',                       /* Non-detail records here. */   |
        |       'locate 2 /C/ |',          /* Select master records.   */   |
        |    'l:'                          /* Into LOOKUP's secondary. */   |
        |                                                                   |
        +-------------------------------------------------------------------+
0       Note that introducing  a buffer before the lookup  will not introduce
        unpredictability into this pipeline.  The order of the output records
        here is determined solely by the order  of the records on the primary
        input of the lookup, which is not changed by holding those records in
        a buffer for a while.
0       Here is another pipeline that stalls:
0       +-------------------------------------------------------------------+
        |                                                                   |
        | /* BADUPW REXX --  *BAD*       Uppercase word in left margin */   |
        | signal on novalue                                                 |
        | 'callpipe (name BadUpW nomsglevel 6 endchar ?)',                  |
        |    '|   *:',                                                      |
        |    '|c: chop blank',        /* Truncate after the first word */   |
        |    '|   xlate upper',       /* Make first word uppercase     */   |
        |    '|i: spec 1-* 1 select 1 1-* next',     /* Rebuild record */   |
        |    '|   *:',                                                      |
        |    '?c:',                   /* Rest of record                */   |
        |    '|i:'                    /* Into SPEC                     */   |
        | exit RC                                                           |
        |                                                                   |
        +-------------------------------------------------------------------+
1
0    Record Flow in CMS Pipelines                                     Page 17
     ------------------------------------------------------------------------
0
        +-------------------------------------------------------------------+
        |                                                                   |
        |       pipe (nomsglevel 6) literal abc def | badupw | console      |
        |                                                                   |
        |       Pipeline stalled.                                           |
        |       ... Running "console".                                      |
        |       Stage is wait out.                                          |
        |       ... Running "literal abc def".                              |
        |       Stage is unavail.                                           |
        |       ... Running "badupw".                                       |
        |       Stage is wait loc.                                          |
        |       ... Running "console".                                      |
        |       Stage is wait out.                                          |
        |       ... Running "chop blank".                                   |
        |       Stage is wait out.                                          |
        |       ... Running "xlate upper".                                  |
        |       Stage is wait loc.                                          |
        |       ... Running "spec 1-* 1 select 1 1-* next".                 |
        |       Ready(-4095);                                               |
        |                                                                   |
        +-------------------------------------------------------------------+
0       From this stall trace, we see:
0       o  The REXX filter  badupw is marked "unavail";  this  means that its
           callpipe command  is running.   Therefore,   the input  and output
           streams of the badupw stage have been taken over by the subroutine
           pipeline;  specifically,   the output of  the literal in  the main
           pipeline is connected  to the input of the chop  in the subroutine
           pipeline, and the output of the spec in the subroutine pipeline is
           connected to the input of the console stage in the main pipeline.
0       o  literal is waiting for its output record to be consumed by chop.
0       o  chop is also waiting for its output  to be consumed,  but we don't
           yet know to which output stream that record is being written.
0       o  xlate is also waiting for its output to be consumed.  We know that
           xlate does not  consume its input until after its  output has been
           consumed,  so  xlate must have blocked  chop by peeking  at chop's
           output.  This tells us that chop is trying to write to its primary
           output stream and  that the cascade of literal,   chop,  and xlate
           must be working on the same record.
0       o  spec is waiting to read.   Given that xlate is waiting to write to
           the primary input of spec,  we can assume that spec must be trying
           to read from its other input;  otherwise,  spec would read xlate's
+                           _____
           output.
0       o  We have already concluded that chop  is blocked trying to write to
           its primary  output,  so it will  never be able to  satisfy spec's
           read from  its secondary  output,  which  is why  the pipeline  is
           stalled.
1
0    Page 18                                     Record Flow in CMS Pipelines
     ------------------------------------------------------------------------
0
        The problem is that chop produces a  record on two output streams one
+                                                                         ___
        at a time,  but  spec requires both of its inputs  to be available at
+       _________
        the same time.   The fix is quite simple.   One inserts a copy on the
        primary output of chop to buffer one record:
0       +-------------------------------------------------------------------+
        |                                                                   |
        | /* BADUPW REXX --  *GOOD*      Uppercase word in left margin */   |
        | signal on novalue                                                 |
        | 'callpipe (name GoodUpW nomsglevel 6 endchar ?)',                 |
        |    '|   *:',                                                      |
        |    '|c: chop blank',        /* Truncate after the first word */   |
        |    '|   copy',              /* Add quantum delay so no stall */   |
        |    '|   xlate upper',       /* Make first word uppercase     */   |
        |    '|i: spec 1-* 1 select 1 1-* next',     /* Rebuild record */   |
        |    '|   *:',                                                      |
        |    '?c:',                   /* Rest of record                */   |
        |    '|i:'                    /* Into SPEC                     */   |
        | exit RC                                                           |
        |                                                                   |
        +-------------------------------------------------------------------+
0       Again, this fix does not introduce unpredictability, because the flow
        of records is still determinate.   The two records that chop produces
        from each  original input record are  still processed by spec  at the
        same time, so the order of the records is not perturbed.
0    3. Jeremy:   If you have a data-dependent stall, you may need to see the
+       Jeremy:
        contents  of  the records  that  are  being  written before  you  can
        determine the problem.   My pipeline formatter jeremy(3)  can help by
        displaying the contents  of your pipeline's output  buffers.   jeremy
        will  also tell  you  which stream  each stage  was  reading from  or
        writing to,  so you can skip a good deal of the analysis we needed to
        do when we were looking at the two traces above.   Here is the output
        from jeremy for the stall we have just analyzed:
0
 
0
 
0
 
0
 
0    --------------------
0    (3) jeremy is available  from Princeton's CMS Pipelines  Runtime Library
+                                              _____________
         Distribution Web page, http://pucc.princeton.edu/~pipeline/.  jeremy
         will be built into Pipelines in CMS 14.
+                           _________
1
0    Record Flow in CMS Pipelines                                     Page 19
     ------------------------------------------------------------------------
0
        +-------------------------------------------------------------------+
        |                                                                   |
        |       Pipeline specification 1 "JeremyTest"                       |
        |          Stage 1 "literal abc def" is wait.out: "abc def".        |
        |          Stage 2 "badupw" is wait.subr.                           |
        |          Stage 3 "console" is wait.locate.                        |
        |       Pipeline specification 2 "BadUpW"                           |
        |          Stage 2 "chop blank" is wait.out: "abc".                 |
        |            input  0 selected.                                     |
        |            output 0 selected.                                     |
        |          Stage 3 "xlate upper" is wait.out: "ABC".                |
        |          Stage 4 "spec 1-* 1 select 1 1-* n" is wait.locate.      |
        |            input  1 selected -- producer's stream not selected.   |
        |            output 0 selected.                                     |
        |                                                                   |
        +-------------------------------------------------------------------+
0       You will note that  the records being written by the  stages that are
        blocked in output commands are displayed,   and we are told that spec
        has selected its secondary input stream  but that its producer (chop)
        does not have that stream selected.
0    One other comment about pipeline stalls:  if you get a stall with one or
     more stages  marked "wait com" ("wait  for commit"),   seek professional
     help.
0
     Dispatching Order
+    Dispatching Order
+    _________________
0    As we have seen,  stalls happen when the dispatcher can find no stage to
     run.   Now, let's consider what happens when the dispatcher has a choice
     of stages to  run.   Actually,  the dispatcher often has  some choice of
     which stage to run next.   In  the pipelines we stepped through earlier,
     there were places where  I glossed over this point,  because  it made no
     difference.   For example,  when the readto command completes in a stage
     that contains a peekto-output-readto loop,  there are,  at that instant,
     two stages that could  run,  the consumer stage (the stage  that did the
     readto) and the producer stage (the stage that issued the output command
     that produced the record that was consumed by the readto).   Usually, it
     makes no difference which gets  dispatched next,  because nothing useful
     will happen  until the producer stage  runs.   If the consumer  stage is
     dispatched first, it will immediately become blocked in its next peekto,
     which won't  complete until  the producer is  dispatched and  produces a
     record.   The two stages will operate  in lock-step,  no matter what the
     dispatching order.
0    Hanging around the Plumbers' Hall,  apprentices  hear that "the order of
     dispatching is  unspecified".   This is quite  true,  but it  is usually
     irrelevant.    As  the  Piper  has said,   "Even  though  the  order  of
     dispatching is unspecified, if only one stage can run,  that will be the
     one running".   However, in those relatively rare pipelines in which the
     dispatcher has a real choice of which  stage to run next,  the algorithm
1
0    Page 20                                     Record Flow in CMS Pipelines
     ------------------------------------------------------------------------
0
     by which  it makes  that choice is  not documented  and,  in  fact,  has
     changed over the years as the dispatcher has been improved.
0    Typically,  the lesson that the dispatching order is unspecified doesn't
     really sink in until one learns it  the hard way.   Often,  this is when
     one writes  a pipeline  with more  than one  stage that  can be  started
     first,  and one  assumes that the order  in which the stages  will begin
     executing  is   the  order  in  which   they  appear  in   the  pipeline
     specification.
0    In fact,  of course,  any stage can  be started first,  but that doesn't
+                          ___
     matter,  because most stages try to read from their input stream as soon
     as they are started.  If they have no input waiting, they become blocked
     immediately and  can't do  anything useful  until they  get some  input.
     Thus,   the order  in which  they  are actually  started is  irrelevant.
     However,  there are stages  for which that is not true.    One case that
     trips novices up is the command  processor stages (e.g.,  cms,  command,
+                                                       ____
     cp, subcom, etc.).   When those stages have a command specified as their
+                ____
     argument, they will issue that command as soon as they are started.  And
     there is  no reason  why the  dispatcher can't  start them  before other
     stages,   even  if  they  are  positioned  very  late  in  the  pipeline
     specification.    If you  want to  issue  a command  only after  earlier
     portions of the pipeline have finished processing,  this is the idiom to
     use:
0             pipe . . . | hole | append command FINIS * * A | hole
0    append  passes all  of the  records from  its  input to  its output  and
     doesn't launch  the stage specified in  its argument until after  it has
     finished passing all of those records.  The process of copying the input
     records to  the output completes as  soon as end-of-file is  received on
     the input stream  or on the output  stream.   The first hole  stage here
+                      or
     runs until it receives end-of-file on  its input.   When that hole stage
     terminates, append sees that it has no more input records to copy to its
     output,  so it  runs the command stage.   There is  nothing tricky about
     that;  it is the  way people usually use append.   What  makes this case
     tricky is that append is free to run its argument stage earlier than one
     might have expected,  if its output  is not connected.   If it discovers
+                          __
     that its output is not connected,  it will  not try to copy its input to
     its  output.   It  will  then run  its  argument stage  as  soon as  the
     dispatcher dispatches  it.   So,  the purpose  of putting a  stage after
     append here is to keep append  from getting premature end-of-file on its
     output.   Some other stage,  such as console,  would do instead of hole,
     but the output of append should be connected when append is used to wait
     to issue  a command until  after earlier  portions of the  pipeline have
     completed.
0    Another  case that  trips up  plumbers in  training is  a pipeline  that
     contains  two independent  input device  drivers.   Again,   there is  a
     tendency  to assume  that stages  run in  the  order they  occur in  the
     pipeline specification.  Here is an example:
1
0    Record Flow in CMS Pipelines                                     Page 21
     ------------------------------------------------------------------------
0
     +----------------------------------------------------------------------+
     |                                                                      |
     |     'PIPE (endchar ?) var banana | f: faninany | > output file a',   |
     |     '?           stem glorp. | f:'                                   |
     |                                                                      |
     +----------------------------------------------------------------------+
0    What will be the order of the  records in the output file?   Novices may
     assume that the output of the var  stage will be first,  followed by the
     records from the stemmed array.   That may be the case,  but it could be
     the other way around, or the records could be intermixed.  (In fact, the
     results have been different at different levels of CMS Pipelines.)  Both
+                                                       _____________
     var and stem are able to run as  soon as the pipeline starts up,  so the
     dispatcher could choose to run either of them first.   Since the sources
     of the  two streams  are independent,   the order  in which  the records
     arrive at faninany is left to the whim of the dispatcher.
0    Nothing in this pipeline constrains the  relative order in which the two
     input device drivers (var and stem)  are initiated and dispatched;  they
     operate completely independently of one another, each writing its output
     to faninany.   faninany  has no preference for one of  its input streams
     over the other.  Each device driver operates in lock-step with faninany,
     in the sense that neither can  produce another output record until after
     faninany has consumed  the earlier record from that  device driver,  but
     the order in which faninany will read  from its streams when both have a
     record available is  not specified,  nor is  the order in which  the two
     device drivers will be started.
0    In  a case  like  this,  where  multiple  input  device drivers  operate
     independently of one another,  the plumber  must take steps to constrain
     the order of the  records in the output file to be  what he wishes.   In
     this case,  the most obvious solution is to replace faninany with fanin,
     as fanin  will read  all of the  records produced by  one of  the device
     drivers before reading any of the records produced by the other.   Which
     stream is read first is determined by the arguments to fanin.
0    Both device drivers may get started and produce a record before fanin is
     started.   One  will then have  to wait for  that record to  be consumed
     until after  fanin has  read all of  the records  from the  other device
     driver.   Or,   in fact,   one of  the device  drivers may  not even  be
     dispatched  to  produce its  first  record  until  after the  other  has
     terminated.
0    It is possible to determine the dispatching order by tracing a pipeline.
     Indeed,  there  have been  numerous misguided  requests to  document the
     relative  starting order  of various  input  device drivers.    It is  a
     mistake,  however,  to write pipelines that  depend on that order.   The
     order has changed in the past and  is likely to do so again.   Pipelines
+                                                                   _________
     has mechanisms for  enforcing deterministic behaviour,  and  it behooves
     the journeyman plumber to learn to use them.
1
0    Page 22                                     Record Flow in CMS Pipelines
     ------------------------------------------------------------------------
0
     Record Delay
+    Record Delay
+    ____________
0    Now,  let's go on to the more  interesting case where one can get burned
     by making assumptions about dispatching order.
0    In one's early  career as a plumber,   one can get along  nicely without
     understanding  the  concept  of  record delay,   but  later  it  becomes
     essential, so we will discuss record delay at some length.   In fact, we
     have already  been discussing record delay,   but without calling  it by
     name.   I have avoided the term "delay the record" up until now, because
     it seems to be a difficult one to absorb.  As Steve Hayes has said:
0       Perhaps "distract the record" would be a better term.
0       "Record delay" occurs  when a stage tells the  stage upstream that
        it has  finished with a record  before this is really  true.   The
        plumbers' jargon for this is "consuming the record".  A stage that
        does a readto  before an output unblocks its  upstream stage while
        it still has  work to do,  leaving  two stages able to  run at the
        same time.   You can't predict  which the pipeline dispatcher will
        choose, so your records may arrive in the "wrong" order.
0       There can  be many  good reasons  for doing  this;  for  instance,
        unique last  must do this,   because it  needs to read  record n+1
+                    must                                              _
        before deciding whether to write record n, and the Pipelines model
+                                               _          _________
        ensures that a stage  can get records from a stream  only one at a
        time.
0       But if  you have a  simple peekto-output-readto loop,   the output
        completes  before  the  readto occurs,   which  ensures  that  the
        downstream stage has finished with the  record before it tells the
        upstream stage that it has finished with it.(4)
0    In  other  words,   a  stage  that   processes  its  records  through  a
     peekto-output-readto loop does not delay the record.  Thus, a stage that
     does  not delay  the  record has  the properties  we  have already  been
     discussing:
0       In general, in a cascade of stages that do not delay the record, a
        single record moves from stage to stage, leaving all the stages it
        passes through blocked.    When the record reaches the  end of the
        pipeline,  then the last stage,  which does no output operation to
        the pipeline  and thus does  not become  blocked in a  write,  can
        consume the record;  this unblocks  the next-to-last stage,  which
        can then consume its input record,  unblocking the previous stage,
        until all the stages unblock, like dominoes falling.
0
 
     --------------------
0    (4) From PIPELINE CFORUM,  6 April 1994,   15 May 1996,  and 6 September
         1996.
1
0    Record Flow in CMS Pipelines                                     Page 23
     ------------------------------------------------------------------------
0
        Thus, a cascade of filters that do not delay the record behaves as
        a unit.  Only one record will be traversing the cascade at any one
        time; the order of the individual records cannot be changed in the
        course of traversing the cascade.   This  remains true even if the
        cascade includes multiple streams and  even if there are different
        numbers of stages in different  paths.   Records leave the cascade
        in the same order  they enter it;  that is,  they  are not delayed
        relative to one another.
0    Record delay is relevant only in multistream pipelines, so let's look at
     a  simple multistream  pipeline that  appends  a certain  string to  any
     record that contains the name "Sid":
0       +----------------------------------------------------------------+
        |                                                                |
        |  'PIPE (endchar ?)',                                           |
        |        '< input file |',                                       |
        |     'l: locate /Sid/ |',                                       |
        |        'spec 1-* 1 /(!)/ next |',                              |
        |     'f: faninany |',                                           |
        |        '> output file a',                                      |
        |  '?',                                                          |
        |     'l: |',                                                    |
        |     'f:'                                                       |
        |                                                                |
        +----------------------------------------------------------------+
0       Assuming the input file contains the  records on the left [below],
        we would  expect the  output file  to contain  the records  on the
        right:
0       +----------------------------------------------------------------+
+                                 +
        |                         |                                      |
        |        Don't forget     |             Don't forget             |
        |        to tell Sid      |             to tell Sid(!)           |
        |        about this.      |             about this.              |
        |                         |                                      |
        +----------------------------------------------------------------+
+                                 +
0       There would be cause for concern if the records in the output file
        were not in the same order as  the records in the input file,  but
        how can we  be sure this is the  case and how can  we reason about
        other pipeline topologies?
0    Let's walk the three records through this pipeline network:
1
0    Page 24                                     Record Flow in CMS Pipelines
     ------------------------------------------------------------------------
0
        +----------------------------------------------------------------+
        |                                                                |
        |          +----------+    +------+    +----------+              |
        |          |          |----| spec |----|          |              |
        |          |          |    +------+    |          |              |
        |       ---|  locate  |                | faninany |---           |
        |          |          |                |          |              |
        |          |          |----------------|          |              |
        |          +----------+                +----------+              |
        |                                                                |
        +----------------------------------------------------------------+
0
        1. locate peeks at  the first record,  which does  not contain the
           required string;   locate writes  the record  to its  secondary
           output stream rather than discarding it.  locate is now blocked
           in a write.  What the dispatcher now does is "unspecified", but
           it is clear that no further data can flow before faninany reads
           from its secondary input stream.
0       2. faninany starts  and issues  a call to  the dispatcher  to wait
           until a record is available on any  one (or more)  of its input
           streams.
0       3. faninany is resumed, because there is a record available on its
           secondary input stream.   We know  that locate is blocked after
           it has written the first record to its secondary output stream.
           spec is probably waiting for an input record, but locate cannot
           produce one  on its primary output  stream while it  is blocked
           waiting to complete the write on its secondary output stream.
0       4. faninany peeks at the record on  its secondary input stream and
           writes it  to its primary output  stream.   The record  has now
           left the  multistream part  of the  pipeline;  locate  is still
           blocked waiting for its output to complete.
0       5. The stage  connected to faninany's  output stream  consumes the
           record.   faninany can now run;  it  consumes the record on the
           secondary input stream  (which releases locate)  and  waits for
           the next input record on any of its input streams.
0       6. locate consumes the first input record  and peeks at the second
           one,  which does  contain the specified string;   locate writes
           that record to its primary output stream.
0       7. spec finally gets a record to work with.   It builds the output
           record in a  work area (copying the input  record and appending
           the literal string) and writes the amended output record.  Both
           locate and  spec are  now blocked waiting  for their  output to
           complete.
1
0    Record Flow in CMS Pipelines                                     Page 25
     ------------------------------------------------------------------------
0
        8. faninany  is resumed;   it  peeks at  the  record,  writes  it,
           consumes  it (which  releases spec),   and waits  for the  next
           record wherever it may appear.
0       9. spec is  resumed;  it  in turn  consumes the  record (releasing
           locate)(5) and peeks for some more to do.
0       10.    locate consumes the  second record,  peeks the  third,  and
           writes it  to the secondary output  stream because it  does not
           contain  the specified  string.   locate  is now  blocked in  a
           write; spec is blocked in a peek; again, only faninany can run.
0       11.    faninany writes  the third record,  consumes  it (releasing
           locate), and waits for the next one.
0       12.    locate receives end-of-file on its input and returns.  This
           severs its two output streams.
0       13.    spec is resumed with an indication of end-of-file;  it also
           returns, severing its output stream.
0       14.    faninany is resumed with an end-of-file return code because
           all of its input streams are now severed; it returns as well.
0       We  see that  the output  records  arrive in  the expected  order,
        because peeking a record ensures that  the producer cannot run and
        thus cannot produce another record while the first one is still in
        flight.   The important piece of information is that spec does not
        delay the  record.   This means that  a record being  processed by
        spec cannot be overtaken by another  record that takes a different
        path.
0       So far we  have met only built-in  programs that do not  delay the
        record;  are there  any that do?   sort must of  necessity see all
        input records before it can begin to produce output.   However,  a
        more subtle form of delay occurs with unique last,  which compares
        pairs of records and discards duplicates,  retaining the last of a
        set of identical records.   Clearly,   unique last must buffer one
        record internally while it examines the next one;  as a result the
        output record is delayed relative to records that take a path that
        bypasses the unique last stage.   Now,  insert unique last instead
        of spec  in the multistream  example and  run the file  through it
        again:
0
 
0
     --------------------
0    (5) Both  spec  and the  stage  to  its  left  [(locate)] can  now  run;
         conceptually they run  in parallel,  but spec will be  blocked if it
         peeks and there is no new record yet available.
1
0    Page 26                                     Record Flow in CMS Pipelines
     ------------------------------------------------------------------------
0
        +----------------------------------------------------------------+
        |                                                                |
        |          +----------+   +--------+   +----------+              |
        |          |          |---| unique |---|          |              |
        |          |          |   +--------+   |          |              |
        |       ---|  locate  |                | faninany |---           |
        |          |          |                |          |              |
        |          |          |----------------|          |              |
        |          +----------+                +----------+              |
        |                                                                |
        +----------------------------------------------------------------+
0       Watch this happen:
0       1. The first  record does  not contain  the specified  string;  it
           bypasses unique and becomes the first output record.
0       2. The second  record is  read into  unique's buffer;   unique now
           peeks to see the next input record.
0       3. The third record bypasses unique as well.   Thus it becomes the
           second output record.
0       4. Finally, unique receives end-of-file; it writes the record from
           its buffer, making it the third output record.
0       Note that  even though  unique has only  a one-record  delay,  the
        effect of  the pipeline topology may  mean that many  more records
        can overtake the one being buffered.
0       Thus,  to reason  about this in general,  the concept  of a record
+                                                                   ______
        delay (or simply delay) is introduced.  This delay is not the time
+       _____
        it takes for a stage to process a record;  rather, it represents a
        change  in  the  relative  order  of  records  that  pass  through
        different paths in  a multistream pipeline.   If a  record takes a
        path entirely through  stages that do not delay  the record,  then
        the record  must arrive at  the end of  the pipeline ahead  of any
        record that enters the pipeline path  after it does.   If a record
        passes through a  stage that delays records,  then  the record may
        arrive at the end of the pipeline later than a record that takes a
        path without delay.
0       All CMS Pipelines  built-in programs are written not  to delay the
+           _____________
        record,  except where  required by the function  they perform.   A
        user-written stage might delay records if it processes them with a
        readto-output loop;  it  will not delay them if  it processes them
        with a peekto-output-readto loop.
0    Appendix A contains examples of user-written stages with various degrees
     of delay.    The author's help  files (which  you can display  using the
     undocumented pipe  ahelp command)   define quite  rigorously the  record
     delay introduced by each built-in stage.    Let's look at a few examples
     from these help files:
1
0    Record Flow in CMS Pipelines                                     Page 27
     ------------------------------------------------------------------------
0
     o  Strictly non-delaying stage (console):
0          Record Delay:  console strictly does not delay the record.
0       Like most other stages, console uses a peekto-output-readto loop,  so
        it writes  each output  record before  it consumes  the corresponding
        input record and does not delay the record.
0    o  Strictly non-delaying stage (delay):
0          Record Delay:  delay strictly does not delay the record.   That
           is,  delay consumes the input record  after it has copied it to
           the primary output stream; records are delayed in time, but the
           relative order of records that originate in a particular filter
           is unchanged.
0       delay  also uses  a peekto-output-readto  loop,   so records  passing
        through it,  no matter how long they may take,  cannot be bypassed by
        records going through the pipeline on  a parallel stream,  if none of
        the stages delays the record.  Note, however, that this does not mean
        that the  entire pipeline  necessarily waits while  a delay  stage is
        waiting for its timer to expire.    Whether that happens depends upon
        the topology of the pipeline, but if all the stages between delay and
        the source of its records are ones that do not delay the record, that
        source stage will not be able to produce other records before delay's
        timer has expired.
0    o  Non-delaying stage (getfiles):
0          Record Delay:   getfiles writes all  output for an input record
           before consuming the input record.
0       getfiles does  not delay the record.    It writes all of  the records
        from a file  it has gotten before  it consumes the input  record that
        specified the  name of that file.    The reason that getfiles  is not
        described as  being "strictly"  non-delaying is  that it  can produce
        more than one output record for each input record.   When getfiles is
        used in a  multistream pipeline containing only  non-delaying stages,
        all of its output records derived from  a given input record will get
        to the end  of the pipeline ahead  of the records derived  from later
        input records,  no  matter what path they take  through the pipeline.
        That is, it truly does not delay the record.
0    o  Stage with different levels of delay on different streams (cms):
0          Record Delay:  cms writes all output for an input record before
           consuming the input record.   When  the secondary output stream
           is defined, the record containing the return code is written to
           the secondary output stream with no delay.
0       When a cms stage issues the command contained in an input record,  it
        writes all  of the  lines of  the response  to its  output before  it
1
0    Page 28                                     Record Flow in CMS Pipelines
     ------------------------------------------------------------------------
0
        consumes that  input record.    In this,  it  is exactly  parallel to
        getfiles;  neither of them delays the record.   cms (like most of the
        other command processor stages)  has the added wrinkle that it writes
        the return code from the command to its secondary output, if there is
        a secondary output.  One return code record is written for each input
        record,  and  the return code record  is written before the  input is
        consumed, so the return code record is strictly not delayed.
0    o  Stage with different delay depending on the keywords used (drop):
0          Record Delay:  drop first does not delay the record.  drop last
           delays the specified number of records.
0       drop first  reads and  discards the specified  number of  records and
        then  shorts its  input to  its output,   so  it does  not delay  the
        records; indeed, it does not even see them.
0       drop last  must hold the  specified number  of records in  a rotating
        buffer until end-of-file,  when it discards  them.   As it reads each
        new record from its input,  it can write out the record that has been
        in its buffer the longest, keeping the specified number of records in
        its buffer.  Thus, it delays the specified number of records.
0    o  Potentially delaying stage (copy):
0          Record Delay:  copy has the potential to delay one record.
0       The copy  program can be  thought of as  a one-step elastic;   it can
        delay one record,  but  it need not.   As we have  seen,  copy uses a
        readto-output  loop.   After  copy consumes  its  input record,   the
        dispatcher may allow  it to write its output  record immediately,  in
        which case the record will not be delayed.   However,  the dispatcher
        may  instead allow  the producer  stage  to run  and produce  another
        output record before copy is allowed  to write its output record,  in
        which case the record is delayed by one.
0    o  Stage with unspecified delay (block):
0          Record Delay:   block delays input records as required to build
           an output record.  The delay is unspecified.
0       There is  not a one-to-one equivalence  between the input  and output
        records for block;  it may need to  span the bytes from a given input
        record across  more than one output  record,  so the record  delay is
        undefined.
0    o  Stage that delays the entire file (hole):
0          Record Delay:  hole delays all records until end-of-file.
0       hole  writes no  output records,   but  it keeps  its consumer  stage
        blocked in a  peekto until after it  has seen end-of-file on  its own
        input; thus, it delays all records until end-of-file.
1
0    Record Flow in CMS Pipelines                                     Page 29
     ------------------------------------------------------------------------
0
     Now, let's review the concept of record delay by looking at explanations
     by a couple of master plumbers.  First is Glenn Knickerbocker:
0       "Delay",  in  Pipelines terms,  is  any time that  elapses between
+                     _________
        releasing the input record and releasing the output record.
0       When every stage  releases its output record  before releasing its
        input record,   every record  must go  through the  whole pipeline
        before another  record can be processed.    The record may  take a
        microsecond or an hour to get through the pipeline,  but it is not
        delayed in relation to the other records.
0       When  any stage  releases its  input record  before releasing  its
        output record,   processing may  (or may not)   start on  the next
        record before this one is fully  processed.   This record has been
        delayed in relation to the next record entering the pipeline.
0       [unique  last] delays  the record  by exactly  one record.    Each
        output record is released after the  next input record arrives and
        before it is released.
0       A readto-output loop, on the other hand, delays the record by some
        unspecified amount less  than or equal to one  record.   The exact
        amount is determined by  the rest of the pipeline and  the whim of
        the dispatcher.   But it is the  act of releasing the input record
        before the  output record that introduces  the delay in  the first
        place.
0       You might  say the terminology is  backwards,  and what  is really
        happening is  not that  the stage  has delayed  processing of  the
        first record but that it has allowed processing of the next record
        to begin early.   But remember, despite how it looks from outside,
        Pipelines does only one thing at a time.   If it is processing the
+       _________
        next record,  it is not processing this one,  so the processing of
        this record is delayed.(6)
0    And this is from Steve Hayes:
0       I find the best way to look at the term "delay the record" is that
        a stage that delays the record  contains a place where that record
        stays whilst other  processing goes on....   A  readto-output loop
        contains  such a  place:  the  variable  used on  the readto.    A
        peekto-output-readto loop does not,  since the peekto,  unlike the
        readto,  does  not allow  stages upstream  to execute  and so  the
        record has  "left" its variable  before any other  processing goes
        on, and therefore there is no delay.
0
 
0    --------------------
0    (6) From PIPELINE CFORUM, 5 September 1996.
1
0    Page 30                                     Record Flow in CMS Pipelines
     ------------------------------------------------------------------------
0
        An important thing  to realise is that there are  stages which may
+                                                                      ___
        delay records (e.g.,   copy and elastic)  and ones  which do delay
+                      ____                                       __
        records (e.g.,  [unique last]).   The  former can introduce needed
+                ____
        flexibility or unwanted (and often unnoticed) indeterminacy in the
        pipeline.   The latter cannot,  although  they can produce a stall
        (and you then may need one of the former to solve it).(7)
0
 
                     III.  WRITING "MULTI-TASKING" PIPELINES
+                    III.  WRITING "MULTI-TASKING" PIPELINES
+                    _______________________________________
0    So far,   we have been discussing  how to write pipelines  through which
     records flow  in an entirely predictable  manner,  so that the  order of
     records is  never altered  except by  stages,  such  as sort,   that are
     specifically intended to reorder records.   In such pipelines,  the goal
     is to make  sure that the dispatcher  has no leeway,  that  it makes the
     stages of the pipeline operate in lock-step, generally moving one record
     at a time through the entire length of the pipeline.
0    That  model  will hardly  do,   however,   when  one  wants to  build  a
     multi-tasking  pipeline,  such  as  a server  that  will serve  multiple
     clients  concurrently.   In  that case,   the  records representing  the
     requests  from concurrent  clients  must be  able  to  flow through  the
     pipeline simultaneously and independently of one another.   Fortunately,
     if  one has  a basic  understanding of  the  flow of  records through  a
     pipeline,  it  is extremely  easy to  write a  "multi-tasking" pipeline.
     Although Pipes has recently been  enhanced to support CMS Multi-Tasking,
+             _____
     what  I  will be  discussing  here  is  how  to achieve  the  effect  of
     multi-tasking  within  a  single  pipeline  set,   whether  or  not  CMS
     Multi-Tasking is active.
0    In CMS Pipelines,  each stage is,  in a very real sense,  an independent
+       _____________
     thread of execution.  And pipeline segments behave like processes.   The
     power  of  the  Pipelines  dispatcher  allows  properly  coded  pipeline
+                    _________
     segments to overlap their execution  quite effectively,  with or without
     the use of CMS Multi-Tasking.
0    CMS Pipelines itself is not MP-capable,  so  the sort of pipeline I will
+    _____________
     be showing you cannot use more than one virtual processor at a time, and
     the entire  pipeline set is  suspended when one  of the stages  issues a
     synchronous DIAGNOSE or  a CMSCALL.   Other than  that,  the independent
     pipeline "processes"  will continue running in  parallel as long  as one
     takes a bit  of care to make sure  that they can get work  when they are
     ready for it and that they don't become blocked trying to write to other
     portions of the pipeline.
0    It doesn't  matter whether  the pipeline segments  to be  overlapped are
     created by  callpipe or addpipe  or are just  part of a  big multistream
     pipeline,  so long  as they are all  part of the same  pipeline set and,
0    --------------------
0    (7) From PIPELINE CFORUM, 6 September 1996.
1
0    Record Flow in CMS Pipelines                                     Page 31
     ------------------------------------------------------------------------
0
     thus,   are  controlled  by  the  same  instantiation  of  the  pipeline
     dispatcher.
0    I've  been experimenting  over  the past  two  years  with writing  such
     multi-tasking pipelines  and have been  delighted with the  results I've
     gotten and the ease with which I've gotten them.  I have a multi-tasking
     SMTP client that uses multiple connections to a server SMTP to move more
     data than was possible with a  single connection.   (This SMTP client is
     light-weight enough to run very well on a P/370 system,  where it serves
     as an Internet/BITNET gateway.)   I also have a multi-tasking version of
     Rick Troth's Webshare Web server that supports multiple simultaneous Web
     clients with quite good responsiveness (even  while running CGIs for one
     or more of them).
0    There is no point, of course, in writing a multi-tasking pipeline unless
     at  least some  of the  pipeline  segments wait  on asynchronous  events
     outside  of  the   virtual  machine.    One  could   certainly  write  a
     multi-tasking pipeline that, say, split a file into two streams and then
     performed the  same filtering  or changing  operations on  both streams.
     That would work,  but it would be slower than putting all of the records
     through  a single  stream,   because the  dispatcher  overhead would  be
     greater.  However, when you have a process that must wait for input from
     the  network,   you can  markedly  increase  total throughput  by  using
     multiple processes,  thus allowing one  process to execute while another
     is waiting on network I/O.
0    My approach  in both the  SMTP client and the  Web server was  to create
     multiple identical processes  and to let a  deal stage dole work  out to
     them as they request it.  When a process completes a task, it queues for
     more work by writing a record containing its process number.  In return,
     deal passes  it a record  that describes its  next task.   (In  the SMTP
     case,  this record contains the spoolid of  the next input file;  in the
     Webshare case,  the record describes the  socket for a client connection
     request.)
0    Each process must be careful to  consume its input record "quickly",  so
     that  deal can  continue passing  out  records to  the other  processes.
     Similarly,  when  the process writes  a record  to queue for  more work,
     something  in the  central portion  of  the pipeline  must consume  that
     record quickly,  so that the process  doesn't become blocked waiting for
     its output to be consumed.
0    In other words,   all there is to writing multi-tasking  pipelines is to
     let the ends of the processes flap a bit.  That is very easy to do, as I
+                                                       very
     will now show you.
0    To see  how to build  up a multi-tasking  pipeline,  let's start  with a
     simple SMTP client that has only one connection to its server:
0                  PIPE starmsg | find ... | spec 26.4 1 | smtp
1
0    Page 32                                     Record Flow in CMS Pipelines
     ------------------------------------------------------------------------
0
     A starmsg  stage captures  CP information messages,   and a  find (which
     isn't completely shown)  selects the messages describing arriving reader
     files.   spec isolates the spoolid number,   which is passed to the smtp
     stage, signalling it to process the corresponding spool file.   When the
     smtp stage has finished processing the spool file,  it will read another
     spoolid record from its input and then process that file,  continuing to
     process one file at a time forever.
0    An  easy way  to add  more  connections to  this  scheme is  to build  a
     pipeline containing  multiple smtp  stages and  to use  a deal  stage to
     "round-robin" the files amongst the smtp stages:
0    +----------------------------------------------------------------------+
     |                                                                      |
     | 'PIPE (endchar ?)',               /* Dispatch by round-robin:    */  |
     |       'starmsg |',                /* Capture spool arrivals.     */  |
     |       'find' arrival_msg'|',      /* Select file arrival message.*/  |
     |       'spec 26.4 1 |',            /* Isolate spoolids.           */  |
     |    'd: deal',                     /* Dole them out to SMTPs.     */  |
     | '?',                                                                 |
     |    'd: | copy | smtp',            /* DEAL's output stream 1.     */  |
     | '?',                                                                 |
     |    'd: | copy | smtp',            /* DEAL's output stream 2.     */  |
     | '?',                                                                 |
     |    'd: | copy | smtp'             /* DEAL's output stream 3.     */  |
     |                                                                      |
     |----------------------------------------------------------------------|
+    +                                                                      +
     |                                                                      |
     |        +------+   +------+   +--------+                              |
     |     -->| find |-->| spec |-->|0      0|-|                            |
     |        +------+   +------+   |        |                              |
     |                              |        |   +------+   +------+        |
     |                              |       1|-->| copy |-->| smtp |-|      |
     |                              |        |   +------+   +------+        |
     |                              |  deal  |                              |
     |                              |        |   +------+   +------+        |
     |                              |       2|-->| copy |-->| smtp |-|      |
     |                              |        |   +------+   +------+        |
     |                              |        |                              |
     |                              |        |   +------+   +------+        |
     |                              |       3|-->| copy |-->| smtp |-|      |
     |                              +--------+   +------+   +------+        |
     |                                                                      |
     +----------------------------------------------------------------------+
0    deal was added in  CMS 12.   It works like a  card player dealing cards,
     passing records to its output streams  in turn.   In this example,  deal
     has no primary output stream,  so it passes the first record it receives
     to  its secondary  output stream  (stream 1),   the next  record to  its
     tertiary (stream 2), the next to it quaternary (stream 3),  and then the
     next to its secondary again,  continuing  in sequence to round-robin its
     output.
1
0    Record Flow in CMS Pipelines                                     Page 33
     ------------------------------------------------------------------------
0
     If you want these smtp stages to  run in parallel,  though,  you must be
     careful that the output of deal is consumed "quickly",  so that deal can
     continue passing out spoolid records to other streams.   The copy stages
     consume the output of deal to unblock it immediately.   (As we have seen
     earlier, copy can be thought of as a one-record buffer.)
0    This scheme is sufficient  to allow the smtp stages to  run in parallel.
     However,  one of the smtp stages might be sending a large file on a slow
     link, so it might not be ready to consume a spoolid record the next time
     through the round-robin  process.   That would block deal  and cause the
     other smtp stages to go idle,  because deal could not give them any work
     to do.   One could change each copy stage to elastic,  of course.   That
     would prevent the blocking of deal,   but would still leave files queued
     in elastic's buffer waiting for a busy smtp stage to complete.
0    A  better  scheme  would  be  not to  dispatch  the  SMTP  processes  in
     round-robin fashion at all,  but instead to let each process queue for a
     file when it is ready for one.  This pipeline does that:
1
0    Page 34                                     Record Flow in CMS Pipelines
     ------------------------------------------------------------------------
0
     +----------------------------------------------------------------------+
     |                                                                      |
     | 'PIPE (endchar ?)',               /* Dispatch on request:        */  |
     |       'starmsg |',                /* Capture spool arrivals.     */  |
     |       'find' arrival_msg'|',      /* Select file arrival message.*/  |
     |       'spec 26.4 1 |',            /* Isolate spoolids.           */  |
     |    'd: deal secondary',           /* Dole them out to SMTPs.     */  |
     | '?',                                                                 |
     |    'y: faninany | elastic |',     /* Secondary input of DEAL.    */  |
     |    'd: | copy | smtp | literal | spec /1/ 1 | y:', /* Process 1. */  |
     | '?',                                                                 |
     |    'd: | copy | smtp | literal | spec /2/ 1 | y:', /* Process 2. */  |
     | '?',                                                                 |
     |    'd: | copy | smtp | literal | spec /3/ 1 | y:'  /* Process 3. */  |
     |                                                                      |
     |----------------------------------------------------------------------|
+    +                                                                      +
     |                                                                      |
     |    +----+   +------+                                      +-----+    |
     | -->|spec|-->|0    0|-|                                  |-|0   0|--+ |
     |    +----+   |      |                                      |     |  | |
     |    +----+   |      |  +----+  +----+  +-------+  +----+   |   f |  | |
     | +->|elas|-->|1    1|->|copy|->|smtp|->|literal|->|spec|-->|1  a |  | |
     | |  |tic |   |      |  +----+  +----+  +-------+  +----+   |   n |  | |
     | |  +----+   | deal |                                      |   i |  | |
     | |           |      |  +----+  +----+  +-------+  +----+   |   n |  | |
     | |           |     2|->|copy|->|smtp|->|literal|->|spec|-->|2  a |  | |
     | |           |      |  +----+  +----+  +-------+  +----+   |   n |  | |
     | |           |      |                                      |   y |  | |
     | |           |      |  +----+  +----+  +-------+  +----+   |     |  | |
     | |           |     3|->|copy|->|smtp|->|literal|->|spec|-->|3    |  | |
     | |           +------+  +----+  +----+  +-------+  +----+   +-----+  | |
     | |                                                                  | |
     | +------------------------------------------------------------------+ |
     |                                                                      |
     +----------------------------------------------------------------------+
0    As before,  each  SMTP process gets its  input from an output  stream of
     deal and consumes it quickly, but now each smtp stage also has an output
     stream of its own that feeds ultimately into the secondary input of deal
     (through the spec,  faninany,  and elastic).    deal is invoked with the
     secondary option, which says to read a stream of stream numbers from its
     secondary input  and write  the records  from its  primary input  to the
     specified output streams.  So, when an SMTP process is ready for another
     input file,  it  simply produces a record containing  its stream number,
     thus queuing for output from deal secondary.  That is, when a process is
     ready for work,  it writes an output record containing its stream number
     to an input stream of faninany,   and faninany feeds those stream number
     records to elastic, which consumes them immediately and buffers them for
     feeding  into  the  secondary  input   of  deal  secondary  (the  second
     occurrence in the  pipeline of the label  "d:"),  in the order  in which
     they were  received.   (The  literal in  each segment  produces a  first
     record to get the process started.)
1
0    Record Flow in CMS Pipelines                                     Page 35
     ------------------------------------------------------------------------
0
     The  multi-tasking  scheme  used  here  is  quite  effective.    In  the
     production version of this SMTP client,   the total throughput with four
     processes is  twice as great  as with  one.   (It increases  little with
     additional processes,  presumably  as the result of  the synchronization
     imposed by  the DIAGNOSE  used to  read from  the spool.)    Letting the
     processes queue  for work  rather than  dispatching them  in round-robin
     fashion definitely improves throughput.   With queuing, the elapsed time
     to deliver a  thousand identical files using five  processes was reduced
     by nine percent  in comparison with strict  round-robining.   Obviously,
     the improvement would  be greater in situations in which  the files were
     of differing sizes.
0    Any pipeline  process can be made  multi-tasking using the  same scheme.
     This is how it works with my variant of Rick Troth's TCPSHELL:
0    +----------------------------------------------------------------------+
     |                                                                      |
     | processes = ''                    /* Initialize pipeline segment.*/  |
     | Do n = 1 to processno             /* Build server processes.     */  |
     |    processes = processes,         /* Append another process.     */  |
     |       'd: |',                     /* Next spool file id to here. */  |
     |          'tcpshell' n '|',        /* Invoke server process #n.   */  |
     |       'y:',                       /* Streamnumber recs from here.*/  |
     |    '?'                                                               |
     | End                                                                  |
     |                                                                      |
     | 'PIPE (endchar ? name TCPShell)', /* Run TCP server processes:   */  |
     |       'tcplisten' server_port '|',/* Wait for the next client.   */  |
     |    'd: deal secondary',           /* Clients to ready processes. */  |
     | '?',                                                                 |
     |    'y: faninany |',               /* Stream of streamnumber recs.*/  |
     |       'elastic |',                /* Hold until ready for them.  */  |
     |        processes                  /* Feed to server processes.   */  |
     |                                                                      |
     +----------------------------------------------------------------------+
1
0    Page 36                                     Record Flow in CMS Pipelines
     ------------------------------------------------------------------------
0
     +----------------------------------------------------------------------+
     |                                                                      |
     |          +-----------+   +------+                   +-----+          |
     |       -->| tcplisten |-->|0    0|-|               |-|0   0|--+       |
     |          +-----------+   |      |                   |     |  |       |
     |          +-----------+   |      |   +-----------+   |   f |  |       |
     |       +->|  elastic  |-->|1    1|-->| process 1 |-->|1  a |  |       |
     |       |  +-----------+   |      |   +-----------+   |   n |  |       |
     |       |                  | deal |                   |   i |  |       |
     |       |                  |      |   +-----------+   |   n |  |       |
     |       |                  |     2|-->| process 2 |-->|2  a |  |       |
     |       |                  |      |   +-----------+   |   n |  |       |
     |       :                  :      :                   :   y :  :       |
     |       :                  :      :                   :     :  :       |
     |       |                  |      |   +-----------+   |     |  |       |
     |       |                  |     n|-->| process n |-->|n    |  |       |
     |       |                  +------+   +-----------+   +-----+  |       |
     |       |                                                      |       |
     |       +------------------------------------------------------+       |
     |                                                                      |
     +----------------------------------------------------------------------+
0    A  pipeline segment  is  built up  to contain  the  specified number  of
     processes,  each of  which reads records describing work to  do from its
     input stream and writes a record  containing its stream number each time
     it  is ready  for  work.   In  this case,   the  work-to-do records  are
     generated  by  a  tcplisten  stage  as  it  receives  client  connection
     requests, but I think you can see that this scheme is equally applicable
     to other kinds of servers.
0    Indeed,   I  will submit  that  one  can  use  this framework  to  build
     multi-tasking servers even without having  passed the journeyman plumber
     exam.    One  must  take  some care  that  the  processes  can  co-exist
     peaceably.   For example,  if they use  GLOBALV,  they should name their
     variables in  accordance with their  process number to  avoid destroying
     one another's variables.   If they use tapes,  they should have a scheme
     for making sure that  they don't all try to attach  their tape drives at
     the same virtual address.  But such considerations are easily sorted out
     and one quickly  finds that one has  built a robust server  that handles
     concurrent clients gracefully.
1
0    REXX Pipeline Stages with Different Delays                       Page 37
     ------------------------------------------------------------------------
0
                                    Appendix A
+                                   Appendix A
0                   REXX PIPELINE STAGES WITH DIFFERENT DELAYS
+                   REXX PIPELINE STAGES WITH DIFFERENT DELAYS
+                   __________________________________________
0
 
     The following skeleton pipeline stages were  written by Steve Hayes,  of
     IBM.   They are examples of stages that  do not,  may,  and do delay one
     record:
0    o  REXX  stage that  does  not  delay the  record:    NULL  REXX uses  a
+       REXX  stage that  does  not  delay the  record:
        peekto-output-readto loop in the canonical way,  so it does not delay
        the record.  It is a good model to use for one's own REXX filters.
0    +----------------------------------------------------------------------+
     |                                                                      |
     |  /* NULL REXX:   Skeleton filter that does not delay the record  */  |
     |  Signal on Novalue                 /* No uninitialised variables */  |
     |  Signal on Failure                 /* Allow RC > 0 for a moment  */  |
     |  'MAXSTREAM INPUT'                 /* Check only one stream      */  |
     |  Signal On Error                   /* now stop for any error     */  |
     |  if RC <> 0 then 'ISSUEMSG 264 PIPSJH'/* too many streams: crash */  |
     |  do forever                        /* until EOF                  */  |
     |    'PEEKTO record'                 /* read from primary input    */  |
     |    'OUTPUT' record                 /* write to primary output    */  |
     |    'READTO'                        /* release input              */  |
     |    end                             /* next record                */  |
     |  Failure:                                                            |
     |  Error:                                                              |
     |  Exit (RC * (RC <> 12))            /* RC = 0 if EOF              */  |
     |                                                                      |
     +----------------------------------------------------------------------+
1
0    Page 38                       REXX Pipeline Stages with Different Delays
     ------------------------------------------------------------------------
0
     o  REXX  stage that  may delay  the record:    COPY REXX  may delay  the
+       REXX  stage that  may delay  the record:
        record, depending on the dispatching order.   It is equivalent to the
        copy built-in program.
0    +----------------------------------------------------------------------+
     |                                                                      |
     |  /* COPY REXX:        Skeleton filter that may delay one record  */  |
     |  SIGNAL ON NOVALUE                 /* No uninitialised variables */  |
     |  SIGNAL ON FAILURE                 /* Allow RC > 0 for a moment  */  |
     |  'MAXSTREAM INPUT'                 /* Check only one stream      */  |
     |  SIGNAL ON ERROR                   /* now stop for any error     */  |
     |  if RC <> 0 then 'ISSUEMSG 264 PIPSJH'/* too many streams: crash */  |
     |  do forever                        /* until EOF                  */  |
     |    'READTO record'                 /* read from primary input    */  |
     |    'OUTPUT' record                 /* write to primary output    */  |
     |    end                             /* next record                */  |
     |  FAILURE:                                                            |
     |  ERROR:                                                              |
     |  Exit (RC * (RC <> 12))            /* RC = 0 if EOF              */  |
     |                                                                      |
     +----------------------------------------------------------------------+
0    o  REXX stage that does delay the record:   HICCUP REXX has a one-record
+       REXX stage that does delay the record:
        delay.   It reads  a record into a  buffer and peeks the  next record
        before copying the first record to its output.
0    +----------------------------------------------------------------------+
     |                                                                      |
     |  /* HICCUP REXX:         Skeleton filter that delays one record  */  |
     |  SIGNAL ON NOVALUE                 /* No uninitialised variables */  |
     |  SIGNAL ON FAILURE                 /* Allow RC > 0 for a moment  */  |
     |  'MAXSTREAM INPUT'                 /* Check only one stream      */  |
     |  SIGNAL ON ERROR                   /* now stop for any error     */  |
     |  if RC <> 0 then 'ISSUEMSG 264 PIPSJH'/* too many streams: crash */  |
     |  do forever                        /* until EOF                  */  |
     |    'READTO record1'                /* read from primary input    */  |
     |    'PEEKTO record2'                /* check next input ready     */  |
     |    'OUTPUT' record1                /* write to primary output    */  |
     |    end                             /* next record                */  |
     |  FAILURE:                                                            |
     |  ERROR:                                                              |
     |  if RC = 12 & symbol('record1') = 'VAR' & symbol('record2') = 'LIT'  |
     |    then 'OUTPUT' record1           /* input record pending       */  |
     |  Exit (RC * (RC <> 12))            /* RC = 0 if EOF              */  |
     |                                                                      |
     +----------------------------------------------------------------------+