13 July, 1983 Multics Technical Bulletin MTB628 From: Edward A. Ranzenbach To: MTB Distribution Date: July 13, 1983 Subject: This MTB address issues concerning the support of the operators' system interface in a post MR10.1 environment. ABSTRACT A recent RPQ effort has identified a need to review the operator's console device control module for the purposes of providing a means of changing bootload consoles without interrupting user service. During this review several problems were discovered with the device control module, ocdcm_, the most serious causing the processor handling an interrupt from the console to loop, masked, in ring zero until the operation completes. If this operation is a read, the termination status being looped on can take as long as thirty seconds to arrive. The processor, and process, servicing this interrupt are effectively unresponsive for this period. Due to problems such as this it was decided that a complete redesign of ocdcm_ was needed. Please direct any comments or questions to the author: By Multics mail at System-M to: Ranzenbach.Multics Or by mail to: Edward A. Ranzenbach Honeywell Information Systems Inc. Cambridge Information Systems Laboratory 575 Technology Square HED MA22 Cambridge Mass. 02139 Phone: 617 492-9350 HVN: HVN 261-9350 ________________________________________ Multics Project internal working documentation. Not to be reproduced or distributed outside the project without consent of the author or director, Multics System Development. MTB628 13 July, 1983 THE NEED FOR A NEW INTERFACE Justification to develop an new console interface comes from many different sources. These include but are not limited to the following reasons: o The current interface does not function correctly. As previously stated a serious design deficiency exists which can severely affect system performance. o Different models of console available at this time have different line lengths. The current software does not utilize the full capabilities of the printers attached to the consoles. o There is an RPQ requirement to provide a means of reconfiguring an IOM and all devices attached to it. This includes the console. At least one site other than the site providing the RPQ funds has requested the ability to switch consoles for maintenance purposes. o Current console recovery strategies are inadequate. Under the current implementation syserr messages are passed to the Message Coordinator while normal I/O from the Initializer is ignored in the face of an inoperative console. o Sites have requested a means of providing multiple system control stations. The Message Coordinator is not robust enough to meet these needs. Although most functions can be performed from the Message Coordinator there are instances where it can not perform a particular task. o The current interface utilizies an obsolete IOS interface. o Work being done on bootload Multics currently has much of the code to be changed open for modification. It is therefore a good time to coordinate the design effort, providing a more reliable interface to bootload Multics. THE NEW DESIGN The following features are provided by the new console support software and will be covered in separate paragraphs: o Interrupt driven I/O facility. o Dynamic error recovery from console failure. o Explicit console reconfiguration. o Console line length support. 13 July, 1983 MTB628 o Chronologically ordered I/O with priority to syserr messages. INTERRUPT DRIVEN I/O The console is a half duplex device and as such requires some form of I/O synchronization on the part of the software. As a result the following rules apply to console I/O: 1.) The console has two modes of operation; output mode and input mode. While in input mode no output can occur. 2.) Input mode is only entered when a read I/O is pending and the operator hits the REQUEST key.(1) 3.) The console will remain in input mode until a null input line is received or until a thirty second hardware timer expires, indicating that the operator was distracted. This is because the Initializer process "listens" to the console by performing a loop which queues a read and goes blocked awaiting its completion. Therefore there is always a read pending except when the Initializer is busy processing input from the last read. The current console interface operates on an interrupt driven basis but this interface is not without problems. A process wishing to perform console I/O queues the I/O, sends a connect and subsequently goes blocked awaiting a wakeup indicating I/O completion. When the IOM completes the operation it sends an interrupt which is fielded by an interrupt routine within the console DIM. The process unlucky enough to be on the processor fielding the interrupt takes action depending upon the type of I/O being terminated. Utilizing the current design a ring zero race condition can occur when an input operator command produces output. In this scenario the console remains in input mode awaiting the operator's next input while the previous input has generated output. The operator does not leave input mode by entering a null input line and the process fielding the interrupt from the IOM loops, masked, in ring zero, awaiting the completion of the pending read so that it can perform the write.(2) When this situation occurs it can effectively take the processor fielding the interrupt away from multi-programming for the length of time that it takes to complete the read. In a case where the operator never leaves input mode this can amount to as much as thirty seconds, as that is the hardware timer timeout value. ________________________________________ (1) This is the RETURN key for csu6601, (LCC), consoles. (2) At times this situation is relieved by the presence of a Message Coordinator. MTB628 13 July, 1983 In the new design this situation is avoided as much as possible by making a non-syserr I/O totally interrupt driven. In the new design the process performing the I/O queues it and then goes blocked awaiting its completion. The process fielding the interrupt will simply take the necessary action to properly terminate a successful I/O and then send a wakeup to the blocked process. Syserr I/O is performed via a separate entrypoint into the DIM. This I/O is considered a priority I/O and is performed ahead of any queued I/O. The process performing the I/O loops awaiting its completion. It must be stated here that syserr I/O requires special handling. The prompt reporting of system errors is of such priority that this ring zero loop must be tolerated. Currently the design of syserr does not allow for input therefore the described race condition can only occur if the priority_io entrypoint of the DIM is utilized for input. Usage of this entrypoint for this purpose is currently restricted to prevent this condition but this restriction can be easily rescinded should it be necessary to perform such operations. DYNAMIC ERROR RECOVERY The current interface design allows only one console to be actively configured to the system at a time. This has serious drawbacks as a system console failure will result in the eventual crash of the system and probable loss of Emergency Shut Down. Even sites fortunate to have a spare console on the floor must physically swap cables in order to effect console and system recovery. The current software allows syserr traffic to be passed to the Message Coordinator but non-syserr traffic is simply discarded. This is an unnecessary and unacceptable situation. The new interface allows multiple consoles to be specified in the config_deck. The format of the card describing the console is presented and described in the following paragraphs. CONFIG CARD FORMAT A new config card format is defined which describes the system consoles, their location, type, line length, state and the actions to be taken in event of console recovery failure. It has the format: PRPH OPC<tag> IOM CHANNEL MODEL LINE_LENG STATE ACTION Each console is described to be in one of three possible states: On - Specifies that the console is selected as the bootload console and is the primary recipient of I/O. There 13 July, 1983 MTB628 must be one and only one console specified with a state of ON In the event of bootload console failure the console's state will be changed to INOP. Alt - Specifies that the console is to be utilized as an alternate console in the event of bootload console failure. If the bootload console becomes inoperative the DIM searches the config_deck for an alternate console to use. If an alternate console is found its state is changed to ON and it becomes the bootload console. Alternate consoles are selected in the order that they appear in the config_deck. Off - Specifies that the console exists but that it is not to be used as an alternate console. Inop - Specifes that the console is inoperative. This state is normally assigned dynamically during console recovery. Any console found in this state at console initialization time will have its state changed to OFF. The action field defines the action to be taken in the event of complete console recovery failure. The field is only checked for the card describing the bootload console, (a state of ON), and is set to null for all others during console initialization. It may have the following values: CRSH - Specifies that the system will crash. Since a system running without an operative console would have no recourse but to log all messages without reporting them some security messages may not be noticed in a timely manner. Sites that consider this a problem would choose this parameter. RUN - Specifies that the system will continue to run. If an unrecoverable bootload console error occurs the system will take the following actions: 1.) Search the configuration for an alternate console. The alternate chosen will be the next alternate encountered in the config_deck. Upon finding a usable alternate the system will: 1.1) Unassign the current bootload console and change its state to INOP. 1.2) Assign the new bootload console and change its state to ON. MTB628 13 July, 1983 1.3) Notify the operator that automated console recovery occurred. 2.) If no alternate consoles exist and there is an active Message Coordinator the system will mark the state of the bootload console as INOP and then proceed as follows: 2.1) For syserr traffic send the Message Coordinator a wakeup in which the wakeup message contains the syserr sequence number of the syserr message. The Message Coordinator will extract the syserr message from the log and print it. 2.2) For normal traffic send the Message Coordinator a wakeup in which the wakeup message contains the negated unique_id of the console message as stored in oc_data. The Message Coordinator will call ocdcm_$get_mc_output with this message UID to retrieve the contents of the message for printing. 2.3) Notify the operator that automatic console recovery has occurred. 3.) If no alternate consoles exist and the Message Coordinator is not active then the current console's state will be changed to INOP and the following actions will be taken with respect to the action field of the config card: 3.1) If the action field is set to CRSH then the system will crash with the message: OCDCM_ (CONSOLE_RECOVERY): CONSOLE INOP. If the console is truly inoperative this message will not be seen but will appear in the flagbox and syserr log. At this time Multics may be restarted by making the bootload console operative and typing "go". This action will result in the bootload console being placed back into operation by the console recovery software. 3.2) If the action field is set to RUN then the fact that the console is inoperative will be logged in the syserr log and the system, will continue running. Subsequent console traffic 13 July, 1983 MTB628 will also be sent to the syserr log. An attentive operator who notices a lack of console output or the inability to do console input should consult the site maintainer for additional instructions. This person may then take any action necessary to effect consle recovery or orderly system shutdown. THE BOS CONSOLE It must be noted that BOS stores the location of the console that it was booted from. To change this information would require either patching the BOS partition in the appropriate location or changing BOS to look for the current console location when it was returned to. Because of the pending release of bootload Multics it was decided that no effort should be spent modifying BOS for console recovery. In this respect BOS will always attempt to converse with what it thinks is the BOS console. If this console is inoperative the site will have no option but to replace it with a functioning console, (by cable swap), or to repair it before attempting to perform BOS commands. EXPLICIT CONSOLE RECONFIGURATION In addition to automatic console reconfiguration during error recovery sites may also explicitly reconfigure any console defined in the config_deck. These services are provided via the set_system_console (ssc) command as described later in this MTB. CONSOLE LINE LENGTH SUPPORT Multics currently supports a variety of consoles, each with its own characteristics. In addition we must consider the possible characteristics of future consoles. The first step in the approach to supporting different console's characteristics is to recognize the fact that different consoles have different line lengths. The new console interface allows the site to specify the console line length in the config_deck. This line length is respected by the DIM support software. It is hoped that this is the first enhancement in a series that will be designed to make full use of the capabilities of the available console devices. CHRONOLOGICAL I/O WITH SYSERR PRIORITY Utilizing the current console support software interface design presents the possibility for anomalous behavior. There is no guarantee that console messages are printed in the order that they were queued. Additionally syserr traffic is not given any special consideration. As a result syserr traffic, critical to the overall well being of the system, may be delayed because of a backup of gratuitous console traffic. The new console support MTB628 13 July, 1983 software guarantees that messages are printed in the proper order by time stamping each as it is queued. Syserr messages are not put into the same queue as normal traffic but have their own special entry which is always given priority for printing. CONSOLE CONFIGURATION COMMANDS Currently the only command that affects console operation is the unlock_oc command. This command purports to correct situations in which the console has become wedged by software anomalies but in effect does little to actually ensure restoration of the console to operative status. This command is obsoleted by the new software in favor of a more general purpose console configuration command. This is the set_system_console command and is described below: 13 July, 1983 MTB628 ------------------ ------------------ set_system_console set_system_console ------------------ ------------------ Name: set_system_console, ssc SYNTAX AS A COMMAND ssc {console_name} {-control_args} FUNCTION controls the configuration of system consoles. ARGUMENTS console_name is the name of the console to be affected as described for that console in the config_deck. If not provided the bootload console is assumed. CONTROL ARGUMENTS -crash specifies that the system should crash in the event of console recovery failure. -run specifes that the system should continue running in the event of console recovery failure. -reset force resets the bootload console as well as the oc_data database. If the console specified by console_name is not the bootload console no action is taken. -state STATE changes the operational state of the of the specified console to the newly specified state. The state may be given as one of the following: on changes the state of the specified console to on. This will effectively make the specified console the bootload console. If a bootload console is currently assigned it will be made an alternate console. alternate, alt changes the state of the specified console to alternate. MTB628 13 July, 1983 ------------------ ------------------ set_system_console set_system_console ------------------ ------------------ off changes the state of the console to off. If this is the bootload console it will be unassigned. NOTE If the bootload console's state is set to off and no other console is assigned as the bootload console Multics will send subsequent output to the Message Coordinator. If no Message Coordinator is available Multics will act with respect to the action field in the opc card for the bootload console. Although it is possible to run the system from the Message Coordinator, it is not recommended that sites run without an active bootload console for extended periods of time. During the period that there is no bootload console sites will be restricted to running only those commands executable at a Message Coordinator. Failure of the FNP to which the Message Coordinator terminals are attached while running without a bootload console could produce severe problems. 13 July, 1983 MTB628 FUTURE CONSOLE SUPPORT Future plans for console support suggest the implementation of a scheme that will replace the aging Message Coordinator facility with Operator Stations. These stations will provide any level of operator functionality as defined at the site level. For example; Site X has a separate I/O facility and Tape Library which are not physically located in the same general area as the mainframe. Under such a scheme The I/O area would have full control of the printer Daemons, their software and devices. The Tape Librarian would be restricted to the ability to deny tape requests based upon the availability of the tape. As always the operator at the bootload console may consult a master log and field the requests directed to any of these remote stations. Remote stations would be simple terminals under control of processes other than the Initializer's. It must be emphasized here that this is the type of facility that I envision, not a commitment to produce such a facility. Designs in this area will be covered in subsequent MTBs. CONTENTS Page Abstract . . . . . . . . . . . . . . . . 1 The Need for A New Interface . . . . . . 2 The New Design . . . . . . . . . . . . . 2 Interrupt Driven I/O . . . . . . . . . 3 Dynamic Error Recovery . . . . . . . . 4 Config Card Format . . . . . . . . . . 4 The BOS Console . . . . . . . . . . . 7 Explicit Console Reconfiguration . . . 7 Console Line Length Support . . . . . 7 Chronological I/O With Syserr Priority 7 Console Configuration Commands . . . . . 8 set_system_console . . . . . . . . . . 9 Future Console Support . . . . . . . . . 11