Multics Technical Bulletin MTB-634
DM: Shutdown
To: Distribution
From: Lee A. Newcomb
Date: 10/11/83
Subject: Data Management: System Shutdown
1 ABSTRACT
A data management system should be shutdown when the Multics
system it is running on is shutdown. This allows data management
to be made available more quickly to users in the next Multics
bootload by avoiding crash recovery. It also gives some extra
insurance that users' protected files are consistent. The reader
should note that some hardcore changes are required to support
new inter-process signals (IPS) and the required static handlers.
Comments should be sent to the author:
via Multics forum:
>udd>Multics>Spratt>meetings>DMS_Development
via Multics Mail:
Newcomb.Multics on System M or
LNewcomb.Multics on MIT Multics.
via US Mail:
Lee A. Newcomb
Honeywell Information Systems, Inc.
4 Cambridge Center
Cambridge, Massachusetts 02142
via telephone:
(HVN) 261-9332, or (617) 492-9332
_________________________________________________________________
Multics project internal working documentation. Not to be
reproduced or distributed outside the Multics project without the
consent of the author or the author's management.
CONTENTS
Page
1 Abstract . . . . . . . . . . . . . . i
2 Introduction . . . . . . . . . . . . 1
3 Basic Shutdown Steps . . . . . . . . 1
4 Start shutdown and warn users . . . 2
4.1 Mark DMS State . . . . . . . . 2
4.2 Stop New Transactions . . . . . 2
4.3 Warn Users . . . . . . . . . . 3
4.4 Set Daemon Timer . . . . . . . 3
5 User Shutdown . . . . . . . . . . . 3
5.1 Mark DMS State . . . . . . . . 3
5.2 Signal User Processes . . . . . 4
5.3 Set Daemon Logout Timer . . . . 4
6 Final Shutdown . . . . . . . . . . . 4
7 Problems . . . . . . . . . . . . . . 5
7.1 Invalidating User DMS
References . . . . . . . . . . . . 5
7.2 IPS' (inter-process signals)
Ignored . . . . . . . . . . . . . 5
7.3 Hardcore Changes to support new
IPS' . . . . . . . . . . . . . . . 6
Multics Technical Bulletin MTB-634
DM: Shutdown
2 INTRODUCTION
The major objective of Data Management is to keep users'
protected files consistent. To this end, there are several
termination mechanisms employed to recover from user process
deadlocks, a Multics system crash (with or without ESD), etc.
The shutdown of a running Data Management System (DMS) is an
extra guarantee that protected file are correct. As important,
users will see a DMS available more quickly after a Multics
bootload since crash recovery, which can be very time consuming,
is avoided.
DMS shutdown will normally occur just prior to Multics
system shutdown. There are also occasions when a system
administrator may wish to shutdown a DMS without taking down
Multics. For example, there may be some priority jobs that must
run and do not use any protected files. It is possible to
shutdown the DMS until these jobs are finished and bring it back
up. It is expected these occasions will be very rare; however,
it is easy to add this capability to Data Management and also
facilitates development testing.
It must be remembered that DMS shutdown is not an absolute
necessity; the DMS crash recovery mechanism will put protected
files back in order. If DMS shutdown does complete, the recovery
at the next DMS bootload will have nothing to do. For the same
reason, shutdown does not have to complete: recovery will still
rollback any transactions left over. The result is less crash
recovery time and faster availability on the next bootload.
The reader is assumed to be familiar with the MTB's covering
the initialization and recovery of a DMS. These are MTB numbers:
508, 592, and 603.
3 BASIC SHUTDOWN STEPS
Following are the basic steps in the shutdown of a Data
Management System. These will be discussed in detail later. The
objective is to have no transactions in progress when the
caretaker Daemon of the DMS logs out, implying all protected
files and before journals are closed (and therefore consistent).
o The DMS state is set to "shutdown warning"; no new
transactions are allowed to begin. The
"dm_shutdown_warning_" inter-process signal (IPS) is sent
to the current users of the DMS to warn them DMS is
shutting down and there is a finite amount of time to
finish their work.
MTB-634 Multics Technical Bulletin
DM: Shutdown
o When the time limit for users to finish is reached, the
DMS state is set to "user shutdown". The
"dm_user_shutdown_" IPS is sent to any remaining users of
the DMS. The default action for user processes in this
case will be to call transaction_manager_$user_shutdown to
finish any active transactions, close all protected files
and journals, and invalidate their per-process DM data.
o When all transactions have been finished, the DMS state is
set to "normal shutdown" and the Daemon logs out. At this
time, all protected files in this system are consistent
and crash recovery will do nothing on the next bootload of
DMS.
4 START SHUTDOWN AND WARN USERS
At some time, someone or something decides a running DMS
should be shutdown and informs the DMS' caretaker Daemon. It is
anticipated this will normally occur when it is decided to
shutdown the Multics system running the DMS. There will also be
an administrative interface to allow a privileged user to start
DMS shutdown.
4.1 Mark DMS State
The current state of the DMS (in dm_system_data_) must be
set to "shutdown warning". This is in case the caretaker Daemon
dies before completing the shutdown tasks. A new Daemon will
note this state and pick up the shutdown work instead of trying
to continue normal DMS operation.
4.2 Stop New Transactions
No new transactions will be started once shutdown has
started. This is enforced by calling
transaction_manager_$begins_off to set a global flag in the
current DM system. This does not prohibit currently active
transactions from continuing.
Multics Technical Bulletin MTB-634
DM: Shutdown
4.3 Warn Users
Send the "dm_shutdown_warning_" inter-process signal (IPS)
to all users of the current DMS. The default static handler in
the user ring for this IPS reports to the user the amount of time
remaining to finish a transaction before DMS user shutdown
actually occurs. If the process does not have an active
transaction, the static handler will act as if the user's grace
time has expired. See the "USER SHUTDOWN" section below.
4.4 Set Daemon Timer
The DMS caretaker Daemon then sets a timer to wake itself up
when the user grace time is over to force users out of the DMS.
This will be the DMS shutdown time for users, not the final
shutdown time.
5 USER SHUTDOWN
When the Daemon's timer for user shutdown goes off, all
active transactions must be aborted or abandoned (this allows us
to use the normal rollback procedures for shutdown instead of
writing new ones). In addition, all users of the DMS must
invalidate their references to DMS per-system and per-process
data. This is mainly to avoid segment faults in the DM ring
(which is generally lower than a user's login ring) if the DMS
bootload directory is deleted (expected to be the most common
case).
5.1 Mark DMS State
The current state of the DMS (in dm_system_data_) is set to
"user shutdown". Again, this is in case the current caretaker
Daemon dies and a new one must pick up the shutdown work.
MTB-634 Multics Technical Bulletin
DM: Shutdown
5.2 Signal User Processes
The Daemon sends the "dm_user_shutdown_" IPS to all users of
the DMS. The default static handler in the user ring for this
signal will call the new program
transaction_manager_$user_shutdown. This will call
transaction_manager_$abandon_txn if the user has an active
transaction so the Daemon may rollback the active transaction
using the currently existing code for this function. In
addition, the user_shutdown entry will invalidate the user's DMS
per-process data (e.g. dm_data_, lm_data_) and references to
per-system tables, and terminate the Data Management ring
transfer vectors. This termination allows the user to use a new
DMS if one is booted again in this Multics bootload; otherwise
the user must new_proc. This type of shutdown is expected to be
rare and may only be used in development and testing.
5.3 Set Daemon Logout Timer
The Daemon now sets a timer for when it is to logout. This
is much like the user warning timer: it defines the finite
amount of time the Daemon has to cleanup transactions abandoned
by users. This may be unneccesary if shutdown is occurring as
part of Multics shutdown.
6 FINAL SHUTDOWN
When there are no more users of the DMS bootload, the
caretaker Daemon will mark the DMS state as "normal shutdown".
It will then call the new procedure
dm_dir_$old_bootload_dir_disposition to either rename or delete
the current bootload directory just as the DMS crash recovery
mechanism would when it is finished.
When the above two steps are finished, the Daemon will
logout. It may logout without doing any of the above if forced
by the Multics operator.
Multics Technical Bulletin MTB-634
DM: Shutdown
7 PROBLEMS
7.1 Invalidating User DMS References
Users who do not have active transactions must also be
notified that the DMS is shutting down so they may invalidate
their per-process data and references to per-system tables, and
terminate references to the Data Management ring transfer
vectors. This is only a concern when Multics is not going down,
just the DMS, and it is expected that the DMS will be re-booted
within this Multics bootload. If a user has references to a
previous DMS (now inactive), the user's process will take segment
faults in the Data Management ring if attempts are made to use
the shut down DMS.
There are several options in this case. One method is to
follow the scheme presented in the main description of shutdown
above. This is more work for the Daemon and requires more coding
effort, but is easier for users and for booting multiple DMS's
within the same Multics invocation for development testing.
The most convenient solution for development is to only be
concerned with users having active transactions. Since DMS
shutdown will usually coincide with Multics shutdown, the DMS
will not be re-booted to cause segment faults in the inner ring
for a user without an active transaction at DMS shutdown time.
If the DMS is re-booted, a warning could be sent to all users to
new_proc or call the user_shutdown entry in transaction_manager_.
It would also be possible to handle segment faults in the
inner ring code, but the faults would require considerable
analysis. There are better ways to use our time.
7.2 IPS' (inter-process signals) Ignored
It is possible for the user to mask the
"dm_shutdown_warning_" and "dm_user_shutdown_" IPS'. In this
case, the user process may never recieve the shutdown warning or
call the user_shutdown entry. This is an unavoidable problem
with the way things are done. The Daemon certainly cannot wait
forever for the user process to respond. The first step is to
simply ignore the fact an active transaction still exists when
the time comes for the Daemon to logout.
MTB-634 Multics Technical Bulletin
DM: Shutdown
An alternative solution is for the caretaker Daemon to
forcibly take over any transaction not abandoned by a user
process which ignores the "dm_user_shutdown_" IPS. The Daemon
would process those transactions given up voluntarily first, and
then takes over any left over. In addition, users without
transactions, but with DMS per-system bootload tables initiated
must be "kicked out". This all amounts to running
transaction_manager_$user_shutdown for a user (I wonder if we can
charge extra for this?). This will require some modifications to
transaction_manager_ to be able to take over transactions; and
will probably also increase the Daemon's working set to keep
pointers to all users' per-process data in the DM ring. There is
a minor advantage to this: force takeover could allow for future
handling of transaction timeouts by the Daemon on an active DMS
to ease holding of before journals, deadlocks, etc.
Another solution is to force the user to logout, which will
cause the Daemon to be notified that the TDT entry for the
process needs cleaning out. This is a rather drastic situation
that only matters if the DMS is being shutdown, but Multics will
stay up, and the DMS re-booted later in the same Multics
invocation. If the force takeover of user transactions and force
invalidation of user DMS data method is used, this method is
unnecessary. If required, a temporary interface interface with
the Initializer to destroy the process could be created.
7.3 Hardcore Changes to support new IPS'
This is not strictly a problem, just a subtask of the
strategy presented above. Two new IPS need to be created and
static handlers written to take care of the signal. In addition,
some programs that deal with the character representation of the
IPS names must be modified (e.g. sys_info, create_ips_mask_);
and the default static handlers must be setup in all stacks
greater than the DM ring (except when the DM ring IS the user's
login ring, mods. required to make_stack_).
It is also being proposed that the four character limitation
on the name of an IPS be expanded to 32 characters. This gives
us the ability to name the IPS' according to function more
clearly and make them self-documenting. Instead of the "dmw_"
and "dms_" signals with the four character restriction, we get
"dm_shutdown_warning_" and "dm_user_shutdown_".
The system programs dealing with IPS are limited enough that
the above changes should not be hard to do.