1 " ***********************************************************
2 " * *
3 " * Copyright, C Honeywell Bull Inc., 1987 *
4 " * *
5 " * Copyright, C Honeywell Information Systems Inc., 1982 *
6 " * *
7 " * Copyright c 1972 by Massachusetts Institute of *
8 " * Technology and Honeywell Information Systems, Inc. *
9 " * *
10 " ***********************************************************
11
12 " HISTORY COMMENTS:
13 " 1) change87-01-19Fawcett, approve87-01-19MCR7531,
14 " audit87-01-19Martinson, install87-01-20MR12.0-1288:
15 " Change to set the bb pointer before it is used. Also add segdef sys_trouble.
16 " END HISTORY COMMENTS
17
18
19 inhibit on <+><+><+><+><+><+><+><+><+><+><+><+>
20
21 use main
22
23
24 " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " "
25 "
26 " Before spinning waiting for console I/O to die, and before going to
27 " bce, the fault vector for lockup is patched to do an SCU/RCU
28 " in absolute mode. This must be in absolute mode, in case the lockup
29 " happens in early bce. The target of the SCU/RCU is in this program
30 " rather than prds$ignore_data, since the latter may not be in the
31 " low-order 256K.
32 "
33 " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " "
34 eight
35 ignore_data:
36 bss ,8 " SCU data for SCU/RCU on lockup fault
37
38 ignore_scu_rcu:
39 scu 0 " Put into fault vector for lockup fault
40 rcu 0 " after absolute address inserted into Y-field
41
42 " ^L
43 segdef sys_trouble
44 sys_trouble:
45 lda prds$processor_pattern get bit pattern for this CPU
46 cana scs$bos_restart_flags are we restarting this processor?
47 tnz restart if so, get it running again
48
49
50 " If this is the first processor to enter this code,
51 " a system trouble connect must be sent to all other
52 " processors to stop them too.
53
54 lda scs$processor get flags for all running CPU's
55 stac scs$trouble_flags are we the first processor?
56 tnz *+2 if not, skip broadcast
57 tsx0 broadcast broadcast system trouble connects to others
58
59 tsx0 fim_util$set_mask save mask and mask down
60
61 " ^L
62
63 " Copy the machine conditions into prds$sys_trouble_data.
64 " This prevents overwriting the data when another
65 " system trouble interrupt is used to restart CPU's.
66
67 lda bp|mc.scu+scu.fi_num_word get fault code
68 ana scu.fi_num_mask,dl mask fault code
69 arl scu.fi_num_shift right-justify
70 cmpa FAULT_NO_CON,dl connect fault?
71 tnz no_copy if not, conditions already in trouble_data
72
73 eppap prds$sys_trouble_data ap -> cache for machine conditions
74 tsx0 fim_util$copy_mc copy the machine conditions
75 no_copy:
76
77 eppbb pds$history_reg_data bb -> place to store history regs
78 tsx0 fim_util$check_mct go copy cpu type into machine conditions
79 tsx0 fim_util$force_hist_regs save the history registers in pds
80
81 lda prds$processor_tag CPU tag in A
82 als 1 multiply by 2
83 sdbr scs$trouble_dbrs,al save DBR for debugging
84
85
86 " If this is the bootload CPU, enter bce.
87 " Otherwise, die gracefully.
88
89 lca 1,dl all one's in A
90 era prds$processor_pattern CPU pattern mask in A
91 ansa scs$processor indicate that this CPU is stopped
92
93 lda prds$processor_tag processor tag in A
94 cmpa scs$bos_processor_tag is this the bootload CPU?
95 tze enter_bce if so, go to bce
96
97 die:
98 dis -1,du stop
99 tra *-1 I said stop!
100
101 " ^L
102
103 " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " "
104 "
105 " The second trouble connect for restarting processors
106 " causes control to be transferred here.
107 "
108 "
109 " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " "
110
111 restart:
112 lda prds$processor_pattern get bit for this processor
113 orsa scs$processor indicate CPU is running again
114 era =-1 complement to make a mask
115 ansa scs$bos_restart_flags indicate processor has been restarted
116 ansa scs$sys_trouble_pending turn off trouble flag for this processor
117
118 eppbp wired_fim$trouble_prs,* bp -> system trouble m.c. area
119 tsx0 fim_util$restore_mask restore original controller mask
120
121 szn scs$faults_initialized see if system ready for cache
122 tze trouble_exit transfer if not
123 tsx0 fim_util$reset_mode_reg restore mode and cache mode regs
124
125 odd
126 trouble_exit:
127 tsx0 fim_util$v_time_calc start virtual time meters again
128
129 lpl bp|mc.eis_info restore ptrs and lgths
130 lreg bp|mc.regs_word and regs
131 lpri bp|mc.prs and prs
132 rcu wired_fim$trouble_scuinfo,* get running again
133
134 " ^L
135
136 " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " "
137 "
138 " The following code copies an error message into the bce
139 " flagbox message buffer.
140 "
141 "
142 " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " "
143
144 enter_bce:
145 lda scs$sys_trouble_pending get flags
146 als 18 extract low-order
147 ars 18 could be negative number
148 neg 0 or zero
149 tze rtb_no_message if zero, no messag
150 eppbb flagbox$ bb -> bce flagbox
151 cmpa trbl_exec_flt,dl
152 tnz not_manual_crash execute fault?
153 ldq fgbx.manual_crash,du
154 orsq bb|fgbx.rtb
155 not_manual_crash:
156 mlr ,pr copy program ID
157 desc9a sys_trouble_name,13
158 desc9a bb|fgbx.message,13
159
160 mlr id,pr,fill040 copy error message
161 arg trouble_messages-1,al
162 desc9a bb|fgbx.message+31,64-13
163
164 cmpa trbl_r0_drl_flt,dl is it a ring-0 derail?
165 tnz non_drl nope, that's all
166 szn scs$drl_message_pointer augment the message
167 tze non_drl nothing to say
168 lprplb scs$drl_message_pointer
169 lda lb|0 acc length in upper 9
170 arl 27 lower 9, now
171 mlr prrl,pr,fill040 Your life story in 32 characters.
172 desc9a lb|01,al
173 desc9a bb|fgbx.message+8,64-32
174
175 non_drl: ldq fgbx.mess+fgbx.alert,du set flags for message printing
176 orsq bb|fgbx.rtb ..
177
178 tra rtb_no_message no, go back to bce
179
180
181 sys_trouble_name:
182 aci "sys_trouble: "
183
184 " ^L
185
186 macro message
187 desc9a &U,&l1
188 maclist off,save
189 use message
190 maclist restore
191 &U:
192 aci "&1"
193 maclist off,save
194 use main
195 maclist restore
196 &end
197
198 trouble_messages:
199 message Page fault while on prds.
200
201 message Fault/interrupt while on prds.
202
203 message Fault in idle process.
204
205 message Fault/interrupt with PTL set.
206
207 message Unrecognized fault.
208
209 message Unexpected fault.
210
211 message Execute fault by operator.
212
213 message Out-of-Segment-Bounds on prds.
214
215 message Fault while in masked environment.
216
217 message Fault while in bound_interceptors.
218
219 message Ring 0 derail.
220
221 " ^L
222
223 " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " "
224 "
225 " The following code enters bce by placing the two
226 " absolute mode instructions needed to enter bce
227 " into the fault vector slot for the derail fault.
228 " NOTE: bp must be preserved across call to bce since
229 " we use it to restore pointer registers upon return.
230 "
231 "
232 " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " "
233
234 rtb_no_message:
235 eppbb fault_vector$+0 bb -> fault vector segment
236 ldaq bb|2*FAULT_NO_LUF+fv.fpair
237 staq lp|save_lockup_fault save SCU/TRA
238 absa ignore_data abs addr in 0-23
239 als 6 abs addr in 0-17 of Areg
240 eaq 0,au abs addr in 0-17 of Qreg
241 oraq ignore_scu_rcu replace lockup fault vector
242 staq bb|2*FAULT_NO_LUF+fv.fpair
243
244 szn scs$processor all CPU's stopped?
245 tnz *-1 if not, wait here
246
247 lda 4,du wait for operator's console output to finish
248 odd
249 sba 1,dl to allow I/O to drain off
250 tnz *-1 ..
251
252 " ^L
253
254 " Here is the channel masking code.
255
256 epp0 iom_data$
257 eax1 pr0|iom_data.per_device
258 lxl0 pr0|iom_data.n_devices
259 mask.next_device:
260 lda pr0|per_device.flags,x1
261 cana per_device.in_use,du
262 tze mask.skip_device " not in use now.
263
264 ldq pr0|per_device.iom,x1 " which iom?
265 mpy per_iom_size,dl
266 eax2 -per_iom_size+iom_data.per_iom,ql " pr0|per_iom.XXX,x2
267 ldq pr0|per_device.iom,x1
268 mpy iom_mailbox_size,dl " address iom mbx
269 epp1 iom_mailbox$+iom_mailbox_seg.iom_mailbox-iom_mailbox_size,ql
270 " pr1|mailbox
271
272 ldq pr0|per_device.channel,x1
273 cmpq =o10,dl " it has to be bigger than this
274 tmi mask.skip_device " overhead
275
276 qls 27 " channel position
277 oraq MASK_PCW
278 staq pr1|connect.pcw
279 ldq pr0|per_iom.connect_lpw,x2 " take template PCW
280 stq pr1|connect.lpw " set up for real
281 lda 50,dl
282 cioc pr0|per_iom.cow,x2 " BANG
283 odd
284 mask.connect_loop:
285 cmpq pr1|connect.lpw " connect taken yet?
286 tnz mask.skip_device " nope, wait it out.
287 sba 1,dl
288 tnz mask.connect_loop " keep waiting
289
290 mask.skip_device:
291 eax1 per_device_size,x1 " next device
292 sbx0 1,du " how many done?
293 tnz mask.next_device " not all
294
295 " ^L
296
297
298
299
300 ldaq bb|2*FAULT_NO_DRL+fv.fpair grab SCU-TRA pair from fault vector
301 staq lp|save_derail_fault
302 ldaq toehold$+2*TOE_HOLD_MULTICS_ENTRY pick up code to enter bce
303 staq bb|2*FAULT_NO_DRL+fv.fpair set it in fault vector
304
305 drl: drl 0 ****** bce is entered here ******
306
307 szn scs$connect_lock did we enter through pmut call?
308 tze drl if not, cannot restart
309
310 ldac scs$trouble_flags get and clear trouble flags
311 sta scs$bos_restart_flags set for restarting CPU's
312
313 ldaq lp|save_derail_fault
314 staq bb|2*FAULT_NO_DRL+fv.fpair ..
315 ldaq lp|save_lockup_fault restore lockup faults
316 staq bb|2*FAULT_NO_LUF+fv.fpair ..
317
318 tsx0 broadcast send trouble connects to start CPU's
319
320 tra restart restart the bootload CPU
321
322 " ^L
323
324 " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " "
325 "
326 " BROADCAST - Send system trouble connects to all other
327 " processors.
328 "
329 "
330 " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " "
331
332 broadcast:
333 ldq hbound_processor_data,dl initialize the Q with maximum configurable CPUs
334 broadcast_loop:
335 cmpq prds$processor_tag test for ourselves
336 tze broadcast_next don't hit ourselves
337 lda scs$processor_data,ql get processor data for this CPU
338 cana processor_data.online,du is it configured?
339 tze broadcast_next if not, don't hit it
340
341 cioc scs$cow_ptrs,ql* Zap
342
343 broadcast_next:
344 sbq 1,dl step to next CPU
345 tpl broadcast_loop if more, get the others
346 tra 0,0 return to caller
347
348 "^L
349
350
351 maclist off
352 include make_data_macros
353 include iom_word_macros
354 maclist on
355
356 make_pcw MASK_PCW,0,0,0,record,terminate,0,0,mask
357
358 " ^L
359
360 use internal_static
361 join /link/internal_static
362
363 even
364 save_lockup_fault:
365 bss ,2 place to save lockup fault SCU and TRA
366
367 save_derail_fault:
368 bss ,2 place to save derail fault SCU and TRA
369
370
371
372 " ^L
373
374 include scs
375
376 include flagbox
377
378 include sys_trouble_codes
379
380 include fault_vector
381
382 include toe_hold
383
384 include iom_data
385
386 equ connect.pcw,connect_channel*channel_mailbox_size+channel_mailbox.scw
387 equ connect.lpw,connect_channel*channel_mailbox_size+channel_mailbox.lpw
388
389
390 " ^L
391
392 " BEGIN MESSAGE DOCUMENTATION
393 "
394 " Message:
395 " sys_trouble: Page fault while on prds.
396 "
397 " S: $crash
398 "
399 " T: $run
400 "
401 " M: $err
402 "
403 " A: $recov
404 "
405 "
406 " Message:
407 " sys_trouble: Fault/interrupt while on prds.
408 "
409 " S: $crash
410 "
411 " T: $run
412 "
413 " M: $err
414 "
415 " A: $recov
416 "
417 "
418 " Message:
419 " sys_trouble: Fault in idle process.
420 "
421 " S: $crash
422 "
423 " T: $run
424 "
425 " M: $err
426 "
427 " A: $recov
428 "
429 "
430 " Message:
431 " sys_trouble: Fault/interrupt with ptl set.
432 "
433 " S: $crash
434 "
435 " T: $run
436 "
437 " M: $err
438 "
439 " A: $recov
440 "
441 "
442 " Message:
443 " sys_trouble: Unrecognized fault.
444 "
445 " S: $crash
446 "
447 " T: $run
448 "
449 " M: Unexpected or unrecognized fault subcondition.
450 " Probable hardware malfunction.
451 "
452 " A: $contact
453 "
454 "
455 " Message:
456 " sys_trouble: Unexpected fault.
457 "
458 " S: $crash
459 "
460 " T: $init
461 "
462 " M: $err
463 "
464 " A: $recov
465 "
466 "
467 " Message:
468 " sys_trouble: Execute fault by operator.
469 "
470 " S: $crash
471 "
472 " T: $run
473 "
474 " M: Operator depressed execute pushbutton on processor.
475 "
476 " A: $recov
477 "
478 "
479 " Message:
480 " sys_trouble: Out-of-Segment-Bounds on prds.
481 "
482 " S: $crash
483 "
484 " T: $run
485 "
486 " M: While running with the prds as a stack, an attempt was
487 " made to reference beyond the end of the prds. The likely
488 " cause was stack overflow, due either to a recursive loop
489 " in the procedures running on the prds or insufficient
490 " space allocated for the prds. If the latter, the size of
491 " the prds should be increased by means of the TBLS Configuration
492 " Card.
493 "
494 " A: $recover
495 "
496 "
497 " Message:
498 " sys_trouble: Interrupts Masked in User Ring.
499 "
500 " S: $crash
501 "
502 " T: $run
503 "
504 " M: During processing of a fault, it was noticed that interrupts
505 " were masked in user-ring, an invalid condition. This is a
506 " debug trap crash, enabled by the hidden tuning parameter
507 " trap_invalid_masked.
508 "
509 " A: Contact the Multics System Development staff.
510 "
511 "
512 " Message:
513 " sys_trouble: Fault in bound_interceptors.
514 "
515 " S: $crash
516 "
517 " T: $run
518 "
519 " M: A fault occured while handling another fault.
520 "
521 " A: $recov
522 "
523 "
524 " Message:
525 " sys_trouble: Ring 0 derail. MESSAGE
526 "
527 " S: $crash
528 "
529 " T: $run
530 "
531 " M: A supervisor software module discovered an untenable situation, and
532 " crashed the system by executing a derail DRL instruction.
533 " If MESSAGE is also present, it will be of the form:
534 " "module: explanation", and further details can be found in
535 " this documentation in the description of "module".
536 "
537 " A: $recov
538 "
539 " END MESSAGE DOCUMENTATION
540
541
542
543 end