1
2 09/21/87 hardcore
3 Known errors in the current release of hardcore.
4 # Associated TR's
5 Description
6
7 922 phx20933
8 Some hardcore module needs to know if the disk is operative. This is
9 done by calling disk_control$test_disk or dctl$test_disk. The module
10 loops until the IO is complete. The problems come when the hardware is
11 broken in such a way that the IO never completes. Therefore the
12 pvte.testing is never reset. I gess that this is another place that
13 the disk dim should give up. Because it knows that the IO did not
14 complete and that it is a test type IO. One of the problems with
15 makeing disk_control smarter is that more pages need to be wired in
16 ring 0.
17
18 921 phx20930
19 During a BCE restore tape record sequence errors are occuring at the
20 end of the tape. Sometimes the sequence error shows an actual disk
21 record missing and others appear to only show the tape record numbers
22 in error with the disk record numbers still in sequence no gap.
23
24 920 phx20152
25 vacate_pv is setting pvte.pc_vacating and pvte.vacating. The use of
26 pvte.vacating is to keep new segments from being created on this pv.
27 pc_vacating will inhibit and new pages being created on this pv. The
28 contract of vacate_pv is only to keep new segments from beeing created.
29 Therefore pvte.pc_vacating should not be set in vacate_pv.pl1.
30
31 919 phx20908
32 Another call in disk_queue to code in >udd>m>lib. The fix will be to
33 remove the -interpret support from the disk_queue command. This should
34 not present any grate problems because it could not be working at sites
35 other than system M.
36
37 918 phx20868
38 The TR claimes a 17th level can exists in the hierachy and problemes
39 exists if the pack is demounted when this segment is active?
40 decativate_for_demount.pl1 line 261.
41
42 917 phx20922
43 disk_control will, on certain types of disk errors such as MPC data
44 alerts, continuously retry the failing IO. The main complant is for
45 bootload_io type at BCE. this includes such things as copy_disk and
46 save and restore commands. The reason for this is disk_control
47 determines that this is a "bad_path" status, its job is to delete this
48 channel and then another will be tried until all channels, save one
49 have been deleted . Then add them all back and just keep doing it over
50 and over again.
51
52 897 phx13424 phx17773 phx17819
53 Problems with directory quota management/enforcement.
54
55 895
56 No automatic hierarchy salvage is occuring when "boot rpvs" or "boot
57 rlvs" is done.
58
59 894 phx20661
60 Linkage error at bce early loading firmware in mpcs.
61
62 891
63 delete_ calls hcs_$get_segment_ptr_path to determine if a segment is
64 known in the calling ring it wants to call term_ only when known
65 segments are being deleted. The hcs_ gate target is
66 initiate_$get_segment_ptr_path, which currently calls
67 dc_find$obj_initiate to find the object's directory entry. This can
68 cause a superfluous GRANT audit message, since $get_segment_ptr_path
69 only returns a pointer to the segment if it is already known in any
70 ring to the process. And it can cause a superfluous DENY audit
71 message, since no operation is performed unless the segment is known.
72
73 The fix involves creating a new entrypoint, dc_find$obj_initiate_priv,
74 which bypasses access checks and auditing, and changing
75 initiate_$get_segment_ptr_path to call this new entrypoint.
76
77 The intent of the fix would be to never audit the operation of
78 hcs_$get_segment_ptr_path. This is true even if the caller asks about
79 a segment known only in a ring other than the caller's ring. Since the
80 original audit message included the ring brackets of the segment, it
81 documents the caller's access to the segment from all rings within
82 those ring brackets.
83
84 890 phx19527
85 ioa_$ioa_stream prints garbage or blows up when no control string is
86 given.
87
88 887 phx19986
89 The disk_control$test_drive entry does not wait for an interrupt for
90 its I/O, but polls the status word. For FIPS devices or those on a
91 DAU, this will not work since the status words are not valid under the
92 interrupt is sent.
93
94 885
95 The program install_ttt_ does no auditing.
96
97 884
98 The hcs_$truncate_file entry logs a DENIED message even though other
99 entries log GRANTED, as the reason the call fails this operation is
100 not allowed for a directory has nothing to do with access control.
101
102 882
103 It appears that hcs_$make_entry does not null its output argument when
104 it returns an error code, although the documentation states that it
105 does. Since it doesn't modify the output argument at all in this case,
106 this is not a security problem.
107
108 881
109 Several problems with hcs_$fs_move_file and hcs_$fs_move_seg.
110
111 They return an error code if the caller has rw access to both the
112 source and destination segments, but null access to the directory in
113 which they are contained. The audit messages show various GRANTED and
114 DENIED fs_obj_prop_read's. The reason is that the inner ring module
115 attempts to get the status on the destination to find out its current
116 length. Unfortunately it uses an entry in status_ which returns more
117 information which requires S on the parent.
118
119 Since the entries are considered obsolete, it's not worth fixing this
120 silly restriction.
121
122 Another, more serious problem with hcs_$fs_move_file is that if the
123 user does not have RW access to the destination, error_table_$no_move
124 is returned, but no DENIED is logged. It audits GRANTED read of fs_obj
125 prop, and GRANTED initiation of FS_obj. This was in a case where the
126 user's authorization was greater than the access class of the existing
127 destination segment, so the process had R effective access to the
128 segment and S effective access to the containing dir. This bug should
129 be fixed, but it requires a new entry into dc_find.
130
131 880
132 Many filesystem operations consist of a name lookup followed by an
133 access check. The way dc_find implements these, an operation which
134 requires more than S access to a directory can fail with
135 error_table_$namedup or error_table_$seg_not_found and generate no
136 audit message, even though the caller has insufficient access to
137 perform the operation. This occurs when the eventual failure of the
138 operation can be determined from the name lookup.
139
140 879
141 The hcs_$tty_get_name returns a channel name for a channel belonging to
142 a process other than the caller.
143
144 877
145 None of the entries in the dm_hcs_ gate do any auditing.
146
147 876
148 Several file system attribute setting operations generate audit
149 messages which say GRANTED even though the operation is later denied.
150 This happens when M access is required to the parent and the process
151 must be in the write bracket of the entry. Worse, no DENIED audit
152 message is ever generated. The entries in question are: set_$copysw
153 volume_dump_switches safety_sw_ptr safety_sw_path synchronized_sw
154 max_length_ptr max_length_path entry_bound_ptr entry_bound_path
155
156 With the fixing of the bug described in entry 23, the entries
157 set$damaged_sw_path damaged_sw_ptr dnzp_sw_path dnzp_ptr must be
158 added to the list.
159
160 875
161 Upgraded directories created under dir privilege are left in a
162 process's address space after dir privilege is turned off. The
163 suspected cause is that the pathname associative memory is not being
164 flushed when dir privilege is turned off.
165
166 This poses no security problems since only a person with privileges
167 could have gotten into this position.
168
169 874
170 log_read_$position_time will not find any messages later then the
171 latest message in the log at the time that the log was opened.
172 log_read_$position_sequence has the equivalent problem.
173
174 872
175 There is an ambiguity in the definition of "security auditing" that is
176 particularly apparent in the case of append. The ambiguity is this:
177 some system operations make both security-related and
178 non-security-related checks. Either check can fail. If the security
179 check passes, but the non- security check fails, it is unclear what the
180 "correct" security audit message is: Grant, or Deny?
181
182 The ideal implementation would probably be to indicate the exact
183 situation in the audit message: that access would have been granted,
184 but was not.
185
186 The current implementation of append and others is to audit the
187 access grant, but later abort the operation if the non-security check
188 fails. This is particularly confusing in the case where the requested
189 multi-class max authorization is above the process authorization or in
190 the case that the requested authorization is below the containing
191 directory access class. This is considered to be a non-security
192 related failure no attempt was made to access information or destroy
193 it but the error code, ai_restricted, appears security-related.
194 Nonetheless, the audit is a GRANT.
195
196 This behavior should be documented in MDD004 and in the MDD on
197 Ring 0 Auditing and Logging.
198
199 863 phx19695
200 If Data Management has not yet been used during a bootload, and a fault
201 while in ring-0 causes verify_lock to be invoked, a ring-0 loop will
202 result because verify_lock attempts to reference dm_journal_seg without
203 first checking the switch sst$dm_enabled to determine if data
204 management has been enabled.
205
206 861 phx19582
207 The entry dc_find$dir_move_quota performs an superfluous and incorrect
208 AIM check. It is superfluous because the KST access modes will ensure
209 there is no writedown path and it is incorrect because the call to
210 aim_check_ attempts to compare the access class in the directory header
211 with the access class in the entry for the directory -- both of these
212 should always be equal. The check may be safely removed.
213
214 858 phx19491
215 The alarm_clock_meters command is missing its addname, "acm". The
216 documention claims the addname exists.
217
218 856 phx19472
219 ioi_page_table$ptx_to_ptp may return an invalid pointer it the supplied
220 ptx is invalid. The verify_ptx internal suboutine causes a non-local
221 return via procedure quit if the ptx is invalid, this will result in
222 a return to the caller of ioi_page_table$ptx_to_ptp with an invalid
223 return pointer.
224
225 852 phx19433
226 The check_vtoce dir salvager and the volume retriever can both produce
227 segments whose security out of service switch is set on. reset_soos,
228 however, refuses to work on non-directory segments.
229
230 851 phx19285
231 sys_trouble.alm lacks message documentation for "Fault while in masked
232 environment"
233
234 850 phx16984
235 Nothing in MDC will replace missing add-names in >lv. This can cause
236 various inconsistencies.,
237
238 849 phx17979
239 Disk MPC's get confused when individual drives generate many, many,
240 errors, and begin to report errors for other drives. This is reported
241 here to cover the TR and to record it for future reference.
242
243 841 phx19270
244 Because page control will not decrement a quota through zero, this can
245 invalidate the assumptions made by fix_quota_used with respect to the
246 constancy of the quota error during operation.
247
248 839 phx19254
249 initiate_ does not distinguish calls from phcs_$initiate's gate target
250 ring0_init_$initiate from calls to hcs_$initiate. For the former,
251 attempts to initiate a directory should return error_table_$moderr if
252 user does not have proper ACL or AIM access to the directory. For the
253 latter, it should return the "traditional" error_table_$dirseg, since
254 directories can never be initiated via hcs_$initiate from an outer
255 ring.
256
257 Fixing this may require a change to dc_find$obj_initiate and
258 $obj_initiate_raw since these entrypoints currently map
259 error_table_$moderr into error_table_$dirseg. And the fix may require
260 separating the entrypoint in initiate_ used by ring 0 modules eg
261 ring0_init_ from that used by hcs_$initiate.
262
263 836 phx19180
264 vtoc buffer allocation and usage can too easily crash the system from
265 lack of buffers. A more graceful way to warn about pending doom
266 appears in the TR, along with a suggestion for avoiding the problem at
267 ast flush time.
268
269 835 phx15923
270 hc_ipc$send_wakeup should protest if a non-null info pointer is
271 supplied for a fast channel.
272
273 833 phx19071
274 quota uses error_table_$invalid_qmax for any error. It should be more
275 informative.
276
277 832 phx19073
278 You can set maxe as high as max_maxe. Unfortunately, this is too high
279 does not count max stopped stack_0's and therefore crashes the system
280 when the system runs out of stack_0's.
281
282 831 phx19074
283 The two calls to range in hc_tune for setting mine are out of order.
284 As such, attempts to set mine above maxe produces the wrong error
285 message.
286
287 829 phx18779
288 add_bit_offset_ and the corresponding addbitno pl1 builtin do not
289 properly handle negative bit offsets. Similarly, add_char_offset_ and
290 the corresponding addcharno pl1 builtin function do not properly
291 handle negative character offsets.
292
293 The failure lies in the abd and a9bd instructions, which assume that
294 only positive offsets will be used. These instructions assume that
295 negative offsets will be handled by negating the offset and using the
296 sbd or s9bd instruction to subtract from the bit or character
297 displacement. The proper solution is to detect negative offsets,
298 negate the offsets and use the sbd or s9bd instruction.
299
300 828 phx15340
301 terminate_proc should not truncate the ring 0 stack; it should leave it
302 around for analysis. terminate_proc needs clean up in general.
303
304 827 phx18873
305 Inner rings should not be allowed to set search rules or working dirs.
306
307 825 phx15219
308 Attempts to type start after a call to sub_err_ with the can't restart
309 option causes an illegal return.
310
311 822 phx18837
312 make_msf_ copies the IACL from a dir onto the components of an MSF it
313 creates. If the IACL does not give the specified user w access to
314 these components, then copy/move will fail to be able to copy/move the
315 MSF into the directory.
316
317 815 phx18756
318 Having any AIM privilege on makes RCP think that you are a system_high
319 process.
320
321 810 phx18607
322 If a SCU's size as correctly described by its config card is less
323 than the port switches on the CPUs i.e. it is 3M whereas the CPU says
324 4M as it must, running ISOLTS memory tests in this case can crash
325 the system with a store fault.
326
327 809 phx18517
328 The system has been known to crash in ioi_masked while processing a
329 channel time-out.
330
331 806 phx18566
332 Typos in fim, et al, misinterpret the hregs bits associated with parity
333 faults.
334
335 805 phx18565
336 The history registers for a parity fault that crashes the system do not
337 appear in the pds. See the TR for details.
338
339 798 phx18352
340 sct_manager_$get is supposed to return a null pointer for non-set sct
341 values. However, it checks for the sct entry being null after
342 converting the null value a zero packed pointer into a unpacked
343 pointer. This unpacked pointer is not all zero so sct_manager_'s zero
344 check fails. The fix is to check for zero before the pointer
345 assignment.
346
347 783 phx09958
348 The default potential attributes for a resource in the RTDT can be
349 mistreated when the RTDT is installed. The symptoms are that the
350 attributes are shifted in the attributes word, causing all attempts to
351 access the resource to fail.
352
353 775 phx17026
354 The limit and process_limit fields in the rtdt are ignored. Actually
355 only the values for the fields in the default_rtdt on the MST are
356 used.
357
358 765 phx18243
359 The ring zero derail fault mechanism needs improvement. In particular,
360 it should save as much information as other faults fault_time
361 especially so that azm displays this fault in proper order with the
362 others.
363
364 760 phx18185
365 Calling hcs_$grow_lot makes your lot of max size. Calling it again
366 causes a FPE even if you have more room left in the lot.
367
368 754 phx17875
369 It has been experienced, on single physical volume logical volumes,
370 that, when the volume becomes full and a user encounteres the logical
371 volume full error, that deleting segments from the volume does not
372 seem to reset the logical volume full condition for some number of
373 minutes afterwards. This is not well understood.
374
375 751 phx17482
376 msf_manager_ does not understand multiclass msfs. For such an msf,
377 msf_manager_ will add new components at the aim level of the dir that
378 is the msf, rather than at the aim class of the components of the msf.
379
380 749 phx17981
381 ips signals are not correctly masked in mrd_util_. As a result, it is
382 possible to hit QUIT or have other conditions which can cause
383 operations to fail, killing off the daemon in question. A fix is
384 known.
385
386 744 phx17943 phx18054
387 status_ won't allow the allocated return structures to be in a
388 different segment than the segment supplied as the return area that
389 is it doesn't allocate into extensible areas.
390
391 742 phx17838
392 The volume salvager should report page and vtoce bit map
393 inconsistencies.
394
395 735 phx17815
396 set_mdir_quota correctly sets the quota in the vtoce, but incorrectly
397 sets the value in the aste, when inferior dirs have terminal quotas.
398
399 733 phx17690
400 If an error is indicated when an i/o completion of a volmap page is
401 posted, volmap_page does not strip the state away from the page number
402 producing a bogus error message.
403
404 732 phx15640
405 Hardcore sets damage switches for directories and there is no way for
406 users to turn them off. The Salvager should be changed to salvage
407 directories that have the damage switch set and turn it off once
408 salvaging is complete.
409
410 731 phx17688
411 Hardcore should validate pds$stacks validation_level before using it.
412
413 730 phx17662
414 A second call to delmain to delete a frame previosuly deleted will
415 cause the calling process to hang on a bogus page wait event.
416
417 723 phx17551
418 More errors in hdx not copying args not terminating segments
419 correctly.
420
421 722 phx17553
422 More errors in mdx not copying parameters not terminating
423 disk_table_.
424
425 721 phx17615
426 init_disk_pack_ actually calling countervalidate_label_ produces an
427 error message not documented within init_disk_pack_.
428
429 720 phx17614
430 init_disk_pack_ references an unreferenced variable when looking for
431 the undocumented copy option.
432
433 718 phx17552
434 mdc_status_ does not properly copy all of its args. For that matter,
435 it doesn't even compile.
436
437 717 phx17597
438 io_syserr_msg is declared to be three words long, but is overlaid with
439 a structure which is five words long.
440
441 712 phx17186
442 You will die if another process deletes your working dir.
443
444 711 phx16992
445 A page error uses mc.errcode to encode the relevant information.
446 Unfortunately, system_startup_ cannot decipher this and crashes the
447 system which would have happened probably anyway.
448
449 708 phx17416
450 hcs_$status_mins does not work on the root.
451
452 707 phx17413
453 act_proc uses the wrong value when determining maximum possible access
454 class.
455
456 705 phx17394
457 A timeout from resetting a channel from a timeout will cause a fault
458 while in masked environment, crashing the system.
459
460 704 phx17374
461 hcs_$quota_read returns "Some directory in path..." instead of "Entry
462 not found" when the target does not exist but its parent does.
463
464 701 phx17302
465 hcs_$fs_get_brackets will not return the ring brackets of an inner ring
466 object.
467
468 700 phx17259
469 attach_lv references the non-existant error_table_$notacted.
470
471 699 phx17257
472 scavange_vol refers to the non-existant error_table_$no_arg.
473
474 698 phx17219
475 disk_rebuild examines too many bits in a vtoce file map when examining
476 it to see if it is free, when performing volmap compression. This
477 sometimes causes the compression to fail.
478
479 696 phx17141
480 The aste/vtoce.dtm fields are examined to set the dbm_map bits used by
481 the volume dumper when dumping objects. For directories, these fields
482 lead to an incorrect interpretation as to whether a directory has been
483 modified, leading to extraneous directory dumping.
484
485 694 phx17132
486 The volume retriever does not collect enough AIM related information.
487 To process a retrieval request, it needs to store, in ring 1, the user
488 auth, and max auth. Now it only stores the auth, which is
489 automatically stored by message_segment_.
490
491 The volume retriever needs its own gate to ring 1 which will store the
492 ring, auth, and max auth securely in the message.
493
494 693 phx17132
495 append$retv_append cannot possibly append a multi-class object, since
496 it only has two of the three quantities
497
498 user auth
499 user max auth
500 desired object max acc
501
502 THe structure passed to it needs to be changed.
503
504 692 phx17141
505 The volume dumper examines the wrong field when determining if it
506 should dump a directory, thus dumping unneeded directories.
507
508 691 phx16992
509 A page_fault_error occuring at the Initializer's ring-1 command level
510 causes a crash, but the attempt to produce the crash message itself
511 produces a crash because the ring-1 condition handler cannot interpret
512 the mc.errcode value.
513
514 690 phx15255
515 The SCU can return the same value for the clock twice. Some software
516 uniquification isa needed.
517
518 689 phx14716
519 When the directory salvager determines that the sons LVID in a
520 directory header is different from the value in the branch for the
521 directory, it mindlessly copies the value from the branch into the
522 directory header. This has the effect that if the value is wrong in
523 the branch, it will be wrong everywhere afterwards.
524
525 At least, the salvager should check the value to see whether it's zero
526 and obviously invalid before propagating it.
527
528 This is a genuine problem, and not already on the hardcore error list.
529 The particular problem that provoked this report has been fixed
530 elsewhere, and is no longer relevant, but the general problem remains.
531
532 685 phx17055
533 Various modules, in particular sys_trouble, are missing some error
534 message documentation.
535
536 684 phx15585
537 A situation not understood exists in which the records used exceeds
538 the current length, preventing further access to the segment.
539
540 682 phx15752
541 core flushing for pleasure from the as should not flush pdir segs.
542 Also, thew scheduling of the core flush is not at precise times.
543
544 681 phx15833
545 reclassify_seg should avoid the work if what it is reclassifying is
546 already at the level it needs to be.
547
548 679 phx15852
549 Both illegal_procedure.pl1 and the documentation suggest that illegal
550 op_code, illegal addr/modifier and other illegal procedure faults
551 should be audited. This third group, however, is not.
552
553 678 phx15172
554 syserr_real should check its error code parameter for non-zero-ness
555 when producing the message text.
556
557 676 phx14420
558 The ascii_to_ebcdic_ and ebcdic_to_ascii_ tables and routines should
559 handle the 256 character ebcdic set and map it onto some extended ascii
560 set.
561
562 664 phx17116
563 The vtoce_checksum implementation is hamstrung by two problems:
564
565 1) the "checksum_valid" flag is quite likely to be turned off by
566 damage, causing the checksum to be recalculated for invalid data.
567
568 2) part 3 has no checksum, and disk damage quite frequently fries it.
569
570 663 phx17010
571 Hot buffers can fill up vtoc_buffer_seg, crashing the system. The
572 retry fix for 662 reduces the problem, but not all if it, since an
573 authentically broken disk can fill up the segment.
574
575 657 phx17050
576 No gullibility checking, checksumming, or other protection against
577 damage exists for
578
579 Record 6 -- the vtoc map
580 Record 0 -- the label except for "Multics Storage System Volume"
581
582 Damage to these areas can cause widespread disaster, due to confusion
583 as to the location of the paging region!
584
585 We need:
586
587 1) Sentinels on all records of the label
588 2) Checksums on all records of the label
589 3) A or multiple safe-store records that store only permanent
590 information for recovery from damage to one of the records like the
591 vtoc map that contain both permanent and dynamic information.
592
593 656 phx17052
594 Detaching a device with I/O in progress can cause a fualt while in
595 wired environment due to an uninitialized pointer in the reset_device
596 entry of the program ioi_masked.
597
598 654 phx16046
599 No re-verification of the label of an offline disk is made when it
600 comes back online. As a result, mistakes with patch plusgs are
601 extremely dangerous. disk_control should not declare a disk back to
602 life unless the label checks out in some simple fashion.
603
604 652 phx16979
605 The ring 0 portion of the three-ring circus volume management is not
606 protected by a cleanup handler, and can leave pvtes in an inconsistent
607 state.
608
609 647 phx16929
610 See the TR for a complete exposition of this. When all 4K aste's have
611 a page in memory get_aste behaves very badly very slowly.
612
613 643 phx16592
614 master directory acs checking should use raw access. Otherwise, it is
615 impossible to get e access to work right in both ring 4 and ring 1.
616
617 642 phx16743
618 disk_pack.incl.pl1 has the wrong include file listed as the home of the
619 dumper bit map.
620
621 634 phx16905
622 boundsfault.pl1 does not recognize the case where the bound is less
623 than the msl but still within the page table size. This breaks setting
624 the max length within the page table but larger for active segments,
625 since the 10.2 performance optimization for set_max_length took out the
626 setfaults in this case.
627
628 628 phx14990
629 Volume backup to a IO disk does not work with the current
630 implementation of rdisk_ stream IO. The current version has no
631 buffering ability and no sense of logical End of Space ala EOT on
632 tape and physical End of Space on the pack, which is needed to allow
633 flushing of IO when thisEOT is detected.
634
635 626 phx16692
636 append$retv_append has a bug wherein it misuses the "max_authorization"
637 field of the structure. It should just consider that the max to put in
638 the multi-class segment max.
639
640 There is a companion bug in the retriever volume that fills in the
641 structure wrong to begin with. The field has to be filled in with the
642 authorization out of the message segment for the retrieval request.
643
644 625 phx16548
645 When you try to terminate a segment with more than about 250 ref names,
646 the call aborts with the message "The RNT is in an inconsistent state."
647
648 623 phx02779
649 Because of a problem with accepting a zero buffer size, it has been
650 found that a returned hardware status that contains channel or central
651 fault status is being overlooked and assumed to be good.
652
653 614 phx16489
654 ring0_get_ miscdeclares code parameters as fixed bin.
655
656 613 phx16351
657 set_bc should not let you set a negative bit count. set or change.
658
659 611 phx16506
660 append only checks mountedp when segments are appended, not dirs or
661 links. While this may be convienient, it is inconsistent. The marginal
662 utility of creating dirs and links on unmounted LV's is outweighed by:
663
664 1) the inconsistency: some operations work, some don't.
665 2) for private LV's: the desire to have NOTHING happen to the LV when
666 unmounted. Even if your access to attach a private logical volume
667 has been taken away, you can still append links and dirs.
668 3) If we ever move dirs onto the LV that they describe, this will
669 clearly have to have the restriction.
670 4) LV aim restrictions cannot be enforced if the LV is not munted.
671
672 605 phx16501
673 check_mdcs does not salvage quota inconsistencies between master
674 directories and their registration in the mdcs. Only register_mdir
675 does this. This requires the administrator to run register_mdir over
676 each mdir on a logical volume to be sure that everything is consistent.
677
678 Also, check_mdcs does not validate that a master directory actually has
679 the correct sons logical volume.
680
681 604 phx16500
682 Master directory control allows up to fixed bin 35 worth of quota for
683 an entire logical volume, but many fields are only declared fixed bin;
684 This creates periodic disasters in the control segments.
685
686 603 phx16499
687 Master directory control was not updated when quota was increased to 18
688 bits. This can cause a wide variety of misbehaviors.
689
690 593 phx16015
691 The file system should log or meter invalid quota changes attempts to
692 decrement used below 0.
693
694 592 phx16093
695 quota_received is not supported very nicely. The TR complains that it
696 is not reported by any existing hcs_ entry. There are other problems,
697 such as failure of salvagers to correct it, a way to forcibly set it.
698
699 587 phx15298 phx16005
700 peruse_crossref bugs: does not detect LV not mounted; does not
701 initialize brief_sw; does not print satisfactory message when module is
702 not referenced.
703
704 583 phx15258 phx15275
705 Invalid iacl terms cause append to fail. asd_ allows acl terms that
706 are invalid, like R..*, to be added to an initial acl. append fails
707 trying to copy then the assumption that the
708 entire RVL will be mounted, else you will be doing 1pack recovery a
709 risky assumption.
710
711 This is a limitation rather than a suggestion since we really aught to
712 have such a mechanism.
713
714 581 phx15044
715 fim should not save history registers that have just been freshly
716 cleared by fim_util.
717
718 572 phx14942
719 act_proc$create fails to return the empty APT entry in almost all error
720 cases.
721
722 569 phx14225
723 Incorrect warning message from scas_init.
724
725 568 phx14877
726 It is impossible to run hc_pf_meters without phcs_ access; metering
727 gate access should be sufficient.
728
729 566 phx14824
730 sweep_pv segment_mover actually cannot move rpv-only segments. This
731 makes it difficult or impossible to compress the RPV VTOC.
732
733 565 phx14875
734 When the operator does an x deny using RCPRM at site the process
735 still thinks it has the drive.
736
737 561 phx14705
738 The accept_fs_disk check for partitions overlapping gets confused by
739 HIGH hardcore partitions.
740
741 557 phx14657
742 ebcidic_to_ascii_ and ascii_to_ebcidic_ should be in the same bound
743 segment, and not bound in with anything that uses them. This will
744 allow prople to replace them when reading tapes with nonstandard or
745 nonMultics EBCDIC encoding.
746
747 529 phx10098
748 save_dir_info fails if any of the entries in the dir are connection
749 failures.
750
751 527 phx08068
752 Strange things are done with the IC for certain faults in the FIM.
753 Perhaps they should be improved. In particular, the IC reported in the
754 machine conditions for dfmp taking underflows is unexpected.
755
756 523 phx05319
757 ioa_ ^ and ^ execute at least once, instead of zero times, when fed
758 zero things to iterate over.
759
760 520 phx14440
761 page_error displays an erroneous disk address in the error message for
762 an I/O error on the volume map. The fix is to ANA -1,dl before saving
763 the Areg, which contains the disk address in the lower.
764
765 518 phx14405
766 print_configuration_deck does not display negative numbers correctly.
767 It prints them as very large positive numbers. This is not currently a
768 problem, since the BOS command parser does not understand negative
769 numbers completely and marks them as octal in the config deck. It
770 will be a problem when BOS is fixed or superceded.
771
772 516 phx14381
773 copy_out will fail is requested to copy a segment whose length is
774 larger than 255K. In this case, it should attempt to set the max
775 length to 256K via phcs_ or hcs_$something when this operation
776 becomes non-privileged.
777
778 514 phx14387
779 rebuild_disk for the RPV may not copy the root directory correctly.
780 Specifically, modified pages in memory will not be copied - instead,
781 the earlier instances on disk will be copied instead. This may cause a
782 crash during the subsequent initialization until the root in salvaged
783 due to bad_dir_. The problem is that disk_rebuild the ring-0 module
784 which does the rebuild does not call pc$cleanup for entry-held
785 segments indeed it should not do so in general. The root directory
786 is entry-held, and so it goes.
787
788 513 phx14276
789 If a trouble fault occurs at a point where it is not caught by
790 fim_util$check_fault, the history registers from the trouble fault will
791 be over-written by those from the subsequent sys_trouble connect. This
792 destroys potentially useful diagnostic data.
793
794 501 phx14181
795 There is a window in ring-0 ITT message processing. If a fault occurs
796 in that window, ITT entries are lost for the bootload. Further, they
797 are lost in a way which disables the logic in pxss which prevents ITT
798 overflow. The likely result is a crash in pxss when the system runs
799 out of ITT entries.
800
801 498 phx05686
802 The time-record product maintained for a directory with a terminal
803 quota account is only an approximation to an ideal space-time integral
804 of disk usage. This approximation is reasonably accurate for accounts
805 which have stable usage, but it has several anomalies for more volatile
806 accounts. The problem is that the cumulative time-record product is
807 updated only when the directory VTOCE is updated it is incremented by
808 the product of the instantaneous quota used and the delta-time since
809 the last update. If, for example, a large amount of space was used
810 and returned in the interval between updates, there is no accounting
811 for that space. A visible anomaly results from a further approximation
812 when get_quota is invoked. At this time, the time-record product is
813 reported as the value it would have if the VTOCE were being updated at
814 that time although it is not. For the reasons cited, this can cause
815 time-record product to decrease with time. The only reasonable
816 solution is to maintain time-record product continuously. This would
817 not be expensive computationally, but it would require significantly
818 more wired storage per active segment.
819
820 497 phx14069
821 Most store faults should be recorded into the Syserr Log, as they are
822 usually indicative of faulty hardware sic.. hardware_fault should
823 filter out store faults in BAR mode, however, as they are caused by
824 program error.
825
826 490 phx13931
827 Values for select_switch parameters to hcs_$star_XXX entries in
828 star_structures.incl.pl1 are declared as fixed bin 2 e.g.
829 star_LINKS_ONLY. They should be fixed bin 3.
830
831 487 phx13896
832 It should be possible to change the size of the AST pools while the
833 system is running well it should be possible to increase them
834 anyway. If the SST is expanded to multiple segments, this could be
835 done with moderately more work.
836
837 486 phx13897 phx14320
838 A volume which is inoperative cannot be demounted. There should be a
839 way to do this, such as abandoning everything associated with the
840 volume which is in memory VTOC buffers ASTEs pages etc. and
841 marking it as demounted. Also, disk I/O error processing should be
842 smarter about detecting inoperative devices, particularly devices which
843 appear operative but cannot do I/O without errors.
844
845 Note that this is the one case where it is safe to abandon VTOCE
846 buffers, since nobody will do an await_vtoce afterwards and lose if
847 demounting does things in the proper order. If there are I/O errors
848 and the volume remains mounted, it is never safe to abandon VTOCE
849 buffers.
850
851 468 phx13716
852 The various tables used in disk volume management ring-0 ring-1 and
853 ring-4 can become inconsistent. Several instances of this problem
854 have been corrected. One which has not shows itself after an "alv"
855 followed by an "av -all". The ring-4 copy of the disk table is not
856 updated after the second command, preventing pdir_volume_manager_ from
857 knowing that the logical volume is mounted and hence eligible for
858 pdirs.
859
860 460 phx13544
861 master directory control can become confused if a master directory has a
862 subordinate directory with quota. A set_mdir_quota plus or minus X
863 will cause the page control quota of the master directory to be the same
864 as the master directory quota.
865
866 448 phx12864
867 KST overflow has strange effects, not readily traceable to this problem.
868 KST overflow should probably be signalled, rather than indicated by an
869 error code.
870
871 436 phx05497
872 When signaller.alm pushes a stack frame, it first extends the previous
873 frame by 48 words to allow for interrupted push operations. If a non
874 local goto is used to transfer control back into that extended stack
875 frame, it never gets shrunk. Repeated occurences of this will
876 eventually use up the stack.
877
878 The fix should be to change signaller.alm to put the new frame 48 words
879 up the stack without doing an extension of the existing frame. This
880 requires hand-coding the push, but thats not too hard. The alternative
881 is to try to use a cleanup handler to shrink it, which would be awfully
882 hard since the cleanup handler would be associated with the frame above,
883 which would still be on the stack. Its hard to shrink your callerr's
884 stack frame.
885
886 429 phx12689
887 When cpt is invoked with the -lg control argument, it does not print
888 full pathnames in the summary report. It does, however, print full
889 pathnames in the detailed trace file if -trace is also specified.
890
891 410 phx12355
892 Attempted logins to ring-6 or ring-7 fail, since makestack requires
893 non-null effective access at the validation level of the initial ring
894 to signal_, unwinder_, operator_pointers_, and pl1_operators_. These
895 have ring brackets of 0,5,5. The general solution is not clear. Rings
896 6 and 7 are supposed to be available for totally encapsulated
897 subsystems, with only facilities provided explicitly by the subsystem
898 available. The difficulty is to balance this against the need to
899 provide a rudimentary environment to initialize the subsystem.
900
901 409 phx12251
902 A more compact method of logging I/O errors is needed. Currently, each
903 I/O error is logged into the syserr log. This can flood the log with
904 largely meaningless I/O error messages for example when reading a tape
905 of marginal quality. An approach is to write summary records
906 periodically based on time or on error thresholds and optionally
907 record detailed messages.
908
909 407 phx12250
910 Deletion of a segment with wired pages causes the segment not to be
911 deleted left active with PTWs for the wired pages having nulled
912 addresses and wired bits on. Under some circumstances this can cause a
913 system crash. This situation can be caused by a user wiring pages
914 through hphcs_. This can also happen if a process terminates with an
915 active ioi buffer.
916
917 399 phx12134
918 append$retv should validate the entry supplied more carefully. An
919 instance An instance of the problem is that the cross-retrieval of an
920 object with multiple names will contain a non-null forward name thread
921 in the primary name field.
922
923 393 phx12070 phx10495
924 Segments should be created with access of r to *.SysDaemon rather than
925 rw.
926
927 383
928 There should be a system-maintained database which keeps track of recent
929 crash history and types of shutdowns. Possibly it could be as simple
930 as logging at bootload the time and type of the last shutdown. The
931 syserr log is probably robust enough and can easily be scanned to find
932 the information.
933
934 382 phx04847
935 fix_quota_used should also adjust TRP totals in accordance with the
936 adjustment being applied to quota used and the length of time since the
937 last ESD failure crash. This should be automatically driven from the
938 last crash info and be manually overridable if necessary.
939
940 378 phx12013
941 setfaults should have a recovery strategy for page_fault_errors on a
942 target dseg; probably it should kill the other process rather than
943 crashing the system with a crawlout with AST lock set.
944
945 376 phx12003
946 trace_mc should use a hardcore segment for the buffer to avoid problems
947 with recursive faults caused by flushing trailers or dseg ptw misses.
948
949 364 phx01612
950 The iocb structure in iocb.incl.pl1 contains an implicit word of padding
951 between iocb.name and iocb.actual_iocb_ptr which should be explicitly
952 declared as pad.
953
954 362 phx11904
955 verify_lock should check all ring-0 locks which could be held on
956 call-side. It should not allow a process to crawlout with any ring-0
957 lock held. For some locks detected by verify_lock the system should be
958 crashed immediately; for others vtoc buffer lock some recovery is
959 possible.
960
961 360 phx11870
962 On a multi-process salvage one of the processes may take an unexpected
963 error page_fault_error for example. This will cause the process to
964 go to a new command level and wait for terminal input. Eventually all
965 other processes will hang blocked waiting for this process to respond
966 to the dispatch wakeup. The solution is probably for do_subtree to
967 establish an any_other handler and do something appropriate on
968 unexpected signals.
969
970 357 phx11839
971 The supervisor should take more pains to ensure that a setfaults
972 operation is performed on segments dynamically marked as damaged either
973 when the damage is detected or soon thereafter.
974
975 356 phx10004
976 The primitive for setting the damaged switch should perform a setfaults
977 operation since it operates in a better environment than page control
978 does when doing so and it is desirable to provide damage notification
979 as quickly as possible to other processes.
980
981 352 phx11831
982 If a directory hash table overflows while the directory is being rebuilt
983 by salv_dir_checker_ some names on the entry which caused the overflow
984 may not be hashed in correctly. This is because the special-case code
985 to keep hash from faulting on the partially rebuilt directory does not
986 ensure that all the names already processed are rehashed.
987
988 306 phx11600
989 The entry structure dir_entry.incl.pl1 is misdeclared; the structure
990 takes only 37 words despite the comment claiming that it takes 38.
991 This seems to be benign but should be rectified.
992
993 305 phx11593
994 Although there are hcs_ entries to set it the DNZP switch is not
995 reported by any status_ entrypoints.
996
997 303 phx11555 phx06112 phx04846
998 The quota salvager should correct inconsistencies in quota allocated and
999 quota received fields as well as quota used. There is presently no way
1000 to repair these fields other than BOS PATCH.
1001
1002 300 phx11553
1003 Damage to >lv and >disk_table_ should be detected and acted upon
1004 automatically at bootload rather than requiring use of BOOT NOLV and
1005 NODT.
1006
1007 272 phx11009
1008 traffic_control_queue should never be reporting a negative value for
1009 tssc. It does so because the snap of the APTEs consumes non-negligible
1010 time due to paging with no locks held. A fix is to read the current
1011 time immediately after copying out the APTEs.
1012
1013 260 phx10996
1014 A volume administrator can adjust the quota on a master directory of
1015 which he is not the owner if he has sma access. This use charges the
1016 quota account of the Initializer which is clearly bogus.
1017
1018 239 phx10114
1019 Although the salvager can set the security-out-of-service bit for
1020 segment branches as well as directories the privileged gate entry to
1021 reset the switch works only on directories. It should work on segments
1022 as well.
1023
1024 229 phx09675
1025 There should be a mechanism for establishing hardcore crash handlers
1026 which would be executed by sys_trouble before crashing the system so
1027 that for instance the IMPDIM could shut itself down by establishing a
1028 handler to send a going-down connect to the IMP.
1029
1030 223 phx09383
1031 Attempting to add a memory which is already online causes an OOB fault
1032 in reconfigure line 193 because it fumbles one of the error codes.
1033
1034 222 phx09341
1035 The error message for incorrect access should be specific about the type
1036 of access which the process lacks: ACL ring bracket or AIM.
1037 Presently some primitives distinguish between ring bracket and ACL
1038 violations and others do not. AIM violations would have to be detected
1039 specially; there is no error code for this today. See also entries 78
1040 and 157.
1041
1042 219 phx09240 phx11009
1043 system_performance_graph cannot properly represent more than 100
1044 logged-in users. It should use a different scale or wrap around.
1045
1046 217 phx09162
1047 When walking the AST to demount a volume demount_pv gives up upon
1048 encountering very minor anomalies causing ESD to fail completely when
1049 it should have almostr succeeded. It needs a better way of walking the
1050 AST to eliminate the "demount_pv: AST out of sync" message. The AST
1051 pools should be described by pointers and counts kept in the SST rather
1052 than just by count.
1053
1054 215 phx09082 phx12302
1055 Checking of CPUs which are being added should be both more complete and
1056 more flexible. Proper settings for both cache and associative memories
1057 should be checked. It should also be possible for a site to over-ride
1058 these checks by arguments to add_cpu.
1059
1060 214 phx09047
1061 There should be a DRL instruction at the beginning of page_fault so
1062 that history registers would be saved if a wild transfer occurred.
1063
1064 213 phx08965
1065 There should be more state recorded in the PVT when a volume cannot be
1066 accessed such as the real fsdisk error coderather than just
1067 pvte.device_inoperative. This lack causes add_vol to be unable to
1068 distinguish between "drive in protect" and "drive offline".
1069
1070 212 phx08963
1071 The check_trailers procedure can only be enabled by recompilation. It
1072 should be possible to simply patch something.
1073
1074 211 phx10123
1075 Messages from hardcore disk_control get_aste hc_dmpr_primitives
1076 etc. should include the physical volume name where appropriate. This
1077 must be preceded by putting the name into ring zero. see entry 210
1078
1079 210 phx11769 phx08952
1080 The ring one volume management tables should be direct copies of the
1081 ring zero PVT and LVT which should be changed to include all the
1082 information names and special flags now only in the disk_table. This
1083 is the only real way to fix the problems due to inconsistencies between
1084 these databases.
1085
1086 203 phx11765
1087 hcs_$fs_get_mode always returns the 4 bit set in directory modes. It
1088 should leave this bit off like hcs_$get_user_effmode.
1089
1090 199 phx11761
1091 The ioa controls ^e and ^f have difficulty formatting integers. For
1092 instance ^.2f gives completely inappropriate results when given
1093 1234567 though it does fine with 1234567.12
1094
1095 193 phx08451 phx11705
1096 There should be special entries to status_ for the primary name the
1097 link path and the list of names. The existing status_ interfaces are
1098 seriously defective here see entry 192. See phx11705 for interface
1099 details.
1100
1101 189 phx08286
1102 There should be a way to turn on the audit flag in the branch. A
1103 primitive mechanism but better than nothing. Now that the audit flag
1104 does nothing this will become a limitation until a proper per branch
1105 audit mechanism is created.
1106
1107 188 phx08284
1108 The privileged quota-setting primitives should log a message when used
1109 to aid in keeping track of the operations.
1110
1111 187 phx08076
1112 When a process running ISOLTS is temrinated abnormally the CPU and
1113 memory is was using for the test are not released. This despite the
1114 code in deact_proc which appears to do just that.
1115
1116 186 phx08263 phx03859 phx06694
1117 There should be a way to interrupt the Initializer process "no matter
1118 what". Perhaps a tiny debugging environment entered on receipt of an
1119 execute fault.
1120
1121 184 phx10589
1122 The MPC error counters should be read out and stored in the syserr log
1123 when a pack is mounted or dismounted; this would make it much easier to
1124 keep track of per-drive error histories.
1125
1126 183 phx07983 phx11700
1127 The system should perform probabilistic verification of disk writes
1128 checking some small fraction of them for success. The fraction would be
1129 increased if errors occurred decreased as the drive was seen to
1130 operate and be manually tunable as well.
1131
1132 181 phx08237
1133 There should be a way to change the time zone CLOK card and sys_info
1134 correction constant while the system is running.
1135
1136 179 phx07814
1137 verify_lock will recurse faulting if it tries to unlock a directory
1138 which is no longer accessable due to seg_fault_error or page_fault_error
1139 problems. It should have condition handlers for this.
1140
1141 176 phx07711
1142 The traffic_control_queue command should display the states of all the
1143 interesting APTE flags; pre_empt_pending in particular.
1144
1145 170 phx06979
1146 The system should further analyze the MOS EDAC error messages to the
1147 extent that it determines which pages in the SCU are affected by the
1148 error so that the pages can be removed either manually or
1149 automatically. This will also save syserr log space.
1150
1151 167 phx06374
1152 When a hardware fault occurs as a result of an Illegal Action from an
1153 SCU software should unlock the SCU history registers on that SCU to
1154 allow data from a fault which crashes the system later to be retained.
1155 Unfortunately it is not possible for software to read these registers.
1156
1157 166 phx06326
1158 The hp_delete command tries to set some AIM flags in the directory it is
1159 trying to delete. This will not work if the directory is
1160 connection-failed. Since initiate was changes to activate directories
1161 immediately this problem is masked but hp_delete shouldn't do this
1162 anyway.
1163
1164 164 phx04854 phx05954
1165 The UID generator and pxss should check the difference between the last
1166 clock reading and the current one periodically and crash the system if
1167 it is too large. This situation arises when a clock makes a sudden
1168 jump and could otherwise seriously damage the file system.
1169
1170 163 phx04854 phx05954
1171 Dates in VTOCEs and directories should be corrected by the volume and
1172 hierarchy salvagers. Dates in the future should be set to the current
1173 time and dates from before NSS should be set to some early date. This
1174 situation can arise either from damage or because the clock was
1175 incorrectly set. UIDs should also be checked for validity and reset to
1176 new UIDs from getuid if they fall outside the range of acceptable
1177 times.
1178
1179 161 phx07238
1180 The system should make some attempt to determine whether all the
1181 configured IOMs can access a memory module being added. This is
1182 probably difficult to do since it would have to be done by experiment
1183 which might prove disasterous if the IOM configuration panel were not
1184 set properly.
1185
1186 157 phx06101
1187 When attempting to append an entry if the append cannot be performed
1188 because of containing directory ring brackets the error message should
1189 be Validation level not in ring bracket rather than Incorrect access to
1190 directory containing entry.
1191
1192 155 phx06075
1193 When a name on a branch is changed it should be changed in place so it
1194 remains in the same place in the list of names rather than behave as if
1195 it had been deleted and added back.
1196
1197 145 phx03708
1198 The attach_lv command should accept -a as well as -all.
1199
1200 142 phx03109
1201 The FIM should distinguish via different error codes for termination
1202 between an out-of-bounds on the ring zero stack and one on an outer ring
1203 stack to aid in identifying situations which cause this particular ring
1204 zero error condition.
1205
1206 139 phx07240
1207 When there is bad parity in memory the resulting error messages are
1208 very verbose. Especially at ESD time they should simply be flushed.
1209 This requires more specific info about the messages in question to solve
1210 well.
1211
1212 137 phx08082
1213 The reclassify_sys_seg primitive doesn't work when system_high equals
1214 system_low because it requires that the segment end up with an acccess
1215 class greater than that of the containing directory. This is a
1216 limitation derived from the implementation of multi-class segments
1217 which are required by various modules of directory control to really be
1218 multi-class.
1219
1220 135 phx07543
1221 When a directory is deleted from another process strange things happen
1222 when it is referenced. Most often lock takes a fault trying to look at
1223 the UID. Perhaps it should have a handler for that condition.
1224
1225 130 phx05245
1226 It is possible for a users virtual CPU time to become very inaccurate as
1227 the result of a large number of faults because of the adjustment which
1228 must be applied to compensate for fault processing time. There is no
1229 real way to fix this.
1230
1231 121
1232 A crawlout may leave a directory initiated which really should be
1233 terminated cluttering the KST.
1234
1235 119
1236 Reference names for inner ring segments can be made available to outer
1237 ring programs; a violation of security. Not well understood.
1238
1239 118
1240 copy_on_write makes the copy unencachable until the next setfaults
1241 restores access. Not well understood.
1242
1243 114
1244 The messages in the syserr log describing page control errors are
1245 truncated when printed. This appears to be a problem in the printing
1246 routines rather than in page_error or the log itself.
1247
1248 108 phx04071 phx04955
1249 The cleanup handler in an absentee job is never executed if the absentee
1250 terminates by a call to cu_$cl. This mechanism should be considerably
1251 more robust.
1252
1253 102 phx03345 phx09268
1254 The fim does not properly handle EIS decimal overflows and underflows
1255 in that it does not respect the values to be reset and also does not
1256 reset the IC properly.
1257
1258 95 phx03943
1259 The machine conditions resulting from inability to add a processor
1260 should be saved somewhere for later analysis. Presently they are just
1261 discarded by init_processor.
1262
1263 80 phx03232
1264 The write_limit is reset at each memory reconfiguration resulting in
1265 the PARM WLIM value apparently being ignored if reconfigurations occur.
1266 Should fix it by having reconfiguration not reset it.
1267
1268 77 phx11596
1269 The error code from hcs_$fs_move_xxx is not specific enough partly due
1270 to the lack of a corresponding source/target switch.
1271
1272 73
1273 Pathnames can be much longer than 168 characters max is 16*32+1 513.
1274 This causes problems for all the interfaces which use the standard char
1275 168 declarations. Fortunately find_ can handle it but many user
1276 ring programs behave inconsistently. The solution is not easy.
1277
1278 69 phx03152
1279 The initializer can "find" directories by its linker search rules due
1280 to the special-casing in access_mode$effective. This leads to
1281 surprising though harmless behaviour.
1282
1283 68 phx11588
1284 The structure for hcs_$create_branch_ has not kept up to date with file
1285 system changes and no longer contains all the values which might want
1286 to be set when a branch is created. It should be upgraded whenever the
1287 file system is changed.
1288
1289 65
1290 The SST limited in size to but one segment cannot be made large
1291 enough to optimally support the largest configurations available today
1292 and this situation can only get worse. The fix is to split it up into
1293 several tables possibly using more than one segment for the AST
1294 itself. This is very hard. 83-01-18: well try this. Get a pointer
1295 register back by changing all references to sst|foo to sst$foo use
1296 pr4 that is. Now make a wired table of packed pointers to astes.
1297 Interpret the aste threads as ndexes into this table. This costs only
1298 1 word per aste as opposed to changing all 6 threads to packed
1299 pointers 3 words. It should just be grunt work to implement.
1300
1301 60
1302 There is no general mechanism for determining how many pages should be
1303 wired by pmut$wire_and_mask since error cases calls to syserr mainly
1304 may use up a large amount of stack space not normally required. This
1305 has been partially fixed by changing syserr to run on the PRDS when
1306 called masked.
1307
1308 53 phx01533 phx01978
1309 ESD will fail if an MPC is broken. Multics should be more robust about
1310 dealing with bad hardware and delete the devices more rapidly.
1311
1312 32
1313 Many system meters overflow when the system stays up for a long time.
1314 This causes faults in the idle process and in various places in ring
1315 zero. This is a catch-all error list entry to be reserved for the
1316 general solution if we ever invent one. Other specific entries address
1317 specific instances of the problem.
1318
1319 22 phx02203
1320 The quota moving primitives sometimes fail to adjust things properly
1321 when working on active directories. More details are not known at this
1322 time.
1323
1324 19
1325 If the HC partition on the RPV is not large enough it may not be
1326 possible to boot with a partial RLV.
1327
1328 11
1329 A bad error message is provided if process initialization fails; for
1330 instance if the user has incorrect access to the process overseer.
1331 This is possibly an answering service problem actually.
1332
1333 10
1334 The linker and the fim look at instructions in the object segment
1335 itself rather than in the SCU data. This is just one more reason why
1336 execute-only code does not work.
1337
1338 9
1339 The system loops or otherwise misbehaves when the permanent syserr log
1340 is damaged. >sc1>perm_syserr_log This is partly a vfile_ problem in
1341 dealing with trashed keyed vfiles. Should fix syserr_log_man_ to be
1342 better about dealing with problems in >sc1>perm_syserr_log. If it has
1343 difficulty it should rename the old one and create a new one rather
1344 than simply giving up and not copying the partition.