Multics Technical Bulletin MTB-688
Multics C Impl. Spec.
To: Distribution
From: Douglas Howe
Date: 26 April 1985
Subject: Multics C Implementation Specification
1. Abstract
This document contains the specifications required to bring up a
System V Release 2.0 compatible C on Multics. The C compiler on
Multics will be as Multics compatible as possible without
becoming incompatible with System V Release 2.0 C.(1)
Changes will be marked with change bars. |
Comments should be sent to the authors:
via Multics mail to:
DGHowe.Multics
via posted mail to:
Douglas G. Howe
Advanced Computing Technology Centre
Foothills Professional Building
1620 29th St., N.W.
Calgary Alberta Canada T2N-4L7
via telephone to:
(403)-270-5400
(403)-270-5437 (Howe)
via forum on System-M to:
>udd>m>DGHowe>mtgs_dir>c>c_imp (c)
_________________________________________________________________
Multics project internal documentation; not to be reproduced or
distributed outside the Multics project.
(1) Unix and System V Release 2.0 are registered trademarks of AT
& T
MTB-688 Multics Technical Bulletin
Multics C Impl. Spec.
TABLE OF CONTENTS
Section Page Subject
======= ==== =======
1 i Abstract
2 1 Preface
3 2 Introduction
3.1 2 . . Goal
3.2 3 . . References For This Document
4 4 Execution Environment
4.1 4 . . Stack Disciplines
4.2 4 . . Argument List Creation
5 4 Object Segment Format
5.1 5 . . Symbol Section
5.2 5 . . Statement Map
6 6 Entrypoints
6.1 6 . . Main Entrypoints
7 7 Calling Conventions
7.1 7 . . Calling C to C
7.2 7 . . Calling a Main Program
7.3 7 . . Calls from C to Non-C Procedures
7.4 8 . . Calls from Non-C Procedures to C Functions
8 8 Storage Allocation
9 9 Code Conventions
9.1 9 . . Forbidden Instructions
9.2 9 . . Use of Pointer Registers
9.3 10 . . Identifiers
10 11 General Information
10.1 11 . . Data Type Sizes
10.1.1 12 . . . . Conversion of Data Types
Multics Technical Bulletin MTB-688
Multics C Impl. Spec.
2. Preface
This document defines the format of a C object segment on Multics
and describes how C programs should use pl1_operators_. This
MTB, MTB-647 and the other related MTBs, are intended to supply
most of the documentation needed to implement the C compiler for
Multics.
We wish to thank those people who have made this possible either
by creating tools for analysis or through input of subject
matter. These people are Ron Barstad, Greg Baryza, Rick Gray,
Steve Herbst, Dave Mason, Audrey Neal, Tom Oke, Doug Robinson,
Melanie Weaver and Brian Westcott.
MTB-688 Multics Technical Bulletin
Multics C Impl. Spec.
3. Introduction
3.1. Goal
The goal of this project is to create a Multics Native C
compiler. This compiler will allow the porting of existing
software and the use of basic Multics tools. The compiler is to
be compatible with System V Release 2.0 while losing as little as
possible of Multics. It will be accompanied by the C runtime
library with some routines redesigned to understand the Multics
environment. To accomplish this goal the compiler will be
divided into a two versions. These versions can be defined as
follows:
I. Demo Compiler
This version of the compiler will be used to bring up C.
This will be done using an alm(1) intermediate; C programs
will be translated to alm source code, and then compiled on
Multics. In the initial transfer of the compiler a Unix
system will be used to generate the alm source. It is
intended that this version will be usable in some form to
allow third party software to be brought to Multics.
II. Production Compiler
This will be the first general release of the compiler and
will be an extension of the demo compiler. It is not
decided at this point if the second version will still
generate alm source or if it will do object generation
directly. The second version should include some
improvements in efficiency and will be able to use probe to
some extent. The full definition of this version will be
given at a later date.
_________________________________________________________________
(1) Assembler Language Multics
Multics Technical Bulletin MTB-688
Multics C Impl. Spec.
3.2. References For This Document
1) MTB-647 created by Greg Baryza.
2) The C Programming Language
Kernighan, Brian W. & Ritchie, Dennis M.
Prentice-Hall (1978)
Englewood Cliffs, New Jersey
4) Multics Programmers Reference Manual (10.2 AG91-03A)
(hereafter referred to as MPRM)
5) MTB 689 titled The C Runtime System on Multics by Doug Howe.
6) MTB 691 titled The C External Execution Environment by Doug
Howe.
8) MTB entitled the Multics Link Editor by Dean Elhard and Doug
Howe.
9) MTB 707 entitled C Required Changes To ALM Specification.
MTB-688 Multics Technical Bulletin
Multics C Impl. Spec.
4. Execution Environment
The execution environment to be used by the production compiler
on Multics will allow the use of Multics user tools such as
probe, trace and profile. It will be compatible with the current
PL1 environment.
The Multics standard execution environment is documented in the
MPRM Appendix H.
4.1. Stack Disciplines
Like most other languages, C will use the same stack as the
Multics command environment for its local storage. All
activities that affect the size of the stack, such as pushing,
popping and extending stack frames, will be done via
`pl1_operators_'.
4.2. Argument List Creation
In Multics, all calls that pass arguments should create a
structure defining where the arguments can be found and where a
set of descriptors defining their data type can be located. A
complete description can be found in the MPRM H-20.
C has no runtime requirement for argument descriptors. Therefore
argument descriptors will not be included in the demo version of
the compiler. These argument descriptors will be added to the
production version of the compiler as required for the support of
various Multics tools.
If required a new descriptor structure and a new method for the |
calling sequence will be designed. |
5. Object Segment Format
The object segments generated by the C compiler will be in
Multics standard format by default due to the use of alm as an
intermediate language. This format is defined in the MPRM
Appendix G. Declarations for all structured items are included.
There are two exceptions to the format: the Symbol Section and
the Statement Map.
Multics Technical Bulletin MTB-688
Multics C Impl. Spec.
5.1. Symbol Section
Due to the use of alm as the intermediate language of the
compiler, C will be lacking complete Symbol Section information
in the demo version. Complete Symbol Section information will be
added as a function of the C compiler or as a series of pseudo |
ops added to ALM (see MTB 707). |
5.2. Statement Map
Pseudo ops in alm or direct object creation will achieve a
statement map in the production version of the compiler. The
Statement Map will refer to the original source segment. Macros
will be seen in their non-expanded form.
MTB-688 Multics Technical Bulletin
Multics C Impl. Spec.
6. Entrypoints
All entrypoints (except for static and main_ -- see below), will
be defined as external entrypoints refering to the pl1_ops entry
ext_entry to perform the stack set up.
All entries of functions that push their own stack frames must be
preceded by the structured information described for the entry
sequence on page G-3 of the MPRM. This will be generated by an |
ALM pesudo op as defined in MTB 707. |
6.1. Main Entrypoints
Due to Multics standard entry procedure the C `main' program
would not be found by the standard searching method. For this
reason, as well as allowing a place for initial set up to take
place, C programs containing a `main' program will have an added
entrypoint called `main_' as is currently done with Fortran. The
definition of the entrypoint `main' will be that of an external
entrypoint.
The entrypoint main_ will have to perform a series of precise
functions. These functions will be fully defined in another MTB
entitled The C External Execution Environment (MTB 691).
Initially `main_' will be a separate program generated and link
edited with the main program.
Multics Technical Bulletin MTB-688
Multics C Impl. Spec.
7. Calling Conventions
Copying of arguments to be passed by value will be done by the
caller. As usual in Multics, if the name of the routine to be
called contains a "$" it will be assumed to be of the form
segment_name$entry_name.
There are four different situations that involve calls. These
are: calls from one C function to another, calls to main
programs, calls from C to non-C procedures and calls from non-C
procedures to C functions.
7.1. Calling C to C
A call from C to C will be done directly with the use of
`pl1_operators_'. The types of the arguments will be as
described in MTB 647.
7.2. Calling a Main Program
C progams will be callable in two ways: one through `main_'
expecting it's arguments in the standard Multics command
processor format; and through `main' which will expect it's
arguments in the standard C Argc, Argv format. The normal entry
sequence for a C program will be via the command processor
linking to `main_'. Within an execution unit calls to main will
be resolved to the standard C entry `main'. Although both
entrypoints are accessable to the user, it will remain the users
responsibility to ensure that the correct values are passed as
parameters.
7.3. Calls from C to Non-C Procedures
C will be able to call non-C functions if the non-C function
being called understands the data types being passed to it. For
this reason only pointers and some basic arithmetic data types
will be compatible with non-C languages.
MTB-688 Multics Technical Bulletin
Multics C Impl. Spec.
7.4. Calls from Non-C Procedures to C Functions
Non-C functions will be able to call C if the C functions
understand the data types being passed to them. For this reason
only pointers and some basic arithmetic data types will be
compatible with C.
8. Storage Allocation
C will follow the Multics standard for the allocation of it's
variables. The only exception to this standard is due to the
definition of C external variables. C external variables are
defined by the normal C environment to be on a
per-execution-process basis, while Multics external variables are
on a per-login-process basis. For this reason C external and
static variables will be allocated as a normal external variable
but the execution unit will be expected to be linked as is
defined in MTB 691.
Multics Technical Bulletin MTB-688
Multics C Impl. Spec.
9. Code Conventions
Multics has a few conventions that must be followed. The major
conventions are listed in the following paragraphs.
9.1. Forbidden Instructions
C will not use any of the alm instruction set which may become
obsolete in future releases of Multics.
9.2. Use of Pointer Registers
Pointer registers are widely used in Multics because of the
segmented address space. Everything outside of the current
segment must be addressed via a segment number. Some pointer
registers have defined uses:
- PR6 should always point to the current stack frame.
- PR0 is set by the operators for programs using
`pl1_operators_'. It points to the `pl1_operators_' transfer
vector except during a call, when it points to the argument
list. The following instruction can be used to reset PR0 to
the transfer vector: epp0 pr7|28,* where PR7 points to the
base of the stack.
- PR4 is usually used when a pointer to the linkage/static
section is needed. The entry operators store it in the stack
frame, so it can be reloaded by the following instruction:
epp4 pr6|36,*
- PR7 points to the base of the stack segment when a program is
entered. It may be reused by the program and reloaded by the
instruction epbp7 pr6|0,*
`pl1_operators_' does not save the values of other pointer
registers across calls.
MTB-688 Multics Technical Bulletin
Multics C Impl. Spec.
9.3. Identifiers
In the demo version of the compiler variable names will likely
have maximum length of 32 characters. The name must be made up
of at least one character followed by a series of characters,
numbers, a $ or the underscore character. If the identifier name
contains a single $ it will be taken to represent a Multics
external identifier. The following bnf style grammar will
explain the variable name.
<character> ::= a|b|c|d|.....|y|z|A|B|C|D|......|Y|Z|_
<character str> ::= <character> | <charcter> <character str>
<number> ::= 0|1|2|3|.....|8|9 | <number> <number>
<identifier> ::= <character>[ <character str>| <number>| "$"]*
External Identifiers on Multics have the form
"segname$entry_name" where:
<segname> ::= <character>[<character str>| <number>]*
<entry_name> ::= <identifier>
where []* means zero or more.
Multics Technical Bulletin MTB-688
Multics C Impl. Spec.
10. General Information
10.1. Data Type Sizes
At the time of this writing, the following sizes are proposed for
the basic data types:
short int (36/18) bits (half/word) aligned
int 36 bits word aligned
long int 72 bits double word aligned
unsigned int 36 bits word aligned(1)
unsigned long 72 bits double word aligned(2)
char 9 bits / char word aligned
float 36 bits (8 bit exponent
28 bit mantissa) word aligned
double 72 bits (8 bit exponent
64 bit mantissa) double word aligned
pointer ITS 72 bits double word aligned
pointer packed 36 bits word aligned
In the demo version of the compiler short int types will be 36
bits long and will be word aligned. Hopefully, short ints will
be 18 bits long and half word aligned in the production version
of the compiler.
_________________________________________________________________
(1) This is a change from MTB 647
(2) This is a change from MTB 647
MTB-688 Multics Technical Bulletin
Multics C Impl. Spec.
10.1.1. Conversion of Data Types
Conversion of C Pointers will be handled as follows.
1. The size of a pointer in C will be 72 bits.
2. Conversion of a value of zero in an int will lead to a null
pointer or to a pointer value of -1|1.
3. Conversion of int to pointer or pointer to int will be done
via the pack and unpack pointer instructions.
4. No conversion will take place on the passing or receiving of
pointers as parameters.
5. Conversion of pointers to long ints or long ints to pointers
will be done directly on a bit to bit relationship.
6. Conversion of a null pointer will lead to an integer value of |
zero. |