Multics Technical Bulletin MTB 644
To: MTB Distribution
From: Ron Barstad
Date: 12/20/83
Subject: The View Manager Facility: SQL Parser
ABSTRACT
The Sequential Query Language (SQL) for the View Manager Facility
is described. The use of a LALR parser to convert the SQL info a
canonical form is described.
This MTB is only one of a group of related documents in the View
Manager series. These documents are
MTB-641 The View Manager Facility
MTB-642 The View Manager Facility: Subroutine Interfaces
MTB-643 The View Manager Facility: The View Master Subsystem
MTB-644 The View Manager Facility: SQL Parser
MTB-645 The View Manager Facility: Data Dictionary Interface
Notes for printing:
This document contains lines longer than the standard pagewidth
of 65. The control argument "-forms elite" should be used for
hardcopies from the x9700 and "-rqt pmdc_12c" should be used on
the PMDC diablo.
Comments may be made
via forum:
>udd>Multics>meetings>End_User_Data_Access (euda)
via electronic mail:
RBarstad.Multics at System M
via telephone:
(HVN) 357-6617 or (602) 862-6617
_________________________________________________________________
Multics Project internal working documentation. Not to be
reproduced outside the Multics Project.
MTB 644 Multics Technical Bulletin
1. INTRODUCTION
The Structured Query Language (SQL) defined by IBM in the mid
70's for their System R project has become the defacto industry
standard. Many vendors, including GCOS, are converting to or
adding a SQL like interface to their data base products. A
Multics dialect of SQL will be used for the View Manager
Facility. This dialect will be familiar to users of other SQL
systems.
Earlier proposals by Jim Gray and others to provide a unified
query language which was a combination of SQL, MRDS and LINUS
were found to be wanting. It was felt that these approaches
provided a lowest-common-denominator language that combined the
worst features of all the others. The syntax for Gray's unified
language also proved to be ambiguous and incapable of LALR
analysis without major change.
Multics Technical Bulletin MTB 644
2. FUNCTIONAL DESCRIPTION
2.1 Overview
The parser translates SQL into a Canonical Query Language (CQL).
The SQL is the user entered text such as "select foo from bar"
and the CQL is an internal representation passed on down to the
routines that do the user's bidding. (The CQL is not described
here.)
The parser can be considered a kind of compiler that converts the
SQL source statements into the CQL object. This is true even
though the parser deals with only a single statement at a time
and appears to the user to be a pre-processor. Errors in syntax
or semantics will be detected and passed back to the caller.
Some checking of column names and expressions may be done to
verify their appropriateness for the requested table. Any error
would prevent the creation of the CQL.
The core of the parser is a LALR translator built from a BNF
definition of the SQL. Semantic processing code will build the
CQL as each production is scanned and verified.
2.2 About LALR
LALR means "Look Ahead, Left to right, Right most derivation".
The Multics LALR system translates a BNF description of a grammar
into a parser for that language. The parser is efficient
compared to traditional "hand coding" and provides an expeditious
way to build and test grammars like SQL. After the decision was
made to use the LALR parser for the View Manager SQL, we found
that IBM made the same decision when building their parser for
the original SEQUEL for System R. It too was a PL/1 based
interpreter that created a "canonical" form of the SEQUEL.
Based on an original design by Dave Ward and Jim Falksen, the
Multics LALR system is now maintained by Pat Prange at BCO.
Development there is continuing with LALR as part of the SLANG
project to support Ada/SIL. Expectations are that LALR will be
included in MR11.
2.3 Interfaces, hooks, handles
SQL statements are built at the terminal in a variety of the
tradition ways: request loop, editor, ec, and so forth. The
parser expects a complete statement for parsing and cannot make
much sense of fragments. However, it may be possible to take
advantage of the error recovery mechanisms to provide some fancy
user interfaces. For example, if the parser were warned that the
statement passed to it was incomplete, it could (after verifying
correctness so far) return a list of possibilities for the next
token.
MTB 644 Multics Technical Bulletin
As in any language, there are three classes of error possible in
SQL. First are simple syntax errors. The parser will give up
processing a statement on the first one and indicate the errant
token and most probable error. Semantic errors will be delt with
similarily. The parser will be able to find some, but not all,
execution errors. It may be able to verify from the data
dictionary that the requested columns are in the referenced table
or that the tables even exist but it can't know if the selection
condition (where clause) is valid.
2.4 Subroutine Interface
To allow for growth, the parser is a free-standing module not
bound with the rest of the view manager. This design allows
other parsers, as long as they produce CQL, to be substited or
added.
Multics Technical Bulletin MTB 644
entry: convert_sql_to_cql_
This entry point is used to parse a sql statement string and
generate the associated cql description.
USAGE
dcl convert_sql_to_cql_ entry options (variable);
call convert_sql_to_cql_ (dd_path, sql_string,
return_area_ptr, replacement_arg_1, ...
replacement_arg_N, cql_ptr);
ARGUMENTS
dd_path
is a character string char (*) that contains the pathname
of the data dictionary to be used while parsing the sql.
(input)
sql_string
is a character string char (*) that contains the sql
statement to be parsed. (input)
return_area_ptr
is a pointer to an area where this routine will allocate
the returned cql structure. (input)
replacement_arg_I
are a variable number of arguments (possibly none), each
char (*), that will be used as substitution arguments for
&N in the sql_str. The &N is associated with the Nth
replacement_arg. (input)
cql_ptr
is a pointer to the generated cql_structure that is
allocated in the area pointed to by return_area_ptr.
(output)
MTB 644 Multics Technical Bulletin
3. DESCRIPTION OF SQL
The following BNF defines the SQL syntax. This is the same BNF
that is passed as input to the lalr command.
It has been shown to be a unambiguous grammar and capable of
correctly parsing SQL statements.
3.1 Reserved words
These reserved words are defined for the SQL. They may not be
used in any other context, such as for table or column names.
access create_tablespace immediate register_table
add create_view in remove
admin current index restore
all dec insert revoke
alter decimal insertion rollback
alter_tablespace delete intersect save
and deletion into select
any deregister is set
as deregister_link like share
asc deregister_table link special
ascending desc lock sum
assertion descending max synonym
avg does min table
between drop minus tablespace
bin drop_synonym mode then
binary drop_table new to
bit drop_tablespace nonvar trigger
but drop_view nonvarying uc
by exclusive not union
char exists null unique
character fix of update
check fixed old using
column float on values
comment for option var
commit from or varying
contain grant order view
contains group read where
count having records with
create if register work
create_table image register_link
Multics Technical Bulletin MTB 644
3.2 SQL BNF
(* Begin Multics View Manager SQL *)
<statement> ::= (* 83-12-20.01 *)
<query> |
<dml_statement> |
<ddl_statement> |
<control_statement> !
<dml_statement> ::=
<insertion> |
<deletion> |
<update> !
<query> ::=
<query_expr> <order_clause> !
<insertion> ::=
insert into <receiver> <insert_spec> !
<deletion> ::=
delete <from_clause> <where_current_clause> !
<update> ::=
update <table_label_name> set <set_clause_list> <where_current_clause> !
<order_clause> ::=
order by <ord_spec_list> | !
<receiver> ::=
<table_name> |
<table_name> ( <column_name_list> ) !
<column_name_list> ::=
<column_name> |
<column_name_list> , <column_name> !
<insert_spec> ::=
<query_expr> |
values <literal_row> !
MTB 644 Multics Technical Bulletin
<table_label_name> ::=
<table_name> <label_name> |
<table_name> !
<set_clause_list> ::=
<set_clause> |
<set_clause_list> , <set_clause> !
<set_clause> ::=
<column_name> = <expr> |
<column_name> = ( <query_block> ) !
<query_expr> ::=
<query_block> |
<query_expr> <set_op> <query_block> |
( <query_expr> ) !
<set_op> ::=
intersect |
union |
minus !
<query_block> ::=
<sub_query_block> |
<sub_query_block> <group_clause> !
<sub_query_block> ::=
<select_clause> <from_clause> <where_clause> !
<select_clause> ::=
select <sel_list_spec> <sel_expr_list> |
select <sel_list_spec> * !
<from_clause> ::=
from <table_list> !
<where_current_clause> ::=
<where_clause> |
where current of <cursor_name> !
<where_clause> ::=
where <boolean_expr> | !
<group_clause> ::=
group by <column_spec_list> |
group by <column_spec_list> <having_clause> !
<having_clause> ::=
having <boolean_expr> !
Multics Technical Bulletin MTB 644
<sel_list_spec> ::=
all |
unique | !
<sel_expr_list> ::=
<sel_expr> |
<sel_expr_list> , <sel_expr> !
<sel_expr> ::=
<expr> |
<table_name> . * !
<table_list> ::=
<table_label_name> |
<table_list> , <table_label_name> !
<column_spec_list> ::=
<column_spec> |
<column_spec_list> , <column_spec> !
<ord_spec_list> ::=
<column_spec> <direction> |
<ord_spec_list> , <column_spec> <direction> |
<integer> !
<direction> ::=
asc |
ascending |
desc |
descending !
<boolean_expr> ::=
<boolean_term> |
<boolean_expr> <or_op> <boolean_term> !
<boolean_term> ::=
<boolean_factor> |
<boolean_term> <and_op> <boolean_factor> !
<boolean_factor> ::=
<boolean_primary> |
<not_op> <boolean_primary> !
<boolean_primary> ::=
<predicate> |
( <boolean_expr> ) !
<predicate> ::=
<expr> <comparison> <expr_or_table_spec> |
<table_spec> <comparison> <table_spec_lit> |
exists ( <query_expr> ) |
<expr> <maybe_not> like <regular_expr> |
MTB 644 Multics Technical Bulletin
<expr> is <maybe_not> null |
<expr> between <expr> <and_op> <expr> |
if <predicate> then <predicate> (* assert only *) !
<expr_or_table_spec> ::=
<expr> |
<table_spec> !
<regular_expr> ::=
<string> !
<table_spec> ::=
( <query_expr> ) |
'< <literal_row_list> > |
<literal_row> !
<table_spec_lit> ::=
( <query_expr> ) |
<literal> !
<expr> ::=
<arith_term> |
<expr> <add_op> <arith_term> !
<arith_term> ::=
<arith_factor> |
<arith_term> <mult_op> <arith_factor> !
<arith_factor> ::=
<add_op> <primary> |
<primary> !
<primary> ::=
<column_spec> |
old <column_spec> |
new <column_spec> |
<set_function> ( <maybe_unique> <expr> ) |
<builtin> ( <arg_list> ) |
count ( * ) |
<constant> |
( <expr> ) !
<column_spec> ::=
<column_name> |
<table_name> . <column_name> !
<comparison> ::=
<comp_op> |
<comp_op> any |
<comp_op> all |
contains |
does <not_op> contain |
Multics Technical Bulletin MTB 644
is <maybe_not> in |
<maybe_not> in !
<maybe_unique> ::=
unique | !
<comp_op> ::=
= |
^= |
> |
>= |
'< |
'<= |
^> |
^'< !
<add_op> ::=
+ |
- !
<mult_op> ::=
* |
/ !
<or_op> ::=
or |
'| !
<and_op> ::=
and |
& !
<not_op> ::=
not |
^ !
<maybe_not> ::=
<not_op> | !
<set_function> ::=
avg |
max |
min |
sum |
count |
<symbol> !
<builtin> ::=
<symbol> !
MTB 644 Multics Technical Bulletin
<arg_list> ::=
<arg> |
<arg_list> , <arg> !
<literal> ::=
'< <literal_row_list> > |
<literal_row> |
( <entry_list> ) |
<constant> !
<literal_row_list> ::=
<literal_row> |
<literal_row_list> , <literal_row> !
<literal_row> ::=
'< <entry_list> > !
<entry_list> ::=
<entry> |
<entry_list> , <entry> !
<entry> ::=
<constant> | !
<constant> ::=
<string> |
<real literal> |
<integer> |
null !
(*
Multics Technical Bulletin MTB 644
*)
<ddl_statement> ::=
<create_table> |
<alter_table> |
<create_tablespace> |
<alter_tablespace> |
<create_index> |
<create_view> |
<create_synonym> |
<comment> |
<drop> !
<create_table> ::=
create table <table_name> ( <column_defn_list> ) !
<column_defn_list> ::=
<column_defn_list> , <column_defn> |
<column_defn> !
<column_defn> ::=
<column_name> <data_type> <maybe_type_mod> !
<data_type> ::=
<arith_data_type> |
<string_data_type> !
<string_data_type> ::=
<string_type> ( <integer> ) <variability> !
<string_type> ::=
character |
char |
bit !
<variability> ::=
varying |
var |
nonvarying |
nonvar !
<arith_data_type> ::=
<scale> <base> <precision> !
MTB 644 Multics Technical Bulletin
<scale> ::=
fixed |
fix |
float !
<base> ::=
binary |
bin |
decimal |
dec !
<precision> ::=
( <integer> ) |
( <integer> , <integer> ) !
<maybe_type_mod> ::=
, <type_mod> | !
<type_mod> ::=
not null |
image <image_mod> |
image !
<image_mod> ::=
unique |
uc !
<alter_table> ::=
alter table <table_name> add <column_defn> !
<create_tablespace> ::=
create tablespace <tablespace_name> in <path> <maybe_with_records> !
<maybe_with_records> ::=
with <integer> records | !
<alter_tablespace> ::=
alter tablespace <tablespace_name> <alter_mode> <integer> records !
<alter_mode> ::=
add |
remove !
Multics Technical Bulletin MTB 644
<create_index> ::=
create <maybe_unique> index <index_name> on <table_name> ( <ord_spec_list> ) !
<create_view> ::=
create view <view_name> <maybe_column_name_list> as <query> <maybe_check_option> !
<maybe_check_option> ::=
with check option | !
<drop> ::=
drop <system_entity> <name> !
<system_entity> ::=
assertion |
index |
synonym |
table |
tablespace |
trigger |
view !
<create_synonym> ::=
create synonym <table_name> for <table_name> !
<comment> ::=
comment on table <table_name> is <string> |
comment on column <table_name>.<column_name> is <string> !
(*
MTB 644 Multics Technical Bulletin
*)
<control_statement> ::=
<assert> |
<trigger> |
<grant> |
<revoke> |
<lock> |
<register_table> |
<deregister_table> |
<register_link> |
<deregister_link> |
<commit> |
<rollback>
<save> |
<restore> !
<assert> ::=
create assertion <assert_name> <maybe_immed> <maybe_assert_cond> is <boolean_expr> !
<maybe_immed> ::=
immediate | !
<maybe_assert_cond> ::=
on <assert_condition> | !
<assert_condition> ::=
<action_list> |
<table_label_name> !
<action_list> ::=
<action> |
<action_list> , <action> !
<action> ::=
insertion of <table_label_name> |
deletion of <table_label_name> |
update of <table_label_name> <maybe_column_name_list> !
<trigger> ::=
create trigger <trigger_name> on <trigger_condition> is ( <statement_list> ) !
<trigger_condition> ::=
<action> |
read of <table_label_name> !
Multics Technical Bulletin MTB 644
<statement_list> ::=
<conditional_statement> |
<statement_list> ; <conditional_statement> !
<conditional_statement> ::=
<statement> |
if <boolean_expr> then <statement> !
<grant> ::=
grant <authorization> <maybe_on_table> to <user_list> <maybe_grant_opt> !
<revoke> ::=
revoke <authorization> <maybe_on_table> from <user_list> !
<authorization> ::=
all |
<privilege_list> |
all but <privilege_list> !
<maybe_on_table> ::=
on <table_name> | !
<user_list> ::=
<user_list> , <user_name> |
<user_name> !
<maybe_grant_opt> ::= (* not to *.*.* *)
with grant option | !
<privilege_list> ::=
<privilege_list> , <privilege> |
<privilege> !
<privilege> ::=
<table_privilege> |
<dict_privilege> |
<tablespace_privilege> |
null !
<table_privilege> ::=
alter | (* not views *)
delete |
index | (* not views *)
insert |
link |
select |
update <maybe_column_name_list> !
MTB 644 Multics Technical Bulletin
<maybe_column_name_list> ::=
( <column_name_list> ) | !
<dict_privilege> ::=
access |
admin |
create_tablespace |
create_view |
deregister_link |
deregister_table |
drop_synonym |
drop_tablespace |
drop_view |
register_link |
register_table !
<tablespace_privilege> ::=
alter_tablespace |
create_table |
drop_table !
<lock> ::=
lock table <table_name> in <lock_mode> mode !
<lock_mode> ::=
exclusive |
share !
<register_table> ::=
register table <path> <maybe_as_table> <maybe_col_defn_list> <maybe_using> <special> !
<path> ::=
<symbol> |
<string> !
<maybe_as_table> ::=
as <table_name> | !
<maybe_col_defn_list> ::=
( <column_defn_list> ) | !
<maybe_using> ::=
using <procedure_name> | !
<special> ::=
special <string> | !
Multics Technical Bulletin MTB 644
<deregister_table> ::=
deregister table <table_name> !
<register_link> ::=
register link <link_name> <maybe_as_link> from <path> !
<maybe_as_link> ::=
as <link_name> | !
<deregister_link> ::=
deregister link <link_name> !
<commit> ::=
commit work !
<rollback> ::=
rollback work !
<save> ::=
save <save_name> !
<restore> ::=
restore |
restore <save_name> !
(*
MTB 644 Multics Technical Bulletin
*)
<name> ::=
<symbol> !
<assert_name> ::=
<symbol> !
<column_name> ::=
<symbol> !
<cursor_name> ::=
<symbol> !
<index_name> ::=
<symbol> !
<label_name> ::=
<symbol> !
<link_name> ::=
<symbol> !
<procedure_name> ::=
<symbol> !
<save_name> ::=
<symbol> !
<table_name> ::=
<symbol> !
<tablespace_name> ::=
<symbol> !
<trigger_name> ::=
<symbol> !
<view_name> ::=
<symbol> !
<user_name> ::=
<symbol> |
<symbol> . <symbol> |
* . <symbol> |
<symbol> . * !
(* End Multics View Manager SQL *)
Multics Technical Bulletin MTB 644
4. BIBLIOGRAPHY
Astrahan, M.M., et al. "A History and Evaluation of System R",
IBM Research Report RJ2843, June 1980.
Astrahan, M.M., et al. "System R: Relational Approach to
Database Management" In Transactions on Database Systems, Vol.
1, No. 2, June 1976.
Astrahan, M.M. and Chamberlin, D.D., "Implementation of a
Structured English Query Language", In Communications of the ACM,
Vol. 18, No. 10, October 1975.
Chamberlin, D.D., et al. "SEQUEL 2: A Unified Approach to Data
Definition, Manipulation, and Control", In IBM Journal of
Research and Development, Vol. 20, No. 6, November 1976.
Codd, E.F. and Blasgen, Michael, "Relational Data Base
Management", ACM Professional Development Seminar lecture notes,
October 1979.
Gray, Jim, "New Translator System for MRDS/LINUS Query Language",
MDC Proposal, Rev. 4.0, September, 16, 1981.
Kroenke, David M., Database Processing: Fundamentals, Design,
Implementation, (2nd ed.) Science Research Associates, 1983.
IBM, Database 2 SQL Usage Guide, IBM Document GG24-1583-00, 1983.
Prange, P., "LALR, a Translator Construction System", SLANG
Project Technical Bulletin, July 26, 1983.
Prange, P. and Margulies, Benson, "LALR, a Translator
Construction System", MTB 602, October 4, 1982.
Reisner, Phyllis, et al. "Human factors evaluation of two data
base query languages--Square and Sequel", In Proceedings of the
National Computer Conference, AFIPS, 1975.
Relational Software Inc., ORACLE User's Guide, Version 2.3, 1981.
Rosensteel, K., Relational Access Manager (RAM) EPS-1, LCPD
document number 58075029, Rev. 4, September 21, 1982.
Shneiderman, Ben, "Improving the Human Factors Aspect of Database
Interactions", In ACM Transactions on Database Systems, Vol. 3,
No. 4, December 1978.