Free Academic Seminars And Projects Reports

Full Version: Relational Model & Algebra
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
[attachment=14777]
Relational Model & Algebra
Recall Our Initial Discussion
There are a variety of ways of representing data, each with trade-offs
Free text [often need a human]
Shapes/points in space
Objects with properties
In general, our emphasis will be on the last item
though there are spatial databases, OO databases, text databases, and the like
The Relational Data Model (1970)
Lessons from the Codd paper
Let s separate physical implementation from logical
Model the data independently from how it will be used (accessed, printed, etc.)
Describe the data minimally and mathematically
A relation describes an association between data items tuples with attributes
We generally think of tables and rows, but that s somewhat imprecise
Use standard mathematical (logical) operations over the data these are the relational algebra or relational calculus
How does this model relate to objects, properties? What are its abilities and limitations?
Why Did It Take So Many Years to Implement Relational Databases?
Codd s original work: 1969-70
Earliest relational database research: 1976
Oracle 2.0 : 1979
Why the gap?
You could do the same thing in other ways
Nobody wants to write math formulas
Why would I turn my data into tables?
It won t perform well
What do you think?
Getting More Concrete: Building a Database and Application
Start with a conceptual model
On paper using certain techniques we ll discuss next week
We ignore low-level details focus on logical representation
Design & implement schema
Design and codify (in SQL) the relations/tables
Do physical layout indexes, etc.
Import the data
Write applications using DBMS and other tools
Many of the hard problems are taken care of by other people (DBMS, API writers, library authors, web server, etc.)
Conceptual Design for CIS Student Course Survey
Example Schema
Our focus now: relational schema set of tables
Can have other kinds of schemas XML, object,
Some Terminology
Columns of a relation are called attributes or fields
The number of these columns is the arity of the relation
The rows of a relation are called tuples
Each attribute has values taken from a domain, e.g., subj has domain string
Theoretically: a relation is a set of tuples; no tuple can occur more than once
Real systems may allow duplicates for efficiency or other reasons we ll ignore this for now
Objects and XML may also have the same content with different identity
Describing Relations
A schema can be represented many ways
In relational DBs, we use relation(attribute:domain)
To the DBMS, use data definition language (DDL) like programming language type definitions
More on Attribute Domains
Relational DBMSs have very limited built-in domains: either tables or scalar attributes int, string, byte sequence, date, etc.
But more generally:
We can have nested relations
Object-oriented, object-relational systems allow complex, user-defined domains lists, classes, etc.
XML systems allow for XML trees (or lists of trees) that follow certain structural constraints
Database people, when they are discussing design, often assume domains are evident to the reader: STUDENT(sid, name)
Integrity Constraints
Domains and schemas are one form of constraint on a valid data instance
Other important constraints include:
Key constraints:
Subset of fields that uniquely identifies a tuple, and for which no subset of the key has this property
May have several candidate keys; one is chosen as the primary key
A superkey is a subset of fields that includes a key
Inclusion dependencies (referential integrity constraints):
A field in one relation may refer to a tuple in another relation by including its key
The referenced tuple must exist in the other relation for the database instance to be valid
SQL: Structured Query Language
The standard language for relational data
Invented by folks at IBM, esp. Don Chamberlin
Actually not a particularly elegant language
Beat a more elegant competing standard, QUEL, from Berkeley
Separated into a DML (data manipulation language) & DDL
DML based on relational algebra & (mostly) calculus, which we discuss this week
Later we ll see how it s embedded in a host language
Table Definition: SQL-92 DDL and Constraints
Example Data Instance
From Tables SQL Web Application
<html>
<body>
<!-- hypotheticalEmbeddedSQL:
SELECT * FROM STUDENT, Takes, COURSE
WHERE STUDENT.sid = Takes.sID
AND Takes.cID = cid
-->
</body>
</html>
Codd s Relational Algebra
A set of mathematical operators that compose, modify, and combine tuples within different relations
Relational algebra operations operate on relations and produce relations ( closure )
f: Relation Relation f: Relation x Relation Relation
Codd s Logical Operations: The Relational Algebra
Six basic operations:
Projection ®
Selection ®
Union R1 [ R2
Difference R1 R2
Product R1 R2
(Rename) b ®
And some other useful ones:
Join R1 R2
Semijoin R1 R2
Intersection R1 R2
Division R1 R2
Data Instance for Operator Examples
Projection,
Selection,
Product X
Join, : A Combination of Product and Selection
Union
Difference
Rename, a b
The rename operator can be expressed several ways:
The book has a very odd definition that s not algebraic
An alternate definition:
a b(x) Takes the relation with schema Returns a relation with the attribute list
Rename isn t all that useful, except if you join a relation with itself
Why would it be useful here?
Mini-Quiz
This completes the basic operations of the relational algebra. We shall soon find out in what sense this is an adequate set of operations.
Try writing queries for these:
The names of students named Bob
The names of students expecting an A
The names of students in Milo Martin s 501 class
The sids and names of students not enrolled
Deriving Intersection
Intersection: as with set operations, derivable from difference
Division
A somewhat messy operation that can be expressed in terms of the operations we have already defined
Used to express queries such as The fid's of faculty who have taught all subjects
Paraphrased: The fid s of professors for which there does not exist a subject that they haven t taught
Division: R1 R2
Requirement: schema(R1) schema(R2)
Result schema: schema(R1) schema(R2)
Professors who have taught all courses :

What about Courses that have been taught by all faculty ?
Division Using Our Existing Operators
All possible teaching assignments: Allpairs:
NotTaught, all (fid,subj) pairs for which professor fid has not taught subj:
Answer is all faculty not in NotTaught:
The Big Picture: SQL to Algebra to Query Plan to Web Page
Hint of Future Things: Optimization Is Based on Algebraic Equivalences
Relational algebra has laws of commutativity, associativity, etc. that imply certain expressions are equivalent in semantics
They may be different in cost of evaluation!
Next Time: An Equivalent, But Very Different, Formalism
Codd invented a relational calculus that he proved was equivalent in expressiveness
Based on a subset of first-order logic declarative, without an implicit order of evaluation
More convenient for describing certain things, and for certain kinds of manipulations
And, in fact, the basis of SQL!