Slide1: Databases on the Internet David Billard David.Billard@cui.unige.ch 4th CEENet Workshop on Network Technology Bratislava, Slovakia, August 21-31, 1998
Slide2: Outline Classical databases
Why should it be different in the Internet?
Organizing the data with the Araneus methodology
Building dynamically a web site with the Target Form Expression
Dealing with security
Using transactions and workflows
Slide3: Classical Databases Entity-Relationship
(ER) model Relational model Oracle DBMS
Slide4: Classical Database Environment A classical Database Management System (DBMS) is composed of: interpretation of requests ACID properties Atomicity (all or nothing)
Consistency
Isolation
Durability
Slide5: What is different with the Internet? Hypertext data: relational tables are 2-dimensions:
tuples (lines)
attributes (columns)
relational tables do not content navigational data ;
hypertext pages implements the third dimension of access path to the data Entity-relation scheme must evolve;
Relations must be queried with the third dimension in mind. Consequences:
Slide6: What is different with the Internet? Internet technology: no guaranty of service (the RSVP is not fully implemented in IP, expected to be in IPng);
unsecure channel of communication;
lack of the notion of user (only machines);
high probability of system crash;
very sensible to scale problems. we must implement recovery protocols;
we must implement security features. Consequences:
Slide7: What is different with the Internet? User behaviour: the user's actions are unknown and unpredictable we cannot use tightly coupled integration of the servers;
we must define a minimal set of mandatory functionalities for interoperation. Strong heterogeneity: a DBMS cannot know every other DBMS technology to interact Consequences:
Slide8: Organizing the data for their use via the Web Web sites containing valuable pieces of information
Web sites containing high volume of data, retrieved from databases
Data organized in hypertextual form (access paths are part of the model) What we have:
Slide9: Organizing the data for their use via the Web What we need: a methodology for the:
Database design process
Hypertext design process
a tool for:
generating web sites from databases
maintaining these web sites
Slide10: The ARANEUS Project (Univ. Roma 3)
Slide11: The ARANEUS Project 1:N 1:1 University ER scheme 1:N 1:1 1:1 1:N name
type day
hour roomNum phone name
photo
e-mail
Slide12: The ARANEUS Project From ER schemes to Hypertext Conceptual schemes Selecting Macroentities (objects relevant and independent)
Designing Directed Relationships (precise the direction of navigation)
Designing Union Nodes and Aggregations (representation of the IS-A relationship)
Slide13: The ARANEUS Project 1:N 1:1 Selecting Macroentities (objects relevant and independent) 1:N 1:1 1:1 1:N name
type day
hour roomNum phone name
photo
e-mail
Slide14: The ARANEUS Project 1:N 1:1 Selecting Macroentities (objects relevant and independent) 1:N 1:1 1:1 1:N name
type day
hour roomNum phone name
photo
e-mail
Slide15: The ARANEUS Project 1:N 1:1 Designing Directed Relationships (precise the direction of navigation) 1:N 1:1 1:1 1:N name
type day
hour roomNum phone name
photo
e-mail
Slide16: The ARANEUS Project Designing Union Nodes and Aggregations (representation of the IS-A relationship) 1:N 1:1 1:N 1:1 1:1 1:N name
type day
hour roomNum phone name
photo
e-mail
Slide17: The ARANEUS Project Department Hypertext Conceptual Scheme type =
undergraduate type =
graduate name
phone
photo
e-mail 1:N 1:1 day
hour
roomNum 1:N name type
Slide18: The ARANEUS Project
From Hypertext Conceptual schemes to Hypertext Logical Design Mapping Macroentities (to pages or lists)
Mapping Directed Relationships (to links between pages)
Mapping Aggregations (pages + lists + links)
Slide19: The ARANEUS Project Mapping Macroentities to pages name
phone
photo
e-mail
room
...
Slide20: The ARANEUS Project Mapping Macroentities to lists title
author
date
hour
room
...
Slide21: The ARANEUS Project Mapping Directed relationships to links name
phone
photo
e-mail
room
... 1:N 1:1 day
hour
roomNum 1:N name type
Slide22: The ARANEUS Project (Univ. Roma 3) 1 2 3 4 5 6 Dynamic page generation
Slide23: Querying the databases (browsing and navigating) a query language to make DB requests
a tool for presenting the result of the request:
in HTML form
in any form What we need:
Slide24: The Target Form Expression (TFE) project (Univ. of Keio, Japan) Idea: Extending SQL to add publishing facilities
the result of a query is presented in a structured document (HTML, Java, LaTeX, ...) To give a comparison: Allaire's Cold Fusion does not allow grouping or hyperlink generation (necessary for structuring documents)
Slide25: The Target Form Expression project GENERATE keyword GENERATE <medium> <TFE> <medium> = HTML, LaTeX, Java, Excel, TCLTK, O2C, SQL <TFE> = expression , = tuple connector
! = row connector
% = depth (link) operator [ and ] = repeaters [emp.name, emp.salary]! = list of tuples (names, salary) [store.name, [dept.name ! [emp.name]!, [item.name]!]!]%
Slide26: The Target Form Expression project Example of a movie database cast
Slide27: The Target Form Expression project
Slide28: The Target Form Expression project select a category generate html verb(select a category) !
[f.type %
[f.year, [f.title %
{f.title ! imagefile(f.pict) !
[imagefile(a.face), a.name, a.birth]!}]!]!],
from film f, cast c, actor a
where f.id = c.film and c.actor = a.id
Slide29: The Target Form Expression project generate html verb(select a category) !
[f.type %
[f.year, [f.title %
{f.title ! imagefile(f.pict) !
[imagefile(a.face), a.name, a.birth]!}]!]!],
from film f, cast c, actor a
where f.id = c.film and c.actor = a.id
Slide30: The Target Form Expression project generate html verb(select a category) !
[f.type %
[f.year, [f.title %
{f.title ! imagefile(f.pict) !
[imagefile(a.face), a.name, a.birth]!}]!]!],
from film f, cast c, actor a
where f.id = c.film and c.actor = a.id
Slide31: The Target Form Expression project generate html verb(select a category) !
[f.type %
[f.year, [f.title %
{f.title ! imagefile(f.pict) !
[imagefile(a.face), a.name, a.birth]!}]!]!],
from film f, cast c, actor a
where f.id = c.film and c.actor = a.id
Slide32: The Target Form Expression project The first wives club generate html verb(select a category) !
[f.type %
[f.year, [f.title %
{f.title ! imagefile(f.pict) !
[imagefile(a.face), a.name, a.birth]!}]!]!],
from film f, cast c, actor a
where f.id = c.film and c.actor = a.id
Slide33: The Target Form Expression project The first wives club generate html verb(select a category) !
[f.type %
[f.year, [f.title %
{f.title ! imagefile(f.pict) !
[imagefile(a.face), a.name, a.birth]!}]!]!],
from film f, cast c, actor a
where f.id = c.film and c.actor = a.id
Slide34: The Target Form Expression project The first wives club generate html verb(select a category) !
[f.type %
[f.year, [f.title %
{f.title ! imagefile(f.pict) !
[imagefile(a.face), a.name, a.birth]!}]!]!],
from film f, cast c, actor a
where f.id = c.film and c.actor = a.id
Slide35: The Target Form Expression project INVOKE keyword allow dynamic queries inside queries
(recursive queries)
Slide36: Classical Database Environment Environment of a classical DBMS: secure environment
Slide37: Classical Database Environment Multidatabases - Federated DBMS
Cooperative work, ...
Slide38: Securing the databases The Internet introduces threats for the DBMS and the users. 4 attacks among the more frequent Interception
Modification
Fabrication
Interruption
Slide39: Confidentiality Cannot help a message to be intercepted;
The message must not be disclosed; Confidentiality of data (Privacy)
Encryption (e.g. Secure Socket Layer - SSL) Interception
Slide40: Dangerous behaviours Modification Fabrication
Slide41: Dangerous behaviours A user cannot deny having received or sent a message. Non-Repudiation schemas, based on the authentication of user. A user must really be who he claims to be. Authentication of user
Authentication certificates delivered by a "Thrustee"
Slide42: Dangerous behaviours Interruption A user cannot help a communication to being cut. Recovery procedures, based on time-outs and logging.
Slide43: Securing the databases The iSaSiLk toolbox (Univ. of Vienna)
Written in Java
Provides cryptography primitives (possibility to implement SSL sockets)
Provides authentication primitives
Provides certificate management
Free for use for academic partners
(not free for industry)
Slide44: Transactions in the Internet Transactions are a very common tool in databases
They provide isolation of concurrent activities
They are fault tolerant processes
They have been extensively studied in distributed environment But They lack security and scalability in the Internet
Slide45: Example in Electronic Commerce Internet
Slide46: Example in Electronic Commerce Internet I want...
Slide47: Example in Electronic Commerce Internet the same shirt as
Sandra Bullock's in
"The Net", and...
Slide48: Example in Electronic Commerce Internet the same hat as
Ingrid Bergman's
in "Casablanca"!
Slide49: Example in Electronic Commerce Internet
Slide50: Example in Electronic Commerce Internet ACID Atomicity
Consistency
Isolation
Durability
Slide51: Example in Electronic Commerce Internet ACID Atomicity
Consistency
Isolation
Durability Secure Confidentialty
Authentication
Non-Repudiation
Slide52: Example in Atomic File Transfer Protocol Internet Sofware update Server A Server B System administrator
Slide53: Example in Atomic File Transfer Protocol Internet Server A Server B System administrator
Slide54: Example in Atomic File Transfer Protocol Internet Server A Server B System administrator
Slide55: Example in Atomic File Transfer Protocol Internet Server A Server B Installation application
Slide56: Example in Internet Aided Manufacturing Internet New extension card for PC
Slide57: Example in Internet Aided Manufacturing Internet New extension card for PC Printed circuit manufacturer
Slide58: Example in Internet Aided Manufacturing Internet New extension card for PC Printed circuit manufacturer Electronic component supplier
Slide59: Example in Internet Aided Manufacturing Internet New extension card for PC Printed circuit manufacturer Electronic component supplier
Slide60: Example in Internet Aided Manufacturing Internet Integrator Towards workflows
Slide61: Transactions in the Internet Transaction Internet Protocol (TIP)
Corba OTS (Object Transaction Service)
X/Open DTP (Distributed Transaction Processing) Related work on transactions:
Slide62: Workflows on the Internet A workflow business process:
Slide63: Workflows on the Internet Process (separation of business logic from function logic)
Organization (who is doing what)
Infrastructure (what has to be done manually, with computer, ...) 3 levels in a workflow:
Slide64: Workflows on the Internet Flowmark
Flowman
InConcert
Staffware
ViewStar
...
Slide65: Resume of part 1 We know how to modelize and represent a database
We know how to query the database and visualize the results
We are aware of security and fault-tolerance problems