Relational Databases

Information Technology Resources

A database is a collection of records stored in a computer in a systematic way, so that a computer program can consult it to answer questions. For better retrieval and sorting, each record is usually organized as a set of data elements (facts). The items retrieved in answer to queries become information that can be used to make decisions. The computer program used to manage and query a database is known as a database management system (DBMS). The properties and design of database systems are included in the study of information science.

The central concept of a database is that of a collection of records, or pieces of knowledge. Typically, for a given database, there is a structural description of the type of facts held in that database: this description is known as a schema. The schema describes the objects that are represented in the database, and the relationships among them. There are a number of different ways of organizing a schema, that is, of modeling the database structure: these are known as database models (or data models). The model in most common use today is the relational model, which in layman's terms represents all information in the form of multiple related tables each consisting of rows and columns (the true definition uses mathematical terminology). This model represents relationships by the use of values common to more than one table. Other models such as the hierarchical model and the network model use a more explicit representation of relationships.

Strictly speaking, the term database refers to the collection of related records, and the software should be referred to as the database management system or DBMS. When the context is unambiguous, however, many database administrators and programmers use the term database to cover both meanings.

Database models

Various techniques are used to model data structure. Most database systems are built around one particular data model, although it is increasingly common for products to offer support for more than one model. For any one logical model various physical implementations may be possible, and most products will offer the user some level of control in tuning the physical implementation, since the choices that are made have a significant effect on performance. An example of this is the relational model: all serious implementations of the relational model allow the creation of indexes which provide fast access to rows in a table if the values of certain columns are known.

A data model is not just a way of structuring data: it also defines a set of operations that can be performed on the data. The relational model, for example, defines operations such as selection, projection, and join. Although these operations may not be explicit in a particular query language, they provide the foundation on which a query language is built.

Flat model

This may not strictly qualify as a data model, as defined above.

The flat (or table) model consists of a single, two-dimensional array of data elements, where all members of a given column are assumed to be similar values, and all members of a row are assumed to be related to one another. For instance, columns for name and password that might be used as a part of a system security database. Each row would have the specific password associated with an individual user. Columns of the table often have a type associated with them, defining them as character data, date or time information, integers, or floating point numbers. This model is, incidentally, a basis of the spreadsheet.

Hierarchical model

In a hierarchical model, data is organized into a tree-like structure. Hierarchical structures were widely used in the early mainframe database management systems, such as the Information Management System (IMS) by IBM. Most desktop computers also employ a hierarchical file system. This structure allows one 1:N relationship between two types of data. This structure is very efficient to describe some of the relationships in the real world. However, the hierarchical structure is inappropriate in many cases and is inefficient for certain database operations.

Network model

The network model (defined by the CODASYL specification) organizes data using two fundamental constructs, called records and sets. Records contain fields (which may be organized hierarchically, as in COBOL). Sets (not to be confused with mathematical sets) define one-to-many relationships between records: one owner, many members. A record may be an owner in any number of sets, and a member in any number of sets.

The operations of the network model are navigational in style: a program maintains a current position, and navigates from one record to another by following the relationships in which the record participates. Records can also be located by supplying key values.

Although it is not an essential feature of the model, network databases generally implement the set relationships by means of pointers that directly address the location of a record on disk. This gives excellent retrieval performance, at the expense of operations such as database loading and reorganization.

Relational model

The relational model was introduced in an academic paper by E. F. Codd in 1970 as a way to make database management systems more independent of any particular application. It is a mathematical model defined in terms of predicate logic and set theory.

The products that are generally referred to as relational databases in fact implement a model that is only an approximation to the mathematical model defined by Codd. The data structures in these products are tables, rather than relations: the main differences being that tables can contain duplicate rows, and that the rows (and columns) can be treated as being ordered. The same criticism applies to the SQL language which is the primary interface to these products. There has been considerable controversy, mainly due to Codd himself, as to whether it is correct to describe SQL implementations as "relational": but the fact is that the world does so, and the following description uses the term in its popular sense.

A relational database contains multiple tables, each similar to the one in the "flat" database model. Relationships between tables are not defined explicitly; instead, keys are used to match up rows of data in different tables. A key is a collection of one or more columns in one table whose values match corresponding columns in other tables: for example, an Employee table may contain a column named Location which contains a value that matches the key of a Location table. Any column can be a key, or multiple columns can be grouped together into a single key. It is not necessary to define all the keys in advance; a column can be used as a key even if it was not originally intended to be one.

A key that can be used to uniquely identify a row in a table is called a unique key. Typically one of the unique keys is the preferred way to refer to row; this is defined as the table's primary key.

A key that has an external, real-world meaning (such as a person's name, a book's ISBN, or a car's serial number), is sometimes called a "natural" key. If no natural key is suitable (think of the many people named Brown), an arbitrary key can be assigned (such as by giving employees ID numbers). In practice, most databases have both generated and natural keys, because generated keys can be used internally to create links between rows that cannot break, while natural keys can be used, less reliably, for searches and for integration with other databases. (For example, records in two independently developed databases could be matched up by social security number, except when the social security numbers are incorrect, missing, or have changed.)

Relational operations

Users (or programs) request data from a relational database by sending it a query that is written in a special language, usually a dialect of SQL. Although SQL was originally intended for end-users, it is much more common for SQL queries to be embedded into software that provides an easier user interface. Many web sites perform SQL queries when generating pages.

In response to a query, the database returns a result set, which is just a list of rows containing the answers. The simplest query is just to return all the rows from a table, but more often, the rows are filtered in some way to return just the answer wanted.

Often, data from multiple tables gets combined into one, by doing a join. Conceptually, this is done by taking all possible combinations of rows (the "cross-product"), and then filtering out everything except the answer. In practice, relational database management systems rewrite ("optimize") queries to perform faster, using a variety of techniques.

There are a number of relational operations in addition to join. These include project (the process of eliminating some of the columns), restrict (the process of eliminating some of the rows), union (a way of combining two tables with similar structures), difference (which lists the rows in one table that are not found in the other), intersect (which lists the rows found in both tables), and product (mentioned above, which combines each row of one table with each row of the other). Depending on which other sources you consult, there are a number of other operators - many of which can be defined in terms of those listed above. These include semi-join, outer operators such as outer join and outer union, and various forms of division. Then there are operators to rename columns, and summarizing or aggregating operators, and if you permit relation values as attributes (RVA - relation-valued attribute), then operators such as group and ungroup. The SELECT statement in SQL serves to handle all of these except for the group and ungroup operators.

The flexibility of relational databases allows programmers to write queries that were not anticipated by the database designers. As a result, relational databases can be used by multiple applications in ways the original designers did not foresee, which is especially important for databases that might be used for decades. This has made the idea and implementation of relational databases very popular with businesses.

Dimensional model

The dimensional model is a specialized adaptation of the relational model used to represent data in data warehouses in a way that data can be easily summarized using OLAP queries. In the dimensional model, a database consists of a single large table of facts that are described using dimensions and measures. A dimension provides the context of a fact (such as who participated, when and where it happened, and its type) and is used in queries to group related facts together. Dimensions tend to be discrete and are often hierarchical; for example, the location might include the building, state, and country. A measure is a quantity describing the fact, such as revenue. It's important that measures can be meaningfully aggregated - for example, the revenue from different locations can be added together.

In an OLAP query, dimensions are chosen and the facts are grouped and added together to create a summary.

The dimensional model is often implemented on top of the relational model using a star schema, consisting of one table containing the facts and surrounding tables containing the dimensions. Particularly complicated dimensions might be represented using multiple tables, resulting in a snowflake schema.

A data warehouse can contain multiple star schemas that share dimension tables, allowing them to be used together. Coming up with a standard set of dimensions is an important part of dimensional modeling.

Object database models

In recent years, the object-oriented paradigm has been applied to database technology, creating a new programming model known as object databases. These databases attempt to bring the database world and the application programming world closer together, in particular by ensuring that the database uses the same type system as the application program. This aims to avoid the overhead (sometimes referred to as the impedance mismatch) of converting information between its representation in the database (for example as rows in tables) and its representation in the application program (typically as objects). At the same time object databases attempt to introduce the key ideas of object programming, such as encapsulation and polymorphism, into the world of databases.

A variety of these ways have been tried for storing objects in a database. Some products have approached the problem from the application programming end, by making the objects manipulated by the program persistent. This also typically requires the addition of some kind of query language, since conventional programming languages do not have the ability to find objects based on their information content. Others have attacked the problem from the database end, by defining an object-oriented data model for the database, and defining a database programming language that allows full programming capabilities as well as traditional query facilities.

Object databases suffered because of a lack of standardization: although standards were defined by ODMG, they were never implemented well enough to ensure interoperability between products. Nevertheless, object databases have been used successfully in many applications: usually specialized applications such as engineering databases or molecular biology databases rather than mainstream commercial data processing. However, object database ideas were picked up by the relational vendors and influenced extensions made to these products and indeed to the SQL language.

Links

Oracle Database Software Downloads : All software downloads are free, and each comes with a Development License that allows you to use full versions of the products only while developing and prototyping your applications. You can buy products with full-use licenses at any time from the online Store or from your sales representative. You must accept the OTN License Agreement to download this software. If you use the application you develop under this license for any internal data processing or for any commercial or production purposes, or you want to use the programs for any purpose other than as permitted under this agreement, you must obtain a production release version of the program by contacting us or an Oracle reseller to obtain the appropriate license.

Microsoft SQL Server 2012 Express Edition : Microsoft® SQL Server® 2012 Express is a powerful and reliable free data management system that delivers a rich and reliable data store for lightweight Web Sites and desktop applications. Designed for easy deployment and rapid prototyping, this download includes support for Sysprep, Microsoft's System Preparation Utility for Microsoft Windows operating system deployment.

ODMG Home Page: Shape the future of object storage! Whether your organization is engaged in database and tools development, consulting or deploying object technology, you can take advantage of this emerging market opportunity by joining the fast-growing ODMG, the standards organization for object storage. The ODMG submitted the ODMG Java Binding to the Java Community Process as a basis for the Java Data Objects Specification (JDO). A Public Review Draft of the the JDO specification is now available.

MySQL: MySQL is the world's most popular Open Source Database, designed for speed, power and precision in mission critical, heavy load use. MySQL AB is the company owned by the MySQL founders. We have made our product available at zero price under the GNU General Public License (GPL), and we also sell it under a commercial license to those who do not wish to be bound by the terms of the GPL.

Object-Oriented Database Management Systems: This page is intended to point to as much of the OODBMS-related material to be found out there as possible. You are encouraged to let me know about the many things I have missed. Emphasis will be on items that can be accessed in an electronic form. (Such sources are highlighted with icons below.)

Bitpipe: Bitpipe Inc. (http://www.bitpipe.com) is the leading syndicator of in-depth information technology (IT) content. Bitpipe distributes content from over 3,500 leading IT vendors and over 60 top analyst firms including Gartner, IDC, Aberdeen Group, Meta Group, and Yankee Group via the Bitpipe Network. The Bitpipe Network reaches over 21.7 million unique visitors

Oracle Corporation : Oracle Corporation (Nasdaq: ORCL) is the world's largest enterprise software company, providing enterprise software to the world's largest and most successful businesses. With annual revenues of more than $10.8 billion, the company offers its database, tools and application products, along with related consulting, education, and support services. Headquartered in Redwood Shores, California, Oracle is the first software company to develop and deploy 100 percent Internet-enabled enterprise software across its entire product line: database, server, enterprise business applications, and application development, and decision support tools.

Empress.com: Empress RDBMS is a full-featured database engine designed for embedded, real-time applications. Empress is the only database to provide total control to the developer delivering high-performance, deterministic data management. Empress, The Embedded Real-Time Database, is compact, agile and maintenance-free and is suited for embedded systems, real-time, communications, military & defense, process control and scientific & engineering applications. Empress runs on Unix, Linux, Windows and Real Time systems.

Sybase home page : We are the software integration company. We can help you integrate all the data and business applications in your enterprise and extend them to any location in the world. So you can improve efficiency while preserving your existing infrastructure investments. Click for more.

Intelligent Enterprise: Intelligent Enterprise is the only magazine dedicated to the strategic business applications that create intelligence from data. Intelligent Enterprise shows high level technology professionals how to plan, deploy, and manage the strategic applications their companies depend on to improve decision making, optimize business performance, and maximize the value of customer relationships.

About Databases: The About network consists of hundreds of Guide sites neatly organized into 23 channels. The sites cover more than 50,000 subjects with over 1 million links to the best resources on the Net and the fastest-growing archive of high quality original content. Topics range from pregnancy to cars, palm pilots to painting, weight loss to video game strategies. No one has greater depth and breadth than About.

Clash of the Titans: SQL Databases: For this story, we brought five database servers into PC Magazine Labs to evaluate their features and test their performance. We looked at IBM's DB2 Universal Database 7.2, Microsoft SQL Server 2000 SP2, MySQL 4.0, Oracle9i Database, and Sybase Adaptive Server Enterprise 12.5.

Data Warehousing Information Center: This site's aim is to help readers learn about data warehousing and decision support (i.e., business intelligence) systems. The site will: Publish this site's author's essays about data warehousing and decision support Point the reader to external publications - some of the better articles and white papers that are web accessible, books, technical evaluations, periodicals, and other non-vendor sources of information

searchDatabase.com: The mission of SearchDatabase.com is to be the leading Web resource for quality information about database software, and to be a vital community for database administrators, developers, and users. The idea is simple: Give the IT professional, whose success hinges on the efficiency and performance of a database, a place to find the information needed to excel. Whether it be technical advice on a specific project, keeping up-to-date with recent industry news, seeking a vendor partnership, being in-the-know on new products and technologies, or assistance on a career path, SearchDatabase.com is "one stop shopping" for the enterprise database professional

Database Debunking: is the web site that sets matters straight by telling the truth about database management.It is the forum for concepts, principles and methods and their practical implications that receive little, incorrect, or no coverage from the trade media and no consideration from vendors and industry pundits. It is dedicated to and intended for MIS professionals, application developers, managers, users-- experienced or novices -- academics and students who think for themselves, want to understand database management, rather than follow the prevailing "cookbook" approach, and who are interested in minimizing the severe costs imposed by mindless technology and marketing fads. The site is focused on database education -- as distinct from product-specific training -- and should be, therefore, useful, regardless of which DBMS software is used.

Database Systems & Logic Programming Bibliography: I am a lecturer ("akademischer Rat") at the Department of Computer Science of the University of Trier (Germany). My primary interests are database systems, information retrieval, digital libraries and electronic publishing. My main activities are to maintain the DBLP bibliography server and to edit The ACM SIGMOD Anthology. (Former activities: DB&LP Demos, Courses, LILOG-DB, ITL, ...)

ITToolBox Database (DB): ITtoolbox provides a knowledge network and support environment for the IT industry. As IT professionals or business decision-makers need information to complete a task or make a decision, they turn to ITtoolbox to quickly get specific answers to their unique questions. ITtoolbox maintains knowledge bases for each major segment of the IT industry. Each knowledge base contains information sourced from the user community, licensed from leading content providers, and linked to on the World Wide Web.

The Official Win32:ODBC Home Page: Here you will not learn how to program ODBC; you won't learn SQL; you will not learn how to program Perl and you will not learn how to create Perl 5 extensions. What you will find is help for those commonly asked questions that you may have regarding the Win32::ODBC extension. ODBC has gained so much momentum over the years that it is now considered to be the standard Call Level Interface (CLI) for many database engines. It has been ported to the most popular platforms (Win32, UNIX, Mac) and is so ubiquitous that you may be using it without knowing it.

1SQLStreet.com : 1SQLStreet has Over five million lines of free source code! And huge database of articles and interestng and useful information on databases. Ian Ippolito is the original creator of 1 SQL Street. It was first written with Microsoft Visual Interdev 1.0 and Microsoft Access. The scripting is done with a technology called Active Server Pages (ASP) and is done using VBScript. Currently it is running from Microsoft Visual Interdev 6.0 and Microsoft SQL Server 7.0.

DevX.com: Get Help is where you can network directly with other developers to get answers to your tough technical questions. You can browse through the categories below to read our library of questions and answers. You'll also find links to our "10-Minute Solution" how-to articles, as well as expert programming tips, tricks, and shortcuts!

SQL Standards: JCC's SQL Std. Page. This page is designed to be a central source of information about the SQL standards process and its current state. It will contain a number of pointers to other sources of information about the SQL standard. The information available here is:

Database Applications from ZDNet.com: ZDNet, www.zdnet.com, operates a worldwide network of web sites that offer content, services, and commerce opportunities that enable IT professionals and business influencers to gain an edge in business.

ITWorld Database Management Systems: ITworld.com is an online publisher of news, educational content and broadband interactive media for the IT community. As a producer of daily IT news, dozens of topic specific newsletters, leading IT white papers, and a wide variety of webcast programs, ITworld delivers measurable, performance-based results for marketers by building customized lead generation programs around the distribution of content.

Share This


Suggestions for Further Reading

Attribution: Information for this article has been adapted from wikipedia under GNLU licence.