A database is a collection of records stored in a computer in a systematic way, so
that a computer program can consult it to answer questions. For better retrieval and sorting, each record is usually organized as a set of data elements (facts). The items retrieved in answer to queries become information that can be used to make decisions. The computer program used to manage and query a database is known as a database management system (DBMS). The properties and design of database systems are included in the study of information science.
The central concept of a database is that of a collection of records, or pieces of knowledge. Typically, for a given database, there is a structural description of the type of facts held in that database: this description is known as a schema. The schema describes the objects that are represented in the database, and the relationships among them. There are a number of different ways of organizing a schema, that is, of modeling the database structure: these are known as database models (or data models). The model in most common use today is the relational model, which in layman's terms represents all information in the form of multiple related tables each consisting of rows and columns (the true definition uses mathematical terminology). This model represents relationships by the use of values common to more than one table. Other models such as the hierarchical model and the network model use a more explicit representation of relationships.
Strictly speaking, the term database refers to the collection of related records, and the software should be referred to as the database management system or DBMS. When the context is unambiguous, however, many database administrators and programmers use the term database to cover both meanings.
Various techniques are used to model data structure. Most database systems are built around one particular data model, although it is increasingly common for products to offer support for more than one model. For any one logical model various physical implementations may be possible, and most products will offer the user some level of control in tuning the physical implementation, since the choices that are made have a significant effect on performance. An example of this is the relational model: all serious implementations of the relational model allow the creation of indexes which provide fast access to rows in a table if the values of certain columns are known.
A data model is not just a way of structuring data: it also defines a set of operations that can be performed on the data. The relational model, for example, defines operations such as selection, projection, and join. Although these operations may not be explicit in a particular query language, they provide the foundation on which a query language is built.
This may not strictly qualify as a data model, as defined above.
The flat (or table) model consists of a single, two-dimensional array of data elements, where all members of a given column are assumed to be similar values, and all members of a row are assumed to be related to one another. For instance, columns for name and password that might be used as a part of a system security database. Each row would have the specific password associated with an individual user. Columns of the table often have a type associated with them, defining them as character data, date or time information, integers, or floating point numbers. This model is, incidentally, a basis of the
In a hierarchical model, data is organized into a tree-like structure. Hierarchical structures were widely used in the early mainframe database management systems, such as the Information Management System (IMS) by IBM. Most desktop computers also employ a hierarchical file system. This structure allows one 1:N relationship between two types of data. This structure is very efficient to describe some of the relationships in the real world. However, the hierarchical structure is inappropriate in many cases and is inefficient for certain database operations.
The network model (defined by the CODASYL specification) organizes data using two fundamental constructs, called records and sets. Records contain fields (which may be organized hierarchically, as in COBOL). Sets (not to be confused with mathematical sets) define one-to-many relationships between records: one owner, many members. A record may be an owner in any number of sets, and a member in any number of sets.
The operations of the network model are navigational in style: a program maintains a current position, and navigates from one record to another by following the relationships in which the record participates. Records can also be located by supplying key values.
Although it is not an essential feature of the model, network databases generally implement the set relationships by means of pointers that directly address the location of a record on disk. This gives excellent retrieval performance, at the expense of operations such as database loading and
The relational model was introduced in an academic paper by E. F. Codd in 1970 as a way to make database management systems more independent of any particular application. It is a mathematical model defined in terms of predicate logic and set theory.
The products that are generally referred to as relational databases in fact implement a model that is only an approximation to the mathematical model defined by Codd. The data structures in these products are tables, rather than relations: the main differences being that tables can contain duplicate rows, and that the rows (and columns) can be treated as being ordered. The same criticism applies to the SQL language which is the primary interface to these products. There has been considerable controversy, mainly due to Codd himself, as to whether it is correct to describe SQL implementations as "relational": but the fact is that the world does so, and the following description uses the term in its popular sense.
A relational database contains multiple tables, each similar to the one in the "flat" database model. Relationships between tables are not defined explicitly; instead, keys are used to match up rows of data in different tables. A key is a collection of one or more columns in one table whose values match corresponding columns in other tables: for example, an Employee table may contain a column named Location which contains a value that matches the key of a Location table. Any column can be a key, or multiple columns can be grouped together into a single key. It is not necessary to define all the keys in advance; a column can be used as a key even if it was not originally intended to be one.
A key that can be used to uniquely identify a row in a table is called a unique key. Typically one of the unique keys is the preferred way to refer to row; this is defined as the table's primary key.
A key that has an external, real-world meaning (such as a person's name, a book's ISBN, or a car's serial number), is sometimes called a "natural" key. If no natural key is suitable (think of the many people named Brown), an arbitrary key can be assigned (such as by giving employees ID numbers). In practice, most databases have both generated and natural keys, because generated keys can be used internally to create links between rows that cannot break, while natural keys can be used, less reliably, for searches and for integration with other databases. (For example, records in two independently developed databases could be matched up by social security number, except when the social security numbers are incorrect, missing, or have changed.)
Users (or programs) request data from a relational database by sending it a query that is written in a special language, usually a dialect of SQL. Although SQL was originally intended for end-users, it is much more common for SQL queries to be embedded into software that provides an easier user interface. Many web sites perform SQL queries when generating pages.
In response to a query, the database returns a result set, which is just a list of rows containing the answers. The simplest query is just to return all the rows from a table, but more often, the rows are filtered in some way to return just the answer wanted.
Often, data from multiple tables gets combined into one, by doing a join. Conceptually, this is done by taking all possible combinations of rows (the "cross-product"), and then filtering out everything except the answer. In practice, relational database management systems rewrite ("optimize") queries to perform faster, using a variety of techniques.
There are a number of relational operations in addition to join. These include project (the process of eliminating some of the columns), restrict (the process of eliminating some of the rows), union (a way of combining two tables with similar structures), difference (which lists the rows in one table that are not found in the other), intersect (which lists the rows found in both tables), and product (mentioned above, which combines each row of one table with each row of the other). Depending on which other sources you consult, there are a number of other operators - many of which can be defined in terms of those listed above. These include semi-join, outer operators such as outer join and outer union, and various forms of division. Then there are operators to rename columns, and summarizing or aggregating operators, and if you permit relation values as attributes (RVA - relation-valued attribute), then operators such as group and ungroup. The SELECT statement in SQL serves to handle all of these except for the group and ungroup operators.
The flexibility of relational databases allows programmers to write queries that were not anticipated by the database designers. As a result, relational databases can be used by multiple applications in ways the original designers did not foresee, which is especially important for databases that might be used for decades. This has made the idea and implementation of relational databases very popular with businesses.
The dimensional model is a specialized adaptation of the relational model used to represent data in data warehouses in a way that data can be easily summarized using OLAP queries. In the dimensional model, a database consists of a single large table of facts that are described using dimensions and measures. A dimension provides the context of a fact (such as who participated, when and where it happened, and its type) and is used in queries to group related facts together. Dimensions tend to be discrete and are often hierarchical; for example, the location might include the building, state, and country. A measure is a quantity describing the fact, such as revenue. It's important that measures can be meaningfully aggregated - for example, the revenue from different locations can be added together.
In an OLAP query, dimensions are chosen and the facts are grouped and added together to create a summary.
The dimensional model is often implemented on top of the relational model using a star schema, consisting of one table containing the facts and surrounding tables containing the dimensions. Particularly complicated dimensions might be represented using multiple tables, resulting in a snowflake schema.
A data warehouse can contain multiple star schemas that share dimension tables, allowing them to be used together. Coming up with a standard set of dimensions is an important part of dimensional modeling.
Object database models
In recent years, the object-oriented paradigm has been applied to database technology, creating a new programming model known as object databases. These databases attempt to bring the database world and the application programming world closer together, in particular by ensuring that the database uses the same type system as the application program. This aims to avoid the overhead (sometimes referred to as the impedance mismatch) of converting information between its representation in the database (for example as rows in tables) and its representation in the application program (typically as objects). At the same time object databases attempt to introduce the key ideas of object programming, such as encapsulation and polymorphism, into the world of databases.
A variety of these ways have been tried for storing objects in a database. Some products have approached the problem from the application programming end, by making the objects manipulated by the program persistent. This also typically requires the addition of some kind of query language, since conventional programming languages do not have the ability to find objects based on their information content. Others have attacked the problem from the database end, by defining an object-oriented data model for the database, and defining a database programming language that allows full programming capabilities as well as traditional query facilities.
Object databases suffered because of a lack of standardization: although standards were defined by ODMG, they were never implemented well enough to ensure interoperability between products. Nevertheless, object databases have been used successfully in many applications: usually specialized applications such as engineering databases or molecular biology databases rather than mainstream commercial data processing. However, object database ideas were picked up by the relational vendors and influenced extensions made to these products and indeed to the SQL language.
Links to Database Studies
Database 10g Downloads : All software downloads are free, and each comes
with a Development License that allows you to use full versions of the products
only while developing and prototyping your applications. You can buy products
with full-use licenses at any time from the online Store or from your sales
Server 2005 Today: You can now try out SQL Server 2005 in two waysregister
to download SQL Server 2005 for a 180-day trial, or download the SQL Server 2005
Express Edition. The fully functional trial software offers you the opportunity
to experience the features and functionality of SQL Server 2005. The trial
automatically expires 180 days after you install it on your servers.
Home Page: Shape the future of object storage! Whether your organization
is engaged in database and tools development, consulting or deploying object
technology, you can take advantage of this emerging market opportunity by
joining the fast-growing ODMG, the standards organization for object storage.
The ODMG submitted the ODMG Java Binding to the Java Community Process as a
basis for the Java Data Objects Specification (JDO). A Public Review Draft
of the the JDO specification is now available.
MySQL is the world's most popular Open Source Database, designed for speed,
power and precision in mission critical, heavy load use. MySQL AB is the company
owned by the MySQL founders. We have made our product available at zero price
under the GNU General Public License (GPL), and we also sell it under a
commercial license to those who do not wish to be bound by the terms of the GPL.
Database Management Systems: This page is intended to point to as much
of the OODBMS-related material to be found out there as possible. You are
encouraged to let me know about the many things I have missed. Emphasis will be
on items that can be accessed in an electronic form. (Such sources are
highlighted with icons below.)
Bitpipe Inc. (http://www.bitpipe.com) is the leading syndicator of in-depth
information technology (IT) content. Bitpipe distributes content from over 3,500
leading IT vendors and over 60 top analyst firms including Gartner, IDC,
Aberdeen Group, Meta Group, and Yankee Group via the Bitpipe Network. The
Bitpipe Network reaches over 21.7 million unique visitors
Corporation : Oracle Corporation (Nasdaq: ORCL) is the world's largest
enterprise software company, providing enterprise software to the world's
largest and most successful businesses. With annual revenues of more than $10.8
billion, the company offers its database, tools and application products, along
with related consulting, education, and support services. Headquartered in
Redwood Shores, California, Oracle is the first software company to develop and
deploy 100 percent Internet-enabled enterprise software across its entire
product line: database, server, enterprise business applications, and
application development, and decision support tools.
Empress RDBMS is a full-featured database engine designed for embedded,
real-time applications. Empress is the only database to provide total control to
the developer delivering high-performance, deterministic data management.
Empress, The Embedded Real-Time Database, is compact, agile and maintenance-free
and is suited for embedded systems, real-time, communications, military &
defense, process control and scientific & engineering applications. Empress
runs on Unix, Linux, Windows and Real Time systems.
home page : We are the software integration company. We can help you
integrate all the data and business applications in your enterprise and extend
them to any location in the world. So you can improve efficiency while
preserving your existing infrastructure investments. Click for more.
Enterprise: Intelligent Enterprise is the only magazine dedicated to the
strategic business applications that create intelligence from data. Intelligent
Enterprise shows high level technology professionals how to plan, deploy, and
manage the strategic applications their companies depend on to improve decision
making, optimize business performance, and maximize the value of customer
Databases: The About network consists of hundreds of Guide sites
neatly organized into 23 channels. The sites cover more than 50,000 subjects
with over 1 million links to the best resources on the Net and the
fastest-growing archive of high quality original content. Topics range from
pregnancy to cars, palm pilots to painting, weight loss to video game
strategies. No one has greater depth and breadth than About.
of the Titans: SQL Databases: For this story, we brought five database
servers into PC Magazine Labs to evaluate their features and test their
performance. We looked at IBM's DB2 Universal Database 7.2, Microsoft SQL Server
2000 SP2, MySQL 4.0, Oracle9i Database, and Sybase Adaptive Server Enterprise
Warehousing Information Center: This site's aim is to help readers learn
about data warehousing and decision support (i.e., business intelligence)
systems. The site will: Publish this site's author's essays about data
warehousing and decision support Point the reader to external publications -
some of the better articles and white papers that are web accessible, books,
technical evaluations, periodicals, and other non-vendor sources of information
The mission of SearchDatabase.com is to be the leading Web resource for quality
information about database software, and to be a vital community for database
administrators, developers, and users. The idea is simple: Give the IT
professional, whose success hinges on the efficiency and performance of a
database, a place to find the information needed to excel. Whether it be
technical advice on a specific project, keeping up-to-date with recent industry
news, seeking a vendor partnership, being in-the-know on new products and
technologies, or assistance on a career path, SearchDatabase.com is "one
stop shopping" for the enterprise database professional
Debunking: is the web site that sets matters straight by telling the truth
about database management.It is the forum for concepts, principles and methods
and their practical implications that receive little, incorrect, or no coverage
from the trade media and no consideration from vendors and industry pundits. It
is dedicated to and intended for MIS professionals, application developers,
managers, users-- experienced or novices -- academics and students who think for
themselves, want to understand database management, rather than follow the
prevailing "cookbook" approach, and who are interested in minimizing
the severe costs imposed by mindless technology and marketing fads. The site is
focused on database education -- as distinct from product-specific training --
and should be, therefore, useful, regardless of which DBMS software is used.
Systems & Logic Programming Bibliography: I am a lecturer ("akademischer
Rat") at the Department of Computer Science of the University of Trier
(Germany). My primary interests are database systems, information retrieval,
digital libraries and electronic publishing. My main activities are to maintain
the DBLP bibliography server and to edit The ACM SIGMOD Anthology. (Former
activities: DB&LP Demos, Courses, LILOG-DB, ITL, ...)
Database (DB): ITtoolbox provides a knowledge network and support
environment for the IT industry. As IT professionals or business decision-makers
need information to complete a task or make a decision, they turn to ITtoolbox
to quickly get specific answers to their unique questions. ITtoolbox maintains
knowledge bases for each major segment of the IT industry. Each knowledge base
contains information sourced from the user community, licensed from leading
content providers, and linked to on the World Wide Web.
Official Win32:ODBC Home Page: Here you will not learn how to program ODBC;
you won't learn SQL; you will not learn how to program Perl and you will not
learn how to create Perl 5 extensions. What you will find is help for those
commonly asked questions that you may have regarding the Win32::ODBC extension.
ODBC has gained so much momentum over the years that it is now considered to be
the standard Call Level Interface (CLI) for many database engines. It has been
ported to the most popular platforms (Win32, UNIX, Mac) and is so ubiquitous
that you may be using it without knowing it.
: 1SQLStreet has Over five million lines of free source code! And huge database
of articles and interestng and useful information on databases. Ian Ippolito is
the original creator of 1 SQL Street. It was first written with Microsoft Visual
Interdev 1.0 and Microsoft Access. The scripting is done with a technology
called Active Server Pages (ASP) and is done using VBScript. Currently it is
running from Microsoft Visual Interdev 6.0 and Microsoft SQL Server 7.0.
Get Help is where you can network directly with other developers to get answers
to your tough technical questions. You can browse through the categories below
to read our library of questions and answers. You'll also find links to our
"10-Minute Solution" how-to articles, as well as expert programming
tips, tricks, and shortcuts!
Our goal at Devdex is to keep you informed of all the latest Microsoft web
technologies. With the release of Windows 2000 and IIS 5.0 there is huge flood
of Articles, FAQ's, Examples, Books, Jobs and most importantly everyday
questions that are being answered on all the Usenet Newsgroups. Our index
collects nearly 500 new resources every day from the most popular ASP, SQL, VB,
and XML web sites including 13 Usenet Newsgroups.Every day we update our Job
categories with all the new job postings for the day. Our Index keeps jobs in
the system for 2 weeks then they are discarded.
Standards: JCC's SQL Std. Page. This page is designed to be a central source
of information about the SQL standards process and its current state. It will
contain a number of pointers to other sources of information about the SQL
standard. The information available here is:
Applications from ZDNet.com: ZDNet, www.zdnet.com, operates a worldwide
network of web sites that offer content, services, and commerce opportunities
that enable IT professionals and business influencers to gain an edge in
Database Management Systems: ITworld.com is an online publisher of news,
educational content and broadband interactive media for the IT community. As a
producer of daily IT news, dozens of topic specific newsletters, leading IT
white papers, and a wide variety of webcast programs, ITworld delivers
measurable, performance-based results for marketers by building customized lead
generation programs around the distribution of content.
Books on Database
links on Databases