Skip to main content

System Status: 

Data Warehouse Terms

See a list of commonly used data warehouse, database and UCSD specific terms.

If you notice any omissions or mistakes, please tell us.

Quickjump : A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

- A -

additive fact

Additive facts are measurements in a fact table that can be added across all of the dimensions.


An alias is a short substitute or nickname for a table name.


American Standard Code for Information Interchange (ASCII) is an 8-bit code for character representation, including 7 bits plus parity. The code represents English characters as numbers, with each letter assigned a number from 0 to 127. Most computers use ASCII codes to represent text, which makes it possible to transfer data from one computer to another.


An attribute field is represented by a column within an object (entity). An object may be a table, view, or report.
- B -

back-end tool

A back-end tool is a software application, typically resident on both the client and the server, that assists in the production data extract process. See front-end tool for comparison.


A browser is a software application used to locate and display Web pages. The 2 most commonly used browsers are Netscape Navigator and Microsoft Internet Explorer.
- C -


A client is a software application on your computer, used to extract or download some application, data, or service from a host system.

client/ server architecture

Client/ server architecture is a networked environment in which a smaller system such as a PC interacts with a larger, faster system. The processing is performed on the larger system which frees the PC. The larger system can connect and disconnect from the clients to process the data more efficiently.

conforming dimensions

Conforming dimensions are dimensions with exactly the same set of primary keys and same number of records. Two conformed dimensions can be combined into a single dimension by creating a union of the attributes.
- D -

Data Directory

The Data Directory, a component of ACT's DataLink, provides descriptions of all databases, tables, and fields used in DARWIN. The Data Directory contains a physical data model (also called a schema or entity-relationship diagram) for each subject area in DARWIN and a subset of the SQL-DSE. You can use the search option to find all structures containing the data field of interest.
DataLink is a Web-based tool, developed by UCSD's Data Warehouse team, that describes all databases, tables, and fields used in DARWIN. DataLink has information on overall data, available SQL queries, and data refresh history.


Data Repository and Windows to Information (DARWIN) is UCSD's primary data environment, running on a Sun/ Sybase server, with full data directory support, integrity checks, and training classes.

data mart

A data mart is a subject-oriented subset of data classically designed for ease of use.

degenerate dimension

A digenerate dimension is a dimension key, such as an invoice number or a ticket number, that has no attributes and hence has no actual dimension table.


Denormalize means to allow redundancy in a table so that the table can remain flat.

dimensional modeling

Dimensional modeling is an accepted practice that the data warehouse industry uses to structure data intended for user access, analysis, and reporting in dimensional data models. The models are specifically designed to meet the twin goals of ease-of-use and performance. Dimensional modeling for data warehousing is now an industry standard. The technique makes retrieving data faster and understanding it easier, because the models are developed based on how the business sees the data.
- E -


An entity is a database object such as a table, view, report, or screen.

entity relation model

An entity relation model is a model of an organization's data in which the objective was to remove all repeated values by creating more tables.
- F -

fact table

A fact table is the central table in a star join schema characterized by a composite key, each of whose elements is a foreign key drawn from a dimension table.


FAQ refers to an Internet page of answers to Frequently Asked Questions. FAQs are compiled by subject matter, and are valuable resources for learning about particular subjects.

foreign key

A foreign key is a field in a relational database table whose values are drawn from the values of a primary key in another table.

front-end tool

A front-end tool is a client tool that fetches or manipulates data stored on a relational database. See back-end tool for comparison.


File Transfer Protocol (FTP) is the most common way to transfer files from one computer to another.
- G -


A gigabyte equals 1 billion bytes.


Graphical Query Language (GQL) is a software product designed by Andyne, used to access information stored in a SQL database. With GQL, you may query the data, download data into your own environment, and generate reports.


GUI (pronounced gooey) stands for Graphical User Interface, a program interface that takes advantage of the computer's graphics capabilities to make the program easier to use.
- H -


HyperText Markup Language (HTML), a subset of Standard Generalized Markup Language (SGML), is the authoring language used to create Web documents. Tags or elements tell the browser how to display the information. The tags are used to define the structure and layout of a Web document in a hierarchical format.


The HyperText Transfer Protocol (HTTP)is a fixed set of commands used during a hypertext link between a client and server.
A hyperlink is an element in an electronic document that links to a related piece of information in the same document or in a different document. You can quickly jump from one document or location to another transparently.


A hypertext document is a combination of text, links to other documents, sound, graphic images, and film clips.
- I -


Integrated Financial Information Systems (IFIS)is the central campus financial system that processes accounts payable, budget and staffing, chart of accounts, travel, purchasing, general accounting, and express order systems.


An index is a link between 1 table and another for rapid access to the rows of a table based on the values of 1 or more columns in another table.


The Internet is a global network linked together through low level protocols.

IP address

Every device on a network is assigned an IP (Internet Protocol) address that is used for communicating from device to device. The format of an IP address is a 32-bit numeric address written as 4 numbers separated by periods.


Integrated Student Information Systems (ISIS) is the central campus student system that processes academic history, accounts receivable, admissions, course catalog/schedue, facilities, grades, housing, registration, and student aid management systems.
- J -
- K -
- L -

local-area network

A local-area network (LAN) is a grouping of devices such as PC's, fax machines, and printers, that are physically connected by ethernet wiring within a fairly limited location.
- M -

many-to-many relationship

A many-to-many relationship is a logical data relationship in which the value of 1 data element can exist in combination with many values of another data element and vice versa.

meta data

Meta data figuratively means "data about data." For a breakdown of the kinds of meta data in the Data Warehouse, see the glossary definitions for Data Directory as well as DataLink.
- N -


Newsgroups are online discussion groups that enable the exchange of ideas by posting messages. Newsgroups are organized by subject or interest.

nonadditive fact

A nonadditive fact is one that cannot logically be added between records.


Normalize is the process of removing redundancy in data by separating the data into multiple tables.


A null value is a trigger to let you know that the value for that row is either missing, unknown, not yet known, or inapplicable. Placing a zero in the row would not reflect the accurate state of the row, because zero is a value. This way you can search for missing data. SQL supports the use of null values.
- O -


Open DataBase Connectivity (ODBC) is an interface protocol used to access data in a relational database environment. Because ODBC is vendor neutral it allows access to data from several servers using just one application, including the DB-Lib protocol portion of the interface between the client and the server. ODBC is Microsoft's interface layer to various databases. When comparing ODBC with Open Client, ODBC would effectively replace the DB-Lib portion of Sybases' Open Client.


OnLine Analytic Processing (OLAP) is a loosely defined set of software tools that provides a dimensional framework for decision support. The term OLAP also defines a confederation of vendors who offer non-relational, proprietary products aimed at decision support.

Open Client

Open Client is the Sybase connectivity software used to establish a connection between various client applications and Sybase's SQL Server software. On Intel platforms, Sybase has split Open Client into the combination of Net-Lib and DB-Lib. The Net-Lib portion establishes the connection and the DB-Lib poriton is specific to the interface (e.g., character verses graphics).


The object owner is a user or users who have authority over that object, where object is a table, view, or attribute.

outer join

An outer join is the merger of 2 SQL answer sets of data, making row headers the union of all values in the 2 sets of data.
- P -

physical design

Physical design is the phase of a database design following the logical design. It identifies the actual database tables and index structures used to implement the logical design.


Payroll Personnel System (PPS) comprises computer-based processes and applications that support the administration of employee-related information for the UCSD Campus and Medical Center. The system is maintained, in the form of a base, or standard version of the system, by the University of California, Office of the President (UCOP). In this role, UCOP could be viewed as the vendor of the system. UCSD maintains and operates its own local, modified versions of the standard system on its own platform.

primary key

A primary key is a column or combination of columns whose values uniquely identify a row or record in the table. The primary key(s) have a unique value for each record or row in the table.


A protocol is a formal description of message formats and the rules 2 or more devices must follow to exchange data.
- Q -


QueryLink is a Web-based tool for easy access to Data Warehouse information without knowing a programming language.
- R -


Relationships in a relational database are represented by common data values stored in the 2 tables. With this type of relationship, you can easily retrieve related data from the database by manipulating the relationships.

relational database

A relational database is a system that supports the full range of standard SQL. It is entirely composed of tables. Each column is a particular kind of data and each row is a unique instance of that data. Each row is uniquely identified by a primary key. Sybase is the relational database used by UCSD's Data Warehouse.
- S -


A schema is the logical and physical definition of data elements, physical charateristics, and interrelationships.


A server is a network computer where the server software resides. Servers provide such services such as filing, mail, communications, and authentication . For example, a file server is a computer and storage device dedicated to storing files. Any user on the network can store files on the server.

slice and dice

Slice and dice is the standard description of accessing data equally through any of its dimensions.

slowly changing dimensions

Slowly changing dimensions reflect the tendency of dimension records to change gradually or occasionally over time.

snowflake schema

A snowflake schema is a set of tables comprised of a single, central fact table surrounded by normalized dimensions.

star schema

A star schema is a set of tables comprised of a single, central fact table surrounded by de-normalized dimensions.


Structured Query Language for Departmental Support Environment (SQL-DSE) is a data environment that will eventually be integrated where feasible into the DARWIN environment. Currently the SQL-DSE environment does not have the same level of integrity checks, training classes, or meta data as DARWIN.

Structured Query Language (SQL)

Structured Query Language is pronounced "sequel" or "S.Q.L." Used with relational databases, it allows users to define the structure and organization of stored data, verify and maintain data integrity, control access to the data, and define relationships among the stored data items. A major function of SQL is retrieval of data from the database by a user or an application program. A user or an application program can update the database (add, delete, or modify data). SQL is a fourth generation language which lets you tell the computer what data you want without telling the computer how to get it.
- T -


Transmission Control Protocol/Internet Protocol (TCP/IP)enables a connection to transmit data across the Internet.


Telnet is a protocol that allows you to remotely interact with another device. You can login to a remote device and simulate a direct connect. When you login to the mainframe and access ISIS, IFIS, PPS, or DSE, you are connecting through Telnet.


A terabyte equals one trillion bytes.


Tagged Image File Format (TIFF) is a graphic file format used to transfer graphic images across the Internet.

twinkling database

In a twinkling database, the data you're attempting to query is constantly changing.
- U -


Uniform Resource Locator (URL) is the path information in an HTML-coded source file used to locate another document or image. The format for the URL is:
- V -


A view is a database component that behaves just like a table but has no independent existence of its own.
- W -

World Wide Web

The World Wide Web is a hypermedia application used for access of data over the Internet. The WWW is based on the HTML standard of marking up documents.
- X -
- Y -
- Z -