Understanding Relational Tables
MySQL is a relational database. This means that data records can be split across several tables in a logical way. You will learn how to design a relational database in Lesson 16, “Designing Your Database,” so for now, you’ll consider a simple example from the sample tables that you have already seen.
A relationship exists between the customers and customer_contacts tables. There might be more than one contact person for each customer, assuming that a customer is a company and that you can deal with people in different departments. By splitting this information into two tables, you avoid duplicating data that is always the same for contacts within the same company.
By not duplicating columns, you not only reduce the amount of storage required for the database, but you also eliminate the possibility of data inconsistency that might otherwise exist. For instance, when a company’s details are stored in just one record in the customers table instead of in every row in the customer_contacts table, there is no danger that the customer data will be inconsistent from one record to another.
To keep the examples in this book simple, the sample tables contain a much smaller amount of information than you would find in a real-world database. Still, the concept is the same. The sample table customers contains a name field, which is the name of the customer. In a live database, this table would also contain a corporate address and any other information that is the same for a customerand is the same for all the contact persons for that customer.
The customer_contacts table is used to store multiple contacts for a customer. The customer_code field in this table acts as a link back to the customer data. The value of contacts.customer_code corresponds to the customers.customer_code value and is the key field used when joining these tables.
The customer_code column in customers is a primary key field. It contains a unique value that can be used to identify exactly one record from the table. The column with the same name in customer_contacts is known as a foreign key. You will learn more about keys in Lesson 17, “Keys and Indexes.”
Joining Two Tables
To join two tables, give both their names in the FROM clause of a SELECT statement. To indicate the relationship between the two tables, you must include an appropriate condition in the WHERE clause.
The following query joins the customers and customer_contacts tables to produce the complete contact information from the sample database. Because this query retrieves the entire contacts database, there is only one WHERE condition; this tells MySQL that the relationship between the tables is that the customer_code value is a reference to the customer_code column in customers.
mysql> SELECT name, -> CONCAT (last_name, ', ', first_name) as contact_name -> FROM customers, customer_contacts -> WHERE customers.customer_code = -> customer_contacts.customer_code -> ORDER BY name, contact_name; +-------------------------+---------------------+ | name | contact_name | +-------------------------+---------------------+ | Musicians of America | Britten, Benjamin | | Musicians of America | Gershwin, George | | Musicians of America | Lennon, John | | Presidents Incorporated | Lincoln, Abraham | | Presidents Incorporated | Nixon, Richard | | Presidents Incorporated | Roosevelt, Franklin | | Presidents Incorporated | Roosevelt, Theodore | | Science Corporation | Curie, Marie | | Science Corporation | Darwin, Charles | | Science Corporation | Einstein, Albert | | Science Corporation | Franklin, Benjamin | +-------------------------+---------------------+ 11 rows in set (0.02 sec)
Column Naming The sample tables in this book have
used a naming convention in which each foreign key has the same name as the corresponding primary key in the other table.
However, this is not a requirement. Another popular convention is to name the primary key in each table as simply id or code. For example, the customer_code value in customer_contacts would relate to a column named code in customers.
The condition in the WHERE clause that indicates the relationship between joined tables is very important. To see why, look at what happens when you specify two tables in the FROM clause without any conditions.
Two small tables are included in the sample database, named t1 and t2. These tables contain just a few rows to demonstrate joins.First, familiarize yourself with their data:
mysql> SELECT * FROM t1; +----+--------+ | id | letter | +----+--------+ | 1 | A | | 2 | B | | 3 | C | +----+--------+ 3 rows in set (0.01 sec) mysql> SELECT * FROM t2; +----+--------+ | id | letter | +----+--------+ | 1 | X | | 2 | Y | | 3 | Z | +----+--------+ 3 rows in set (0.00 sec)
Now perform a join on the two tables, with no condition specified.
mysql> SELECT * -> FROM t1, t2; +----+--------+----+--------+ | id | letter | id | letter | +----+---- ---+----+--------+ | 1 | A | 1 | X | | 2 | B | 1 | X | | 3 | C | 1 | X | | 1 | A | 2 | Y | | 2 | B | 2 | Y | | 3 | C | 2 | Y | | 1 | A | 3 | Z | | 2 | B | 3 | Z | | 3 | C | 3 | Z | +----+--------+----+--------+ 9 rows in set (0.00 sec)
As you can see, each row from t1 is combined with each row from t2. Three rows in each table produce a join, resulting in a total of nine pairs of records. This is known as a Cartesian product, and it is rarely the result you would want from a query.
Notice how the SELECT * notation in the previous example caused every column from both tables to be returned. As a result, two columns named id and two named letter were returned. These columns belong to the tables in the order given in the FROM clause.
The total number of rows returned is the product of the number of rows in each table. You can imagine how large the resulting data set can grow when the individual tables contain a large number of rows themselves.
A Cartesian product is sometimes known as a cross join.
Adding a WHERE condition to this join instructs MySQL to filter out the pairs of records for which there is no relationship between the data. This is the majority of the rows that would be returned in a Cartesian product. Assuming a relationship between the id column in each table, the join then produces the following result:
mysql> SELECT * -> FROM t1, t2 -> WHERE t1.id = t2.id; +----+--------+----+--------+ | id | letter | id | letter | +----+--------+----+--------+ | 1 | A | 1 | X | | 2 | B | 2 | Y | | 3 | C | 3 | Z | +----+--------+----+--------+ 3 rows in set (0.00 sec)
Take a moment to look back at the result of the Cartesian product, and you will see that the three rows returned by the previous example appear, but the other rows have been discarded.
A join that involves a condition in which columns in two tables are specified as equal is known as an equijoin or inner join. You will learn about other types of joins in Lesson 12, “Creating Advanced Joins.”
Even if your table definition has a foreign key defined (you will learn how to do this in Lesson 17), you must include a WHERE clause that specifies the join condition. The relationship to be used for a join is never stored at the database level.
Joining Multiple Tables
You can join more than two tables in one query by specifying all the table names in the FROM clause. The following example uses the relationships among orders, order_lines, and products tables to produce an order history for a particular customer:
mysql> SELECT o.order_id, o.order_date, l.quantity, p.name -> FROM orders o, order_lines l, products p -> WHERE o.order_id = l.order_id -> AND p.product_code = l.product_code -> AND o.customer_code = 'SCICORP' -> ORDER BY o.order_date; +----------+-------------+----------+------------------+ | order_id | order_date | quantity | name | +----------+-------------+----------+------------------+ | 3 | 2006-01-23 | 16 | Small product | | 4 | 2006-02-02 | 16 | Small product | | 4 | 2006-02-02 | 10 | Large product | | 5 | 2006-02-05 | 10 | Medium product | | 5 | 2006-02-05 | 10 | Large product | +----------+-------------+----------+------------------+ 5 rows in set (0.00 sec)
To perform an equijoin on multiple tables, a condition in the WHERE clause must specify a relationship from every table to another table. In this example, there are relationships between orders and order_lines, and between order_lines and products.
Each row in order_lines adds the specified quantity of a product to an order. The order_id column in order_lines joins that table to orders, and the product_code column joins it to products. The first two conditions in the WHERE clause indicate these relationships. The third filter is used to return only the order history for SCICORP.
In the previous example, because there is no direct relationship between orders and products, no condition in the WHERE clause tries to link these tables.
Although each table joined should have a relationship to another table in the query, every table does not have to beand usually will not berelated to every other table.