SQL Database Interview Questions

SQL (Structured Query Language) is the standard language for managing and manipulating relational databases. Whether you are preparing for a junior or senior database role, mastering these 20 commonly asked SQL interview questions will help you build a solid foundation and ace your next interview.

1. What is SQL? What are the types of SQL commands?

SQL stands for Structured Query Language. It is used to communicate with relational databases to create, read, update, and delete data. SQL commands are grouped into five categories:

DDL (Data Definition Language) — CREATE, ALTER, DROP, TRUNCATE
DML (Data Manipulation Language) — SELECT, INSERT, UPDATE, DELETE
DCL (Data Control Language) — GRANT, REVOKE
TCL (Transaction Control Language) — COMMIT, ROLLBACK, SAVEPOINT
DQL (Data Query Language) — SELECT (sometimes categorized separately)

SQL Command Types

-- DDL (Data Definition Language)
CREATE TABLE users (
    id INT PRIMARY KEY AUTO_INCREMENT,
    name VARCHAR(100) NOT NULL,
    email VARCHAR(255) UNIQUE
);

ALTER TABLE users ADD COLUMN age INT;
DROP TABLE users;
TRUNCATE TABLE users;

-- DML (Data Manipulation Language)
INSERT INTO users (name, email) VALUES ('John', 'john@example.com');
UPDATE users SET name = 'Jane' WHERE id = 1;
DELETE FROM users WHERE id = 1;

-- DCL (Data Control Language)
GRANT SELECT, INSERT ON users TO 'app_user'@'localhost';
REVOKE INSERT ON users FROM 'app_user'@'localhost';

-- TCL (Transaction Control Language)
START TRANSACTION;
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;
COMMIT;  -- or ROLLBACK;

2. What is the difference between SQL and NoSQL?

SQL databases are relational (table-based) while NoSQL databases are non-relational and can store data in documents, key-value pairs, graphs, or wide-column formats.

Feature	SQL	NoSQL
Structure	Fixed schema, tables	Flexible schema, documents
Scalability	Vertical	Horizontal
ACID	Strong ACID compliance	Eventual consistency (usually)
Examples	MySQL, PostgreSQL, SQL Server	MongoDB, Redis, Cassandra

SQL vs NoSQL Examples

-- SQL: Relational Database (MySQL)
CREATE TABLE users (
    id INT PRIMARY KEY AUTO_INCREMENT,
    name VARCHAR(100),
    email VARCHAR(255) UNIQUE
);

SELECT * FROM users WHERE name = 'John';

-- NoSQL: Document Database (MongoDB syntax for comparison)
-- db.users.insertOne({
--     name: "John",
--     email: "john@example.com",
--     hobbies: ["reading", "coding"],  -- flexible schema
--     address: { city: "Phnom Penh" }  -- nested documents
-- });
--
-- db.users.find({ name: "John" });

3. What is a Primary Key vs Foreign Key?

A Primary Key uniquely identifies each record in a table. It must be unique and cannot be NULL. A Foreign Key is a column that creates a relationship between two tables by referencing the primary key of another table.

Primary Key & Foreign Key

-- Primary Key: Uniquely identifies each row
CREATE TABLE users (
    id INT PRIMARY KEY AUTO_INCREMENT,
    name VARCHAR(100) NOT NULL,
    email VARCHAR(255) UNIQUE NOT NULL
);

-- Foreign Key: References another table's primary key
CREATE TABLE orders (
    id INT PRIMARY KEY AUTO_INCREMENT,
    user_id INT NOT NULL,
    product VARCHAR(100),
    amount DECIMAL(10, 2),
    FOREIGN KEY (user_id) REFERENCES users(id)
        ON DELETE CASCADE
        ON UPDATE CASCADE
);

-- Composite Primary Key
CREATE TABLE order_items (
    order_id INT,
    product_id INT,
    quantity INT,
    PRIMARY KEY (order_id, product_id),
    FOREIGN KEY (order_id) REFERENCES orders(id),
    FOREIGN KEY (product_id) REFERENCES products(id)
);

Key difference: A table can have only one primary key but can have multiple foreign keys. Foreign keys enforce referential integrity between tables.

4. What are Constraints in SQL?

Constraints are rules enforced on data columns to ensure accuracy and reliability. The most common constraints are:

NOT NULL — Column cannot have NULL values
UNIQUE — All values in a column must be different
CHECK — Values must satisfy a specific condition
DEFAULT — Sets a default value when none is specified
PRIMARY KEY — Combination of NOT NULL and UNIQUE
FOREIGN KEY — References a primary key in another table

SQL Constraints

CREATE TABLE employees (
    id INT PRIMARY KEY AUTO_INCREMENT,

    -- NOT NULL: Cannot be empty
    name VARCHAR(100) NOT NULL,

    -- UNIQUE: No duplicate values allowed
    email VARCHAR(255) UNIQUE,

    -- CHECK: Must satisfy condition
    age INT CHECK (age >= 18 AND age <= 65),

    -- DEFAULT: Automatic value if not provided
    status VARCHAR(20) DEFAULT 'active',
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,

    -- Combining constraints
    salary DECIMAL(10, 2) NOT NULL CHECK (salary > 0)
);

-- Adding constraints after table creation
ALTER TABLE employees
ADD CONSTRAINT chk_salary CHECK (salary >= 30000);

5. What is the SELECT statement? How do ORDER BY and LIMIT work?

SELECT retrieves data from one or more tables. ORDER BY sorts the results in ascending (ASC) or descending (DESC) order. LIMIT restricts the number of rows returned.

SELECT, ORDER BY, LIMIT

-- Basic SELECT
SELECT * FROM users;
SELECT name, email FROM users;

-- WHERE clause to filter rows
SELECT * FROM users WHERE age > 25;
SELECT * FROM users WHERE name LIKE 'J%';
SELECT * FROM users WHERE age BETWEEN 20 AND 30;
SELECT * FROM users WHERE status IN ('active', 'pending');

-- ORDER BY: Sort results
SELECT * FROM users ORDER BY name ASC;         -- A to Z
SELECT * FROM users ORDER BY created_at DESC;  -- newest first
SELECT * FROM users ORDER BY age DESC, name ASC; -- multi-column sort

-- LIMIT: Restrict number of rows
SELECT * FROM users LIMIT 10;           -- first 10 rows
SELECT * FROM users LIMIT 10 OFFSET 20; -- skip 20, get next 10 (pagination)

-- Combining everything
SELECT name, email, age
FROM users
WHERE status = 'active'
ORDER BY age DESC
LIMIT 5;

6. What are Aggregate Functions?

Aggregate functions perform a calculation on a set of values and return a single result. The most commonly used ones are COUNT, SUM, AVG, MIN, and MAX.

Aggregate Functions

-- COUNT: Number of rows
SELECT COUNT(*) AS total_users FROM users;
SELECT COUNT(DISTINCT department) AS dept_count FROM employees;

-- SUM: Total of a numeric column
SELECT SUM(salary) AS total_salary FROM employees;

-- AVG: Average value
SELECT AVG(salary) AS avg_salary FROM employees;

-- MIN / MAX: Smallest / Largest value
SELECT MIN(salary) AS lowest_salary FROM employees;
SELECT MAX(salary) AS highest_salary FROM employees;

-- Combining aggregate functions
SELECT
    COUNT(*) AS total_employees,
    SUM(salary) AS total_salary,
    AVG(salary) AS avg_salary,
    MIN(salary) AS min_salary,
    MAX(salary) AS max_salary
FROM employees
WHERE status = 'active';

7. What is GROUP BY and HAVING?

GROUP BY groups rows that share the same values into summary rows. HAVING filters groups after aggregation (unlike WHERE, which filters rows before aggregation).

GROUP BY & HAVING

-- GROUP BY: Group rows by column values
SELECT department, COUNT(*) AS employee_count
FROM employees
GROUP BY department;

-- GROUP BY with aggregate functions
SELECT
    department,
    COUNT(*) AS total,
    AVG(salary) AS avg_salary,
    MAX(salary) AS max_salary
FROM employees
GROUP BY department;

-- HAVING: Filter groups (after aggregation)
SELECT department, AVG(salary) AS avg_salary
FROM employees
GROUP BY department
HAVING AVG(salary) > 50000;

-- WHERE vs HAVING
SELECT department, COUNT(*) AS count, AVG(salary) AS avg_salary
FROM employees
WHERE status = 'active'          -- filters rows BEFORE grouping
GROUP BY department
HAVING COUNT(*) > 5              -- filters groups AFTER grouping
ORDER BY avg_salary DESC;

WHERE vs HAVING: Use WHERE to filter individual rows before grouping. Use HAVING to filter groups after aggregation.

8. What are the different types of JOINs?

JOINs combine rows from two or more tables based on a related column. The main types are:

INNER JOIN — Returns only matching rows from both tables
LEFT JOIN — Returns all rows from the left table, with matches from the right
RIGHT JOIN — Returns all rows from the right table, with matches from the left
FULL OUTER JOIN — Returns all rows when there is a match in either table
CROSS JOIN — Returns the Cartesian product of both tables

Types of JOINs

-- Sample tables:
-- users (id, name, email)
-- orders (id, user_id, total)

-- INNER JOIN: Only matching rows from both tables
SELECT u.name, o.id AS order_id, o.total
FROM users u
INNER JOIN orders o ON u.id = o.user_id;

-- LEFT JOIN: All users, even without orders (NULLs for no match)
SELECT u.name, o.id AS order_id, o.total
FROM users u
LEFT JOIN orders o ON u.id = o.user_id;

-- RIGHT JOIN: All orders, even without matching users
SELECT u.name, o.id AS order_id, o.total
FROM users u
RIGHT JOIN orders o ON u.id = o.user_id;

-- FULL OUTER JOIN: All rows from both tables
SELECT u.name, o.id AS order_id, o.total
FROM users u
FULL OUTER JOIN orders o ON u.id = o.user_id;

-- CROSS JOIN: Every combination (Cartesian product)
SELECT u.name, p.product_name
FROM users u
CROSS JOIN products p;

-- Multiple JOINs
SELECT u.name, o.id, p.product_name, oi.quantity
FROM users u
JOIN orders o ON u.id = o.user_id
JOIN order_items oi ON o.id = oi.order_id
JOIN products p ON oi.product_id = p.id;

9. What are Subqueries?

A subquery is a query nested inside another query. There are three types: Scalar (returns a single value), Row (returns a single row), and Table (returns a result set that acts as a virtual table).

Subqueries

-- Scalar Subquery: Returns a single value
SELECT name, salary
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees);

-- Row Subquery: Returns a single row
SELECT *
FROM employees
WHERE (department, salary) = (
    SELECT department, MAX(salary)
    FROM employees
    WHERE department = 'Engineering'
);

-- Table Subquery: Returns a result set (used with IN, EXISTS)
SELECT name, department
FROM employees
WHERE department IN (
    SELECT department
    FROM employees
    GROUP BY department
    HAVING COUNT(*) > 10
);

-- Correlated Subquery: References outer query
SELECT e.name, e.salary, e.department
FROM employees e
WHERE e.salary > (
    SELECT AVG(e2.salary)
    FROM employees e2
    WHERE e2.department = e.department
);

-- EXISTS Subquery
SELECT u.name
FROM users u
WHERE EXISTS (
    SELECT 1 FROM orders o WHERE o.user_id = u.id
);

10. What is a Self Join?

A self join is when a table is joined with itself. This is useful for finding relationships within the same table, such as employees and their managers.

Self Join

-- Self Join: Join a table with itself
-- Example: Find employees and their managers

-- employees table:
-- id | name    | manager_id
-- 1  | Alice   | NULL       (CEO, no manager)
-- 2  | Bob     | 1          (reports to Alice)
-- 3  | Charlie | 1          (reports to Alice)
-- 4  | David   | 2          (reports to Bob)

SELECT
    e.name AS employee,
    m.name AS manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.id;

-- Result:
-- employee | manager
-- Alice    | NULL
-- Bob      | Alice
-- Charlie  | Alice
-- David    | Bob

-- Find employees who earn more than their manager
SELECT
    e.name AS employee,
    e.salary AS emp_salary,
    m.name AS manager,
    m.salary AS mgr_salary
FROM employees e
JOIN employees m ON e.manager_id = m.id
WHERE e.salary > m.salary;

11. What is CASE WHEN?

CASE WHEN is SQL's conditional expression, similar to if-else statements in programming. It allows you to add conditional logic directly in your queries.

CASE WHEN

-- Simple CASE WHEN
SELECT
    name,
    salary,
    CASE
        WHEN salary >= 100000 THEN 'Senior'
        WHEN salary >= 60000  THEN 'Mid-level'
        WHEN salary >= 30000  THEN 'Junior'
        ELSE 'Intern'
    END AS level
FROM employees;

-- CASE WHEN in WHERE clause
SELECT * FROM orders
WHERE CASE
    WHEN status = 'shipped' THEN shipped_date IS NOT NULL
    WHEN status = 'pending' THEN created_at > '2026-01-01'
    ELSE TRUE
END;

-- CASE WHEN for conditional aggregation
SELECT
    department,
    COUNT(*) AS total,
    SUM(CASE WHEN status = 'active' THEN 1 ELSE 0 END) AS active_count,
    SUM(CASE WHEN status = 'inactive' THEN 1 ELSE 0 END) AS inactive_count,
    ROUND(
        SUM(CASE WHEN status = 'active' THEN 1 ELSE 0 END) * 100.0 / COUNT(*), 1
    ) AS active_percentage
FROM employees
GROUP BY department;

12. What are Window Functions?

Window functions perform calculations across a set of rows related to the current row without collapsing them into a single output. Common window functions include ROW_NUMBER(), RANK(), and DENSE_RANK().

Window Functions

-- ROW_NUMBER: Unique sequential number for each row
SELECT
    name,
    department,
    salary,
    ROW_NUMBER() OVER (ORDER BY salary DESC) AS row_num
FROM employees;

-- RANK: Same rank for ties, skips next numbers
-- DENSE_RANK: Same rank for ties, does NOT skip
SELECT
    name,
    department,
    salary,
    RANK()       OVER (ORDER BY salary DESC) AS rank_num,
    DENSE_RANK() OVER (ORDER BY salary DESC) AS dense_rank_num
FROM employees;
-- salary: 100k, 90k, 90k, 80k
-- RANK:       1,    2,    2,    4  (skips 3)
-- DENSE_RANK: 1,    2,    2,    3  (no skip)

-- PARTITION BY: Window function per group
SELECT
    name,
    department,
    salary,
    ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS dept_rank,
    AVG(salary)  OVER (PARTITION BY department) AS dept_avg_salary
FROM employees;

-- Get top 3 earners per department
SELECT * FROM (
    SELECT
        name, department, salary,
        ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS rn
    FROM employees
) ranked
WHERE rn <= 3;

RANK vs DENSE_RANK: RANK() skips numbers after ties (1, 1, 3), while DENSE_RANK() does not skip (1, 1, 2).

13. What is a CTE (Common Table Expression)?

A CTE is a temporary named result set defined using the WITH keyword. It makes complex queries more readable and can be referenced multiple times within the same query. CTEs can also be recursive.

Common Table Expression

-- Basic CTE
WITH active_users AS (
    SELECT id, name, email
    FROM users
    WHERE status = 'active'
)
SELECT au.name, COUNT(o.id) AS order_count
FROM active_users au
LEFT JOIN orders o ON au.id = o.user_id
GROUP BY au.name;

-- Multiple CTEs
WITH
    dept_stats AS (
        SELECT department, AVG(salary) AS avg_salary
        FROM employees
        GROUP BY department
    ),
    high_earners AS (
        SELECT e.name, e.department, e.salary
        FROM employees e
        JOIN dept_stats d ON e.department = d.department
        WHERE e.salary > d.avg_salary
    )
SELECT * FROM high_earners ORDER BY salary DESC;

-- Recursive CTE: Org chart / hierarchy
WITH RECURSIVE org_chart AS (
    -- Base case: top-level (no manager)
    SELECT id, name, manager_id, 1 AS level
    FROM employees
    WHERE manager_id IS NULL

    UNION ALL

    -- Recursive case: employees with managers
    SELECT e.id, e.name, e.manager_id, oc.level + 1
    FROM employees e
    JOIN org_chart oc ON e.manager_id = oc.id
)
SELECT * FROM org_chart ORDER BY level, name;

14. What is the difference between UNION and INTERSECT?

UNION combines results from two queries and removes duplicates (UNION ALL keeps duplicates). INTERSECT returns only the rows that appear in both result sets. EXCEPT (or MINUS) returns rows from the first query that are not in the second.

UNION, INTERSECT, EXCEPT

-- UNION: Combine results, remove duplicates
SELECT name, email FROM customers
UNION
SELECT name, email FROM suppliers;

-- UNION ALL: Combine results, keep duplicates (faster)
SELECT name, email FROM customers
UNION ALL
SELECT name, email FROM suppliers;

-- INTERSECT: Only rows that appear in BOTH queries
SELECT name FROM customers
INTERSECT
SELECT name FROM suppliers;

-- EXCEPT (or MINUS in Oracle): Rows in first query NOT in second
SELECT name FROM customers
EXCEPT
SELECT name FROM suppliers;

-- Practical example: Find users who placed orders but never left a review
SELECT user_id FROM orders
EXCEPT
SELECT user_id FROM reviews;

15. What is an Index? Clustered vs Non-clustered?

An index improves the speed of data retrieval on a table at the cost of additional storage and slower writes. A Clustered Index sorts and stores data rows physically in order (one per table). A Non-clustered Index creates a separate structure with pointers to the data rows (multiple per table).

Indexes

-- Create a basic index
CREATE INDEX idx_users_email ON users(email);

-- Composite index (multi-column)
CREATE INDEX idx_users_name_dept ON employees(department, name);

-- Unique index
CREATE UNIQUE INDEX idx_users_email_unique ON users(email);

-- Clustered Index (one per table, determines physical row order)
-- In MySQL InnoDB, the PRIMARY KEY is automatically the clustered index
CREATE TABLE users (
    id INT PRIMARY KEY,  -- This is the clustered index
    name VARCHAR(100),
    email VARCHAR(255)
);

-- Non-clustered Index (multiple per table, separate structure)
CREATE INDEX idx_name ON users(name);       -- non-clustered
CREATE INDEX idx_email ON users(email);     -- non-clustered

-- Show indexes on a table
SHOW INDEX FROM users;

-- Drop an index
DROP INDEX idx_users_email ON users;

-- Check query performance with EXPLAIN
EXPLAIN SELECT * FROM users WHERE email = 'john@example.com';

When to use indexes: Add indexes on columns frequently used in WHERE, JOIN, and ORDER BY clauses. Avoid over-indexing as it slows down INSERT/UPDATE/DELETE operations.

16. What is a View? What is a Materialized View?

A View is a virtual table based on the result of a SQL query. It does not store data itself. A Materialized View physically stores the result set and must be refreshed to reflect changes in the underlying data.

Views

-- Create a View (virtual table, no data stored)
CREATE VIEW active_employees AS
SELECT id, name, department, salary
FROM employees
WHERE status = 'active';

-- Query the view like a regular table
SELECT * FROM active_employees WHERE department = 'Engineering';

-- Update a view
CREATE OR REPLACE VIEW active_employees AS
SELECT id, name, department, salary, hire_date
FROM employees
WHERE status = 'active';

-- Drop a view
DROP VIEW active_employees;

-- Materialized View (stores data physically, PostgreSQL)
CREATE MATERIALIZED VIEW monthly_sales AS
SELECT
    DATE_TRUNC('month', order_date) AS month,
    SUM(total) AS total_sales,
    COUNT(*) AS order_count
FROM orders
GROUP BY DATE_TRUNC('month', order_date);

-- Must refresh manually to update data
REFRESH MATERIALIZED VIEW monthly_sales;

-- Query materialized view (fast, reads stored data)
SELECT * FROM monthly_sales ORDER BY month DESC;

17. What are Transactions and Isolation Levels?

A transaction is a sequence of SQL operations treated as a single unit of work. Transactions follow ACID properties (Atomicity, Consistency, Isolation, Durability). Isolation levels control how transactions interact with each other:

READ UNCOMMITTED — Can read uncommitted changes (dirty reads)
READ COMMITTED — Only reads committed data
REPEATABLE READ — Guarantees same data if read again
SERIALIZABLE — Full isolation, transactions run sequentially

Transactions

-- Start a transaction
START TRANSACTION;

-- Transfer $500 from account 1 to account 2
UPDATE accounts SET balance = balance - 500 WHERE id = 1;
UPDATE accounts SET balance = balance + 500 WHERE id = 2;

-- If everything is OK
COMMIT;

-- If something goes wrong
-- ROLLBACK;

-- SAVEPOINT: Partial rollback
START TRANSACTION;
INSERT INTO orders (user_id, total) VALUES (1, 99.99);
SAVEPOINT order_created;

INSERT INTO order_items (order_id, product_id, qty) VALUES (1, 5, 2);
-- Oops, wrong item! Roll back to savepoint
ROLLBACK TO order_created;

INSERT INTO order_items (order_id, product_id, qty) VALUES (1, 8, 1);
COMMIT;

-- Set Isolation Level
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;

18. What is a Stored Procedure vs Function?

Both are reusable SQL code blocks, but they differ in usage. A Stored Procedure can perform actions (INSERT, UPDATE, DELETE) and does not need to return a value. A Function must return a value and can be used inside SQL statements.

Stored Procedure vs Function

-- Stored Procedure: Performs actions, does not need to return a value
DELIMITER //
CREATE PROCEDURE raise_salary(
    IN emp_id INT,
    IN percentage DECIMAL(5, 2)
)
BEGIN
    UPDATE employees
    SET salary = salary * (1 + percentage / 100)
    WHERE id = emp_id;
END //
DELIMITER ;

-- Call a stored procedure
CALL raise_salary(1, 10.00);  -- Give employee 1 a 10% raise

-- Stored Procedure with OUT parameter
DELIMITER //
CREATE PROCEDURE get_employee_count(
    IN dept VARCHAR(100),
    OUT total INT
)
BEGIN
    SELECT COUNT(*) INTO total
    FROM employees
    WHERE department = dept;
END //
DELIMITER ;

CALL get_employee_count('Engineering', @count);
SELECT @count;

-- Function: Must return a value, can be used in SELECT
DELIMITER //
CREATE FUNCTION calculate_bonus(salary DECIMAL(10,2))
RETURNS DECIMAL(10,2)
DETERMINISTIC
BEGIN
    RETURN CASE
        WHEN salary >= 100000 THEN salary * 0.20
        WHEN salary >= 60000  THEN salary * 0.15
        ELSE salary * 0.10
    END;
END //
DELIMITER ;

-- Use function in a query
SELECT name, salary, calculate_bonus(salary) AS bonus
FROM employees;

19. What is Database Normalization?

Normalization is the process of organizing data to reduce redundancy and improve data integrity. The main normal forms are:

1NF — Each column contains atomic (indivisible) values; no repeating groups
2NF — 1NF + every non-key column depends on the entire primary key
3NF — 2NF + no transitive dependencies (non-key depends only on key)
BCNF — 3NF + every determinant is a candidate key

Normalization Example

-- UNNORMALIZED (Repeating groups, multiple values in one column)
-- | id | name  | phones                |
-- | 1  | John  | 012345678, 098765432  |  -- violates 1NF

-- 1NF: Atomic values, no repeating groups
CREATE TABLE users (
    id INT PRIMARY KEY,
    name VARCHAR(100)
);
CREATE TABLE user_phones (
    id INT PRIMARY KEY AUTO_INCREMENT,
    user_id INT,
    phone VARCHAR(20),
    FOREIGN KEY (user_id) REFERENCES users(id)
);

-- 2NF: No partial dependencies (all non-key columns depend on FULL primary key)
-- BAD (partial dependency on order_id only):
-- | order_id | product_id | product_name | quantity |
--
-- GOOD (separate the product info):
CREATE TABLE products (
    id INT PRIMARY KEY,
    name VARCHAR(100)
);
CREATE TABLE order_items (
    order_id INT,
    product_id INT,
    quantity INT,
    PRIMARY KEY (order_id, product_id)
);

-- 3NF: No transitive dependencies (non-key depends only on key)
-- BAD: employee -> department_id -> department_name
-- GOOD: separate department into its own table
CREATE TABLE departments (
    id INT PRIMARY KEY,
    name VARCHAR(100)
);
CREATE TABLE employees (
    id INT PRIMARY KEY,
    name VARCHAR(100),
    department_id INT,
    FOREIGN KEY (department_id) REFERENCES departments(id)
);

20. What is an ER Diagram?

An ER (Entity-Relationship) Diagram is a visual representation of the entities (tables), their attributes (columns), and the relationships between them in a database. It is used during the database design phase to plan the structure before writing SQL.

Entity — A table (e.g., Users, Orders, Products)
Attribute — A column (e.g., id, name, email)
Relationship — How entities relate (one-to-one, one-to-many, many-to-many)

ER Diagram as SQL

-- ER Diagram represented as SQL tables
-- Entities: Users, Orders, Products, Categories

-- One-to-Many: One user has many orders
CREATE TABLE users (
    id INT PRIMARY KEY AUTO_INCREMENT,
    name VARCHAR(100) NOT NULL,
    email VARCHAR(255) UNIQUE NOT NULL
);

CREATE TABLE orders (
    id INT PRIMARY KEY AUTO_INCREMENT,
    user_id INT NOT NULL,
    order_date DATE DEFAULT CURRENT_DATE,
    total DECIMAL(10, 2),
    FOREIGN KEY (user_id) REFERENCES users(id)
);

-- One-to-Many: One category has many products
CREATE TABLE categories (
    id INT PRIMARY KEY AUTO_INCREMENT,
    name VARCHAR(100) NOT NULL
);

CREATE TABLE products (
    id INT PRIMARY KEY AUTO_INCREMENT,
    name VARCHAR(100) NOT NULL,
    price DECIMAL(10, 2),
    category_id INT,
    FOREIGN KEY (category_id) REFERENCES categories(id)
);

-- Many-to-Many: Orders can have many products, products in many orders
-- Requires a junction/pivot table
CREATE TABLE order_items (
    order_id INT,
    product_id INT,
    quantity INT DEFAULT 1,
    price DECIMAL(10, 2),
    PRIMARY KEY (order_id, product_id),
    FOREIGN KEY (order_id) REFERENCES orders(id),
    FOREIGN KEY (product_id) REFERENCES products(id)
);

-- Relationships Summary:
-- users     1 --- * orders       (one-to-many)
-- categories 1 --- * products    (one-to-many)
-- orders    * --- * products     (many-to-many via order_items)

ER Diagram tools: You can create ER diagrams using tools like dbdiagram.io, MySQL Workbench, Lucidchart, or draw.io. The diagram helps visualize relationships before writing CREATE TABLE statements.

Summary

✓SQL Basics — DDL, DML, DCL, TCL command categories
✓Keys & Constraints — Primary Key, Foreign Key, NOT NULL, UNIQUE, CHECK, DEFAULT
✓Querying — SELECT, ORDER BY, LIMIT, Aggregate Functions, GROUP BY, HAVING
✓JOINs — INNER, LEFT, RIGHT, FULL OUTER, CROSS, Self Join
✓Advanced Queries — Subqueries, CASE WHEN, Window Functions, CTEs
✓Set Operations — UNION, INTERSECT, EXCEPT
✓Performance — Indexes (Clustered vs Non-clustered)
✓Database Objects — Views, Materialized Views, Stored Procedures, Functions
✓Transactions — ACID properties, Isolation Levels
✓Design — Normalization (1NF to BCNF), ER Diagrams