What is the GROUP BY Clause?
The GROUP BY clause of T-SQL is used to divide the results returned into groups. This technique is often utilized to obtain aggregate information from the table as in the number of users you have from each state or number of books an author has written. In addition you can also use the group by command to return the results with related articles proceeding each other so that you can associate the data into sub objects.
The basic format for the GROUP BY clause is:
GROUP BY column1, column2, ... columnn
where it comes after the WHERE clause but before the ORDER BY clause in a SELECT statement.
Sample GROUP BY example
In this example lets assume we have the following two Tables: BOOKS and SALES
Using the above above tables lets do a query to find the total quantity sold for a book. To do this we are going to join the two tables based on the Book ID as well as grouping on that ID.
SELECT b.ID as BOOK_ID, sum(s.QUANTITY) as NUM_SOLD FROM BOOK b LEFT JOIN SALES s ON b.ID = s.BOOK_ID GROUP BY b.ID
This will return the following results:
Utilizing the HAVING clause to restrict results
The HAVING clause is used to limit the groups that are returned from a query. In this command you can specify that the results must have a column equal to some value or that the sum or aggregate of columns of the group match some criteria.
The format for the HAVING command is:
GROUP BY column1, column2, ... HAVING criteria and|or criteria2 ..
Sample HAVING clause example
Utilizing the above database tables from the previous example we could add a HAVING clause to restrict the results to only return books that have sold more than 10 copies.
SELECT b.ID as BOOK_ID, sum(s.QUANTITY) as NUM_SOLD FROM BOOK b LEFT JOIN SALES s ON b.ID = s.BOOK_ID GROUP BY b.ID HAVING sum(s.QUANTITY) >=10
This query would have the following results: