Select which has max date or latest date
Here are two tables.
SCHOOL_CODE + STAFF_TYPE_NAME + LAST_UPDATE_DATE_TIME + PERSON_ID ================================================================= ABE Principal 24-JAN-13 111222 ABE Principal 09-FEB-12 222111
PERSON_ID + NAME ================= 111222 ABC 222111 XYZ
Here is my oracle query.
SELECT MAX(LAST_UPDATE_DATE_TIME) AS LAST_UPDATE, SCHOOL_CODE, PERSON_ID FROM SCHOOL_STAFF WHERE STAFF_TYPE_NAME='Principal' GROUP BY SCHOOL_CODE, PERSON_ID ORDER BY SCHOOL_CODE;
which gives this results
LAST_UPDATE SCHOOL_CODE PERSON_ID ===========+===========+========= 24-JAN-13 ABE 111222 09-FEB-12 ABE 222111
I want to select the first one for the school which has latest date.
Your current query is not giving the desired result because you are using a
GROUP BYclause on the
PERSON_IDcolumn which has a unique value for both entries. As a result you will return both rows.
There are a few ways that you can solve this. You can use a subquery to apply the aggregate function to return the
select s1.LAST_UPDATE_DATE_TIME, s1.SCHOOL_CODE, s1.PERSON_ID from SCHOOL_STAFF s1 inner join ( select max(LAST_UPDATE_DATE_TIME) LAST_UPDATE_DATE_TIME, SCHOOL_CODE from SCHOOL_STAFF group by SCHOOL_CODE ) s2 on s1.SCHOOL_CODE = s2.SCHOOL_CODE and s1.LAST_UPDATE_DATE_TIME = s2.LAST_UPDATE_DATE_TIME;
Or you can use use a windowing function to return the rows of data for each school with the most recent
select SCHOOL_CODE, PERSON_ID, LAST_UPDATE_DATE_TIME from ( select SCHOOL_CODE, PERSON_ID, LAST_UPDATE_DATE_TIME, row_number() over(partition by SCHOOL_CODE order by LAST_UPDATE_DATE_TIME desc) seq from SCHOOL_STAFF where STAFF_TYPE_NAME='Principal' ) d where seq = 1;
This query implements
row_number()which assigns a unique number to each row in the partition of
SCHOOL_CODEand placed in a descending order based on the
As a side note, the JOIN with aggregate function is not exactly the same as the
row_number()version. If you have two rows with the same event time the JOIN will return both rows, while the
row_number()will only return one. If you want to return both with a windowing function, then consider using the
rank()windowing function instead as it will return ties:
select SCHOOL_CODE, PERSON_ID, LAST_UPDATE_DATE_TIME from ( select SCHOOL_CODE, PERSON_ID, LAST_UPDATE_DATE_TIME, rank() over(partition by SCHOOL_CODE order by LAST_UPDATE_DATE_TIME desc) seq from SCHOOL_STAFF where STAFF_TYPE_NAME='Principal' ) d where seq = 1;
Thanks, I find the inner join to a subquery table (example 1 above) to be the most intuitive.. and doesn't require me to learn what _partition by_ is all about. Here is a look at a similar syntax to example 1: select oT.dateField, oT.siteID, oT.field1, oT.field2, oT.field3, from **originalTable** as **oT** inner join (select max(dateField) as newestDate, siteID from originalTable group by siteID ) as newTable on oT.siteID = newTable.site_ID and oT.dateField = newTable.newestDate order by oT.siteID asc To me, that better explains what's happening in the subquery.
I'm surprised nobody has taken advantage of window functions beyond row_number()
Here's some data to play with:
CREATE TABLE SCHOOL_STAFF ( LAST_UPDATE_DATE_TIME VARCHAR(20), SCHOOL_CODE VARCHAR(20), PERSON_ID VARCHAR(20), STAFF_TYPE_NAME VARCHAR(20) ); INSERT INTO SCHOOL_STAFF VALUES ('24-JAN-13', 'ABE', '111222', 'Principal'); INSERT INTO SCHOOL_STAFF VALUES ('09-FEB-12', 'ABE', '222111', 'Principal');
The OVER() clause creates a window for which you will define your aggregate groups. In this case, I am only partitioning on the SHOOL_CODE, so we will see the FIRST_VALUE, which will come from LAST_UPDATE_DATE_TIME, grouped by SCHOOL_CODE, and in the order of LAST_UPDATE_DATE_TIME by descending order. This value will be applied to the entire column for each SCHOOL_CODE.
It is important to pay close attention to your partitioning and ordering in the over() clause.
SELECT DISTINCT FIRST_VALUE(LAST_UPDATE_DATE_TIME) OVER (PARTITION BY SCHOOL_CODE ORDER BY LAST_UPDATE_DATE_TIME DESC) AS LAST_UPDATE ,FIRST_VALUE(SCHOOL_CODE) OVER (PARTITION BY SCHOOL_CODE ORDER BY LAST_UPDATE_DATE_TIME DESC) AS SCHOOL_CODE ,FIRST_VALUE(PERSON_ID) OVER (PARTITION BY SCHOOL_CODE ORDER BY LAST_UPDATE_DATE_TIME DESC) AS PERSON_ID FROM SCHOOL_STAFF WHERE STAFF_TYPE_NAME = 'Principal' ORDER BY SCHOOL_CODE
24-JAN-13 ABE 111222
This should eliminate your need for GROUP BY and Subqueries for the most part. You will want to make sure to include DISTINCT though.
select LAST_UPDATE_DATE_TIME as LAST_UPDATE, SCHOOL_CODE, PERSON_ID from SCHOOL_STAFF WHERE STAFF_TYPE_NAME='Principal' AND LAST_UPDATE_DATE_TIME = (SELECT MAX(LAST_UPDATE_DATE_TIME) FROM SCHOOL_STAFF s2 WHERE PERSON_ID = s2.PERSON_ID)