Fetch the rows which have the Max value for a column for each distinct value of another column

Questions : Fetch the rows which have the Max value for a column for each distinct value of another column

Table:

UserId, Value, Date. 

I want to get the UserId, Value for the max(Date) for each UserId. That is, the Value for each UserId that has the latest date. Is there a way to do this simply in SQL? (Preferably Oracle)

Update: Apologies for any ambiguity: I need to get ALL the UserIds. But for each UserId, only that row where that user has the latest date.

Total Answers: 35 Answers 35


Popular Answers:

  1. This will retrieve all rows for which the my_date column value is equal to the maximum value of my_date for that userid. This may retrieve multiple rows for the userid where the maximum date is on multiple rows.

    select userid, my_date, ... from ( select userid, my_date, ... max(my_date) over (partition by userid) max_my_date from users ) where my_date = max_my_date 

    “Analytic functions rock”

    Edit: With regard to the first comment …

    “using analytic queries and a self-join defeats the purpose of analytic queries”

    There is no self-join in this code. There is instead a predicate placed on the result of the inline view that contains the analytic function — a very different matter, and completely standard practice.

    “The default window in Oracle is from the first row in the partition to the current one”

    The windowing clause is only applicable in the presence of the order by clause. With no order by clause, no windowing clause is applied by default and none can be explicitly specified.

    The code works.

  2. This will retrieve all rows for which the my_date column value is equal to the maximum value of my_date for that userid. This may retrieve multiple rows for the userid where the maximum date is on multiple rows.

    select userid, my_date, ... from ( select userid, my_date, ... max(my_date) over (partition by userid) max_my_date from users ) where my_date = max_my_date 

    “Analytic functions rock”

    Edit: With regard to the first comment …

    “using analytic queries and a self-join defeats the purpose of analytic queries”

    There is no self-join in this code. There is instead a predicate placed on the result of the inline view that contains the analytic function — a very different matter, and completely standard practice.

    “The default window in Oracle is from the first row in the partition to the current one”

    The windowing clause is only applicable in the presence of the order by clause. With no order by clause, no windowing clause is applied by default and none can be explicitly specified.

    The code works.

  3. SELECT userid, MAX(value) KEEP (DENSE_RANK FIRST ORDER BY date DESC) FROM table GROUP BY userid 
  4. I don’t know your exact columns names, but it would be something like this:

     select userid, value from users u1 where date = (select max(date) from users u2 where u1.userid = u2.userid) 
  5. Not being at work, I don’t have Oracle to hand, but I seem to recall that Oracle allows multiple columns to be matched in an IN clause, which should at least avoid the options that use a correlated subquery, which is seldom a good idea.

    Something like this, perhaps (can’t remember if the column list should be parenthesised or not):

    SELECT * FROM MyTable WHERE (User, Date) IN ( SELECT User, MAX(Date) FROM MyTable GROUP BY User) 

    EDIT: Just tried it for real:

    SQL> create table MyTable (usr char(1), dt date); SQL> insert into mytable values ('A','01-JAN-2009'); SQL> insert into mytable values ('B','01-JAN-2009'); SQL> insert into mytable values ('A', '31-DEC-2008'); SQL> insert into mytable values ('B', '31-DEC-2008'); SQL> select usr, dt from mytable 2 where (usr, dt) in 3 ( select usr, max(dt) from mytable group by usr) 4 / U DT - --------- A 01-JAN-09 B 01-JAN-09 

    So it works, although some of the new-fangly stuff mentioned elsewhere may be more performant.

  6. I know you asked for Oracle, but in SQL 2005 we now use this:

     -- Single Value ;WITH ByDate AS ( SELECT UserId, Value, ROW_NUMBER() OVER (PARTITION BY UserId ORDER BY Date DESC) RowNum FROM UserDates ) SELECT UserId, Value FROM ByDate WHERE RowNum = 1 -- Multiple values where dates match ;WITH ByDate AS ( SELECT UserId, Value, RANK() OVER (PARTITION BY UserId ORDER BY Date DESC) Rnk FROM UserDates ) SELECT UserId, Value FROM ByDate WHERE Rnk = 1 
  7. I don’t have Oracle to test it, but the most efficient solution is to use analytic queries. It should look something like this:

    SELECT DISTINCT UserId , MaxValue FROM ( SELECT UserId , FIRST (Value) Over ( PARTITION BY UserId ORDER BY Date DESC ) MaxValue FROM SomeTable ) 

    I suspect that you can get rid of the outer query and put distinct on the inner, but I’m not sure. In the meantime I know this one works.

    If you want to learn about analytic queries, I’d suggest reading http://www.orafaq.com/node/55 and http://www.akadia.com/services/ora_analytic_functions.html. Here is the short summary.

    Under the hood analytic queries sort the whole dataset, then process it sequentially. As you process it you partition the dataset according to certain criteria, and then for each row looks at some window (defaults to the first value in the partition to the current row – that default is also the most efficient) and can compute values using a number of analytic functions (the list of which is very similar to the aggregate functions).

    In this case here is what the inner query does. The whole dataset is sorted by UserId then Date DESC. Then it processes it in one pass. For each row you return the UserId and the first Date seen for that UserId (since dates are sorted DESC, that’s the max date). This gives you your answer with duplicated rows. Then the outer DISTINCT squashes duplicates.

    This is not a particularly spectacular example of analytic queries. For a much bigger win consider taking a table of financial receipts and calculating for each user and receipt, a running total of what they paid. Analytic queries solve that efficiently. Other solutions are less efficient. Which is why they are part of the 2003 SQL standard. (Unfortunately Postgres doesn’t have them yet. Grrr…)

  8. Wouldn’t a QUALIFY clause be both simplest and best?

    select userid, my_date, ... from users qualify rank() over (partition by userid order by my_date desc) = 1 

    For context, on Teradata here a decent size test of this runs in 17s with this QUALIFY version and in 23s with the ‘inline view’/Aldridge solution #1.

  9. In Oracle 12c+, you can use Top n queries along with analytic function rank to achieve this very concisely without subqueries:

    select * from your_table order by rank() over (partition by user_id order by my_date desc) fetch first 1 row with ties; 

    The above returns all the rows with max my_date per user.

    If you want only one row with max date, then replace the rank with row_number:

    select * from your_table order by row_number() over (partition by user_id order by my_date desc) fetch first 1 row with ties; 
  10. With PostgreSQL 8.4 or later, you can use this:

    select user_id, user_value_1, user_value_2 from (select user_id, user_value_1, user_value_2, row_number() over (partition by user_id order by user_date desc) from users) as r where r.row_number=1 
  11. Use ROW_NUMBER() to assign a unique ranking on descending Date for each UserId, then filter to the first row for each UserId (i.e., ROW_NUMBER = 1).

    SELECT UserId, Value, Date FROM (SELECT UserId, Value, Date, ROW_NUMBER() OVER (PARTITION BY UserId ORDER BY Date DESC) rn FROM users) u WHERE rn = 1; 
  12. Just had to write a “live” example at work 🙂

    This one supports multiple values for UserId on the same date.

    Columns: UserId, Value, Date

    SELECT DISTINCT UserId, MAX(Date) OVER (PARTITION BY UserId ORDER BY Date DESC), MAX(Values) OVER (PARTITION BY UserId ORDER BY Date DESC) FROM ( SELECT UserId, Date, SUM(Value) As Values FROM <<table_name>> GROUP BY UserId, Date ) 

    You can use FIRST_VALUE instead of MAX and look it up in the explain plan. I didn’t have the time to play with it.

    Of course, if searching through huge tables, it’s probably better if you use FULL hints in your query.

  13. I’m quite late to the party but the following hack will outperform both correlated subqueries and any analytics function but has one restriction: values must convert to strings. So it works for dates, numbers and other strings. The code does not look good but the execution profile is great.

    select userid, to_number(substr(max(to_char(date,'yyyymmdd') || to_char(value)), 9)) as value, max(date) as date from users group by userid 

    The reason why this code works so well is that it only needs to scan the table once. It does not require any indexes and most importantly it does not need to sort the table, which most analytics functions do. Indexes will help though if you need to filter the result for a single userid.

  14. If you’re using Postgres, you can use array_agg like

    SELECT userid,MAX(adate),(array_agg(value ORDER BY adate DESC))[1] as value FROM YOURTABLE GROUP BY userid 

    I’m not familiar with Oracle. This is what I came up with

    SELECT userid, MAX(adate), SUBSTR( (LISTAGG(value, ',') WITHIN GROUP (ORDER BY adate DESC)), 0, INSTR((LISTAGG(value, ',') WITHIN GROUP (ORDER BY adate DESC)), ',')-1 ) as value FROM YOURTABLE GROUP BY userid 

    Both queries return the same results as the accepted answer. See SQLFiddles:

    1. Accepted answer
    2. My solution with Postgres
    3. My solution with Oracle
  15. I think something like this. (Forgive me for any syntax mistakes; I’m used to using HQL at this point!)

    EDIT: Also misread the question! Corrected the query…

    SELECT UserId, Value FROM Users AS user WHERE Date = ( SELECT MAX(Date) FROM Users AS maxtest WHERE maxtest.UserId = user.UserId ) 
  16. i thing you shuold make this variant to previous query:

    SELECT UserId, Value FROM Users U1 WHERE Date = ( SELECT MAX(Date) FROM Users where UserId = U1.UserId) 
  17. Select UserID, Value, Date From Table, ( Select UserID, Max(Date) as MDate From Table Group by UserID ) as subQuery Where Table.UserID = subQuery.UserID and Table.Date = subQuery.mDate
  18. select VALUE from TABLE1 where TIME = (select max(TIME) from TABLE1 where DATE= (select max(DATE) from TABLE1 where CRITERIA=CRITERIA)) 
  19. (T-SQL) First get all the users and their maxdate. Join with the table to find the corresponding values for the users on the maxdates.
  20. SELECT FIRST, LAST, SUM(POINTS) AS TOTAL FROM STUDENTS S, RESULTS R WHERE S.SID = R.SID AND R.CAT = 'H' GROUP BY S.SID, FIRST, LAST HAVING SUM(POINTS) >= ALL (SELECT SUM (POINTS) FROM RESULTS WHERE CAT = 'H' GROUP BY SID)
  21. Just tested this and it seems to work on a logging table

    select ColumnNames, max(DateColumn) from log group by ColumnNames order by 1 desc 
  22. Assuming Date is unique for a given UserID, here’s some TSQL:

    SELECT UserTest.UserID, UserTest.Value FROM UserTest INNER JOIN ( SELECT UserID, MAX(Date) MaxDate FROM UserTest GROUP BY UserID ) Dates ON UserTest.UserID = Dates.UserID AND UserTest.Date = Dates.MaxDate 
  23. Solution for MySQL which doesn’t have concepts of partition KEEP, DENSE_RANK.

    select userid, my_date, ... from ( select @sno:= case when @pid<>userid then 0 else @sno+1 end as serialnumber, @pid:=userid, my_Date, ... from users order by userid, my_date ) a where a.serialnumber=0 

    Reference: http://benincampus.blogspot.com/2013/08/select-rows-which-have-maxmin-value-in.html

  24. select userid, value, date from thetable t1 , ( select t2.userid, max(t2.date) date2 from thetable t2 group by t2.userid ) t3 where t3.userid t1.userid and t3.date2 = t1.date 

    IMHO this works. HTH

  25. I think this should work?

    Select T1.UserId, (Select Top 1 T2.Value From Table T2 Where T2.UserId = T1.UserId Order By Date Desc) As 'Value' From Table T1 Group By T1.UserId Order By T1.UserId 
  26. First try I misread the question, following the top answer, here is a complete example with correct results:

    CREATE TABLE table_name (id int, the_value varchar(2), the_date datetime); INSERT INTO table_name (id,the_value,the_date) VALUES(1 ,'a','1/1/2000'); INSERT INTO table_name (id,the_value,the_date) VALUES(1 ,'b','2/2/2002'); INSERT INTO table_name (id,the_value,the_date) VALUES(2 ,'c','1/1/2000'); INSERT INTO table_name (id,the_value,the_date) VALUES(2 ,'d','3/3/2003'); INSERT INTO table_name (id,the_value,the_date) VALUES(2 ,'e','3/3/2003'); 

     select id, the_value from table_name u1 where the_date = (select max(the_date) from table_name u2 where u1.id = u2.id) 

    id the_value ----------- --------- 2 d 2 e 1 b (3 row(s) affected) 
  27. This will also take care of duplicates (return one row for each user_id):

    SELECT * FROM ( SELECT u.*, FIRST_VALUE(u.rowid) OVER(PARTITION BY u.user_id ORDER BY u.date DESC) AS last_rowid FROM users u ) u2 WHERE u2.rowid = u2.last_rowid 
  28. This should be as simple as:

    SELECT UserId, Value FROM Users u WHERE Date = (SELECT MAX(Date) FROM Users WHERE UserID = u.UserID) 
  29. select UserId,max(Date) over (partition by UserId) value from users; 
  30. If (UserID, Date) is unique, i.e. no date appears twice for the same user then:

    select TheTable.UserID, TheTable.Value from TheTable inner join (select UserID, max([Date]) MaxDate from TheTable group by UserID) UserMaxDate on TheTable.UserID = UserMaxDate.UserID TheTable.[Date] = UserMaxDate.MaxDate; 
  31. check this link if your questions seems similar to that page then i would suggest you the following query which will give the solution for that link

    select distinct sno,item_name,max(start_date) over(partition by sno),max(end_date) over(partition by sno),max(creation_date) over(partition by sno), max(last_modified_date) over(partition by sno) from uniq_select_records order by sno,item_name asc;

    will given accurate results related to that link

  32. Use the code:

    select T.UserId,T.dt from (select UserId,max(dt) over (partition by UserId) as dt from t_users)T where T.dt=dt; 

    This will retrieve the results, irrespective of duplicate values for UserId. If your UserId is unique, well it becomes more simple:

    select UserId,max(dt) from t_users group by UserId; 
  33. SELECT a.userid,a.values1,b.mm FROM table_name a,(SELECT userid,Max(date1)AS mm FROM table_name GROUP BY userid) b WHERE a.userid=b.userid AND a.DATE1=b.mm; 
  34. Below query can work :

    SELECT user_id, value, date , row_number() OVER (PARTITION BY user_id ORDER BY date desc) AS rn FROM table_name WHERE rn= 1 
  35. SELECT a.* FROM user a INNER JOIN (SELECT userid,Max(date) AS date12 FROM user1 GROUP BY userid) b ON a.date=b.date12 AND a.userid=b.userid ORDER BY a.userid;

Tasg: sql, oracle