Arranging column in PySpark using Group By

The purpose of the Group By function is to group Data according to certain conditions, resulting in aggregated data. When Group By is applied to a Data Frame, the output is a Relational Grouped Data set object that includes the aggregated function for Data aggregation.


Solution:

Can you try the following?

df.withColumn("rank", F.rank().over(Window.partitionBy("A", "B").orderBy("C")))

Frequently Asked Questions

Posted in Uncategorized