DataFrame.DropDuplicates Method
Definition
Important
Some information relates to prerelease product that may be substantially modified before it’s released. Microsoft makes no warranties, express or implied, with respect to the information provided here.
Overloads
DropDuplicates(String, String[]) |
Returns a new |
DropDuplicates() |
Returns a new |
DropDuplicates(String, String[])
Returns a new DataFrame
with duplicate rows removed, considering only
the subset of columns.
public Microsoft.Spark.Sql.DataFrame DropDuplicates (string col, params string[] cols);
member this.DropDuplicates : string * string[] -> Microsoft.Spark.Sql.DataFrame
Public Function DropDuplicates (col As String, ParamArray cols As String()) As DataFrame
Parameters
- col
- String
Column name
- cols
- String[]
Additional column names
Returns
DataFrame object
Applies to
DropDuplicates()
Returns a new DataFrame
that contains only the unique rows from this DataFrame
.
This is an alias for Distinct().
public Microsoft.Spark.Sql.DataFrame DropDuplicates ();
member this.DropDuplicates : unit -> Microsoft.Spark.Sql.DataFrame
Public Function DropDuplicates () As DataFrame