The assertSmallDataFrameEquality method can be used to compare two DataFrames.val sourceDF = Seq( (1), (5) ).toDF("number") val expectedDF = Seq( (1), (3) ).toDF("number") assertSmallDataFrameEquality(sourceDF,
The assertSmallDatasetEquality method can be used to compare two Datasets or DataFrames(Dataset[Row]). Nicely formatted error messages are displayed when the Datasets are not equal. Here is an example of content mismatch:val sourceDS = Seq( Person("juan", 5), Person("bob", 1), Person("...
我建议为这种聚合实现spark-sql。如果你的数据是结构化的,试着将其加载到dataframes中,并执行分组和其...
which rows of one dataset / dataframes to add, delete or change to get to the other dataset / dataframes.For example, in Scalaval left = Seq((1, "one"), (2, "two"), (3, "three")).toDF("id", "value") val right = Seq((1, "one"), (2, "Two"), (4, "four"))....
TheassertSmallDatasetEqualitymethod can be used to compare two Datasets (or two DataFrames). valsourceDF=Seq( (1), (5) ).toDF("number")valexpectedDF=Seq( (1,"word"), (5,"word") ).toDF("number","word") assertSmallDataFrameEquality(sourceDF, expectedDF)//throws a DatasetSchemaMismatch...