您不能在udf. 正如您所提到的,这个问题最好使用join. IIUC,您正在寻找类似的东西: from pyspark.sql import Window import pyspark.sql.functions as F df1.alias("L").join(df2.alias("R"), (df1.n == df2.x1) | (df1.n == df2.x2), how="left")\ .select("L.*", F.sum("w").ov...