Here is an example to change the column type. val df2 = sqlContext.load("com.databricks.spark.csv", Map("path" -> "file:///Users/vshukla/projects/spark/sql/core/src/test/resources/cars.csv", "header" -> "true")) df2.printSchema() val df4 = df2.withColumn("year2", 'year....
PySpark In PySpark, we can use the cast method to change the data type. frompyspark.sql.typesimportIntegerTypefrompyspark.sqlimportfunctionsasF# first methoddf = df.withColumn("Age", df.age.cast("int"))# second methoddf = df.withColumn("Age", df.age.cast(IntegerType()))# third methodd...
object DFHelper{ def castColumnTo( df: DataFrame, cn: String, type: DataType ) : DataFrame = { df.withColumn( cn, df(cn).cast(type) ) } } which is used like: import DFHelper._ val df2 = castColumnTo( df, "year", IntegerType
I try to set the configuration\"\" to true But keep getting error message such as \"cannot resolve column1 in INSERT clause given columns source.column2, source.column3 when I try to load new source data with only column2 ...
