In this post, We will learn how to add/subtract months to the date in pyspark with examples.
Creating dataframe – Sample program
With the following program , we first create a dataframe df with dt as of its column populated with date value '2019-02-28'
import findspark
findspark.init()
from pyspark import SparkContext,SparkConf
from pyspark.sql.functions import *
sc=SparkContext.getOrCreate()
#Creating a dataframe df with date column
df=spark.createDataFrame([('2019-02-28',)],['dt'])
print("Printing df below")
df.show()
Output
The dataframe is created with the date value as below .
Printing df below
+----------+
| dt|
+----------+
|2019-02-28|
+----------+
Adding months – Sample program
In the Next step , we will create another dataframe df1 by adding months to the column dt using add_months()
date_format() helps us to convert the string '2019-02-28'
into date by specifying the date format within the function .
You could get to know more about the date_format() from https://beginnersbug.com/how-to-change-the-date-format-in-pyspark/
#Adding the months
df1=df.withColumn("months_add",add_months(date_format('dt','yyyy-MM-dd'),1))
print("Printing df1 below")
Output
add_months(column name , number of months ) requires two inputs – date column to be considered and the number of months to be incremented or decremented
Printing df1 below
+----------+----------+
| dt|months_add|
+----------+----------+
|2019-02-28|2019-03-31|
+----------+----------+
Subtracting months – Sample program
We can even decrement the months by giving the value negatively
#Subtracting the months
df2=df.withColumn("months_sub",add_months(date_format('dt','yyyy-MM-dd'),-1))
print("Printing df2 below")
Output
Hence we get the one month back date using the same function .
Printing df2 below
+----------+----------+
| dt|months_sub|
+----------+----------+
|2019-02-28|2019-01-31|
+----------+----------+
Reference
https://spark.apache.org/docs/2.2.0/api/python/pyspark.sql.html#pyspark.sql.functions.add_months
Related Articles
from_unixtime in pyspark with example