Categories
collections java

join two arraylist in java with example

In this post, we will learn to join two arraylist in java with example

ArrayList is a dynamic array concept in java. We use ArrayList in so many places.

Syntax
list1.addAll(list2);

From the above syntax list2 will added to list1.

Example
import java.util.ArrayList;
import java.util.List;

public class Join2List {

	public static void main(String[] args) {
		try {
			List<String> list1 = new ArrayList<String>();
			List<String> list2 = new ArrayList<String>();

			list1.add("Rajesh");
			list1.add("Usha");

			list2.add("Kumar");
			list2.add("Nandhini");

			System.out.println("---List1 value before adding list2---");
			for (String string : list1) {
				System.out.println(string);
			}

			// Adding list2 to list1
			list1.addAll(list2);

			System.out.println("---List1 value After adding list2---");

			for (String string : list1) {
				System.out.println(string);
			}
		} catch (Exception e) {
			e.printStackTrace();
		}
	}

}
Output
---List1 value before adding list2---
Rajesh
Usha
---List1 value After adding list2---
Rajesh
Usha
Kumar
Nandhini
Github

https://github.com/rkumar9090/BeginnersBug/blob/master/BegineersBug/src/com/geeks/example/list/Join2List.java

Related Articles

ArrayList in java with example

Categories
pyspark

Inner join in pyspark dataframe with example

In this post, We will learn about Inner join in pyspark dataframe with example. 

Types of join in pyspark dataframe

Before proceeding with the post, we will get familiar with the types of join available in pyspark dataframe.

Types  of join: inner join, cross join, outer join, full join, full_outer join, left join, left_outer join, right join, right_outer join, left_semi join, and left_anti join

What is Inner join ?

As similar to  SQL , Inner join helps us to get the matching records between two datasets . To understand it better , we will create two dataframes with the following piece of code .

Sample program for creating two dataframes
spark = SparkSession.builder.appName("Inner Join").getOrCreate()
from pyspark.sql import Row
# Creating dictionary with columns Emp_id and Emp_name
dict=[{"Emp_id" : 123 , "Emp_name" : "Raja" }, {"Emp_id" : 456 , "Emp_name" : "Ravi"}]
# Creating RDD from the above dictionary using parallelize method
rdd=sc.parallelize(dict)
# Converting RDD to dataframe 
df=rdd.toDF()
print("Printing the first dataframe df")
df.show()
Printing the first dataframe df 
+------+--------+
|Emp_id|Emp_name|
+------+--------+
|   123|    Raja|
|   456|    Ravi|
+------+--------+
# Creating dictionary with columns Emp_id and Dep_name 
dict1=[{"Emp_id" : 123 , "Dep_name" : "Computer" } , {"Emp_id" : 456 ,"Dep_name"  :"Economy"} , {"Emp_id" : 789 , "Dep_name" : "History"}]
# Creating RDD from the above dictionary using parallelize method
rdd1=sc.parallelize(dict1)
# Converting RDD to dataframe  
df1=rdd1.toDF()
print("Printing the second dataframe df1")
df1.show()
Printing the second dataframe df1
+--------+------+
|Dep_name|Emp_id|
+--------+------+
|Computer|   123|
| Economy|   456|
| History|   789|
+--------+------+
How to do inner join ?

The syntax of join requires three parameters to be passed –

1) The dataframe to be joined with

2) Column to be checked for

3) Type of join to be do . 

By default , Inner join will be taken for the third parameter if no input is passed .

First method

Let us see the first method in understanding Inner join in pyspark dataframe with example.

# Inner joining the two dataframes df and df1 based on the column Emp_id 
df2=df.join(df1,['Emp_id'], how = 'inner')
print("Printing the dataframe df2")
df2.show()
Printing the dataframe df2
+------+--------+--------+
|Emp_id|Emp_name|Dep_name|
+------+--------+--------+
|   123|    Raja|Computer|
|   456|    Ravi| Economy|
+------+--------+--------+
Second method
# Inner joining the two dataframes df and df1 based on the column Emp_id with default join i.e inner join
df3=df.join(df1,['Emp_id'])
print("Printing the dataframe df3")
df3.show()
Printing the dataframe df3
+------+--------+--------+
|Emp_id|Emp_name|Dep_name|
+------+--------+--------+
|   123|    Raja|Computer|
|   456|    Ravi| Economy|
+------+--------+--------+
Reference

https://spark.apache.org/docs/latest/api/python/pyspark.sql.html?highlight=join#pyspark.sql.DataFrame.join