In this post, we will learn about create dataframe in python using pandas. There are multiple ways to create dataframe in python
DataFrame
Dataframe is one of the data types in python as like string, int. It will look like a table.
It consists of rows and columns. We can say that it is a two-dimensional array.
Here we are using pandas to create the data frame. Pandas is a fast and powerful open-source package.
For More details refer the doc below
https://pandas.pydata.org/
Installing Pandas Libraries using pip
pip install pandas
Installing Pandas libraries using conda
conda install pandas
In order to use pandas, we should install a pandas package on our machine.
Open the terminal/Command prompt and run any one of the above commands
Once you installed we need to import using the import command below
import pandas as pd
Here I am going to create a data frame with avengers details as like below image
Below are the multiple ways to create dataframe in python using pandas.
- Create data frame from list
- Create data frame using dictionary
- Create data frame from csv file
- Load Mysql table as dataframe using pandas
- Load Mongodb collection as dataframe
1. Create data frame from list
import pandas as pd
avengers_column_details = ['ID', 'Character Name', 'Real Name']
avengers_data = [[1, 'Hulk', 'Mark Ruffalo'], [2, 'Thor', 'Chris Hemsworth'], [3, 'Black Widow', 'Scarlett Johansson'], [4, 'Iron Man', 'Robert Downey Jr'],[5, 'Captain America', 'Chris Evans']]
df_avengers_details = pd.DataFrame(avengers_data, columns=avengers_column_details)
print("Created Dataframe using List")
print(df_avengers_details)
Output using list
Created Dataframe using List
ID Character Name Real Name
0 1 Hulk Mark Ruffalo
1 2 Thor Chris Hemsworth
2 3 Black Widow Scarlett Johansson
3 4 Iron Man Robert Downey Jr
4 5 Captain America Chris Evans
In the above example, we have created a data frame using the list.
2. Create data frame using dictionary
import pandas as pd
dict_avengers_data={"ID": [1, 2, 3, 4, 5],
"Character Name": ['Hulk', 'Thor', 'Black Widow', 'Iron Man', 'Captain America'],
"Real Name": ['Mark Ruffalo', 'Chris Hemsworth', 'Scarlett Johansson', 'Robert Downey Jr', 'Chris Evans']}
df_avengers_dict = pd.DataFrame(dict_avengers_data)
print("Created Dataframe using dict")
print(df_avengers_dict)
Output
Created Dataframe using dict
ID Character Name Real Name
0 1 Hulk Mark Ruffalo
1 2 Thor Chris Hemsworth
2 3 Black Widow Scarlett Johansson
3 4 Iron Man Robert Downey Jr
4 5 Captain America Chris Evans
Here we are created a data frame using the dictionary. Printed the output.
3. Create data frame from csv file
In the below code, we are importing a CSV file as a data frame with the help of pandas library
import pandas as pd
df_avenger_data_csv = pd.read_csv("D://avenger_details.csv")
print("Created Dataframe using csv file")
print(df_avenger_data_csv)
print("\n")
Output
Created Dataframe using csv file
ID Character Name Real Name
0 1 Hulk Mark Ruffalo
1 2 Thor Chris Hemsworth
2 3 Black Widow Scarlett Johansson
3 4 Iron Man Robert Downey Jr
4 5 Captain America Chris Evans
4. Load Mysql table as dataframe using pandas
To load the MySQL table data as a data frame we need a MySQL connector library. you can install using the below command
pip install mysql-connector-python
Once you installed the MySQL connector in your system. you need to create the MySQL connection object and need to pass the connection object and query to the pandas as below
import pandas as pd
import mysql.connector
mysql_connection = mysql.connector.connect(host="localhost", user="root", password="password", database="avengers")
df = pd.read_sql("select * from avengersdetails", mysql_connection)
print("Created Dataframe from mysql table")
print(df)
mysql_connection.close()
Output
Created Dataframe from mysql table
ID CharacterName RealName
0 1 Hulk Mark Ruffalo
1 2 Thor Chris Hemsworth
2 3 Black Widow Scarlett Johansson
3 4 Iron Man Robert Downey Jr
4 5 Captain America Chris Evans
5. Load Mongodb collection as dataframe
To load the MongoDB collection data as a data frame we need pymongo library. you can install using the below command
pip install pymongo
Once you installed the pymongo in your system. you need to create the MongoDB connection object. After that, you need to convert MongoDB to pandas data frame
For connecting python with MongoDB refer this
https://beginnersbug.com/python-with-mongodb/
import pandas as pd
import pymongo
mongodb_connection = pymongo.MongoClient("mongodb://localhost:27017/")
mongodb_db = mongodb_connection["avengers"]
mongodb_avengers = mongodb_db["avengersdetails"].find()
df_mongodb_avengers = pd.DataFrame(list(mongodb_avengers))
print("Created Dataframe from mongodb collections")
print(df_mongodb_avengers)
Output
Created Dataframe from mongodb collections
_id ID Character Name Real Name
0 5fd0e603549a851a24a48c36 1 Hulk Mark Ruffalo
1 5fd0e603549a851a24a48c37 2 Thor Chris Hemsworth
2 5fd0e603549a851a24a48c38 3 Black Widow Scarlett Johansson
3 5fd0e603549a851a24a48c39 4 Iron Man Robert Downey Jr
4 5fd0e603549a851a24a48c3a 5 Captain America Chris Evans