Categories
pandas python

create dataframe in python using pandas

In this post, we will learn about create dataframe in python using pandas. There are multiple ways to create dataframe in python

DataFrame

Dataframe is one of the data types in python as like string, int. It will look like a table.

It consists of rows and columns. We can say that it is a two-dimensional array.

Here we are using pandas to create the data frame. Pandas is a fast and powerful open-source package.
For More details refer the doc below
https://pandas.pydata.org/

Installing Pandas Libraries using pip

pip install pandas

Installing Pandas libraries using conda

conda install pandas

In order to use pandas, we should install a pandas package on our machine.
Open the terminal/Command prompt and run any one of the above commands
Once you installed we need to import using the import command below

import pandas as pd

Here I am going to create a data frame with avengers details as like below image

Below are the multiple ways to create dataframe in python using pandas.

  • Create data frame from list
  • Create data frame using dictionary
  • Create data frame from csv file
  • Load Mysql table as dataframe using pandas
  • Load Mongodb collection as dataframe
1. Create data frame from list
import pandas as pd

avengers_column_details = ['ID', 'Character Name', 'Real Name']
avengers_data = [[1, 'Hulk', 'Mark Ruffalo'], [2, 'Thor', 'Chris Hemsworth'], [3, 'Black Widow', 'Scarlett Johansson'], [4, 'Iron Man', 'Robert Downey Jr'],[5, 'Captain America', 'Chris Evans']]

df_avengers_details = pd.DataFrame(avengers_data, columns=avengers_column_details)
print("Created Dataframe using List")
print(df_avengers_details)
Output using list
 Created Dataframe using List
    ID   Character Name           Real Name
 0   1             Hulk        Mark Ruffalo
 1   2             Thor     Chris Hemsworth
 2   3      Black Widow  Scarlett Johansson
 3   4         Iron Man    Robert Downey Jr
 4   5  Captain America         Chris Evans

In the above example, we have created a data frame using the list.

2. Create data frame using dictionary
import pandas as pd

dict_avengers_data={"ID": [1, 2, 3, 4, 5],
                    "Character Name": ['Hulk', 'Thor', 'Black Widow', 'Iron Man', 'Captain America'],
                    "Real Name": ['Mark Ruffalo', 'Chris Hemsworth', 'Scarlett Johansson', 'Robert Downey Jr', 'Chris Evans']}
df_avengers_dict = pd.DataFrame(dict_avengers_data)
print("Created Dataframe using dict")
print(df_avengers_dict)
Output
Created Dataframe using dict
    ID   Character Name           Real Name
 0   1             Hulk        Mark Ruffalo
 1   2             Thor     Chris Hemsworth
 2   3      Black Widow  Scarlett Johansson
 3   4         Iron Man    Robert Downey Jr
 4   5  Captain America         Chris Evans

Here we are created a data frame using the dictionary. Printed the output.

3. Create data frame from csv file

In the below code, we are importing a CSV file as a data frame with the help of pandas library

import pandas as pd

df_avenger_data_csv = pd.read_csv("D://avenger_details.csv")
print("Created Dataframe using csv file")
print(df_avenger_data_csv)
print("\n")
Output
Created Dataframe using csv file
    ID   Character Name           Real Name
 0   1             Hulk        Mark Ruffalo
 1   2             Thor     Chris Hemsworth
 2   3      Black Widow  Scarlett Johansson
 3   4         Iron Man    Robert Downey Jr
 4   5  Captain America         Chris Evans
4. Load Mysql table as dataframe using pandas

To load the MySQL table data as a data frame we need a MySQL connector library. you can install using the below command

 pip install mysql-connector-python

Once you installed the MySQL connector in your system. you need to create the MySQL connection object and need to pass the connection object and query to the pandas as below

import pandas as pd
import mysql.connector

mysql_connection = mysql.connector.connect(host="localhost", user="root", password="password", database="avengers")
df = pd.read_sql("select * from avengersdetails", mysql_connection)
print("Created Dataframe from mysql table")
print(df)
mysql_connection.close()
Output
Created Dataframe from mysql table
    ID    CharacterName            RealName
 0   1             Hulk        Mark Ruffalo
 1   2             Thor     Chris Hemsworth
 2   3      Black Widow  Scarlett Johansson
 3   4         Iron Man    Robert Downey Jr
 4   5  Captain America         Chris Evans
5. Load Mongodb collection as dataframe

To load the MongoDB collection data as a data frame we need pymongo library. you can install using the below command

pip install pymongo

Once you installed the pymongo in your system. you need to create the MongoDB connection object. After that, you need to convert MongoDB to pandas data frame

For connecting python with MongoDB refer this
https://beginnersbug.com/python-with-mongodb/

import pandas as pd
import pymongo

mongodb_connection = pymongo.MongoClient("mongodb://localhost:27017/")
mongodb_db = mongodb_connection["avengers"]
mongodb_avengers = mongodb_db["avengersdetails"].find()
df_mongodb_avengers = pd.DataFrame(list(mongodb_avengers))
print("Created Dataframe from mongodb collections")
print(df_mongodb_avengers)
Output
Created Dataframe from mongodb collections
                         _id ID   Character Name           Real Name
 0  5fd0e603549a851a24a48c36  1             Hulk        Mark Ruffalo
 1  5fd0e603549a851a24a48c37  2             Thor     Chris Hemsworth
 2  5fd0e603549a851a24a48c38  3      Black Widow  Scarlett Johansson
 3  5fd0e603549a851a24a48c39  4         Iron Man    Robert Downey Jr
 4  5fd0e603549a851a24a48c3a  5  Captain America         Chris Evans
Related Articles