Pandas DataFrames
In this article, we will master data structuring with Pandas DataFrames! Explore accessing data, modifying columns, rows, and indexing techniques for efficient data analysis in Python.
Think of a DataFrame as a two-dimensional labeled data structure, like a spreadsheet with rows and columns. Each column represents a specific variable, and each row represents a data point (observation). This structure allows you to organize and analyze diverse data types efficiently, making it a versatile tool for various data science tasks.
df.head()
: Peek at the first few rows of your DataFrame to get a quick glimpse at the data.df.tail()
: Examine the last few rows for a sense of the data’s end.df.columns
: View a list of all column names (variable names) in your DataFrame.Accessing Data by Label (Index):
import pandas as pd
data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 22], 'City': ['London', 'New York', 'Paris']}
df = pd.DataFrame(data)
print(df['Name'])
Output:
0 Alice
1 Bob
2 Charlie
dtype: object
first_few_rows = df.iloc[0:5, :]
print(first_few_rows)
Output:
Name Age City
0 Alice 25 London
1 Bob 30 New York
2 Charlie 22 Paris
# Add a new column 'Country' with sample data
df['Country'] = ['England', 'USA', 'France']
print(df)
Output (showing the new 'Country' column):
Name Age City Country
0 Alice 25 London England
1 Bob 30 New York USA
2 Charlie 22 Paris France
# Create a new row dictionary
new_row = {'Name': 'David', 'Age': 35, 'Country': 'Germany'}
# Append the new row to the DataFrame
df = df.append(new_row, ignore_index=True)
print(df)
Output (with the new row added):
Name Age City Country
0 Alice 25 London England
1 Bob 30 New York USA
2 Charlie 22 Paris France
3 David 35 Germany
# Modify the 'Age' value for Alice (row label 'Alice')
df.loc['Alice', 'Age'] = 30
print(df)
Output (with Alice's age modified):
Name Age City Country
0 Alice 30 London England
1 Bob 30 New York USA
2 Charlie 22 Paris France
3 David 35 Germany
Pandas DataFrames offer a vast array of functionalities for data analysis, including filtering, sorting, grouping, and aggregation. This guide provides a springboard for you to delve deeper and unlock the full potential of DataFrames in your Python projects.
We hope that you liked our information, if you liked our information, then you must share it with your friends, family and group. So that they can also get this information.
Hi, I am Vishal Jaiswal, I have about a decade of experience of working in MNCs like Genpact, Savista, Ingenious. Currently i am working in EXL as a senior quality analyst. Using my writing skills i want to share the experience i have gained and help as many as i can.
Welcome to Day 13 of Learning Python for Data Science! Today, we’re focusing on three…
Test your understanding of Python Data Structure, which we learned in our previous lesson of…
Welcome to Day 12 of Learning Python for Data Science. Today, we’ll dive into Pandas,…
NumPy Array in Python is a powerful library for numerical computing in Python. It provides…
Welcome to Day 9 of Learning Python for Data Science. Today we will explore comprehensions,…
Test your understanding of Python Data Structure, which we learned in our previous lesson of…