Day 14 of Learning Python for Data Science
Welcome to Day 14 of our Python for Data Science journey! Today, we explored Seaborn, a high-level Python visualization library that builds on Matplotlib. It’s an essential tool for any aspiring data scientist who wants to uncover insights hidden within the data.
Day 13 of Learning Python for Data Science: Mastering Pivot, Apply and RegEx
Seaborn offers:
We first created a synthetic dataset using numpy
and pandas
:
import numpy as np
import pandas as pd
data = {
'Age': np.random.randint(18, 70, 100),
'Salary': np.random.randint(30000, 120000, 100),
'Department': np.random.choice(['HR', 'IT', 'Finance', 'Marketing'], 100),
'Performance_Score': np.random.uniform(1, 5, 100).round(1)
}
df = pd.DataFrame(data)
sns.histplot(df['Age'], kde=True, bins=10)
plt.title('Age Distribution')
plt.show()
Observation:
sns.barplot(x='Department', y='Salary', data=df)
plt.title('Average Salary by Department')
plt.show()
Observation:
sns.countplot(x='Department', data=df)
plt.title('Employee Count by Department')
plt.show()
Observation:
sns.boxplot(x='Department', y='Salary', data=df)
plt.title('Salary Distribution by Department')
plt.show()
Observation:
sns.lineplot(x='Age', y='Performance_Score', data=df)
plt.title('Performance Score vs. Age')
plt.show()
Observation:
corr = df.select_dtypes(include=['number']).corr()
sns.heatmap(corr, annot=True, cmap='coolwarm')
plt.title('Correlation Matrix')
plt.show()
Observation:
Age
and Salary
(~0.4–0.5), suggesting that older employees earn more, likely due to experience.Performance_Score
has weak or negligible correlation with other variables, implying performance is not strictly tied to age or salary in this dataset.Performance_Score
.Age
varies by Department
.Challenge:
Education_Level
(e.g., ‘Graduate’, ‘Postgraduate’, ‘Diploma’) and visualize how it influences Salary
.On Day 15, we’ll dive deeper into Matplotlib, which gives us more control over visual elements. This will empower you to create professional and customized plots for reports and dashboards.
Until then, keep practicing and exploring. The more you visualize, the better your intuition becomes for understanding data.
We hope this article was helpful for you and you learned a lot about data science from it. If you have friends or family members who would find it helpful, please share it to them or on social media.
Join our social media for more.
Python for Data Science Python for Data Science Python for Data Science Python for Data Science Python for Data Science Python for Data Science Python for Data Science Python for Data Science
Hi, I am Vishal Jaiswal, I have about a decade of experience of working in MNCs like Genpact, Savista, Ingenious. Currently i am working in EXL as a senior quality analyst. Using my writing skills i want to share the experience i have gained and help as many as i can.
Welcome to Day 13 of Learning Python for Data Science! Today, we’re focusing on three…
Test your understanding of Python Data Structure, which we learned in our previous lesson of…
Welcome to Day 12 of Learning Python for Data Science. Today, we’ll dive into Pandas,…
NumPy Array in Python is a powerful library for numerical computing in Python. It provides…
Welcome to Day 9 of Learning Python for Data Science. Today we will explore comprehensions,…
Test your understanding of Python Data Structure, which we learned in our previous lesson of…