Pandas, a cornerstone library in Python’s data science arsenal, offers a treasure trove of tools for data manipulation and analysis. At its core lies the Series, a one-dimensional powerhouse capable of storing and manipulating data of various types. This guide empowers you to grasp the essence of Pandas Series, unlocking its potential for efficient data handling in your Python projects.
Imagine a single, flexible column in a spreadsheet – that’s the essence of a Pandas Series! It’s a one-dimensional labeled array, meaning each data point has a corresponding label (index) for effortless retrieval and organization. This structure excels in handling various data types, including numbers, text, and booleans, making it a versatile tool for diverse data wrangling tasks.
Here’s a table to illustrate the concept:
Index (Labels) | Data |
---|---|
Fruit 1 | Apple |
Fruit 2 | Banana |
Fruit 3 | Cherry |
Fruit 4 | Mango |
This table represents a Series with fruits as data and custom labels (“Fruit 1”, “Fruit 2”, etc.) as the index. You can access specific fruits using their corresponding labels or positions.
import pandas as pd
data = [1, 5, 8, 2, 4]
my_series = pd.Series(data)
print(my_series)
pd
for convenience.data
is created containing numerical values.pd.Series(data)
function creates a Series from the list data
.print(my_series)
displays the newly created Series.0 1
1 5
2 8
3 2
4 4
dtype: int64
dtype: int64
at the end indicates the data type of the elements in the Series, which in this case is integer (int64).data = {"apple": 10, "banana": 15, "cherry": 20}
my_series = pd.Series(data)
print(my_series)
data
is created with keys representing fruit names and values representing their prices.pd.Series(data)
function creates a Pandas Series from the dictionary data
.print(my_series)
displays the Series with fruit names as index and their corresponding prices as data.apple 10
banana 15
cherry 20
dtype: int64
dtype: int64
indicates the data type of the Series elements (integers in this case).data = ["apple", "banana", "cherry", "mango"]
labels = ["Fruit 1", "Fruit 2", "Fruit 3", "Fruit 4"]
my_series = pd.Series(data, index=labels)
print(my_series)
data
is created containing fruit names.labels
is created with custom labels for each fruit.pd.Series(data, index=labels)
function creates a Series from the list data
and assigns the custom labels from labels
as the index.print(my_series)
displays the Series with custom labels (“Fruit 1”, “Fruit 2”, etc.) as the index and fruit names as data.Fruit 1 apple
Fruit 2 banana
Fruit 3 cherry
Fruit 4 mango
dtype: object
labels
list.data
list at the same position (0th index) as the “Fruit 1” label.dtype: object
: This indicates that the data type of the Series elements is ‘object’, which means it can hold various data types like strings in this case.my_series = pd.Series(["apple", "banana", "cherry", "mango"], index=["Fruit 1", "Fruit 2", "Fruit 3", "Fruit 4"])
fruit_at_index_2 = my_series["Fruit 2"]
print(fruit_at_index_2)
my_series
is created with a list of fruits (["apple", "banana", "cherry", "mango"]
) and custom labels (index) as a separate list (["Fruit 1", "Fruit 2", "Fruit 3", "Fruit 4"]
).[]
with the desired label name inside. In this case, my_series["Fruit 2"]
retrieves the element associated with the label “Fruit 2”.banana
banana
) confirms that we successfully accessed the element with the label “Fruit 2”, which is “banana” in this Series.first_fruit = my_series[0]
print(first_fruit)
apple
fruits[0]
retrieves the element at index 0, which is “apple” in our case.You can use comparison operators like ==
, !=
, <
, >
, <=
, and >=
to create boolean Series based on conditions.
prices = pd.Series([5, 12, 8, 15, 9])
expensive_items = prices > 10
print(expensive_items)
0 False
1 True
2 False
3 True
4 False
dtype: bool
prices
).expensive_items
) that identifies which prices in the original Series are greater than 10 using a boolean comparison.expensive_items
), showing True for expensive items and False for non-expensive items.You can use boolean operators like &
(AND), |
(OR), and ~
(NOT) to combine boolean Series.
prices = pd.Series([12, 5, 18, 9, 15])
in_stock = pd.Series([True, False, True, False, True])
expensive_items = prices > 10
available_expensive_items = (expensive_items & in_stock)
print(available_expensive_items)
0 True
2 True
4 True
dtype: bool
expensive_items
Series holds True for items with prices greater than 10.expensive_items
and in_stock
using the &
operator. This ensures only items that are both expensive AND in stock are marked as True in available_expensive_items
.print
statement shows which items are both expensive and available for purchase.Reference:
By understanding Pandas Series and their functionalities, you’ve unlocked a powerful tool for data manipulation and analysis in Python. You can leverage Series for tasks like filtering, sorting, and performing calculations on your data with ease.
Feel free to share your thoughts and questions in the comments below! We hope that you liked our information, if you liked our information, then you must share it with your friends, family and group. So that they can also get this information.
Hi, I am Vishal Jaiswal, I have about a decade of experience of working in MNCs like Genpact, Savista, Ingenious. Currently i am working in EXL as a senior quality analyst. Using my writing skills i want to share the experience i have gained and help as many as i can.
SQL Interview Question at Zomato: These questions were recently asked in interview at Zomato, you…
Introduction: SQL Indexing and Query Optimization SQL indexing is a critical concept that can drastically…
This article is about the SQL Interview Questions asked by Walmart for their Data Analyst…
You must be able to answer these SQL Interview Questions if you are applying for…
This article tackles common SQL Interview Questions asked by EY, offering detailed solutions and explanations…
1164. Product Price at a Given Date: Learn how to track and select price from…