ADVERTISEMENT
ADVERTISEMENT

What is a Series in Pandas?

In the Python programming language's Pandas library, a Series represents a one-dimensional array-like object that can hold any data type (integers, strings, floating-point numbers, Python objects, etc.). A Series is similar to a column in a spreadsheet, a vector in mathematics, or a series in a database. Each value in the Series has a unique label, known as its index, which can be used to access it.

For example, imagine we have a series of four different fruits. In a pandas Series, this data will look something like this:

Index Item
0 Apple
1 Banana
2 Cherry
3 Blueberry

 

The pandas.Series( ) or pd.Series( ) function is used to create a series object in Python's pandas library.

Here's a brief breakdown of its syntax and the arguments it accepts:

pandas.Series(data, index, dtype, copy)
  1. data: This argument can be any array-like object, such as a list or a NumPy array, a dictionary, or a scalar value. This data is the core set of values that will populate the series. If the data argument is a dictionary, the keys of the dictionary will be used as the index of the series.

  2. index: This is an optional argument. The index is like an address for each element in the series, similar to the index of a list or the keys of a dictionary. It should be unique and of the same length as the data. If an index is provided, it must be the same length as the data. If no index is provided, the default integer index will be used, starting from 0.

  3. dtype: This is also an optional argument, which stands for data type. If you wish to specify the data type of the series explicitly, you can use this argument. For instance, you might specify int, float, str, etc. If no dtype is provided, pandas will infer the data type from the input data.

  4. copy: This is another optional argument. If set to True, the data will be copied into a new object, rather than just creating a reference to the original data. This is important if you want to make sure the original data isn't modified when you change the series.

How to Create Series in Pandas ?

The simplest way to create a series is as follows:

import pandas as pd

# Creating a pandas Series
s = pd.Series([1, 3, 5, 6, 8])
print(s)

Here, pd.Series is used to create the Series. Inside the parentheses, we pass a list of values that we want in our Series. In this case, our Series contains the numbers 1, 3, 5, 6, and 8.

When you print the series, you will see that an index has been automatically created. It looks like this:

0    1.0
1    3.0
2    5.0
3    6.0
4    8.0
dtype: float64

What are Labels in Pandas Series?

In a Pandas Series, labels are identifiers that are used for the individual data points. They function much like keys in a dictionary, providing a name or identifier for each value in the Series. These labels make up the index of the Series.

Labels are useful for many reasons. They can be used to:

  • Access data: You can get the value corresponding to a label in the same way you would get a value from a dictionary using its key. For example, if you have a Series s with a label 'apple', you can get the value corresponding to 'apple' by using s['apple'].
  • Filter data: You can use labels to select a subset of data from the Series. For example, s[['apple', 'orange']] will return a new Series containing only the data for 'apple' and 'orange'.
  • Align data: When performing operations between two Series, pandas will align data based on its labels, not on its position. This means that the operation will be applied to the values with the same label in both Series, regardless of their position.

How to Create Labels in Pandas Series?

By default, if you don't provide labels when creating a Series, pandas will assign integer labels starting from 0. But you can also set your own labels by passing a list of labels to the index parameter when creating the Series.

For example:

import pandas as pd

data = [10, 20, 30, 40, 50]
labels = ['a', 'b', 'c', 'd', 'e']
s = pd.Series(data=data, index=labels)

print(s)
# a    10
# b    20
# c    30
# d    40
# e    50
# dtype: int64

In this example, 'a', 'b', 'c', 'd', and 'e' are the labels for the values 10, 20, 30, 40, and 50, respectively.

How to Create Pandas Series from Dictionary in Pandas?

You can create a pandas Series directly from a Python dictionary. When you do this, the keys of the dictionary are used as the index (or labels) of the Series, and the values of the dictionary are used as the values in the Series.

Here's an example:

import pandas as pd

# Create a dictionary
data = {'a': 1,'b': 2,'c': 3,'d': 4,'e': 5}

# Create a pandas Series from the dictionary
s = pd.Series(data)

print(s)

#output
# a    1
# b    2
# c    3
# d    4
# e    5
# dtype: int64

How to Access Values in a Pandas Series?

There are several ways to access elements in a Pandas Series. Here are a few examples:

  1. By index label: This is similar to how you would access a dictionary in Python. If the index label 'a' exists, you can get its corresponding value like so: s['a'].
  2. By integer location: Using the iloc property, you can access the value at a given integer location. For example, s.iloc[0] would give you the first value in the Series.
  3. By index label range: You can also get a range of values by specifying a range of index labels. For example, s['a':'c'] will return all the elements from 'a' to 'c' inclusive.
  4. By integer location range: Similarly, you can get a range of values by specifying a range of integer locations. For example, s.iloc[0:2] will return the first two values in the Series.

Here's an example for all the cases:

import pandas as pd

# Creating a pandas Series
s = pd.Series([1, 3, 5, 6, 8], index=['a', 'b', 'c', 'd', 'e'])

print(s['a'])  # Output: 1.0
print(s.iloc[0])  # Output: 1.0
print(s['a':'c'])  
# Output: 
# a    1.0
# b    3.0
# c    5.0
# dtype: float64

print(s.iloc[0:2])  

# Output: 
# a    1.0
# b    3.0
# dtype: float64

 


ADVERTISEMENT

ADVERTISEMENT