Pandas-Series

Japneet Singh Chawla
3 min readSep 2, 2018

Python is a really powerful language and any piece of software like web application,windows application, ML model etc can be built using Python.

Prior to Pandas, Python was majorly used for data munging and preparation. It had a very little contribution towards data analysis. Pandas solved this problem. Using Pandas, we can accomplish five typical steps in the processing and analysis of data, regardless of the origin of data — load, prepare, manipulate, model, and analyze.

Python with Pandas is used in a wide range of fields including academic and commercial domains including finance, economics, Statistics, analytics, etc.

This blog provides code for basic usage of Series Data Type in Pandas.

Data types in Pandas

  • Series: 1D,labeled homogeneous array, size immutable.
  • Data Frames: General 2D labeled, size-mutable tabular structure with potentially heterogeneously typed columns
  • Panel: General 3D labeled, size-mutable array.

Series

#importing pandas and Series
import pandas as pd
from pandas import DataFrame, Series

Pandas constructor for Series

s=Series( data, index, dtype, copy)

A Series can be created from list, numpy array, dict, scalar

#series from array
s=Series([10,20,30,40,50])
s1=Series([100,200,300,400,500])
print s
Output:0 10
1 20
2 30
3 40
4 50
dtype: int64
#series from dict
d={'a':10,'b':20,'c':30}
s=Series(d)
#dict keys are used for indexing notice NaN
print s
#creating a series from dict with cutom index
s=Series(d,index=['a','c','d'])
print s
Output:a 10
b 20
c 30
dtype: int64
a 10.0
c 30.0
d NaN
dtype: float64
#series from scalar values (index is required)
s=Series(4,index=[1,2,3,4],dtype=pd.Float64Index)
s
Output:1 4.0
2 4.0
3 4.0
4 4.0
dtype: float64

Accessing Series Elements

#accessing elements from a series 
s=Series([1,2,3,4,5,6,7,8,9,10])
#to get first element
print s[1:]
#to print elements from 1 -6
print s[1:7]
#printing last 3 element
print s[-3:]
#printing the last element
s.iloc[-3]
Output:1 2
2 3
3 4
4 5
5 6
6 7
7 8
8 9
9 10
dtype: int64
1 2
2 3
3 4
4 5
5 6
6 7
dtype: int64
7 8
8 9
9 10
dtype: int64
8

Functions in Series

#functions on series
s=Series([10,20,30,40,50])
#adding,subtracting, multiplying and dividing from a seriesprint "Addition ",s.add(1)
print "Subtraction ",s.subtract(2)
print "Multiplication ",s.multiply(2)
print "Divide ",s.divide(2)
Output:Addition 0 11
1 21
2 31
3 41
4 51
dtype: int64
Subtraction 0 8
1 18
2 28
3 38
4 48
dtype: int64
Multiplication 0 20
1 40
2 60
3 80
4 100
dtype: int64
Divide 0 5.0
1 10.0
2 15.0
3 20.0
4 25.0
dtype: float64

Rolling Functions in Series


#Rolling Functions on Series
#gives cumulative difference of last n records
print "CUmulative Difference \n",s.diff(2)
#to calculate cumulative product
print "Cumulative Proudct\n",s.cumprod()
#to calculate cumulative sum
print "Cumulative Sum \n",s.cumsum()
#gives teh summary of the series
print "Basic info about the series \n",s.describe()
Output:CUmulative Difference
0 NaN
1 NaN
2 20.0
3 20.0
4 20.0
dtype: float64
Cumulative Proudct
0 10
1 200
2 6000
3 240000
4 12000000
dtype: int64
Cumulative Sum
0 10
1 30
2 60
3 100
4 150
dtype: int64
Basic info about the series
count 5.000000
mean 30.000000
std 15.811388
min 10.000000
25% 20.000000
50% 30.000000
75% 40.000000
max 50.000000
dtype: float64

Custom functions on Series

#applying a custom function 
def custom_func(x):
return ((x*x)-1)
print "Custom Function ",s.apply(custom_func)Output:Custom Function 0 99
1 399
2 899
3 1599
4 2499
dtype: int64

Statistical Functions on Series

#to get a 2D matrix
print "Matrix from series\n",s.as_matrix()
#to perform autocorrelation
print "Autocorrelation is :",s.autocorr(4)
#to calculate the correlation between 2 series
print "Correlation is: ",s.corr(s1,'pearson')
#to find the covariance between two series
print "Covariance is : ",s.cov(s1,2)
Output:Matrix from series
[10 20 30 40 50]
Autocorrelation is : nan
Correlation is: 1.0
Covariance is : 2500.0

Some other useful functions in Series

#largest element
print "Largest element ",s.max()
#smallest element
print "Smallest element ",s.min()
#index of the largest element
print "Index of largest element ",s.argmax()
#index of the smallest element
print "Index of minimum element ",s.argmin()
#returns the indexes of the sorted array
print "Indexes of elements on a Series when sorted ",s.argsort()
Output:Largest element 50
Smallest element 10
Index of largest element 4
Index of minimum element 0
Indexes of elements on a Series when sorted 0 0
1 1
2 2
3 3
4 4
dtype: int64

--

--