Sunday, August 28, 2016

Pandas : Plotting a data series

A common need is to plot the data using pandas dataframe having different series data. For example, consider the following data:

Student, Class, Marks
A,       1,     56
A,       2,     67
A,       3,     89
A,       4,     76
A,       5,     76
B,       1,     78
B,       2,     99
B,       3,     75
B,       4,     44
B,       5,     77
C,       1,     90
C,       2,     76
C,       3,     90
C,       4,     45
C,       5,     53

The data is quite sorted here but that's not difficult to do using pandas dataframe. Now to plot the data you can use the following code snippet

mport pandas as pd
import matplotlib
import chardet
import matplotlib.pyplot as plt

# Locad ggplot
matplotlib.style.use('ggplot')

#Read the data file by passing proper encoding
file_name = 'a.data'
encoding_result = chardet.detect(file_name)
df = pd.read_csv(file_name ,encoding = encoding_result['encoding'])

# Group by on student name and plot 
for key, grp in df.groupby(['Student']):
    plt.plot(grp['Class'],grp['Marks'], label=str(key))
plt.legend(loc='best')
plt.show()

And the result is

Series Plot
Series Plot

No comments:

Post a Comment