Interactive exploratory data analysis (EDA) of sensor data with Pandas: Multivariate time series data

Visualizing multivariate time series data with the pandas plotting API

This post shows the basic look and feel of the pandas plotting API applied to typical multivariate sensor data, each represented as time series. Feel free to visit the source code repository, press the “Binder” button to open the repository in a Binder environment and explore the plot type interactivity in the notebook time_series_multivariate.ipynb. Of course not every plotting type makes sense to visualize multivariate time series data. However for the sake of completion and to make it clear that some plot types make no sense I’ve added GIFs for all of them. If a plotting backend does not support a plot type I skipped the GIF in the corresponding section. In the Binder environment I’ve tried to plot with all plot types to force output of the exceptions. These exceptions relate not to wrong usage of the pandas plotting API but could help to figure out that a plot type is simply not supported (yet).

Pandas DataFrames simplify working with time series data

In the example we construct univariate fake data of an ideal temperature sensor as pandas Time Series (Series with Datatime index), temperature_series. Ideal means that we ignore sensor data uncertainty for now.

temperature_d = random.sample(range(20, 20+10), 10)
temperature_dti = pd.date_range("2020-01-01 12:00:00.000001", periods=10, freq="S").tz_localize("Europe/Berlin")
temperature_series = pd.Series(data=temperature_d, index=temperature_dti, name="Temperature")

In addition we construct univariate fake data of an ideal humidity sensor as pandas Time Series, humidity_series.

d = [i for i in reversed(range(60, 70, 1))]
dti = pd.date_range("2020-01-01 12:00:00.000001", periods=10, freq="S").tz_localize("Europe/Berlin")
humidity_series = pd.Series(data=d, index=dti, name="Humidity")

Again, ideal means that we ignore sensor data uncertainty for now. Ideal means as well that the timestamps do exactly match which will never be true in the real world. Usually we’d either choose a less accurate time stamp resolution (e.g. in the range of seconds instead of microseconds). The accuracy chosen only to show which accuracy may be processed using pandas in general. Another real world option would be to use an IntervalIndex with Datetimes as interval boundaries.

To being able to plot several time series and beeing able to relate them to each other one has to combine the series into a dataframe df_row_wise. This can be done e.g. like follows

frames = [temperature_series.to_frame().T, humidity_series.to_frame().T]
df_row_wise = pd.concat(frames)

and results in this DataFrame:

Image for post
Image for post
Series combined into a dataframe (row wise).

However plotting of the ataframe datastructure is column-centric and converting a Series Index into a DataFrame Index makes more sense. This can be achieved e.g. like follows

df_column_wise = df_row_wise.T

and results in this DataFrame:

Image for post
Image for post
Series combined into a dataframe (column wise).

The following sections show how the plot types look like and behave for the different plotting backends in the default configuration. Comments w.r.t. the visualization of several time series in a single plot have been added (relating data between time series, “data overlap”).

Area plot

Image for post
Image for post
area plot (altair)

The altair backend seems to select the order of dataframe columns/time series correctly that the higher value area is not hidding the lower value area. However the backend seems to be buggy w.r.t. visualizing area plots (the humidity is not approx. 90).

Image for post
Image for post
area plot (pandas bokeh)

The pandas_bokeh backend uses a transparency effect which prevents areas from overlapping each other and hiding information from the user. In addition the selection of data points is easy and all time series data values are shown for a given point in time.

Image for post
Image for post
area plot (hvplot/holoviews)

The hvplot backend hides the lower value area behind the higher value area.

Bar plots

Image for post
Image for post
bar plot (altair)
Image for post
Image for post
bar plot (pandas_bokeh)

Horizontal bar plot

Image for post
Image for post
barh plot (altair)
Image for post
Image for post
barh plot (pandas_bokeh)
Image for post
Image for post
barh plot (hvplot/holoviews)

Box plot

Image for post
Image for post
box plot (altair)
Image for post
Image for post
box plot (hvplot/holoviews)

Hexbin plot

Image for post
Image for post
hexbin plot (hvplot/holoviews)

Hist plot

Image for post
Image for post
hist plot (altair)
Image for post
Image for post
hist plot (pandas_bokeh)
Image for post
Image for post
hist plot (hvplot/holoviews)

KDE plot

Image for post
Image for post
KDE plot (hvplot/holoviews)

Line plot

Image for post
Image for post
line plot (altair)
Image for post
Image for post
line plot (pandas_bokeh)
Image for post
Image for post
line plot (hvplot/holoviews)

Pie plot

Image for post
Image for post
pie plot (pandas bokeh)

Scatter plot

Image for post
Image for post
scatter plot (altair)
Image for post
Image for post
scatter plot (pandas_bokeh)
Image for post
Image for post
scatter plot (hvplot/holoviews)

Conclusion

As for the visualiaztions of Series, the altair backend is recommended for the box plot.

As for the visualization of Series, the hvplot backend is the only plotting backend which supports the density plot and KDE plot.

As for the visualization of Series, the pandas_bokeh backend is the most suitable one for all remaining plot types, with line plot beeing the most important one.

For hist plots either the pandas_bokeh or the hvplot backend is recommended.

Post series

Software Developer for rapid prototype or high quality software with interest in distributed systems and high performance on premise server applications.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store