Visualize data distribution professionally with Python | Written by Kurt Klingensmith

Learn 7 different ways to visualize data distribution

Exploratory data analysis and data visualization often involves examining the distribution of a dataset. This provides important insights into your data, such as identifying ranges, outliers or unusual groupings, the central tendency of your data, and any bias in your data. Comparing subsets of data reveals more information about the data at hand. Gain instant insight with professionally constructed visualizations of the distribution of datasets. This guide details several options for quickly creating clean and meaningful visualizations using Python.

Visualizations covered:

histogram
KDE (density) plot
Joy plot or ridge plot
boxplot
violin plot
Strip plots and swarm plots
ECDF plot

Data and code:

This article uses fully synthetic weather data generated according to the concepts in the previous article. Data for this article and the complete Jupyter notebook are available here. Linked GitHub page. Feel free to download both and follow the steps, or refer to the code block below.

The libraries, imports, and settings used for this are:

# Data Handling:
import pandas as pd
from pandas.api.types import CategoricalDtype# Data Visualization Libraries:
import seaborn as sns
import matplotlib.pyplot as plt
import plotly.express as px
from joypy import joyplot
# Display Configuration:
%config InlineBackend.figure_format='retina'

First, let’s load and prepare the data. This is a simple synthetic weather dataframe showing various temperature measurements for three cities over four seasons.

# Load data:
df = pd.read_csv('weatherData.csv')# Set season as a categorical data type:
season = CategoricalDtype(['Winter', 'Spring', 'Summer', 'Fall'])
df['Season'] = df['Season'].astype(season)

Note that in this code, the Season column is set to a categorical data type. This will…

Source link

Subscribe to Updates

What's Hot

Visualize data distribution professionally with Python | Written by Kurt Klingensmith | February 2024

Learn 7 different ways to visualize data distribution

Visualizations covered:

Data and code:

Related Posts