Plotting with seaborn -part 3

keshav
Analytics Vidhya
Published in
3 min readMar 21, 2021

--

Displot

This function provides access to several approaches for visualizing the univariate or bivariate distribution of data, including subsets of data defined by semantic mapping and faceting across multiple subplots. The kind parameter selects the approach to use:

histplot() (with kind="hist"; the default)kdeplot() (with kind="kde")ecdfplot() (with kind="ecdf"; univariate-only)

Loading the dataset.

d = sns.load_dataset("tips")

The default plot is the histogram.

sns.displot(data=d, x = "total_bill")

we can use the kind parameter to select the different representations.

sns.displot(data=d, x = “total_bill”,kind=”kde”)

While in histogram mode, it is also possible to add a KDE curve

sns.displot(data=d, x = "total_bill",kde=True)

To draw a bivariate plot, assign both x and y

sns.displot(data=d,x="size",y="total_bill")
sns.displot(data=d,x="size",y="total_bill",kind="kde")

showing individual observations with a marginal “rug”:

sns.displot(data=d,x="size",y="total_bill",kind="kde",rug=True)

Each kind of plot can be drawn separately for subsets of data using hue mapping

sns.displot(data=d,x="total_bill",kind="kde",hue="sex")
sns.displot(data=d,x="total_bill",kind="kde",hue="day")
sns.displot(data=d,x="total_bill",kind="kde",hue="day",
multiple="stack")
sns.displot(data=d,x="total_bill",hue="day",multiple="stack")

The figure is constructed using a FacetGrid, meaning that you can also show subsets on distinct subplots

sns.displot(data=d, x="total_bill", hue="smoker", col="sex", kind="kde")

ecdfplot

An ECDF represents the proportion or count of observations falling below each unique value in a dataset. Compared to a histogram or density plot, it has the advantage that each observation is visualized directly, meaning that there are no binning or smoothing parameters that need to be adjusted. It also aids direct comparisons between multiple distributions. A downside is that the relationship between the appearance of the plot and the basic properties of the distribution (such as its central tendency, variance, and the presence of any bimodality) may not be as intuitive.

Plot a univariate distribution along the x-axis:

sns.ecdfplot(data=d, x="total_bill")

multiple histograms from a long-form dataset with hue mapping can be drawn.

sns.ecdfplot(data=d, x="total_bill",hue="day")
sns.ecdfplot(data=d, x=”total_bill”,hue=”size”)

The default distribution statistic is normalized to show a proportion, but you can show absolute counts instead:

sns.ecdfplot(data=d, x="total_bill",hue="size",stat="count")

It’s also possible to plot the empirical complementary:

sns.ecdfplot(data=d,x="total_bill",hue="size",
stat="count",complementary=True)

--

--