'repeated y axis ticks in violinplot

I want to plot the distribution with violinplot of a set of values between 1 and 800, I have used this code. I am very new to this.

import matplotlib.pyplot as plt
from matplotlib import ticker as mticker
import seaborn as sns
import numpy as np

log_data = [[np.log10(d) for d in row] for row in [data['count']]]
print(log_data)

fig, ax = plt.subplots()
sns.violinplot(data=log_data, ax=ax)

plt.show()

Why do I have three 10^0s?

enter image description here

This is my data: [ 8, 7, 5, 1, 2, 6, 5, 1, 2, 31, 9, 40, 9, 53, 4, 8, 3, 1, 46, 2, 18, 4, 17, 26, 17, 2, 19, 14, 2, 16, 35, 42, 22, 2, 19, 13, 59, 11, 69, 33, 2, 2, 24, 86, 16, 11, 7, 5, 18, 22, 1, 2, 16, 28, 3, 2, 12, 16, 1, 8, 1, 2, 5, 4, 9, 1, 1, 5, 1, 4, 5, 2, 11, 25, 6, 45, 64, 6, 2, 63, 26, 2, 3, 8, 3, 16, 8, 2, 2, 99, 2, 51, 43, 5, 53, 10, 19, 20, 6, 9, 1, 4, 1, 19, 4, 2, 3, 2, 77, 4, 7, 3, 2, 1, 81, 15, 50, 22, 58, 21, 10, 1, 18, 8, 1, 35, 2, 32, 18, 12, 11, 7, 5, 27, 29, 1, 2, 5, 1, 2, 3, 3, 1, 45, 22, 1, 12, 2, 21, 4, 1, 19, 27, 23, 3, 1, 21, 1, 124, 13, 17, 1, 18, 33, 23, 3, 6, 2, 8, 3, 1, 228, 28, 1, 1, 122, 868, 47, 2, 1, 9, 108, 10, 1, 5, 40, 43, 5, 2, 137, 9, 11, 19, 19, 11, 21, 8, 1, 6, 2, 3, 3, 26, 42, 14, 1, 14, 15, 3, 30, 17, 5, 17, 3, 38, 11, 54, 3, 1, 1, 3, 3, 7, 3, 1, 1, 5, 9, 1, 5, 4, 7, 35, 8, 10, 6, 6, 5, 3, 28, 2, 2, 5, 13, 6, 2, 4, 3, 2, 7, 52, 31, 1, 7, 7, 216, 4, 13, 6, 14, 4, 4, 5, 102, 3, 15, 4, 12, 48, 5, 9, 3, 10, 35, 36, 2, 10, 2, 55, 15, 17, 2, 19, 14, 14, 15, 5, 4, 11, 1, 1, 18, 4, 63, 63, 22, 37, 2, 22, 8, 22, 8, 20, 104, 3, 2, 6, 11, 20, 1, 3, 78, 2, 1, 52, 33, 2, 4, 9, 1, 27, 9, 4, 4, 2, 9, 9, 2, 24, 137, 12, 2, 2, 1, 6, 11, 8, 1, 20, 23, 75, 5, 1, 14, 3, 31, 15, 4, 2, 26, 50, 9, 75, 42, 14, 4, 1, 2, 9, 34, 25, 37, 53, 122, 28, 52, 22, 1, 109, 1, 1, 11, 1, 15, 2, 9, 32, 23, 5, 6, 3, 2, 51, 9, 12, 10, 7, 5, 2, 1, 311, 41, 1, 6, 13, 2, 5, 18, 105, 13, 17, 3, 9, 48, 2, 15, 18, 16, 77, 13, 3, 2, 2, 8, 1, 3, 4, 93, 23, 169, 1, 24, 2, 1, 8, 36, 1, 1, 1, 6, 3, 1, 25, 1, 2, 59, 2, 3, 3, 1, 8, 2, 1, 6, 15, 1, 7, 29, 4, 4, 8, 22, 5, 80, 16, 3, 147, 23, 6, 16, 1, 8, 530]

Using the set_yscale

ax.set_yscale('log')

sns.violinplot(data=first_issues_count, ax=ax)

enter image description here



Solution 1:[1]

A logscale option for the violinplot is on the roadmap for seaborn 0.12. Meanwhile, you can calculate the violinplot using the log10 of the _data and some formatting tricks, similar to Violin Plot troubles in Python on log scale.

The example code below shows how the formatting tricks could be adapted for your situation. For comparison, a sns.boxenplot is added, which doesn't have problems with a real log scale.

import matplotlib.pyplot as plt
from matplotlib.ticker import StrMethodFormatter
import seaborn as sns
import numpy as np

data = np.array([8, 7, 5, 1, 2, 6, 5, 1, 2, 31, 9, 40, 9, 53, 4, 8, 3, 1, 46, 2, 18, 4, 17, 26, 17, 2, 19, 14, 2, 16, 35, 42, 22, 2, 19, 13, 59, 11, 69, 33, 2, 2, 24, 86, 16, 11, 7, 5, 18, 22, 1, 2, 16, 28, 3, 2, 12, 16, 1, 8, 1, 2, 5, 4, 9, 1, 1, 5, 1, 4, 5, 2, 11, 25, 6, 45, 64, 6, 2, 63, 26, 2, 3, 8, 3, 16, 8, 2, 2, 99, 2, 51, 43, 5, 53, 10, 19, 20, 6, 9, 1, 4, 1, 19, 4, 2, 3, 2, 77, 4, 7, 3, 2, 1, 81, 15, 50, 22, 58, 21, 10, 1, 18, 8, 1, 35, 2, 32, 18, 12, 11, 7, 5, 27, 29, 1, 2, 5, 1, 2, 3, 3, 1, 45, 22, 1, 12, 2, 21, 4, 1, 19, 27, 23, 3, 1, 21, 1, 124, 13, 17, 1, 18, 33, 23, 3, 6, 2, 8, 3, 1, 228, 28, 1, 1, 122, 868, 47, 2, 1, 9, 108, 10, 1, 5, 40, 43, 5, 2, 137, 9, 11, 19, 19, 11, 21, 8, 1, 6, 2, 3, 3, 26, 42, 14, 1, 14, 15, 3, 30, 17, 5, 17, 3, 38, 11, 54, 3, 1, 1, 3, 3, 7, 3, 1, 1, 5, 9, 1, 5, 4, 7, 35, 8, 10, 6, 6, 5, 3, 28, 2, 2, 5, 13, 6, 2, 4, 3, 2, 7, 52, 31, 1, 7, 7, 216, 4, 13, 6, 14, 4, 4, 5, 102, 3, 15, 4, 12, 48, 5, 9, 3, 10, 35, 36, 2, 10, 2, 55, 15, 17, 2, 19, 14, 14, 15, 5, 4, 11, 1, 1, 18, 4, 63, 63, 22, 37, 2, 22, 8, 22, 8, 20, 104, 3, 2, 6, 11, 20, 1, 3, 78, 2, 1, 52, 33, 2, 4, 9, 1, 27, 9, 4, 4, 2, 9, 9, 2, 24, 137, 12, 2, 2, 1, 6, 11, 8, 1, 20, 23, 75, 5, 1, 14, 3, 31, 15, 4, 2, 26, 50, 9, 75, 42, 14, 4, 1, 2, 9, 34, 25, 37, 53, 122, 28, 52, 22, 1, 109, 1, 1, 11, 1, 15, 2, 9, 32, 23, 5, 6, 3, 2, 51, 9, 12, 10, 7, 5, 2, 1, 311, 41, 1, 6, 13, 2, 5, 18, 105, 13, 17, 3, 9, 48, 2, 15, 18, 16, 77, 13, 3, 2, 2, 8, 1, 3, 4, 93, 23, 169, 1, 24, 2, 1, 8, 36, 1, 1, 1, 6, 3, 1, 25, 1, 2, 59, 2, 3, 3, 1, 8, 2, 1, 6, 15, 1, 7, 29, 4, 4, 8, 22, 5, 80, 16, 3, 147, 23, 6, 16, 1, 8, 530])

sns.set_style('ticks')
fig, (ax1, ax2) = plt.subplots(ncols=2, figsize=(16, 8))

sns.violinplot(y=np.log10(data), ax=ax1)
major_ticks = np.arange(np.floor(np.log10(data).min()), np.log10(data).max() + 1)
ax1.yaxis.set_ticks(major_ticks, minor=False)
ax1.yaxis.set_ticks([np.log10(x) for p in major_ticks for x in np.linspace(10 ** p, 10 ** (p + 1), 10)], minor=True)
ax1.yaxis.set_major_formatter(StrMethodFormatter("$10^{{{x:.0f}}}$"))

ax2.set_yscale('log')
sns.boxenplot(y=data, ax=ax2)
ymin, ymax = ax1.get_ylim()
ax2.set_ylim(10**ymin, 10**ymax)

plt.tight_layout()
plt.show()

violinplot on log scale and boxenplot

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1