Saving Plots
I use Jupyter notebooks extensively for data analysis and exploration. It’s fantastic to be able to quickly see output, including plots, and have it all saved and persisted and viewable on GitHub.
However, when it comes time to prepare for publication, I need to save high-resolution and/or vector versions of the plots for use in LaTeX or Word. The display in Jupyter does not have nearly high enough resolution to copy and paste into a document and have it look acceptably good.
Most of my projects, therefore, have a convenience function for plots that are going into the paper. This function saves the plot to disk (in both PDF and 600dpi PNG formats) and returns it so it can also be displayed in Jupyter. That way I don’t have two copies of the plot code — one for saving and one for interactive exploration — that can get out of sync. The This function is built for plotnine, a Grammar of Graphics plotting library for Python that I currently use for most of my statistical visualization. It should be possible to write a similar function for raw Matplotlib, or for Plotly, but I have not yet done so. It uses a global variable Code: This can be used like this: The width and height are in inches. And here’s I use these functions in the book author gender code. I also have an R vesion from some older projects, before I switched to Python. This one requires you to use You can use it like this: I also don’t have automatic theming in the R version, but it would be easy to add.Python Code
make_plot
function takes care of three things:_fig_dir
to decide where to put the figures. The extra keyword arguments (kwargs
) are passed directly to another theme
call, to make per-figure theme customizations easy.import plotnine as pn
def make_plot(data, aes, *args, file=None, height=5, width=7, theme=theme_paper(), **kwargs):
= pn.ggplot(data, aes)
plt for a in args:
= plt + a
plt = plt + theme + pn.theme(**kwargs)
plt if file is not None:
= _fig_dir / file
outf if outf.suffix:
'file has suffix, ignoring')
warnings.warn('.pdf'), height=height, width=width)
plt.save(outf.with_suffix('.png'), height=height, width=width, dpi=300)
plt.save(outf.with_suffix(return plt
'DataSet', 'value', fill='gender'),
make_plot(data, pn.aes(='identity'),
pn.geom_bar(stat'qual', 'Dark2'),
pn.scale_fill_brewer(='Data Set', y='% of Books', fill='Gender'),
pn.labs(x=lbl_pct),
pn.scale_y_continuous(labelsfile='frac-known-books', width=4, height=2.5)
theme_paper
, a custom theme that extends theme_minimal
with some text cleanups:class theme_paper(pn.theme_minimal):
def __init__(self):
__init__(self, base_family='Open Sans')
pn.theme_minimal.self.add_theme(pn.theme(
=pn.element_text(size=10),
axis_title=pn.element_text(margin={'r': 12}),
axis_title_y=pn.element_rect(color='gainsboro', size=1, fill=None)
panel_border=True) ), inplace
R Code
+
yourself; it doesn’t have any automatic ggplot calls.= function(plot, file=NA, width=5, height=3, ...) {
make_plot if (!is.na(file)) {
png(paste(file, "png", sep="."), width=width, height=height, units='in', res=600, ...)
print(plot)
dev.off()
cairo_pdf(paste(file, "pdf", sep="."), width=width, height=height, ...)
print(plot)
dev.off()
}
plot }
make_plot(ggplot(frame, aes(x=DataSet, y=value, fill=gender))
+ geom_bar(stat='identity')
+ scale_fill_brewer('qual', 'Dark2')
+ labs(x='Data Set', y='% of Books', fill='Gender')
+ scale_y_continuous(labels=lbl_pct),
file="frac-known-books", width=4, height=2.5)