Scatter plots and Density Plot

Scatter and Density Plots
Hello everyone, I am trying to visualize power plant dataset and I would like some feedback.

First and foremost, the visualization was created with ggplot2 in R. I separated the primary fuels of each power plant into “Green” and “Non Green” in order to do a binary classification and I kept only years after 1950 as the others contained excessive noise within. The top two scatter plots shows the correlation of Estimated Generated power regarding to Capacity in Mw of each power plant in a scatter plot and I attached a linear model with each respective confidence intervals (95%). The bottom density graph shows the distribution of the two type of sources among years (1950 - 2020).

Source of dataset: http://datasets.wri.org/dataset/globalpowerplantdatabase

Thus, my main questions are:

  • Do you think this visualization is efficient?
  • I found statistical significance over the models, should I include the p-values and slope as well?
  • Does the colour correspond to the “Green” - “Non Green” classification? or is it irrelevant?


Any other comments are very welcomed.

Thanks in advance,

Thomas

Hello Thomas. I think this is an effective visualisation overall as you are maximising the information by breaking it down into different components. The choice of colours I believe is also appropriate to convey the contrast between green and non green. Since the aim of the visualisation is focused on the capacity vs generated power in 2015, my only observation is related to the bottom density graph and how relevant it is to the two scatterplots which focuses on 2015.