Clyde used data sets from the Berkley Earth Surface Temperature project. It has, of course, the self-aggrandising acronym, BEST.
Clyde’s article concentrated somewhat on heatwaves using covariance of moving average data blocks although that is a simplification. I would recommend reading the article.
My preference was to look at the data using Pandas in Jupyter Lab to get a feel for what the data could reveal. As usual, I have posted the Notebook and data to my Github page if you want to download it and comment.
The great thing about Jupyter is that the code for the analyses is clearly presented for anyone to follow and criticise or improve.
So here it is..
I sort of understand why people may be protective of the hard work they have put into creating data sets. But this data set seems so contrived as to be actually worthless. Why present it as the ‘BEST’ when it’s probably not even a good guess?
What is the confidence interval of a temperature data point for March the 5th 1883 in this data set? As that most eloquent and erudite POTUS said, “There’s an old saying in Tennessee — I know it’s in Texas, probably in Tennessee — that says, fool me once, shame on — shame on you. Fool me — you can’t get fooled again.”
Bubble Gum Data