Prediction results for 2017

Posted on January 28, 2018

I made 35 predictions last January, and judgement day has come. Well, technically is was the 1st, but I’m writing this today. I’ll use evidence as of the 1st where possible. I’m inverting some of them so all the predictions are ≥ 50%.

Politics

Trump takes office on inauguration day: 95%.
- TRUE. This one was pretty obvious. Maybe low.
Trump is not impeached: 85%.
- TRUE. Maybe low? Impeachment takes a long time.
Trump is not murdered nor does he survive an attempted murder: 95%.
- TRUE. Wikipedia There have been 31 attempts in American history, successful or not. The presidency was established March 4th 1789, and there are 83,231 days between that and January 19 2017. So a rate of 3.766249544405297e-3 per day. There were 346 days between January 20 2017 and January 1 2018. 1 - ((1 - 31/83231) ^ 346) gives us a 12.09% probability of assassination. So I think this one was high, especially accounting for how unpopular he is.
Construction on US-Mexico border wall doesn’t begin: 80%.
- TRUE.
ACLU launches at least one major lawsuit against new federal policy: 90%.
- TRUE. This might’ve been low.
No mass deportation program begins: 80%.
- TRUE. Politifact says deportations are actually down though that may be an artifact of how they’re tracking. Regardless, nothing that qualifies as “mass”.
Trump’s approval rating as reported by Gallup is below 45%: 65%.
- TRUE. Gallup’s website is kind of awful, but they had him at 40% approve 55% disapprove on December 30th.
The US doesn’t uses a nuclear weapon on a populated area: 95%.
- TRUE.
Legislation requiring the weakening/backdooring of cryptography and/or computer systems in general is not passed in the US: 75%.
- TRUE.

Technology

Someone writes a web framework for Idris: 65%.
- FALSE. There already was one, last updated in 2014.
Conditioned on my going back to work on it, Idris gets a new build system: 90%.
- INVALID. I didn’t go back to work in it.
I go back to work on it: 65%. (I make no estimate on whether anyone else will rewrite it)
- FALSE. Early in the year I switched my projects to stuff that seemed more likely to get me a job, and later in the year I was working and develpoed pretty bad RSI.
There are at least two more papers published about/using Idris: 90%.
- TRUE. Two by Edwin Brady alone: Type-Directed Reasoning for Probabilistic, Non-Compostitional Resources and Automatically Proving Equivalence by Type-Safe Reflection. Almost certainly more.
No program written in Idris gets substantial use outside of the Idris community: 90%.
- TRUE. AFAICT anyway. Happy to be proven wrong.
There are at least 36 jobs found when searching for Haskell on Stack Overflow Jobs. (18 at time of writing): 75%.
- FALSE. It’s 17 as of time of writing. Haskell seems to be getting less popular over time, which really doesn’t bode well. I’m not sure how to explain the trend. Stack Overflow trends.
There are at least 30 jobs found when searching for Tensorflow on Stack Overflow jobs. (7 at time of writing): 75%.
- FALSE. 19 at time of writing.
There will be at least one large breach of user data reported at Yahoo, Facebook or Google/GMail: 80%.
- FALSE.

Personal life

I will be living in a different house: 85%.
- TRUE.
I will be living in a different metropolitan area: 65%.
- FALSE.
I will have a job programming: 80%.
- TRUE.
I will have any job at all: 95%.
- TRUE.
I will have more average positive social interaction per time, excluding work: 80%.
- Hard to tell. Not substantially up for sure.
My weight will be less than or equal to 180 lbs: 70%.
- FALSE. I’m was at 224.8 as of January 1. I’ve started a new weirdo diet as of the 13th which is working, so this might be improved next year.

Personal work

PDXFunc attendance will increase at least 50%: 75%.
- TRUE. 90%+ of the credit for this goes to Lyle.
I will publish software written mostly or entirely by me, outside of work, that has at least 5 users: 85%.
- FALSE. Beescheduler has 1 active user (hi!). A bunch of people signed up and then completed their goals, although I don’t actually know if they deauthorized the app before that.
The software quality survey will be online or completed: 80%.
- FALSE. Same reason as the Idris stuff above.
The collection of explanatory variables for the software quality study will have begun: 70%.
- FALSE. These are all counted as in addition to the above, not conditional.
I will have published conclusions from the study: 50%.
- FALSE.
I will not have published conclusions that are valid and actionable: 65%.
- TRUE.
No one other than me will take them into consideration: 87%.
- TRUE.

Media

No full length Half-Life game will be released: 95%.
- TRUE. We did get some nice prose from Marc Laidlaw though.
No announcement of same: 70%.
- TRUE.
Steven Universe is still airing: 85%.
- TRUE.
At least one gameplay patch for Dota 2: 90%.
- TRUE. There were 14 in 2017.

Analysis

My cross entropy score was 0.807 (range 0 - infinity), my Brier calibration was 0.0204 (range 0 - 1), my Brier refinement was 0.1496 (range 0-0.25) and my overall Brier score was 0.1701 (range 0-1). For all of those metrics, lower numbers are better. I was overconfident for the buckets from 50% to 85% and underconfident for 87% to 95%. Here’s a calibration chart:

The red line is perfect calibration, my buckets are in blue.

I’m not really sure what lessons to draw from this. In future, I’m not going to do this on a yearly basis - it’s much better to get feedback quicker and more often that once a year. I may start using PredictionBook, though annoyingly they compute Brier scores but not the decomposition into components.

Technology was my worst category by cross-entropy - 1.3793. The breach prediction and the jobs predictions were worst. In the TF jobs one I was very overconfident. There are a lot of postings for the `machine-learning tag, but not for TF specifically. I think I overestimated how much employers care about specific libraries and how complicated industrial ML work is. The Haskell prediction I talked about above.

My second and third worst were the personal categories. I was overoptimistic about project difficulty, succumbing to the planning fallacy even though I’m supposed to know better, and I underestimated how much time and energy professional work would take up. Which is not to say it was a bad year, I really like my job. My RSI seems to be getting better so perhaps I’ll be able to spend time on personal projects in 2018.

(Spreadsheet here.)