Tiny Quiet Boycott

In 2013, someone I know was trying to fly to California for a break, but didn’t catch a break from the TSA, probably based on his appearance[1].

The TSA deserves much of the blame in the story. However, how does one reform the TSA?[2][3] By doing politics, of course! It’s a good thing that our national politics aren’t deadlocked, dysfunctional, and generally fucked[4].

But I note that JetBlue didn’t help once he was cleared by security. Sure, we can come up with reasons to explain away why this is the right thing to do[5], but none of them seem particularly strong. And, from symmetry I know I wouldn’t want to get stranded after getting put through the TSA wringer.

So JetBlue’s choice made me sad, but what can one do?

You’ve already seen the post title, so you know that I did end up doing something about it. I had been flying JetBlue regularly[6] (I had flown JetBlue the month before I read Aditya’s post), so I had leverage: I could just stop exchanging money for their services.

There are many reasons that people just stop giving companies money, but in this case it’s straightforward to disambiguate. I simply filled out a feedback form on JetBlue’s website, telling them that I resolved to not fly JetBlue for the next 5 years because of their poor customer service in this specific instance. And so I started my tiny, quiet boycott.

Why 5 years? It balances practicality and having some actual principles. In the specific case of airlines, I only fly at most a few times a year, and choosing some short time span like a year means that my choice to avoid JetBlue wouldn’t rise above noise, even to someone that was watching my airline choices in particular. On the other hand, forever is a long time[7], and I’ve seen enough of the world that it seems obvious “negative press cycles” seem to sometimes happen for arbitrary reasons. However, not reacting at all because the news delivery channel is noisy just gets us to a worse equilibrium, where anyone can get away with murder because you can’t completely trust your newspaper. Hence, timeboxed principles[8].

Why tiny? Why quiet? Partly because trumpeting a cause on social media seemed (and still seems[9]) distasteful, the way we end up with a louder and louder social chamber filled with attentional red queen races[10]. Another part is that while the instigating incident is sad and shitty, I didn’t have the energy to grow even a small mass movement around it[11]. So, tiny and quiet it was.


Well, it’s been 5 years since then, and I succeeded in refraining from flying JetBlue[12]. Miscellaneous thoughts[13]:

  • I nearly caved after 2 years; the price temptation is real. I feel like now I viscerally understand how we ended up price pressuring out all the niceties that used to be associated with air travel.

    I will note the 5 year timebox helped me to stay the course. When I would see lower prices on a route through JetBlue, I could remind myself that I wasn’t bound to pass up lower prices forever, just for a while.

  • Was anyone paying attention? Functional Decision Theory (FDT)[14] at least implies that one ought to do some apparently irrational act like voting, even without communicating. The catch is that lots of knowledge impacting voting is public and shared, but there are relatively few people that know Aditya. As far as I know, I am the only one that took direct financial action as a result of his post, so the impact of this incident is capped at a few hundred dollars, if that. Once transactions are bucketed into annual reports and rounded to the nearest million dollars, my impact is negligible. However, if JetBlue’s company-wide policies are generally causing awful post-TSA treatment and people implement FDT[15], then maybe it could add up to really hitting them in the dollars[16].
  • Was it worth it? It was for a good cause, and pretty low cost to me. But I didn’t pin this down into the universe of different causes and actions. I didn’t do a cost/benefit analysis, didn’t run an impact analysis; intuitively it seems quiet boycotting would do atrociously on those measures. On the other hand, as part of a well rounded charity budget, we could file this under “probably not impactful, but feels righteous”.
  • What about actually reforming the TSA? That will have to wait for another day.

TLDR

You too can do timeboxed quiet boycotts. Tell your friends to be superrational, and cooperate without cooperating!


[1]  He looks mildly middle eastern, but only if you don’t recognize that Indians are at best geographically adjacent.

[2]  Note any existing organization isn’t just going to sit down and take attempts to kill or reform it lightly. Samz[]dat: mass movements that lose their originating purpose will find some other way to perpetuate themselves.

[3]  I’m slightly amenable to the 2nd-level take: security theater keeps people from freaking out, and that’s great! Perhaps it’s comparable to gazelle stotting. But that doesn’t excuse the fact that we’re throwing people under the bus. Under the extended biological metaphor, the TSA is more of an autoimmune disease, with the organism expending resources to attack itself.

[4]  You got me: while I disparage US politics, I don’t follow politics particularly closely and don’t understand the levers of power. Hell, I haven’t even read The Power Broker yet.

[5]  It’s a holiday, so everything is booked up hard (especially since JetBlue has a no-overbooking policy, so they don’t normally have to bump people from flight to flight). Or maybe there were unintended consequences from a new policy. These excuses seem tepid: was there really no recourse?

[6]  It seems much harder to expect any boycott impact if you never planned on buying products from the target company. I could vow to never buy any makeup brand, and it would have zero impact on any bottom line in the world.

[7]  Another point: much of the time there’s some discretion being exercised by a corporate agent, and trying to pin infinite blame on an organization for the actions of a single person seems a bit much. For example, deciding you will never use T-Mobile Austria because one of their front-line Twitter help people didn’t escalate properly (Twitter @troyhuntVice) is a bit of an over reaction. And, if “negative press cycles” happen more frequently than companies turn over, you’re going to eventually have a bad time, and will eventually be forced to choose between being principled and practical.

[8]  Another way to think about this is being akin to tit-for-tat: react, but with forgiveness. Obviously, this simple first level analysis is simple.

[9]  Remember, I ain’t posting to Facebook.

[10]  It also calls to mind slacktivism, noise without meaningful action.

[11]  I might not have thought it at the time, but now I know my effective altruist sensibilities would have kicked in and forced me to ask whether I wanted to draw this from my non-altruistic project bucket, since it’s obvious this cause wouldn’t win any QALY contests.

[12]  In one case, even when I was not the person buying the plane tickets.

[13]  For some reason I thought that Alaska Airlines had bought JetBlue, and wrote up something about not considering all my bases: if I was flying Alaska, then was I effectively flying JetBlue? Except I had misremembered, and Alaska had bought Virgin instead. Score another point for memory being frighteningly malleable.

[14]  Also see Hofstader’s superrationality.

[15]  Yeah, I know how this sounds. “If only people did X, we would be so much better off!”

[16]  I know, I’m re-implementing organized movements from the ground up. I know, the particular genre of “we’re going to replace common practice X by thinking from first principles, it’ll be great!” has a… poor track record.

Probability Theory by Jaynes: Summary, Thoughts

This is a overview/review of E.T. Jaynes’ textbook on Bayesian probability theory, appropriately named Probability Theory (GoodReadsPublisher).

This post is split into 3 parts:

  1. A fast high level overview, to provide background for the next two questions.
  2. My answer to the question “should you read this book?”
  3. A mildly more detailed breakdown of the book into chunks, paired chapters that share a theme. The intent is to provide a map where certain material is, to help readers just hit the parts they want to.

Fast Overview

What does the textbook cover?

Chapters 1-4: Deriving/applying probability theory

Jaynes starts by deriving the probability theory from basic building blocks, rigorously(ish)[1] deriving results from basic properties until it covers many of the results presented in an introductory probability textbook[2]. Then Jaynes extends to results beyond introductory textbooks, to show off the flexibility of the methods developed.

Chapters 6-7, 11-12: Priors

As befits a Bayesian textbook, the choice of priors is a running topic. Jaynes starts with the principle of indifference, extends to maximum entropy distributions with an entire chapter devoted to the Gaussian, and includes pointers to marginalization and coding theory as prior generation methods in later chapters.

Chapters 8-10, 15, 16-17: Against orthodox statistics

Probability Theory is half Bayesian textbook, and half polemic against the old guard Fischer/Neyman Orthodox school of statistics. The polemic nature of the textbook is not restrained to these chapters, but these chapters are where Jaynes focuses on picking apart orthodox practices, tools, and sociology.

Chapters 13-14, 18-22: Other

The other chapters make sense for inclusion, but don’t have a strong common theme, covering:

Personal thoughts

Originally I thought I would be badass and work through all the proofs: after spending around half a year of my free time on the first 6 chapters[3], I gave up on that plan. For calibration, the most advanced math course I took in college was partial differential equations[4]. If you’re more adept at math, you may be able to follow along with the flood of proofs more easily[5].

After skipping through the mathematical parts, there’s still interesting insight to be derived, but it likely isn’t as clear as a work that was written from the ground up to serve as loose insight fodder. However, I don’t know if that work yet exists[6], and so the textbook might still be worth reading if you want a look at this philosophical mindset.

On first reading, the polemic parts enlivened the work, but looking back I see them as offputting: even if it is true and necessary[7], the tone leaves a bad taste in my mouth.

Should you read this?

Yes, if:

  • You have a few months of your schedule cleared, mathematical aptitude, and a desire to learn the foundations of Bayesian probability theory inside and out.

Maybe:

  • You’re interested in how to motivate probability theory. Some folks recommend reading the just first 2 chapters, but I wonder if explanations of Cox’s theorem available on the internet are sufficient[8].
  • You’re a frequentist, want to learn about how frequentism and Bayesian approaches relate to each other, and (importantly) don’t mind feeling attacked.
  • You want a polemic; moreover, you want a highly technical polemic. I only hesitate, since it’s a long polemic, and there are likely papers written by Jaynes that encapsulate this feeling more efficiently[9].

No:

  • You need something practical, to use in your day to day work as quickly as possible. As befits a book with “Theory” in the name, this is not that book.
  • You want to learn about modern Bayesian approaches that take full advantage of the current glut of computational power. Jaynes died in 1998, just as the dotcom boom started taking off, and WinBUGS (an early Bayesian analysis package) was only just released. I can vouch for Statistical Rethinking as a beginner-friendly text, and Bayesian Data Analysis is regarded as a good non-introductory work[10].
  • You are bothered by incomplete works. Jaynes didn’t finish writing Probability Theory before he died, and the later chapters have big chunks missing and lines of inquiry dropped on the floor.

Chunk overviews

In my understanding, it appears that most of the conceptual chunks consist of paired chapters.

Chapters 1-2: Grounding Probability Theory

The textbook starts by laying out the reasoning behind probabilistic thinking, first motivating thinking about probabilities by contrasting it with Aristotelian logic. Then, Jaynes sets out to chart the properties of probabilistic thinking, framing it as an extended logic.

Unsurprisingly, Jaynes delivers the usual definition of probability, with the usual product/sum/Bayes rule. Surprisingly, he derives these fixtures by starting with a simple set of desiderata (desired properties)[11]:

  • Plausibility is represented by continuous real numbers.
  • Qualitative correspondence with common sense. This means using syllogisms similar to but weaker than Aristotle’s.
  • Consistent reasoning, including:
    • Path independence. If an answer can be calculated multiple ways, then each calculation should give the same answer.
    • Non-ideological. The reasoner does not leave out information.
    • Equivalent states of knowledge are represented by the same number.

and simply working forwards via mathematical derivation, laying out the entirety of Cox’s Theorem. As befits a Bayesian probability textbook, a term for background/prior information is included from the beginning without much fanfare.

Chapters 3-4: Doing Inference

With the product and sum rules in place, Jaynes works out exact solutions to “draw a ball without replacement from an urn” problems, including a surprising backwards inference to the first draw given information about a later ball draw.

Expanding to drawing with replacement, Jaynes takes the chance to draw a map vs territory distinction: randomization is throwing away data. While he goes on to derive the usual randomized draw results, he also extends the result to draws with (simplified) non-perfect randomization.

After working forwards from given generating information (given this urn, what is the probability of drawing a red ball?) Jaynes also works backwards to do hypothesis testing (given these draws, what does the urn look like?). There’s a bit of concept/terminology thrashing when Jaynes adopts and throws away terms (decibels of evidence, the log form of the likelihood) as he generalizes to multiple hypothesis testing.

By this point, it’s pretty clear that Jaynes has an axe to grind, with a constant exhortation to OBEY THE RULES OF PROBABILITY THEORY and a vendetta against taking underdefined limits to infinity[12].

Chapter 5: Queer uses for probability theory

A grab bag of topics, a break chapter of sorts. Jaynes talks about ESP, which leads to the counter-intuitive idea that different priors can lead to different people updating their probabilities in opposite directions (as an extreme example, priors may include “Bob is a paragon of truth” and “Bob is a compulsive liar”). He also talks about the importance of comparing alternatives, instead of evaluating hypotheses in a vacuum, offering a solution/dissolution of Hempel’s paradox (is seeing a white shoe supporting evidence for “all crows are black”?).

Chapters 6-7: What priors should we use?

Jaynes starts by generalizing hypothesis testing to a continuous domain, but I think this chunk is more properly thought of as starting to tackle the hard question of prior selection. He works out the impact of choosing uniform, truncated uniform (assumes at least 1 ball of each color in an urn), a concave (more uninformative than uniform), and “binomial monkey” prior on hypothesis testing.

Chapter 7 is just about the normal/Gaussian distribution. Jaynes includes 3 different derivations of the distribution (by Herschel-Maxwell, Gauss, and Landon), which seems like overkill. However, his motivation is to explain the unreasonable effectiveness of the normal distribution (other distributions naturally become Gaussian, and stay Gaussian under common operations, and is the maximum entropy distribution given mean/variance), and dispel the unease when people use a distribution that doesn’t match their (unknown) error distribution. I also think of this as his first attempt at making good on his promise in the preface to teach us maximum entropy methods.

Jaynes also weighs in on:

  • early stopping, basically stating that possible data sets should not impact analysis, especially over the actual data set that was collected[13].
  • improving precision by aggregating data. Contravening folk wisdom, averaging a bunch of data with 3 significant digits means we can confidently state the average with 4 significant digits.

Chapters 8-10: Against frequentist tools

So far most of the topics have been about explaining the Bayesian approach to probability, but now Jaynes lays into frequentist tooling and philosophy.

I didn’t get much out of this chunk, since I wasn’t grounded in frequentist practice before. However, a list of topics Jaynes explains as redundant or supplanted by the Bayesian approach:

  • sufficient statistics.
  • the likelihood principle.
  • ancillary statistics.
  • ad-hoc evidence combination. Includes a cute parable about estimating the height of the emperor of China (1 billion people know the emperor’s height to ±1m, averaging all the estimates should get us an estimate with stdev 1/√N = 0.03mm. The key to the paradox is that the individual estimates are not independent) and a warning against something similar to Simpson’s paradox.
  • 𝛘2[14], or significance tests in general that purport to evaluate a hypothesis without any alternatives to compare against.

Jaynes also spends around half the chunk explaining connections between Bayesian probability theory as thus far explained and frequentism, showing that in a pretty simple case the frequentist solution is equal to the Bayesian one with an ignorant prior.

(Of interest to people from LW: Jaynes is certain that the probabilistic nature of quantum mechanics is false, that the quantum physicists have given up the cause too easily[15]. With that in mind, it feels like Yudkowsky’s quantum mechanics sequence is in part a response to this charge.)

Chapters 11-12: Discrete/continuous priors

Now in part 2, Jaynes loops back around and comes back to expand on Bayesian concepts.

So far Jaynes has touched on maximum entropy priors, especially around the normal distribution, but now he lays out a more rigorous definition of information entropy, working from desiderata:

  • there’s a connection between uncertainty and entropy.
  • continuity: entropy is a continuous function of p.
  • if there are additional choices, uncertainty increases.
  • consistency: if there are multiple ways to get an answer, they should agree.

from which Jaynes derives information entropy, following the Wallis derivation. Using this definition, he expands maximum entropy distributions past just considering an average from prior data.

He also extends to continuous maximum entropy distributions, tackling the problem of getting a distribution that is invariant under parameter changes. (For example, a uniform prior can lead to different results depend on if it’s uniform over x or x5.)

Chapters 13-14: Decision Theory

This is a bit of strange chunk: does this really belong in a textbook entitled Probability Theory?

First, Jaynes lays out some groundwork, including a demonstration of non-linear utility (the Tversky/Kahneman research program shows up a few other times in Probability Theory) and the usual square/linear/dirac delta loss functions resulting in the usual mean/median/mode estimates.

Getting to the heart of the matter, Jaynes walk through Wald’s decision theory[16], and ties it to Bayesian inference by way of identifying a prior distribution in the likelihood calculations. Then, he walks through deriving different decision rule criteria (minimax) from a Bayesian criteria.

Given that Jaynes identifies these developments of decision theory as starting points for the Bayesian revolution, it makes sense why this topic shows up, even if only for historical context.

Chapter 15: Stop, Paradox time

Another break chapter, Jaynes picks apart paradoxes, most of which have to do with improper limits to infinity, including:

I think Jaynes included discussion of these paradoxes to shore up his position that probability theory is not just practically workable, but correct, applicable everywhere. In this light, tackling historical confusions makes sense.

Chapters 16-17: Against Orthodoxy

Jaynes returns to ragging on frequentists, this time with a more philosophical bent.

Jaynes presents his take on the sociology of orthodox statistics as encouraging learned helplessness with a cookbook approach (Statistical Methods for Research Workers) and doctor-client-like statistician relationship, instead of teaching researchers base principles themselves. (I do wonder if he was typical minding and overestimating people’s ability to generate probability theory from scratch.) Unfortunately, he also stoops to comparing personal details on Fischer and Jeffreys, which made me feel like I’m reading a high-brow academic tabloid instead of a textbook.

That said, there are more technical arguments as well, around the choice of unbiased estimators, the practice of prefiltering data (if you smooth your data, you get some future data into your current data), and the (mis)use of the sampling distribution width as a measure of estimator goodness.

An interesting insight that makes intuitive sense to me is that Fischer and Jeffreys were naturally responding to their fields: Fischer in biology had lots of data, but not as much theory, and Jeffreys in geophysics had well developed theory, but not much data. In that light, it is no wonder only Jeffreys considered priors important.

Chapters 18, 20: Future work

(The strongly thematic paired chapters start to fall apart here.)

Now we’re getting into some more speculative work, which Jaynes thought should be developed with more rigor in the future.

Take, for example, the Ap distribution, in which Jaynes models a distribution of probabilities, instead of a distribution of values, leading to the final probability of a particular hypothesis as an expectation of the distribution. It seems necessary in order to avoid recomputing probabilities with all the gathered data so far, but he didn’t have a principled motivation for the construct, nor a way to avoid infinitely regressing with a probability of a probability of a…

He also sneaks in a discussion of Laplace’s rule of succession, attempting to rescue it from rampant misunderstandings (it only applies with little to no prior information), and uses it as a vehicle to tie together probability/frequency again.

Jaynes also touches on model comparison, but only as an expanded sort of hypothesis testing. Most of the chapter is not super practical: for example, while he vaguely gestures towards the problem of overfitting, he doesn’t give concrete solutions to solve it. (The recent generation of Bayesian textbooks noted above give more detail on possible solutions.)

Chapters 19, 21: Wrapping up loose ends

Jaynes shows that Bayesian methods don’t need as much hand holding if the model can take into account bad/inaccurate data, being able to weight data the model concludes is bad without needing manual intervention (ex. outlier removal[17]).

Chapter 22: Example Application, Communication Theory

The chapter is kind of a worked application of max entropy, but otherwise it’s not clear why Jaynes decided to include this chapter. My best guess is that he meant to make clearer the use of coding theory for prior generation, but wasn’t able to do so before he died.

Plus, it’s not a great introduction to communication/coding theory, when he won’t call a Huffman coding a Huffman coding.


[1]  Rigor as defined relative to me: others have described it as “fast and loose” in the tradition of the Griffiths physics textbooks.

[2]  Looking over the table of contents for A First Course in Probability by Sheldon Ross, many but not all of the chapters would be covered.

[3]  I was well into my software engineering job, which did leave me less intellectual energy. If you are not as heavily intellectually taxed, this timeline could be compressed.

[4]  Keep in mind that not all courses are the same; doing well in the equivalent course at MIT would be more impressive than what I did at my small liberal arts college, and hence bode better for following all the details.

[5]  For example, a good chunk of the time I was trying to puzzle out whether some step was in fact a legal operation, and how it was legal.

[6]  Portions of Rationality: A-Z come close, if you’re not interested strictly in probability theory and just want the general parts of Probability Theory.

[7]  Kind/true/necessary categorization stolen from the SSC comment policy.

[8]  Like, how often are you going to need to rederive the sum and product rule from basic principles, instead of simply leaning on the knowledge that probability theory has such a basis?

[9]  Or, House of Cards (GoodReads) is another polemic in the same vein, written for a more popular audience.

[10]  It’s unclear to me exactly how non-introductory you have to get. Like, do you just need to know about distributions? Have taken a graduate course?

[11]  It’s a bit weird to include 3 different sub-items under the same “consistent reasoning” heading, but apparently this is the traditional formulation of the desiderata.

[12]  To be fair, it does seem like it’s a problem that people tend to fall into easily.

[13]  You may have read warnings about early stopping (example), but those concerns only impact NHST.

[14]  There’s a fun worked example that underlines the ad-hocness of the 𝛘2 test: let’s say we have a (thick) coin, and a person that knows there’s a coin flip (49.9% heads, 49.9% tails, 0.2% on edge) and a person only informed there are 3 outcomes (33.3% to everything). For the data 14 heads/14 tails/1 edge, then 𝛘2coin = 15.33 and 𝛘2equal = 11.66.

Note there are practical ways to overcome the problem of a small category with the 𝛘2 test, some more satisfying than others.

[15]  Keep in mind this is coming from the man that coined “whenever there is a randomized way of doing something, then there is a nonrandomized way that delivers better performance but requires more thought”.

[16]  It seems like the Wald maximin model is the major contribution Wald made to decision theory, but it’s not obvious this is the same concept as in the textbook. It also doesn’t seem like this decision theory matches directly to CDT or EDT.

[17]  The outlier removal example used Euler’s attempt to estimate the orbital parameters of celestial bodies as a frame device. I’m half taken by the idea of using this frame as a demonstration of Bayesian methods: just go outside, make crappy measurements with crappy equipment, and derive an okay solution for the orbital parameters for the inner solar system.

Signing up for Cryonics: an incomplete and biased guide

Status: surprised no one has put this together yet, would have loved to have this 6 months ago. If you don’t know what cryonics is, good luck.

What this guide will cover:

  • The different steps necessary for signing up for cryonics.
  • Some choices that will have to be made in the process of signing up for cryonics, and some recommendations.

What this guide will not cover:

  • Whether cryonics is generally a good idea.
  • Whether or not cryonics is right for you.
  • Exhaustive detail on each choice in the cryonics process: for example, I won’t evaluate every life insurance provider in the US.

In other words, I will assume that you already want to get signed up for cryonics, and that you are facing the trivial inconvenience of learning more details about the process of signing up for cryonics, or what a normal process of signing up looks like, and that you are similar to American, June 2018 (~30 year old) me.

Finally, note that recommendations included are not chosen by some impartial arbiter: all judgments and recommendations are based on my incomplete knowledge, and no attempt to correct for my biases was made. If you make important life choices based solely on this guide (eg. signing up for cryonics), you might end up making bad choices, and we will wonder how you could have been so naive as to trust a random person on the internet.

TLDR/Summary

First, choose your cryonics provider and procedure:

  • Alcor, if you can afford it.
    • Choose neuropreservation (just brain freezing) if you want a cheaper Alcor option, and don’t care about preserving your body[1].
  • Cryonics Institute (CI), if you can’t afford Alcor.

Second, choose your type of life insurance:

Third, choose your insurance provider. Rudi Hoffman is the cryonics insurance agent, but you can go through another agent or try doing it yourself as well.

These recommendations might not be good if:

  • You are not young and reasonably healthy.
  • You plan on being outside of America when you get old and likely to die.

Choice 1: Cryonics Providers

There are two major cryonics institutions in the US: Alcor and the Cryonics Institute (CI). There are more cryonics companies, but due to time constraints I decided to pass on seriously evaluating them. In particular, I was lazy and wanted an established company so I could free ride on other people’s opinions[2]. If you are outside the US, service by these organizations will likely be more expensive or worse than usual, and it may make sense to evaluate other cryonics organizations.

Alcor

Alcor is higher profile[3], seemingly more established[4], and more expensive: the total cost for Alcor is $100k for neuropreservation (just the brain) and $220k for whole body preservation, with an additional annual membership fee of $525.

Notes about the above cost, particularly about options not included above:

  • The normal prices for Alcor are $80k for neuro and $200k for whole body: why are the prices I provide above $20k higher?

    Standby (stabilization and transportation of your body) can be paid for with either a $180/year fee, or by committing to an extra $20k payment on death, in addition to the freezing costs. If you end up paying for cryonics via life insurance (detailed in the next section), an extra $20k of coverage should work out to less than $180/year in insurance premiums, especially if you are young. Therefore, I ignored the annual standby fee option and included the $20k in the costs above.

    (Standby is technically just the process of having people that can stabilize and transport close by when a patient is close to death. Speed to freeze after death is important, so a dedicated team makes sense. However, I’m going to simply group all the costs related to standby, stabilization, and transport under standby for ease of reference, instead of repeating “standby, stabilization, and transport” a bunch of times.)

  • You can pay membership fees more frequently than annually, but the more frequent payments add up to a more expensive rate (but only marginally, $134 quarterly dues add up to $536). I have no idea why you would both pay more money and inconvenience for the more frequent option.
  • The membership fees can be discounted if you are a student, or if you have a family member who is already signed up for cryonics.

Cryonics Institute

The Cryonics Institute (CI) is seen as more of a mom n’ pop budget organization, with concomitant lower costs: CI offers whole body preservation for $28k (CI does not offer a neuropreservation option[5]). Membership costs a one-time fee of $1250.

CI also does not handle standby directly (stabilization and transportation of your body), so you will need to contract that out separately. For example, if you use the services of Suspended Animation (SA; referenced here by CI), you could pay between $37.5k[6] and $60k for successful transport to CI. One can also pay for standby services via life insurance (detailed in part 2).

Notes about the above costs:

  • Regarding membership fees, there is also an option to pay an annual instead of a one-time fee. However, it costs $120/year (vs. $1250), and you have to pay a higher $35k procedure fee (vs. $28k), which adds up to not a good idea for anyone.
  • SA has lots of options, which can lower your costs a lot if you know when/how you are going to die (ex. if you have aggressive terminal cancer).

Cost Notes

Bringing the costs together again:

Alcor: $100k – $220k, $525/year. Depends on whether you choose neuro or whole body.

CI: $65.5k – $88k. Combines CI and SA costs.

Important! Note that these prices could rise at any time: in the beginning of cryonics, members were locked in at the price when they signed up, so if they died 30 years after signing up, the procedure may have become a lot more expensive (say, simply due to inflation, or advances in technique) but they would still be paying prices from 30 years ago. This predictably turned out to be financially unworkable, since the cryonics organizations suddenly needed to either get into the business of projecting when their members would die, or needed to eat the potentially large differentials between current and future cryonics costs (see Gwern’s capture of Alcor hand-wringing about this problem, and Rudi talking about the funding change). Instead of doing that, the organizations now require new members to cover the cryonics cost at the time of death: if this is 50 years from now, the costs may be significantly different than what they currently are.

Additionally, standby costs for tricky situations may be a higher than normal: see this case of a patient being airlifted with funds provided beyond the costs stated above, and the potential need to pay for extra days of standby with SA.

The bottom line is that you will want to over fund your cryonics procedure relative to current prices. How much of a margin to over fund by is up to you, and strongly depends on how old you are: the younger you are, the more time prices have to grow. For reference, I am funding my procedure at 3x current prices, which I arrived at by spitballing and not rigorous modeling. You can compare this with the guy that runs the site fightaging.org, who is over funding by 2x (at an unknown age). Also for reference, see Alcor’s past prices[7] (I could not find any CI charts).

Alcor vs CI

It seems like people generally recommend Alcor if you can afford it, and CI otherwise:

Also note that the president of CI went on a charm offensive in 2016, popping up in a number of places on the internet (Quora/r/cryonics thread)[9].

Let’s break this down to a case study: Robin Hanson and Eliezer Yudkowsky (from We Agree, Get Froze) are signed up for Alcor and CI respectively (source), which fits since Robin the tenured professor can afford Alcor, and Eliezer the non-profit researcher[10] would have a harder time doing so.

For reference, I went with Alcor.

Choice 2: How to Pay for Cryonics

Now, let’s talk about paying for that big-ass freezing fee[11].

(And reminder: I Am Not An Accountant[12]. If you uncritically take all the following assertions at face value, you deserve as much sympathy as someone that believes that someone on Reddit just gave them a hot tip about dogecoin ie. no sympathy at all.)

Most people don’t have $220k laying around. Even if they did, the organizations can’t accept your word that you will pay them those thousands of dollars, because you will be dead and unable to do things. You can will them the money, but people contest wills all the time, and the organizations can’t wait weeks or months for your will to settle: you need to be frozen as soon as possible to save as much brain structure as possible.

You could deposit money for the process with the cryonics organization, but the organizations won’t invest in risky equities, so your deposit might not keep up with procedure price increases. Plus, that’s a huge amount of money to be tied up for most people.

Instead, the usual way to pay for the service is to use life insurance. If you die, the policy pays out a guaranteed amount after death, when the money is needed. And, if you make the cryonics organization the owner of the policy (as suggested), it is difficult to drag the policy into a will battle, since it’s not an asset owned by the deceased. Problems, neatly solved.

(If you’re concerned about insurance fraud, because cryonics is all about a last ditch effort[13] to fight death, consider that life insurance companies are making the same amount of money on cryonicists as anyone else, since the insurance pays out on legal death, which is a lower standard than information-theoretic death. Besides, we don’t even know if cryonics works.[14])

There are two broad types of life insurance.

  • Term: coverage lasts for a fixed term, usually 10, 20, 30 years. You keep paying in premiums, and if you die within the coverage window, the policy pays out. Once the term is up, you will need to get another term policy. Term premiums are cheaper, which makes sense when you consider around 97% of people do not die while covered by term insurance[15]. On the flip side, if you’re old or sick, a new term policy will be expensive, perhaps to the point that you would be uninsurable.
  • Indexed universal life (IUL) (replacing older whole life insurance[16]): coverage lasts for your entire life, or whenever the cash value funds run out. As a simplified model, you can consider the policy as a growing fund that you pay into, with fees taken out of the fund to pay for an ongoing term-like policy in the background. The exact way the fund grows seems to be the main difference between the different sorts of whole/universal life: for example, IUL could peg growth to the S&P 500, hence the “indexed” (with some caveats, which we will cover below). The fund growth is important to offset the effects of growing fees as you grow older, and hence more likely to die[17].

It’s common wisdom that “term is better than whole” (including whole variants like IUL), which seems to usually be correct. Consider a more normal insurance scenario, in which a single income household wants to cover the death of the breadwinner while the kids grow up: in this case, 20 year term would cover the kids growing up, and afterwards the children are adults and can responsibly decide to squander their money on avocado toast and cronuts.

In the case of cryonics, though, one really does want coverage until they die, so whole life makes much more sense despite having higher premiums.

(Note the IUL premiums aren’t intrinsically higher, but are effectively higher due to needing to cover your entire life, and following the suggestion of paying in higher premiums earlier helps grow the fund portion of the policy to help cover the higher fees later in life.)

That said, term insurance can make sense, but only if you are currently budget constrained, won’t be budget constrained in the future, and want to avoid cryo-crastinating[18] (you wouldn’t be reading this otherwise, right?). If you’re a university student, 10 year term life insurance should be cheap, and bridge you into a future where you have a job and the ability to pay for an IUL policy.

Let’s expand on the IUL a bit more. So you pay into the policy, and any funds left over grow pegged to an index like the S&P 500 (the indexed part of Indexed Universal Life). And that’s not all:

If you really want, you can squint at it and try and treat the entire package as a Roth IRA[19] without contribution limits[20].

This sounds too good to be true! How are we paying for this lunch? Or, what’s in it for the insurance company?[21]

  • There’s a cap on maximum yearly account growth (ex. if the index grows 20%, but the cap is 12%, the account will only grow 12%).
  • Or, there’s a cap on the proportion of the account that can “participate” in the market (ex. if the market grows 20%, but the participation rate is 75%, the account will only grow 15%).
  • These caps can be changed at the insurance company’s discretion (with the understanding that efficient markets means the companies aren’t going to give you awful rates for no reason; however, you can bet that they’re going to choose rates that will make this a worse “investment” vehicle than a normal brokerage account).
  • The market growth does not include dividends: putting dividends back into the market can account for a large portion of a normal investment account’s growth, and the indices are usually pegged to price returns sans dividends/interest, not total return.
  • Even if the cash account growth is floored at 0%, inflation effectively adds in negative growth on down years.
  • Surrendering the policy (ending the policy to access the cash value of the policy) means paying income taxes on any gains over premiums already paid. Using loans to cash out the policy is questionable: it’s illegal to sell IULs as an investment, which makes me wonder if some regulator is going to nail tax-evasive IUL usage to the wall at some point.

Taken all together, IULs are not a good investment[22]. But, it works for our purposes: we want a vehicle that will pay out a guaranteed amount on death, will at least attempt to keep up with inflation, doesn’t require a huge upfront investment, and avoids uninsurability problems with other schemes[23]. Given these upsides, it’s not surprising that life insurance is how most cryonicists pay for the procedure.

And to give you an idea of ballpark life insurance costs:

  • At 31yo, $500k coverage, 30 year term: $43/mo. (source)
  • At 31yo, $100k coverage, 30 year term: $14.55/mo[24].
  • Myself, at ~30yo, $240k coverage, IUL: ~$110/mo.

And as of 2018, Alcor’s site has a table of age vs life insurance coverage in the midst of their funding options page.

There are quote engines on the internet that will give you more ballpark estimates, especially if your age or coverage needs are different. But note, again, these are estimates, and your actual rates may vary depending on the underwriting process.

Choice 3: Choose your insurance provider

There are a multitude of insurance providers out there. However, some of them aren’t suitable for our purposes: they need to be okay with the policy being used for a cryonics procedure, and we need to be able to assign the cryonics organization as the policy owner.

(Note we’re not giving up any flexibility by doing this ownership transfer: at least Alcor (and probably CI also) will sign an agreement stating that they will hand back ownership of the policy on request (Alcor example) if you don’t want to be signed up for cryonics any longer.)

You could ask Alcor/CI for a list of preferred insurance agents, but you would do worse than to default to Rudi Hoffman, who’s also signed up for cryonics himself. You can request an insurance quote from his website.

Fair warning: having gone through the process with Rudi, I can why see why Alicorn decided not to go through Rudi due to a personality conflict. Rudi is very much a salesperson, and I can see that certain people would be turned off by salesman-like rapport building. However, unless you are dead certain that you cannot work with such salespeople, I would recommend at least going through the first 5-15 minute call and seeing if you really can’t work with him. It would be a shame if you couldn’t take advantage of Rudi’s up to date market knowledge.

And if you find you really can’t work with Rudi? Alicorn did it herself through New York Life, but that was back in 2012, and the markets may have changed. You’ll have to shop around, and I don’t have any particular insight there.

The Sign Up Process

Once you’ve made your choices, now it’s time to make it happen.

Sign up for Life Insurance

If you go through Rudi, what does the process look like?

  • Do an introductory call, chat about your choices about cryonics organization, life insurance type, and the amount of coverage you want.
  • (Possibly IUL specific) Rudi sends you an illustration, which outlines some worst/average case scenarios. I’m guessing this is required to highlight the market based nature of the account, that no rate of return is guaranteed, and ongoing costs could eat into paid in premiums.
  • Verbally assent that the illustration looks good.
  • Receive the full life insurance application, fill it out and send it in. You will likely need to provide the 1st month of payment, or some way of charging you.
  • As part of the application process, go through a call with a nurse, who will take your medical history.
  • Do a physical exam, which the life insurance company will contract out. You’ll likely need to fast for half a day before the exam, and the exam will likely include a blood draw[25].
  • Wait a couple weeks, get your shiny new policy in the mail! (For example, mine took 7 weeks to arrive from when I mailed my application in, not accounting for delays in taking my history or physical examination. If you are on point and get back to the insurance company promptly, you will probably have a faster turn around time.)

Sign up for Cryonics

I’ll be detailing the steps for Alcor.

  • Print out the cryonics application, fill it out (you can do most of it on the computer with PDF form filling), send it to Alcor. This requires an application fee of $300, and making choices like “what should happen to my remains if I die in an remote location and am not discovered for years?” These choices will feed into the exact text that is included in the agreement.

    You don’t necessarily need to wait until the life insurance process is done to send in the application, and can fill out the insurance details in the agreement you’ll get later. However, note that there’s a $90 fee if you take more than 4 months to finish the application process once you’ve started[26].

  • Alcor will send back an full agreement with legal verbiage matching your choices in the initial application. If you didn’t fill out your insurance information in the application, fill it out on the policy transfer agreement. This step may be redundant, since Rudi may have sent your insurance policy to the cryonics organization.

    In addition to signing all the documents, you’ll need to get 2 witnesses to sign most of the documents, as well as get the included Last Will notarized. Banks will notarize documents; since my bank didn’t have a branch in my area, I went down to the UPS Store and got notarization services there for $2. Note you’ll need to get both your signature and your witnesses’ signatures notarized at the same time, so you’ll need to find some time that’s good for all the parties involved.

(Side note: the agreement contains a list of detailing the plethora of ways cryonics can go wrong or fail to work. If you’re not sure you want to sign up for cryonics, I recommend reading this list first (available on Alcor’s website, starting on page 2 in section 9), before you spend lots of time and money on the process only to bail at this step.)

  • Wait about a week, and you’ll get back a packet, with a counter-signed copy[27] of the agreement, as well as necklace/wrist tags/wallet cards detailing what needs to happen in the event of your death[28].

Congratulations! You’re now signed up for cryonics.

Post-signup

There are some additional steps you should probably take.

  • Let your family know that you’ve signed up for cryonics, and make sure they are willing to support your cryonics procedure. There’s at least one horror story where the family of someone signed up for cryonics hid the fact that the cryonicists died (and embalmed them!), so they could try to get the insurance payout. If your family will not support you in the cryonics process, you may want to make sure to adjust your will’s executor and power of attorney to someone that will.

    Alcor also recommends getting family members to sign a relative’s affidavit, but that isn’t as important as making sure they know you’re signed up for cryonics, and support it.

  • Tell your friends you’re signed up for cryonics. While it doesn’t seem as helpful as telling your family[29], you may want to cover your bases and make sure the non-family people that will be most likely to be around you know your wishes regarding cryonics and will call the cryonics organization if something happens, and are in a position to advocate for you if your family has a change of heart.
  • Wear the cryonics bracelet/necklace. This is probably not especially helpful, since it seems more likely that you’ll die of a cause you see coming, or you will die around people that know your wishes. However, if we’re already relying on long tail chances to succeed, it may make sense to cover the long tail failures with low-cost coverage.

Hopefully this guide has cleared up the moving parts around signing up for cryonics, and served as a good introduction to the process. Let me know if anything is confusing, or differed in your own cryonics sign up experience[30].

And remember: live forever, or die trying!


[1]  Some people worry that the nervous system outside of the brain will play a role in personality/subjective continuity. Maybe they’re right.

[2]  On the other hand, having another company would be great for redundancy; right now if something went wrong to Alcor/CI for reasons unrelated to being a cryonics company, that would be bad for the cryonics market in the US. Having another major player would reduce the chances of catastrophic failure.

[3]  Part of this fame is due to suing relatives that refuse to allow their clients to get frozen.

[4]  People seem to believe that Alcor has more employees, but Wikipedia’s Alcor page lists their employee count as 8 (as of 2016/06), which implies that CI is really on a shoestring budget.

[5]  This seems fine, since the entire point of a neuro option is to be cheaper, and CI is already inexpensive.

[6]  Using the prepaid incremental plan $7.5k + the $30k completion fee. This is pretty aggressive, and I would be worried about the lack of flexibility, but the nice thing is that CI+SA allows you to make the call yourself.

[7]  Obviously, past performance does not guarantee future performance.

[8]  I remember reading a critical piece with more specific criticisms, but now I can’t find it. Sorry.

[9]  Also thought I saw him in more places, but when I went back, I didn’t have it all in one place, and couldn’t find it all.

[10]  I really, really wanted to put “math messiah” as his profession instead.

[11]  But why does cryonics cost so damn much? It wasn’t obvious this would be the case: you can read about early cryonicists projecting that demand for cryonics would become so large that economies of scale would kick in, and cryonics would cost $8.5k. Instead, in 2018, I have an Alcor ID number in the low 3000’s, which extends over 46 years of operation (Alcor was founded in 1972). That’s not scale by any stretch of the imagination.

[12]  Even if I were an accountant, I’m not your accountant.

[13]  Strangely, cryonics is a last ditch effort to avoid being lowered into a last ditch.

[14]  In that regard, you could also apply the same line of thought to any religion with an afterlife: “Ha! My life insurance company thought I died, but in reality I transcended to heaven and now my earthly family has all this money!”

[15]  The exact wording is “97% of term policies do not result in a death benefit”, which isn’t exactly the same as 97% of people not dying. I presume the wording is the way it is due to fraud cases.

[16]  It seems like the main benefit is lower costs from having the policyholder hold the risky investments.

[17]  Obviously, this doesn’t cover potential life extension; if people start routinely living to 200, problems are bound to happen. On the other hand, you might not even need cryonics if we hit life extension escape velocity.

[18]  Like many things, people put off signing up for cryonics. Unlike many things, signing up for cryonics has to be done before you die, and can’t be easily done for you by another person, so procrastinating on this particular thing is particularly bad. It can be done if everyone knows that you want it, support your decision, are willing to put in large sums of money into the process, and can make it happen while grieving: for example, see this case study (I vaguely remember seeing a case study that started after death, but I can’t find it). You’re already rolling the dice with vanilla cryonics, why gamble some more with whether your loved ones can get you frozen without prior arrangements?

[19]  “Withdrawals” appear to be done via loans against the IUL policy. As the breathless tone of the article might tip you off, the maneuver is not exactly not shady.

[20]  IRA contributions top out around several thousand dollars per year.

[21]  “Contributing” to profitability is not a bad thing: you don’t want your insurance company to go out of business, especially 30 years from now when it will be much more difficult for you to get favorable life insurance rates.

[22]  Unless you have absolute Fuck Tons(tm) of money, in which case you might actually want whole life insurance for estate planning. Have fun talking to an accountant.

[23]  Consider another approach, where we implement something like whole life ourselves: we create a 10 year term insurance ladder, renewing a policy every 10 years, and pay for it with normal investments which don’t have the same IUL restrictions. The main problem with this is that you have to have a new insurance policy underwritten every 10 years, during which you might become uninsurable. Same problem with doing a 30 year term and then dumping normal investment money into an IUL. If you plan on making so much money that pre-paying $220k is chump change, then good for you, but this guide is not for you.

[24]  The table must have a typo: there’s no way prices jump up $10 from 30-31yo, and then back down for 31-32yo.

[25]  You may get to see your blood panel results, if your insurance provider shares the data with you.

[26]  This isn’t necessarily consumer hostile: when people are known to cyrocrastinate, a financial penalty might help people get over the last hump. It doesn’t seem likely that people start signing up for cryonics without knowing what they’re getting into, where they suddenly change their minds 4 months into the process.

[27]  If you want a counter-signed original for your records, you may need to print an additional set of documents or otherwise request it: I only received one set of documents.

[28]  Hopefully, you will know that you’re dying ahead of time, and be able to convey to your doctors that you are signed up for cryonics, and to please cooperate with the standby team. Even more ideally, you dodge death entirely.

[29]  Looking over the 8 cases in 2018 so far, it looks like many cases are living in hospice care, when there is ample time to explain the patient’s cryonics wishes. In the most dynamic case, the patient choked to death, but his family was around to alert Alcor and explain the situation. And in the worst outcome, no amount of telling people about the patients cryonics wishes would have helped, since the patient died and wasn’t discovered for some amount of time.

[30]  I can’t guarantee that I can integrate everything into this specific guide. I’m thinking it might be a good idea to create a more general guide that does try to cover the entire possible decision tree. Let me know if you’re interested in such a project.

Time Levels

Summary: lays out a skill leveling system based on time. I don’t endorse it, but maybe you’ll have fun thinking about it.

Back in 201X[1], I was thinking about how to convey mastery of a skill.

For example, at some point I was talking to a young kid that also played the violin. The kid was playing through the Suzuki instruction program, and since he knew I also played the violin, they asked me what Suzuki book (1-10) I was working from at the moment. For context, I was in late high school/college, and had barely touched any of the Suzuki books (since I didn’t go through the program). And, the last book just contains Mozart’s Concerto 4, which is not difficult in the grand scheme of things[2].

I told the kid that there was an entire other universe outside of the Suzuki books, and at some point you have to leave them behind. The kid wandered away looking confused and unconvinced, but it got me thinking: how do I convey just how much distance they have to go? Or, turn it around: what would I need to hear in order to make the call “yes, this person is way better than I am at $TASK“?

Some fields have it easy, with rankings built in: chess and Elo come to mind, as does Go and Dan ranks. Or if you’re an insider, you can use your knowledge to make judgments: for example, maybe you just happen to know that Mendelssohn’s violin concerto is harder to play than Bach’s Concerto for Two Violins, or that one youth symphony is generally known to be better than another.

But if you don’t happen to be looking at a field that self ranks, or a field that you deeply know, is there a way to get a sense of someone’s mastery?

What if we used time spent practicing?

Time Leveling

The general idea is that the longer someone has spent practicing a skill is correlated[3] with how good the person is at that skill.

While thinking about this around 4 or 5 years ago, I randomly came up with the following way to map hours of practice to a “skill level”:

minimum hours of practice = level · 2level

So, for the first 10 levels:

  • Level 1 → 2 hours
  • Level 2 → 8 hours
  • Level 3 → 24 hours
  • Level 4 → 64 hours
  • Level 5 → 160 hours
  • Level 6 → 384 hours
  • Level 7 → 896 hours
  • Level 8 → 2048 hours
  • Level 9 → 4608 hours
  • Level 10 → 10240 hours

It’s appealing because the levels work out to a logarithmic relation with hours, and people love logarithmic magnitude progressions. Plus, it seems to intuitively fit the range from not knowing anything about about the skill, up to the point that one should be more than competent at the skill. I’ll give examples in the next section to help show that the intuition seems correct. And, the system reaches level 10 around 10000 hours (from the 10000 hours to mastery meme), which makes it easy to re-derive the exact hour calculation.

Why hours? Minutes are overly fine grained, and days are too coarse grained: practicing 1 minute and 24 hours are vastly different outcomes, both of which might count as “I practiced today”. While it’s possible to use fractions of a day instead, hours are easier to think about as a sub-day increment, instead of a unit that is conflated between 24 hours and “around 16 hours of being able to do things”[4].

Examples

To demonstrate, I’ll run some Fermi estimates[5] of hours of practice I have done for different skills. I’ll be conservative with most of my estimates, because much of my practice was done before I had really rigorous time tracking[6].

Level 9 Programmer

I started programming around the end of middle school, but in a fantastically undisciplined and sporadic way, where sometimes I would hack feverishly through the night, and other times would do nothing for long periods of time. Surprisingly, this continued into actually joining a CS program, so I extended the range of really uncertain estimates[7] over 3 years of HS and 5 years of college.

8 years · 12 months/year · 1-10 hour / month = 72-720 hours

I graduated and went off to full time work, which is really where I started putting in lots of time to coding, minus weekends and some number of vacation days[8]. I’m uncertain whether I should discount from 8 hours/day quite so much, but I think it’s right for a target between deliberate practice and ass-in-seat time.

6 years · 250 weekdays/year · 3-4 hours / weekday = 4500-6000 hours

Total: 4572-6720 hours, just meeting[9] or well over the level 9 limit, 4608 hours.

Level 7 Violinist

I started playing in the 5th grade, and wasn’t especially serious about it until the 8th grade. The summer was especially bad, so I’ve heavily discounted a large portion of the year.

3 years · 2/3year · 52 weeks/year · 0.5 hour/week = 51.5 hours

Then I got a private teacher, joined a youth symphony, and I started putting in more time during high school[10].

4 years · 50 weeks/year · 2-5 hour/week = 400-1000 hours

I continued into college, and was a bit more studious. And then I moved to New York, and basically stopped playing. I do vaguely remember having my shit more together during college, but I figure that weekly figure is better on the somewhat conservative end.

3 years · 50 weeks/year · 4-8 hour/week = 600-1200

Total: 1051.5-2251.5 hours, solidly in level 7 (896 hours) but not really solidly in level 8 (2048), so I rounded down.

Level 4 Statistician

I know I’ve picked up things like the normal distribution by osmosis throughout life, but have no idea how to price that in, so I’m leaving it out for now as “not even trying to learn statistics”.

0 hours

However, I do know that high school was basically worthless.

0-4 hours

I took a statistics course during college, which was at best mildly better, and required actual thinking.

8-16 hours

I did pick up some probabilistic thinking from LessWrong, but I question whether anything beyond “An Intuitive Explanation of Bayes’ Theorem” really made a difference. However, it was great at making me care about probability/statistics.

1-8 hours

More recently, I spent some time doing statistical work while reanalyzing a game development postmortem analysis. I time tracked the entire thing, so I have a more precise figure without uncertainty[11].

13 hours

I somewhat recently read Jaynes’ textbook Probability Theory. Also time tracked.

101 hours

And, I spent 10 hours investigating how to use Stan, a statistical tool.

10 hours

And I’m currently reading/working through problems in the textbook Statistical Rethinking, which compared to “Probability Theory” is a billion times more practical and billion times less theoretical[12].

9 hours (as of mid-August)

Total: 142-161 hours, solidly above level 4 (64 hours), but not well into level 5 (160 hours) yet[13].

Level 4 Drawer

Much like many children, I spent a fair amount of time in school and at home scribbling at random. However, I didn’t do that much of it, and instead spent a lot of my post-primary school days reading instead of scribbling[14], so I’m heavily discounting the time I would have spent. Plus, I don’t have a good way to estimate my time doing art in primary school.

8-16 hours

More recently, after the point I had actually grown into a Real Person™, a friend gave out drawing lessons, and afterwards I spent a fair amount of time practicing sketching[15].

12 weeks (~3 months) · 2-4h/week = 24-48 hours

Since then, I’ve done some more miscellaneous doodling, for example in support of my current avatar image.

4-8 hours

Finally, I spent some time working on art for a puzzle. It didn’t go anywhere, but there’s still sketches saved to my hard drive, and I time tracked this work.

20 hours

Total: 56-92 hours, ranging from just under or well into level 4 (64 hours).

(Point in favor of the leveling system: I’m not a good artist, but even the short amount of time I’ve dedicated to actual practice has gotten me visibly better at drawing, compared to a random sampling of software engineering coworkers who could barely put out stick faces during a recent brainstorming exercise.)

Level 1 Potter

We’re going to discount all the time I spent playing with playdoh as a kid. Indeed, the only time I’ve worked with ceramics is in the last month, when I took a 2 hour wheel workshop.

2 hours

Total: 2 hours, exactly the threshold for level 1 (2 hours), going from “completely clueless” to “can slap clay on a wheel and intellectually understand how things have gone so wrong“.

Downsides

All that said, I don’t think this system is useful as anything other than a curiosity.

Practice is heterogeneous

Not all practice is equal. Deliberate practice yields greater dividends than forced feet-dragging practice. Practicing in bad conditions (say, before getting coffee) is not as useful as practicing when awake and alert[16]. People themselves are quick witted or slow to learn[17].

And this is all before considering what is being practiced. Practicing tic-tac-toe has a much lower skill ceiling than playing Go, and playing the kazoo has a much lower skill ceiling than playing anything else[18].

That is, someone at some level could be worlds better/more impressive than another person at the same level, and the system makes no effort to distinguish between them.

10000 hours is bullshit

The system maps level 10 to a number of hours close to 10000, which Malcolm Gladwell talks about in Outliers as the amount of time before you get to mastery. This is a blessing, because the 10000 hours meme[19] is everywhere and thus easy to remember. It is also a curse, because it’s a bad meme.

There’s a kernel of truth to it: you need to put in lots of time before becoming a master. But, as explained in the book Peak, a sort of response to Gladwell, Anders Ericsson explained that his research, which served as the source for Gladwell’s 10000 hour limit, is at best approximate. Deliberate practice matters, and if you don’t do that sort of focused practice, you can plink away at the hours and barely make any progress.

So the 10000 hour link isn’t one that should be reinforced. Or, maybe the sort of disclaimer I just gave would arrive intact to whoever hears about this system, but it’s more likely that some radical simplification happens in transmission and nobody hears the more nuanced take.

(On the other hand, this system is a radical simplification of practice and mastery already; maybe it deserves to be tied to Gladwell.)

Equivalence with plain hours

Once you have an estimate of hours practiced, one can convert hours to levels. But, the two are roughly equivalent, so why go through an extra step?

You could argue the wide buckets are helpful:

  • since the buckets are so wide, people won’t be tempted to spend lots of effort getting a precise estimate that doesn’t matter.
  • again, it maps to a logarithmic understanding of the world, which seems correct: practicing for twice as long usually does not mean someone becomes twice as good.

However, what if given an hour count, we simply threw away all digits aside from the most significant one? So 4356 hours would map to 4000 hours, and 356 hours would map to 300. This has many of the same benefits as the system (aside from the logarithmic understanding), while avoiding an arbitrary division of hours into levels. The main downside would be that it’s only a step away from just reporting a precise hour count, and then we’re back at doing detailed accounting, just to show up your rivals by practicing 10 hours more than them even at 4000 aggregate hours.

Adoption could have bad effects

Remember Goodhart’s law? “When a measure becomes a target, it ceases to be a good measure.” It would be easy to make this system a target (contrast this with IQ, which is not easily movable[20]). If we apply optimization pressure towards leveling, the obvious point of failure is that the system cannot capture how deliberate any practice is, so mindless practice is rewarded.

And then I do wonder if calling out the lifetime amount of time practiced would damage intrinsic motivation. Again, applying optimization pressure to a measure that only pays attention to time practiced would drive out other motivational factors.

Network effects

The system is primarily a social technology: it’s most useful when exchanging information quickly. If you need to explain what the hell “level 6” is each time you invoke it, then what’s the point?

And then on the other hand, you could run into the same network effects as lojban“your communication would be unambiguous and logical, but with the kind of people who learn lojban.” In this case, you can exchange information quickly, but among people willing to ignore the caveats that ought to encrust the system.

What’s the point?

The most obvious use of the system is in an informal contest of skill among friends, acquaintances and enemies, or what one might possibly call a “dick measuring contest”. Given the flaws above, and the subjectivity of use that Fermi estimation introduces, it’s not especially interesting to judge the system by this usage.

On the other hand, one could construct cases in which it is difficult for laymen to identify masters. In such a case, the system might actually give the best estimate of skill available. However, it’s hard to see how this might be generally useful: sports teams have playoffs, Go has Kyu/Dan rankings, chess has Elo, the violin has competitions and visible positions for professionals. Hell, for the violin, even listening to someone should be enough to at least pick someone out of the “dying cat” category[21]. If nothing else, you could do the same thing as everyone else and outsource your judgment to college[22].

Somehow, the scenario where the system becomes useful requires profound ignorance on part of the arbiter, while also caring a lot about picking winners and losers. In this specific case, such a metric as the system could work. However, if mastery actually matters, profound ignorance in the thing you are trying to measure is an “oh shit” state of affairs, not something you want to keep long-term.


So, to close: I lightly endorse the (sparse) thinking that lead up to it, but with all the downsides I don’t recommend actually trying to use/distribute this system.


[1]  For those confused, the X is deliberate: I don’t have a hard date to nail down exactly when I was doing this thinking, just that it was within the last 8 years.

[2]  For example, if you leave high school without being able to breeze through that concerto, and you want to play at a professional level, then something has gone terribly wrong.

[3]  The alarm bells should be going off right about now: everything is correlated with everything else (in fiction form), so just claiming any level of correlation is Not Helpful.

[4]  And don’t forget to knock off another 8 hours for work, and another couple hours for making sure you do things to stay alive.

[5]  Fermi estimates are non-ideal because it’s usually thought that they will only get you within an order of magnitude of the actual number. Unfortunately, that might mean getting us just within the ±1 level range.

[6]  If you want more on time tracking, and enjoy podcasts, you can check out episode 44 of Cortex.

[7]  The low end estimate seems really low: I feel like I must have spent half that time just physically punching programs into a TI-83.

[8]  It’s true that I do some coding on the side, but it’s a drop in the bucket compared to my day job.

[9]  If it bothers you I didn’t strictly hit the minimum, imagine I added in the couple extra months of full time work I dropped when I rounded down the time range from “6 years and change” to “6 years”.

[10]  It was clutch discovering that I could go practice when a “spirit” assembly was happening.

[11]  It’s true that this figure isn’t super-precise: ideally we would only count really deliberate practice, and I’m sure I must have spent some time staring off into space wondering what’s for lunch while I was on the clock. I’m sure, because I just did it today. However, we don’t have that, and time tracking means I’m reasonably on task.

[12]  NEWS FLASH: textbook with the word “Theory” in the title contains theoretical content! More at 11!

[13]  By the time this is published, it’s likely that I’ll be almost done with Statistical Rethinking, and solidly in level 5.

[14]  Plus, I had arty friends that I couldn’t compete with. For example, there was a marvelous sketch of a cat-me that I am sad to say probably no longer exists.

[15]  I recorded these sketch times, but it’s currently tied up in the sketchy books, and not easily accessible otherwise.

[16]  This isn’t universally true: if you’re special ops, then part of your job description is being able to function when not fully awake.

[17]  Only taking into account practice time could be seen as a systematic under weighting of the talented.

[18]  I will never turn down a chance to bash on the kazoo, because then I get to link to people doing insane things with kazoos, like making actual passable music.

[19]  This is the tiny hill I will have part of my soul die on: memes are the Dawkins concept, not whatever the EU just banned.

[20]  Otherwise, the system shares more than a few features with IQ: it’s a single ordinal number which nosily measures (all else equal, does it matter if someone has an IQ of 115 vs 112? : spent 400 or 500 hours practicing bonsai trimming?) some attribute (intelligence : mastery), and is heavily modulated by all sorts of things when you consider what you actually want done (conscientiousness : deliberateness of practice).

[21]  I hesitate to make any stronger of a claim, since I remember Joshua Bell playing in DC public transit and not getting any looks.

[22]  Elephant in the Brain (which I’ve read) borrows from The Case Against Education (which I haven’t read) to make the claim that higher education is primarily useful for credentialing. Or: education is not about learning.

Happy 10th Blog Birthday!

My blog is turning 10 this month[1]. Happy birthday, blog!

(It also happens that my High School 10th year reunion was within a year ago, but it is difficult for me to care about it.)

Let’s take a look at what I posted in June 2008

Oh no. What is this, using WordPress as unlimited character Twitter? Stream of consciousness meta-jabbering about how I was setting up the blog I was posting on? WHO WOULD WRITE SUCH CONTENTLESS DRIVEL?

Ahem.

So there have been some changes in the last 10 years. To note some of the larger ones:

  • Then, I was still at least nominally Christian. Now, I’m about as atheist as they get without foaming at the mouth.
  • Then, programming was a fun side hobby. Now, programming is my full time job.
  • Then, my most serious endeavor was playing the violin. Now, I haven’t played in months and haven’t practiced in years[2].
  • Then, I didn’t belong to a community. I had friends, I hung out at school and church, but I didn’t fit with a flock of people[3]. Now, I’ve found my people[4].
  • Then, I was just starting college in a small suburban city downwind of a sawdust mill. Now, I’ve been out in the working world for years in the middle of New York[5].
  • Then, I was absolutely awful at normal person things, like taking care of my personal appearance[6], shopping for clothes[7], socializing[8], and seeking out new experiences[9]. Now, I’m (somewhat) better at all those things.

If we really wanted to sum it up, I grew up. I got my shit together[10].

But this is all cocktail party level retrospection. Let’s dive in a bit deeper.

On the violin

For a large portion of my life I ostensibly cared about getting better at the violin, but I always had motivational problems carving out time to do any practice, much less practice that was hard or smart[11]. I got better at the mundane art of forcing myself to practice as I racked up years, but eventually I just didn’t have the will to keep it up after moving across the country[12].

In retrospect, I didn’t even have the patterns of thought to start tackling the problem of “how do I become a world class violinist?” I knew practicing was difficult for me, and the only thing I could do was try to practice harder, and then feel bad when I would once again procrastinate well into the evening, past the time people would put up with me practicing[13]. I didn’t think about changing my environment, about changing my drive and motivation, about how I could improve my feedback, about cultivating the habit of practicing. For example, if your practice time sheets aren’t motivating, maybe you should try something else!

Hell, I didn’t even alieve that “becoming a world class violinist” was a thing I could try. I could utter the words “I want to play in Benaroya Hall[14]”, but it was pie in the sky. How did people usually get to the pie? Did the pie taste good[15]? Did the pie have any money in it[16]? Didn’t know, didn’t bother finding out, it was just the sort of thing you said to signify an acceptable direction of “I’m-not-just-coasting-along”.

I think there are a few contributing factors: first, modesty norms prevented me from seriously pursuing a goal as immodest as “be the best in the world at <anything>”. Second, I seemed to naturally shy away from striving, from fully committing to doing something. In an article I might have read more than 10 years ago[17], not giving it everything leaves room for uncertainty: if only I had gotten a bit luckier. If only I had tried a bit harder. If only.

As I ventured deeper into college, my awareness of my performance inadequacy paired with my growing love of technical things. The clincher came when I auditioned for the Columbia student orchestra, and was rejected. Up until that point there had been a continuous stretch of orchestral work from 5th grade to the junior year of college[18], and it’s telling once any obstacle at all was placed in the path of that train it immediately derailed.

(At least it derailed onto a track with material outcomes better than those of the starving artist path.)

On gaming

Let’s go back further than 10 years ago, around 15 years ago. I would play strategy games like Age of Empires and Starcraft, but I didn’t understand how to play. That is, I could move units around, but I didn’t know how the game expected me to win. I didn’t grok simple concepts like macro and micro and logistics[19], to the point the only way I could win was with lots and lots of cheats[20].

So we skip ahead a little, and by the time I hit college I have discretionary funds; more importantly, I also no longer have the paralyzing fear of spending said funds. So I bought The Orange Box, a bundle of great games (Half Life 2 + Episode 1 + Episode 2, Portal, Team Fortress 2 (TF2)) from back when Valve actually made games. I’m not sure, but I think TF2 might have been the spark: it turned out I was bad at FPS games. No, I was absolutely awful. If I didn’t pick medic and get points by assisting someone else racking up kills, I would sit at the bottom of the leaderboard, and routinely experienced the ignominious distinction of being forcibly switched around when teams got unbalanced.

But, I was having fun, which was good, because otherwise I might not have learned, deep down in my bones, that I was absolutely awful at something. Sure, I unambiguously flunked a college course[21], I didn’t get into MIT or Caltech, I was a pretty mediocre violin player, but I could spinthose failures as temporary setbacks. I’ll show them. I’ll show them all!

Neal Stephenson (through Snowcrash) remarked that “Until a man is twenty-five, he still thinks, every so often, that under the right circumstances he could be the baddest motherfucker in the world.”[22] In matters of WASD and joystick, I was clearly not the baddest motherfucker, I was just the baddest. I couldn’t even avoid getting wiped by my freshman roommate in 1v2 in Call of Field or BattleDuty[23].

So fast forward a few years, after I graduated, when I finally pick up another computer capable of gaming. I picked up Civ V, and muddled my way through understanding the game, picking up an understanding[24] of game economics through a simple simulated economy. I cycled through getting my ass handed to me (Bismarck betraying me still stings), and learning how to avoid the obvious problems, which grew less obvious over time.

Perhaps even more instructive was seeing how different Civ games implemented different mechanics, and seeing how those changed the game. Much like how I only got a sense of how science fiction worked by reading enough books that the differences stood out from the tropes[25], was seeing how Civ IV and Alpha Centauri and Beyond Earth handled similarly and differently from Civ V[26].

Then cue a coworker getting me into the new XCOM, which eventually led me to beaglerush and getting into the optimizing mindset.

(Running parallel to this was delving into a community that celebrated finding hacks and snowballing. More on this in the next section.)

And eventually, the circle completed: I picked up the HD remake of Age of Empires II, and played the campaign on Hard. I understood the use of space, the offensive potential of a good economic base, rock-paper-scissors army composition. And I could win, no cheats needed[27].

On community

I didn’t really belong[28].

Don’t get me wrong, I had friends, good friends. But, I didn’t have a larger group of people I fit with: I drifted on the edges of social circles at church[29] and school, hanging out with the orch dorks and math geeks and fobs, but not settling in. Orchestra was instructive: sometime in high school I had hit a no-mans-land between decidedly mediocre school orchestras and the ambitious youth orchestras. It was a lot like when I was hiking when I was in the Boy Scouts[30]: I was too fast for the younger kids, but not fast enough to keep up with the older kids. I would routinely find myself hiking alone, alone with mother nature[31].

So that was longer than 10 years ago. Around 10 years ago, I joined the university physics club. It felt different: unlike a lot of high school, we weren’t thrown together as a social group to survive the prison that is secondary education. Instead, we were there by choice to nerd out. We hung out, we built a coilgun with big ole’ capacitors. It was probably one of my first tastes of being part of a group of competent peers[32], which was amazing.

(The aluminum reprap I started with the physics club was another stepping stone, an early time where I decided that I wanted to do something, and then I made it happen. Thank you, Alan Thorndike, for helping me become a more agenty person; I’m sorry we couldn’t save you.)

Then I transferred schools. I thought finding a few competent people was great, but finding a dozen was even better. ADI (Application Development Initiative) was my first foray into having dozens of people that all cared deeply about something with lots of energy. Let me emphasize: it was absolutely astounding to me that there was a veritable buffet of people that all cared about the same thing (building things and getting better at the craft of code), with around the same level of skill and prioritization of the thing we cared about.

It was my first taste of not feeling like I had to do everything myself. Group project in high school? People were incompetent or uncaring. Group project in physics club? There was one other person that was competent and cared, better hope they didn’t get sick. And in ADI, people went out and got things done, and I didn’t have to do them[33]. It’s a glorious feeling.

But still, something’s missing. Ivy league go-getters got things done, but it took a Harry Potter fanfiction meetup to put me in touch with my people, the rationalists.

My people aren’t much better on the execution axis: the ivy league kids win with vat-bred conscientiousness. They’re not more charismatic, don’t have better parties[34], don’t have more fun. No, what they have is visionKardashev 1 and abovethe vector space of possible mindshacking together a cycle of infinite wishes. Sure, they are a little wild in eye and beard, but fishing around for tomorrow’s out of context problems requires a bit of madness.

And maybe more importantly, they have spirit. We sing “Tomorrow can be brighter than today”, knowing perfectly well the Nash equilibria that might swallow us whole. We sing “Where we wanna go, who we want to be, in another, in five thousand years?”, a reminder of where we’re going when we’re drowning in elbow grease. We improve the self, because our challenges are poorly designed and don’t scale well (almost like they weren’t designed at all…) and losing isn’t an option.

For challenges shorter than 5 years, give me the ivy leaguers. For challenges further out, such that even Phillip Tetlock has hazy vision, give me my people.

The Future

That was a brief snapshot of where I’ve been, and how I got here.

Now, where am I going?

I could rattle off my project list, but experience tells me it’ll change within a year. If we take an impressionist approach, within 10 years I hope that I’ll have completed something that has reach and is something I’m proud of[35]. Or, at least if Hamming sits down with me at lunch, I might have an answer to his questions. And, there’s the strong possibility that alien priorities will take hold of me and warp me into another person[36]. So either through forging or biology, I expect to be a different person.

Well, thanks for joining me in the last 10 years, and here’s to next 10!


[1]  It might not actually be 10 years old: an early post indicates that I lost some posts in a hard drive crash.

[2]  Later I mention that I’ve grown up, but I might consider this one of the clear casualties of doing so, the sort of thing that warrants a camera pan to me staring sadly out at the sun setting into the ocean while sad violins play in the background.

[3]  I empathize with Scott’s description of not knowing just how atomized suburban life can get.

[4]  The post does point to me complaining about them, but it’s gotten better, and I’m more confident things will work out in the 5 year time frame.

[5]  To be fair, this is mostly due to inertia, which I tell myself is because NYC is great, despite the dirty streets. Also, fuck driving.

[6]  I’m sure if you dig really hard you may be able to find a picture of high school me with a crew cut.

[7]  I will admit this is debatable, since right now all my socks have holes in them.

[8]  It is true I’m usually socializing with bigger nerds than I, but surely we can ignore that minor point.

[9]  Example: back then, it took me at least a few months to go check out the university pizza shop in the basement of the student union building, precisely because there were tons of people, I hadn’t done high speed ordering before, and I wasn’t familiar with the dive bar atmosphere. Now, I get irrationally annoyed when people in front of me in line spend more than 10 seconds deciding what to order.

[10]  I also enjoy Ray’s description of the same getting-his-shit-together phenomenon in Sunset at Noon.

[11]  See the book Peak for more on practicing smart.

[12]  You can see me maintaining the fiction that I still cared in my earlier blog posts.

[13]  Note that part of this was being excessively nice: I didn’t really probe the boundaries of what I could do in pursuit of greatness, instead I used other people as an unwitting excuse to get out of putting in work.

[14]  Funny story, I’m certain I did, but I forget if it was with the Mahler Festival or some special youth symphony event, or maybe both. But that was never the intent of the phrase: soloing or at least being a regular orchestral member was.

[15]  The intermediate pies at least were somewhat delicious: I remember a friend telling me finishing performing a symphony was one of the handful of times I looked happy.

[16]  As it turns out, the pie sounds beautiful but has little money in it.

[17]  The original article got nixed, but the gist of this article is still the same.

[18]  Minus the obvious summer vacation when everything slumped over, but plus the week long summer orchestra camp that happened when nothing else was.

[19]  Looking back, I wonder if it wouldn’t have been better to just have access to MineCraft or the like. Maybe I would have eventually figured out how to game the economy, but it seems unlikely given my arc.

[20]  Show me the money.

[21]  There is a truly marvelous story about this occurrence, which this blog post is too small to contain because otherwise it would be too fucking long. The important thing is that it was clearly my incompetence that caused the flunking to happen.

[22]  Keep in mind that writers don’t necessarily endorse everything they write, for purposes of creating villains and flawed protagonists.

[23]  The nerds in the back can stop screaming, I meant that demonic mish-mash.

[24]  These days, once I get into a game, I devour it through wikis (RimWorld, Stellaris), discovering the mechanics of a game through bare stats and cold meta-gaming (alpaca or dromedary? Muffalo 4 lyfe). However, I think it was important that I muddle my way through organically grokking a game at least once.

[25]  Again, a bigger story than this footnote can hold.

[26]  If you want specifics: policies leading to more heterogeneous play styles (Alpha Centauri factions), the existence of doom stacks vs forcing local tactics with stack limits, different country boundary mechanics leading to different tactics.

[27]  That said, I did lean on some AI behaviors, like a reluctance to harass villagers.

[28]  I recognize this is exactly as emo as this could sound.

[29]  Church was weird, since I lived far away from the center of congregational mass, and was far on the Americanized side of most of the Asian churches I attended.

[30]  The Boy Scouts were another place I didn’t fit: I got to Life, but Eagle just didn’t seem to matter once I had gotten within striking distance.

[31]  It occurs to me now that the situation is probably something the scout leaders were trying to prevent (buddy system, etc) but whatever, I came through fine. And, it’s quiet in beauty.

[32]  Admittedly, the group was really small; I don’t know if we ever got more than a handful of people in a room at once.

[33]  Entering the workforce was also a bit like this: I suppose it helps that doing things is your entire job.

[34]  Although I hold that pi day is a perfectly good holiday for parties, let no one tell you otherwise. Just keep tau day as a twice as good perfectly good holiday.

[35]  I realize the irony between this hope and my earlier realization that I am not the baddest ass.

[36]  I know that version of me would be happy, but seeking maximally happy versions of myself without regards for anything else leads to wireheading.

Tax Charity Research

Epistemic status: amateur effort, on the order of half a day of casual investigation. Accidentally failed to do scholarship properly[1]. Possibly accepts bad premises[2].

It’s story time.

So tax season rolls around, and you wonder why taxes are so damn complicated, to the point that paying someone or something to help with your taxes is an attractive proposition. You would like to save the money you would spend on tax prep, but slogging through worksheets of impenetrable bureacratese is bad enough that you’ll pay money for accountants or tax software.

Now look at Sweden, who has return-free filing. The government sends people their prefilled return, the people take a look and sign off if it’s correct, and for most people it takes minutes, instead of the hours it took me to do my taxes[3]. You only need pay for tax help if you want to do weird things.

Now look back at my country: the large tax preparation conglomerates have an incentive to oppose the existential threat to their business model in return-free filing. If they can both keep return-free filing from being implemented, and ensure the tax code is complicated enough that people have to get help, then they can push people to give them money with malicious UX.

With only mild public will to save a few hours a year, and an industry lobby to ensure those few hours stay unsaved, the outcome seems obvious. Companies will lobby for a bigger tax prep market, get money from the bigger market, and then lobby some more, ad infinitum.

It’s a compelling story, but I haven’t done the rigorous legwork of figuring out how true it is. However, for this post will accept it as true[4], so we can focus on what happens next.

In particular, I pay for tax prep software but feel bad about feeding the grand tax prep loop[5]. What can I, a private citizen, do to nudge the nation towards getting an IRS that can implement magical return-free filing?

In this case I’m going to be lazy, not think too hard about ways to take more direct action[6], and simply look for nonprofits already operating in this space.

Summary

I mildly endorse the Institute on Taxation and Economic Policy and the Tax Policy Center. Both are either aligned with my politics or neutral. Unfortunately they are both general tax non-profits, with only tangential focus on return-free filing.

Non-profit structuring

A quick note on nonprofit structures: a 501(c)(3) organization is the usual nonprofit, with donations to the organization being tax deductible. However, 501(c)(3) organizations have the restriction that they can’t attempt to directly influence legislation or campaign for candidates. Instead, explicitly political organizations are covered under 501(c)(4) (social welfare organizations), for which donations are not tax deductible.

So, a somewhat common structure seems to be split between a backend 501(c)(3) organization that does the non-partisan analysis and support work, and if there are outcomes that support a certain policy decision, a sibling 501(c)(4) that pushes those specific policies. This doesn’t seem to prevent the 501(c)(3) organizations from being clearly tilted one way or another: one of the organizations below I donated to stated “we’ll show how proposals to restructure or dismantle progressive income taxes will affect people across the income spectrum”, maintaining just that shade of plausible deniability[7].

However, with the recent tax bill restructuring, it’s a lot harder to hit the standard deduction ($12000), so if you were going to donate to something, maybe it’d make sense to donate to 501(c)(4)s instead. Personally, I’m trying to blow past the standard deduction so I’m primarily interested in the 501(c)(3)s, but I’ll note when there is a 501(c)(4) associated with the below organizations.

Methodology

My methodology was super non-rigorous, politically biased, and not terribly in-depth[8].

  1. I drew from notes accumulated over the past year, which I made whenever I ran across a possible tax-related non-profit.
  2. I searched Google for other sources I may not have stumbled across randomly, and were prominent enough to be found with search terms like “tax nonprofit”.
  3. I searched Charity Navigator for charities related to taxes, skimming the list and cherry picking the ones I thought looked good.

Once I got a list of charities, I tried to answer some questions:

  • Are they a 501(c)(3) or a 501(c)(4)? Remember, donations to one are tax deductible, donations to the other aren’t.
  • What sort of work did they do? As we’ll see, there are some non-profits doing noble work, but it’s work that isn’t addressing the root causes.
  • What do their financials look like? For example, if they’re sitting on lots of cash, it’s less pressing to donate to them.
  • Do their goals align with mine?

Ideally, I would do a quasi-GiveWell-ian impact analysis, convert everything to something like QALY/$ but for something like “policy movement/$”[9], and figure out which charities were doing the best work and had funding shortfalls. However, I neither have the skill nor time to do that, and mumbles something about perfect being the enemy of good.

What I’m donating to

Tax Policy Center (TPC)

  • Both parent institutes are 501(c)(3), so donations are tax deductible.
  • They do mainly do modeling work, as well as produce educational materials. I like the high level suggestions in 10 ways to simplify the tax system because they acknowledge that there are trade offs to these simplifications. They also do bog-standard economic education, like how plastic bag taxes work, but an educational center can’t live on more advanced concepts alone.
  • The TPC is a… joint sub-institute?… of the Urban Institute and Brookings Institute. Both are well off: The Urban Institute has $101M in income (Charity Navigator entry), and the Brookings Institute has $108M in income (Wikipedia, 2016; weirdly, couldn’t find them on Charity Navigator). Unfortunately, it’s not clear how donations/assets are allocated to the TPC specifically.
  • The materials produced by the TPC are not clearly partisan: their reporting on the TCJA was remarkably even handed. They do mention return-free tax filing (12), but it doesn’t appear to be a core pillar of their agenda. This isn’t such a big problem, because no one makes return-free filing a core part of their agenda. Additionally, from Wikipedia both parent institutes are regarded as not especially partisan.

So, the TPC seems to be doing even-handed policy analysis work, with the downside that it is hosted by institutions already well funded relative to other charities.

TPC’s donation page; you may need to specify that you are earmarking a donation for the TPC when donating to either parent institute.

Institute on Taxation and Economic Policy (ITEP)

In short, ITEP is more focused specifically on tax policies, with a progressive bent I generally support, and fewer resources than TPC (especially CTJ[11]).

ITEP’s donation page.

Tax Help

Tax AidCommunity Tax Aid

There’s a class of 501(c)(3) organizations focused on helping local low income folk fill out their tax returns. It’s a noble cause, but it’s not getting at the root of the problem, which is that they need to fill out returns at all.

Wut-level Charities

These are charities that are confusing in some way, or have goals inimical to mine.

Tax Analysts

  • Tax Analysts are a 501(c)(3) organization.
  • They produce tax analysis briefs, as their name implies.
  • For a tax-focused charity, they have a ton of money: $68M in assets, and $48M in income (Charity Navigator).
  • The briefs and positions within seem even handed, so my problem is not with the position of the charity[12]. No, it’s with Tax Notes: it appears that the charity is somehow linked to a subscription service for tax briefs, which is provided to tax professionals and other parties interested enough to pony up thousands of dollars for analyses. For example, “Tax Notes Today” is $2,500 annually. I’m confused about whether Tax Notes is feeding into the Tax Analysts income above, which would make sense given their large asset pool. Working under that assumption, it seems like Tax Analysts don’t need my money.

Americans for Tax Reform (ATR)

This is one of the first results that show up if you search for “tax charity”.

However, the ATR’s main (only?) goal is to lower taxes, period. Then, their Taxpayer Protection Pledge page is full of GOP pull quotes, making it clear who their demographic is, as if the banner “4,000,000 Americans (and counting) will receive Trump Tax Reform Bonuses” (complete with “Click here to see the employers paying bonuses!”) wasn’t clear enough[13].

I mean, I guess the maniacal focus on LOWER TAXES is refreshing in its clarity, but that’s not what I want.

Tax Foundation

The Tax Foundation is a 501(c)(3), and is correspondingly less on the nose about their target demographic than ATR. However, there are some clear indicators which way the Tax Foundation leans: the article “Tax Reform Isn’t Done” talks about making provisions of the recent Tax Cut and Jobs Act permanent[14], and their donation page has a pull quote from Mike Pence.

Tax Council Policy Institute (TCPI)

So the TCPI is a 501(c)(3) (Charity Navigator), but there’s no donation page on their website. What? What charity doesn’t want your money?

Looking at their about page makes it clear that the TCPI is affiliated with The Tax Council, and on their home page is the quote “Our membership is comprised of (but not limited to) Fortune 500 companies, leading accounting and law firms, and major trade associations.” Which makes it clear that they don’t need your dinky public donation, because they have industrial support.

Even if they did accept donations, a part of The Tax Council’s mission is “… contributing to a better understanding of complex and evolving tax laws…” with nary a note about simplifying those tax laws, or at least simplifying how people do their taxes.


[1]  What I should have done is look for other people trying to answer the same question, especially in an EA style. I did not do this, partly because honestly, I didn’t really expect a strong showing, and partly because I had just finished doing my taxes and I didn’t want to keep doing research. I would appreciate it if you let me know about stronger posts/guides on this topic.

[2]  But is it ever impossible for me to not accept bad premises?

[3]  This is a little disingenuous: I expect most people have simpler returns than I do. If I only had one W-2, I would have spent much less time on my taxes.

[4]  That said, I would be shocked if the balance of evidence worked out that return-free filing was negative for American citizens.

[5]  Not using the tax prep industry seems like an obvious first step, except I would be piling a lot of suffering on myself for little gain. I usually check my federal return numbers with Excel 1040, but this year is the first time I couldn’t get my tax return within the same ballpark as the numbers given by the tax prep software. I could have sacrificed a weekend to figure out what was going on, but fuck that.

[6]  This is your periodic reminder that action space is really wide, and doing the lazy thing is sometimes much less effective than doing any direct action.

[7]  “We never said that the effects would be bad“, seems to be the implied response to people charging it with partisan mongering.

[8]  I’ll probably spend more time writing and editing this post than I will have spent on actual research.

[9]  Yes, QALYs are weird, and the GiveWell approach is vulnerable to the streetlight effect. Understood.

[10]  The article also references Elizabeth Warren’s return-free bill, which does raise a question about why I don’t donate directly Elizabeth Warren. I remember her advocating for weird policies, but apparently my go-to dumb policy I thought she backed was her anti-vax position, which was either blown out of proportion or reversed at some point. So, basically no reason.

[11]  Unfortunately they don’t seem to have their IT locked down tight, since I found a almost certainly surreptitious CoinHive install on their site.

[12]  Even if the articles can be inanely focused on the minutia of policy: when I was doing my research, the Tax Analysts featured articles list was full of articles about the grain glitch, a tax loophole.

[13]  Plug for Sarah’s post about the intertwining of politics and aesthetics, Naming the Nameless, which partially explains why the ATR uses language usually reserved for last generation click bait and aggressive ads.

[14]  I haven’t really been following along with the TCJA, and don’t have a strong opinion on the specific policy changes, so it’s more of a gut-level identity-based dislike of the support of the TCJA. Yes, yes, this is why we can’t have nice things.

Review/Rant: The Southern Reach Trilogy

Warnings: contains spoilers for AnnihilationAuthorityAcceptanceThe Quantum ThiefThe ExpanseThe LaundryverseDark Matter, and SCP (as much as SCP could be said to have spoilers). Discussion of horror works. Otherwise contains your regularly scheduled science fiction rant.

I recently[1] blew through Jeff VanderMeer’s Annihilation/Authority/Acceptance series, also known as the Southern Reach Trilogy, which I’ll abbreviate to SRT.

First things first: overall, it was pretty good. I enjoyed the writing, the clever turns of phrase (“Sheepish smile, offered up to a raging wolf of a narcissist.”[2]). It’s reasonably good at keeping up the tension, even while sitting around in bland offices with the characters politicking at each other.

So the writing is alright, but the real draw was kind of the setting, kind of the story structure, kind of the subject matter. In a way, it’s right up my alley. It’s just a… weird alley.


The most obvious weird is used as a driving force in the world building, forcing us reconsider what exactly we’re reading.

Is this an environmental thriller? Kind of, but the environmental message is muted and bland, restricted to a repeated offhand remark “well, too bad the environment is fucked”. Is this an X-Files rip off? Kind of, but the paranormal is undeniable: you don’t want to believe it’s there, you want to believe there’s an explanation behind it all. Is this a romance? For the first book maybe, but with one of the pair entirely absent from the book[3]. The second book doesn’t help by introducing elements of the corporate thriller genre, and then axing any chance of finishing that transition by the end of the book.

Whatever it is, all throughout SRT is world building, but shot through with twists and turns. It reminds me of those creepy dolly zooms (examples) which undermine the sense of perception, but applied to narrative. For example, the biologist and story at large constantly give up information that forces us to reconsider everything that came before:

  • By the way, my husband was part of the previous expedition.
  • By the way, there were way more expeditions than 12.
  • By the way, the danger lights don’t actually do anything[4].
  • By the way, I (the biologist) am glowing.
  • By the way, the 12th expedition psychologist was the director of Southern Reach.
  • By the way, said director was in the lighthouse picture.
  • By the way, Central was involved in the Science and Seance Brigade.
  • Did I mention Control’s mom was in the thick of it?

It’s sort of like Jeff is giving us an unreliable narrator with training wheels: we’re not left at any point with contradictory information, yet there’s a strong sense that our only line into the story is controlled by a grinning spin doctor. It’s an artful set of lies by omission.

My suspicion is that I enjoyed this particular aspect of the SRT for the same reason I enjoyed The Quantum Thief trilogy. Hannu does a bit less hand holding[5], like starting the series with the infamous cold open “As always, before the warmind and I shoot each other, I try to make small talk. ‘Prisons are always the same, don’t you think?'”[6]. And as an example, the trilogy never explicitly lays out who the hell Fedorov is: in fact, I didn’t even expect him to be a real person, but his ideology (or the Sobornost’s understanding of his Great Common Task) was so constrained by the plot happening around it that I never had to leave the story and, say, search Wikipedia, which was excellent story crafting. Anathem is another book that does this sort of “fuck it we’ll do it live” sketching of a world to great effect[7].

But while The Quantum Thief is sprinting through cryptographic hierarchies and Sobornost copy clans, it’s still grounded in a human story. The master thief/warrior/detective tropes serve as a reassuring life vest while Hannu tries to drown us with future shock[8][9]. The SRT doesn’t need as much of a touch point, since we never leave Earth and bum around a mostly normal forest and a mostly normal office building[10], but the organizational breakdown in the expedition and Southern Reach agency are eminently relatable in the face of a much larger and stranger unfolding universe.


Let’s unpack that unfolding universe.

The world of SRT is weird: while The Quantum Thief is a fire hose, it only spews the literary equivalent of water, easily digestible and only murky in tremendous quantities. The SRT finishes with loose ends, the author at some point shrugging his shoulders and leaving a dripping plot point open for the spectacle of it, and that’s okay. It’s weird fiction.

Another parallel: Solaris describes a truly alien world sized organism. What is it thinking? How does it think? How do you communicate with it? The story ends with all questions about the planet Solaris unresolved, with the humans only finding out that broadcasting EEG waves into the planet does something[11]. No men in rubber suits here, just an ineffable consciousness. Even a hungry planet makes more sense to us: at least it has visible goals that we can model (even if they are horrifying[12]).

You end up with the same state in SRT: what is Area X doing? Why is it doing it? What the hell does the Markov chain sermon mean?[13]

I’m guessing this is why people don’t like it: there are barely any answers at the end. How did turning into an animal and leaping through a doorway help at all? Did Central ever get their shit together? What’s up with the burning portal world? If you were expecting a knowable “rockets and chemicals” world, it’d be disorienting.

In a way the story suffers a bit from a mystery box problem, where there are boxes that are never opened. However, in this case I think the unopened boxes are unimportant. Sure, the future of humanity is left uncertain, the mechanisms of Area X are still mysterious, but we know what happened to all the main characters, see how they played their parts and have some closure.

(I am miffed that Joss Whedon is poisoning the proverbial storytelling well. Yes, mystery boxing makes economic sense, but now I see the mystery box like I hear the Wilheim scream, and it’s not pretty.)


Okay, so we have a weird new world we explore, and weird fiction that is weird for the sake of being weird, but I’m neglecting the weird that gives people bad dreams.

On one level there’s simple horror based on things going bump in the night: think about the moaning psychologist in the reeds, the slug/crawler able to kill those that interrupt its raving sermon. But that doesn’t show up in spades: the description of the 1st expedition disintegration cuts off after a sneak peak, omitting most of the ugly details. Jeff had plenty of opportunity to get into shock horror, and didn’t.

I think that he wanted to instead emphasize the 2nd layer of Lovecraftian horror beyond the grasping tentacles, a horror driven by a tremendous and possibly/maybe/almost certainly malign world[14]. Area X pulls off simple impossible feats like time dilation and a barrier that transports things elsewhere (or nowhere). More concerning is the fact that Area X knows what humans look like. It’s an alien artifact, and somehow (something like the Integrated Information Theory of consciousness turns out to be right?) knows what makes up a human, recognizes them as special and in need of twisting, and can’t help but twist with powers beyond our understanding. There’s something large and unspeakably powerful stalking humanity, and it is hungry.

Or maybe it’s not deliberately stalking humanity, and it’s just engaging sub-conscious level reactions, and everything it has done so far is the equivalent of rolling over in its sleep: how would Area X know it just rolled over a butterfly of an expedition? This implies a second question: what happens when it finally wakes up?

It all reminds me of The Expanse series. Sure, there’s the radically simplified political/economic/military squabbling and made for action movie plot, but the protomolecule is what I’m thinking about. “It reaches out it reaches out it reaches out“: an entire asteroid of humans melted down for spare parts by the protomolecule are kept in abeyance for use, living and being killed again and again in simulation until the brute force search finds something useful happening (which in turns reminds me of the chilling line “There is life eternal in the Eater of Souls”.) Thousands die and live and die, all to check a cosmic answering machine.

If we want to draw an analogy, the first level of horror draws from being powerless in the face of malign danger: think of the axe murderer chasing the cheerleader. The second level of horror draws from the entirety of humanity being powerless in the face of vast malign danger. Samuel L. Jackson can handle an axe murderer, but up against the AM from “I Have No Mouth and I Must Scream”? No contest[15].

(We could even go further, and think about the third level as malign forces of nature: Samuel L. Jackson vs the concept of existential despair might be an example, not on the level of “overcoming your inner demons” but “eradicating the concept as a plausible state of mind for humans to be in”[16]. Now that I think about it, it would have been an interesting direction to take The Quantum Thief’s All-Defector, fleshing it out as a distillation of a game theoretic concept like Moloch. Maybe there’s room for a story about recarving the world without certain malign mathematical patterns… well, maybe without religious overtones either.)


But we’ve only been looking at what the rock of Area X has been doing to the humans. What about the hard place of the Southern Reach agency, and what they do to humans? The agency continually sends expeditions into a hostile world, getting little in return, and pulls stunts like herding rabbits into the boundary without rhyme or reason. In the face of failure to analyze, they can only keep sending people in, hoping that an answer to Area X will pop back out if they just figure out the right hyperparameter of “which people do we send?”.

In other words: a questionably moral quasi-government agency, operating from the shadows to investigate and prepare to combat a unknown force that might destroy all of humanity? And as if it wasn’t close enough, the SRT throws in the line “What if containment is a joke?”, and I almost laughed out loud. It’s all a dead ringer for the Foundation in the SCP universe.

A little background: SCP is one of those only-possible-with-the-internet media works[17], a collaborative wiki[18] detailing the workings of the Foundation, an extra-governmental agency with an international mandate to, well, secure, contain, and protect against a whole bevy of anomalous artifacts and entities. SCP. As is with wikis there is an enormous range of work: some case files detail tame artifacts (a living drawing), or problems solvable with non-nuclear heavy weapons (basically a big termite), or with nukes (a… living fatberg?), or something a 5-year old might come up with if you asked them to imagine the most scary possible thing (an invincible lizard! With acid blood!).

And then there’s things a bit more disquieting. Light that converts biological matter to… something elseInfectious ideasAn object that can’t be described as it is, just as it is not (it’s definitely not safe).[19]

Area X slots into this menagerie well, an upper tier threat to humanity. It’s utterly alien and unpredictable, actively wielding unknown amounts of power to unknown ends. With the end of SRT, it seems likely that an “XK Class End of the World scenario” is in progress, a real proper apocalypse pulling the curtains on humanity.

On the other hand, the Southern Reach/Central agencies are vastly less competent at handling existential threats than the Foundation (this, despite a mastery of hypnosis the Foundation would kill for[20]). Part of it is the nonsensical strategy: for crying out loud, Central sends a mental weapon in to try and provoke Area X, and to what end? To hasten the end of the world? Then Lowry gaining control of the Area X project was absolutely atrocious organizational hygiene, a willful lack of consideration that contamination can go past biological bacteria and viruses, that the molecular assembly artifact under study can change your merely physical mind. An O5 Foundation overseer would have seen dormant memetic agents activate and rip through departments, and would take note of a field agent turned desk jockey that started accumulating more and more soft power in the branch investigating the same anomaly that nearly took his life…

Back to the first hand, both works partly derive their horror from the collision of staid and sterile office politics with the viscerally supernatural. Drawing from the savanna approximation, we weren’t built to work in cubicles, and there were definitely no trolleys, much less trolley problems[21]. And office organizations are unnatural, but are the most effective way we’ve found to get a great many things done. So press the WEIRD but effective organizational tool into service to call the shots on constant high-velocity high-stakes moral problems, except it’s not people on the tracks but megadeaths, and you start to get at why it’s so unnerving to read interdepartmental memos about how to combat today’s supernatural horror[22].

And there’s the “sending people to their death” aspect of both organizations, which conflicts with their nominally scientific venture: at least no one pretends the military hierarchy is trying to discover some deeper truth when it sends people into battle. So the faceless bureaucracy expends[23] their people[24] to chart the ragged edges of reality[25], and gets dubious returns back. The Southern Reach gets a lighthouse full of unread journals, the Foundation usually just figures out yet another thing won’t destroy an artifact of interest.

And as an honorable mention, the Laundryverse by Charlie Stross shares strong similarity to both works: Lovecraftian horrors are invokable with complicated math, the planets are slowly aligning, and world governments have created agencies to prepare for this eventuality, deal with “smaller” “supernatural” incidents, and find/house the nerds that accidentally discover “cosmic horror math”. This series focuses a bit more on the humorous side of office hijinks, and focuses on threats a bit more tractable to the human mind: at least many of the threats Bob faces can be hurt with the Medusa camera he carries around.

If you want a taster into the Laundryverse, you could do worse than the freely available Tor stories (Down on the FarmOvertime[26], Equoid (gross!)[27], or the not-really-Laundryverse-but-pretty-damn-similar A Colder War[28], in which I remember Stross being inordinately pleased to include the line “so you’re saying we’ve got a, a Shoggoth gap?”.


In the end, I wasn’t too entirely horrified: the best SCP has to offer rustled my jimmies more than Area X. And, the Laundryverse is somewhat more entertaining than the SRT. And Solaris does the “utterly alien”-alien a bit better. SRT, though, strikes a balance between all these concerns, and has much better writing quality than SCP, and fewer of the hangups that turned me off The Expanse[29].

But let me rant for a bit.

On Goodreads Annihilation has an average 3.6 score. I personally don’t think it deserves such a low score, but a fair number of people were turned off by the characters, it’s not everyone’s cup of tea, okay sure fine.

Dark Matter, a nominally science fiction novel, has a 4.1. 4.1! I only see acclaimed classics and amazing crowd favorites with those sorts of scores.

The problem is that Dark Matter is FUCKING TERRIBLE. I know, I complained about this before (on my newsletter), and I’ll complain again, because it’s a fucking travesty that Annihilation got relegated to bargain bin scores compared with an utterly predictable story with trash science and characterization so bland doctors prescribe it when you are shitting your brains out due to a norovirus infection[30].

Maybe I can say it another way:

Where lies the darkness that came from the hand of the writer I shall bring forth a fruit rotten with the tunnels of the worms that shine with the warmth of the flame of knowledge which consumes the hollow forms of a passing age and splits the fruit with a writhing of a monstrous absence which howl with worlds which never were and never will be. The forms will hack at the roots of the world and fell the tree of time which reveals the revelation of the fatal softness in the writer. All shall come to decide in the time of the revelation, and shall choose death[31] while the hand of the writer shall rejoice, for there is no sin in writing an action plot that the New York Times Bestseller list cannot forgive[32].

Again, a fucking travesty. Christ.


[1]  Not so recently by the time this post is published. I’m still a slow writer.

[2]  Okay, it’s a little too clever for it’s own good.

[3]  Surely there is Control/Grace rule 34. Or anyone/thousand-eye mutated Biologist. But as far as I know Biologist-husband is the only canon pairing.

[4]  I almost forgot these were a thing while reading Annihilation, so a quick refresher: “… a small rectangle of black metal with a glass-covered hole in the middle. If the hole glowed red, we had thirty minutes to remove ourselves to ‘a safe place.'”.

[5]  If you want a flavor of the info dump sort of style of The Quantum Thief, I recommend “Variations on an Apple” as an even more extreme example: I suspect that normal people feel the same way reading The Quantum Thief as when I first read that story.

[6]  Except where SRT slowly reveals the unnaturalness of the world, The Quantum Thief revels in it, fills the tub with weird and takes a luxurious bath. Like, it seems like Hannu tried really hard to get the “Toto, I don’t think we’re on Earth anymore” senses tingling right in the first sentence.

[7]  Well, if you’re willing to put up with/enjoy the made up words.

[8]  I mean, I do wonder if the author was too bad of a writer to pull off something less stereotypical while retaining the alien world, but maybe it was intentional. Sure, the writer has written some cringeworthy stuff (I never knew someone could string together the word “kawaii” so poorly), but that’s what the internet has given us, government officials with a publicly available teenager history.

[9]  Charlie Stross has more thoughts about drowning people with future shock as a genre, namely that it isn’t productive any longer because we’re already in a (future?) shocking world.

[10]  Breathing cafeteria wall notwithstanding.

[11]  Because EEG is somehow magical? Well, Solaris was written in the 1960s, so some amount of leeway is necessary. But even if you replace the EEG with some other brain state, you have to wonder what exactly Solaris would be doing with it… “Data can’t defend itself” and all that.

[12]  Another alternative is the cactus that doesn’t lift a finger to attain stated goals.

[13]  It turns out to be surprisingly understandable once you finish the trilogy, even if it reads like a digested the Old Testament.

[14]  Yeah, we’re ignoring the icky parts of Lovecraft.

[15]  I’m ignoring the fact that any movie plot would somehow have Samuel L. Motherfuckin’ Jackson end up the winner: it’s too bad that our widely known “tough guy” archetypes are all actors, which then implies the presence of Hollywood plot armor.

[16]  General memetic hazards might be another example: Roko’s Basilisk is a shitty example of one.

[17]  Other examples I know of are Football in the Year 17776 (previously), Deep Rising (a little less so, it’s just a comic+music), Homestuck (a little less so, it’s just a walls of text+animations), and every piece of interactive fiction: for example, Take (and spoiler-ific analysis).

[18]  It seems almost like a fandom that didn’t coalesce around an existing body of work/author, one that just birthed into the void without a clear seeding work.

[19]  This isn’t the best that SCP has to offer. It’s just that there’s so damn much of it, and it’s not like I’m keeping records on which pages are the best.

[20]  A good life heuristic: if the Foundation would kill to get some capability, maybe you should rethink trying to get that capability.

[21]  But maybe we don’t want to be good at solving trolley problems?

[22]  The dispassionate Foundation reports are effective at conveying the sense of wrongness. There’s a brutal rhythm to the uniform format, leaving a feeling that in order to fight the monsters out there we had to suppress our humanity until we became monstrous in our own way.

[23]  Interesting yet morbid comment: “Well, you were properly expended, Gus. It was part of the price.”.

[24]  New head canon (if such a thing could be considered to exist in the SCP-verse): the replication crisis was suppressed by the Foundation to maintain the facade of the Milgram obedience experiment, which is useful for subconsciously convincing D-class they will eventually follow orders.

[25]  Line stolen from qntm‘s Ra (chapter link).

[26]  The frame story is a bit eye roll inducing, but I understand a man’s gotta publish.

[27]  No, really, it’s gross. Stross: “Stross explains his idea about the life cycle of unicorns to Scalzi and Anders. When he stops retching, Scalzi’s body language changes until it eerily matches Anders. ‘Don’t call us, we’ll call you,’ he says with icy-sober politeness, and beats a hasty retreat.”.

[28]  Home to my go-to chilling quotes “There is life eternal in the Eater of Souls” (previously referenced) and “Why is hell so cold this time of year?”.

[29]  Namely, the incredibly simplified politics and anti-corporation messages set up puppet villains that aren’t interesting: I’d be more into it if the trade offs were more nuanced. It’s still a good “Holden and friends fly around and have adventures” series, though.

[30]  The BRAT diet is bland for a reason: ask me how I know this!

[31]  No, not being emo here: the clones of the main character of Dark Matter (don’t make me look this up, please) end up choosing to fight each other because they can’t figure out functional decision theory. This would be fine, if the main character weren’t ostensibly eminent physics professor material.

[32]  Everything is based on some correspondence with what I actually mean, which fits with what Jeff VanderMeer also did with the original “strangling fruit” prose.

Making the Most of Bitcoin

Epistemic status: I believe I’m drawing on common wisdom up to part 5. After that I’m just making shit up, but in a possibly interesting way. Not proper financial advice, see the end of the post.

So let’s say you have some Bitcoin. What do you do with it?

#1. Cash out everything immediately

Lots of people think putting your money in Bitcoin is a bad idea: Jack Bogle (founder of Vanguard)Warren BuffetRobert Shiller (Yale economic professor)Mr. Money MustacheJason Calacanis (Angel investor)[1]. I tend to agree with them[2], and am basically following this action by not buying in[3].

However, you (hypothetical Bitcoin holder) already knew that Bitcoin was widely thought to be not the greatest investment vehicle, and bought in anyways. You’re not going to immediately cash out, ok, fine, whatever. What else could you do?

#2. Become a HODLR

You’re going to HODL the Bitcoin you have until it reaches THE MOON. It’s unclear what you’ll do once it reaches THE MOON. Maybe you’ll just slowly squander your satoshis on breeding Shiba Inus and kidnapping cryptography experts to ensure the sanctity of SHA-256.

Or maybe one day you’ll end up with 99% of your net worth in Bitcoin, and the next day you’ll have 0% of your net worth in Bitcoin because your kidnapping orders were read incorrectly, and SHA-256 was demonstrably broken by vendetta-driven cryptographers overnight[4]. Also, the Iranians are really mad at you[5].

Another way of looking at it is that it’s difficult to make money slowly with Bitcoin: there are no fundamentals[6] to inexorably drive value, you can’t yell “gains through trade, buy ’em all and let the market sort em’ out!” and put your money in an index fund equivalent and then forget about it.

The life of a HODLR is a life with a hell of a lot of volatility; maybe there’s a better strategy?

#3. Time the market

The key is to buy low, sell high. This advice is approximately as useful as “be attractive, don’t be unattractive” labeled as dating advice.

If you think you can beat the market, I’ll point you to all the rest of the brilliant ideas that have been tried and failed, and the anti-inductive nature of the market, and the seeming adequacy of liquid markets. If you still think you have a grand insight into market mechanics, the great thing is that you can go make a billion dollars if you’re right. Go on, and try to remember us little people.

Besides, if I knew how to do this, would I be here telling you? I would be out playing with my Shiba herd instead.

#4. Recoup your investment

This strategy has the virtue of simplicity:

  • Buy some Bitcoin.
  • Wait until the price of Bitcoin doubles.
  • Sell half your Bitcoin, making back your original “investment”. Now it’s not possible to be worse off than before.
  • … HODL?

It’s nice to not lose money (as long as the market doesn’t crash out before you reach your doubled price), but you have one point at which you cash out, and then you’re back to not having any strategy.

#5. Rebalance

Another strategy is to simply rebalance.

A quick tutorial detour: let’s say there are only 2 investments in the world, boonds and stoocks[7]. Boonds are low risk, low reward, and stoocks are large risk, large reward.

Let’s say you’re a young’un that has just entered the job market with $1000 to put into the market, and have an appetite for risk in order to get good returns. That means taking on higher risk, but that’s okay since you’ll have plenty of years to rebuild if things go south. So you might go for a 90% stoocks, 10% boonds allocation, for $900 stoocks/$100 boonds.

Now let’s say that the market absolutely tanks tomorrow. Boonds don’t really change since they’re low risk; let’s say boonds take a 10% hit. But stoocks, man, they took a 95% hit. Now we’ve ended up $45 stoocks/$90 boonds, meaning our asset allocation is 33.3% stoocks/66.6% boonds. #1. This is super sad, we’ve lost a lot of money, but #2. This isn’t what we want at all! We have so many boonds that our risk of losing most of what we have is low, but our returns are also going to be super low. Besides, even if we do lose it all, we’ll make it back in salary over a few days.

So what we can do is rebalance: we sell our abundance of boonds, and buy more stoocks, until we have a 90% stoock/10% boond allocation again, which works out to $121.5 stoocks/$13.5 boonds[8].

To fill out the rebalancing example, now let’s say you’re older and about to retire. Over the years you’ve shifted your asset allocation to 10% stoocks/90% boonds with $100000 stoocks/$900000 boonds: this close to retirement, you’d be in a lot of trouble if most of your money disappeared overnight, so you want low risk.

Now let’s say stoocks do fantastically well tomorrow, growing 10000%, so you end up with $10000000 stoocks/$900000 boonds. The problem is that now your allocation by percentage is 91.7% stoocks/8.3% boonds, and you’re about to enter retirement. All your wealth is in a super-risky investment! Could your heart even handle the bottom of the market dropping out? Instead of letting that happen, you could rebalance back to 10% stoocks/90% boonds or $1090000 stoocks/$9810000 boonds[9].

What’s the moral of the story? If you have multiple asset risk classes, then you don’t have to put it all on black and ride the bubbles up and down like a cowboy: rebalancing is a simple strategy to target some amount of risk, and then you can just go long and not worry about the fine details.

There are finer details that do matter: you can’t rebalance Bitcoin often or you might get eaten alive by mining fees[10] (which peaked at an average of $50 when Bitcoin was around $10000). So maybe you’d target some large-ish percentage change and only rebalance once Bitcoin changes by that amount.

Let’s run some numbers: let’s say 1 Bitcoin is currently $1000,
and you have exactly 1 bitcoin, and you rebalance only whenever Bitcoin doubles in price (this basically extends the previous “double and sell” strategy). Now if Bitcoin goes from $1000 to $10000, you would rebalance 3 times: when Bitcoin is $2000, $4000, and $8000. If you have many more assets than $1000, you can hand wave away the exact percentage calculations and just sell half the Bitcoin at each point. Even if Bitcoin crashes to $0.001 after reaching $10000, you’ve “made” $3000 that you’ve rebalanced to other stabler assets (minus fees, $~70). Not bad for riding a speculative bubble!

#6. Kind of rebalance-ish

On the other hand, only getting $3000 out of a maximum of $10000 Bitcoin seems… not a good show. Sure, you were going to get only $0.001 if you were a HODLR, but that $10000 is a juicy number, and $3000 is an awful lot smaller.

Or consider the scenario in which you read Gwern in 2011 speculating
that Bitcoin could reach $10000
, and you were convinced that you should be long on Bitcoin. However, it was still possible that Bitcoin wouldn’t reach $10000, falling prey to some unforeseen problem before then. You would want to hedge, but rebalancing would throw away most of your gains before you got close to $10000. For example, if you started with $1000 @$1/BTC for a total of 1000 BTC, and you rebalanced at every doubling, you would end up with $13000 cash and ~$1000 BTC, compared to HODLing ending up with $10000000 in BTC. It’s a used car versus being the Pineapple Fund guy, I get it, it’s why HODLing is enticing.

The problem is that rebalancing doesn’t know anything about beliefs about long term outcomes, just about overall asset class volatility.

That said, if it’s possible to encode your beliefs as a probability distribution[11], you could run (appropriately named) Monte Carlo simulations of different selling strategies and see how they do, choosing a strategy that does well given what you expect the price of BTC to do.

I’ll work some simple examples, following some assumptions:

  • we start from a current price of $10000/BTC.
  • we don’t care about the day-to-day price: if BTC reaches $20000, dips back to $15000, and then rises to $50000, we aren’t concerning ourselves with trying to time the dip, just with the notion that BTC went from $20000 to $50000.
  • rebalancing is replaced with a hedge operation, where some fixed fraction of our Bitcoin stake is sold, at some fixed rising proportion of BTC. We’ll fix our sell point at every doubling (except for a sensitivity analysis step below).
  • the transfer fees are set to be proportional to the price of BTC, at 0.5%: in practice, this just serves as a drag on the BTC-cash conversion. If you’re dealing with amounts much larger than 1 BTC (or SegWit works out), you might be able to amortize the transfer costs down to 0. To allow interpolating between both cases, we’ll simply give both 0.5% and 0% transaction drag simulations.
  • the price of Bitcoin is modeled as rising to some maximum amount, and then crashing to basically nothing. This can also cover cases where BTC crashes and stays low for such a long time that it would have been better to put your assets elsewhere.

The processes of adapting the general principle to real life, consulting the economic/finance literature for vastly superior modeling methods, using more sophisticated selling strategies than selling a constant fraction, and not betting your shirt on black is left as an exercise to the reader.

So let’s say our beliefs are described by a mangled normal distribution[12]: we’re certain BTC will reach the starting price (obviously, we’re already there), around 68% less certain BTC will reach 1 standard deviation above the starting price, 95% less certain BTC will reach the 2nd standard deviation, so on and so forth. We’re not interested in a max BTC price below our starting price, so we’re just chopping the distribution in half and doubling the positive side.

Since we’ve centered the normal distribution on our starting price, we have only one other parameter to choose, our standard deviation (stdev). Some values are obviously bad: choosing a stdev of $1 means you are astronomically confident that BTC will never go above $10100. While you might not believe in the fundamentals behind Bitcoin, it is odd to be so confident that the crash is going to happen in such a specific range of prices. On the other hand, I don’t have a formal inference engine from which I can get a stdev value that best fits my beliefs, so I’ll be generous and choose a middling value of $10000.

So if we run a number of simulations where the price of BTC follows the described normal distribution, we get:

Price simulation with a normal distribution

Several things become apparent right away:

  • there’s an obvious stepping effect happening[13]. Thinking about it, it’s obvious that each separate line is describing the effects of selling at each doubling. The lowest line only manages to sell once, the next line sells twice, and so on.
  • as one might expect, selling everything is low variance, and holding more is higher variance. As a reference point, the 0.5 sell fraction is just the previously described rebalancing strategy.
  • even when hitting 4 sell points, the transaction drag on 1 BTC isn’t too bad.
  • fitting a trend line with LOESS gets us a rough[14] measure of expected profit. In particular, we seem to top out at $20000 around a 0.5 sell fraction.

An obvious sensitivity analysis comes to mind: does the fact we’re selling only at every doubling matter? What if we sold more often? We can re-run the analysis when we sell at every 1.2x:

Price simulation with a normal distribution, selling at every 1.2x increase

The stepping effect is still there, but less obvious: we hit more steps on the way to the crash price. The largest data points don’t go as high, but you can also see fewer zero values, since we pick up some selling points between $10000 and $20000. Additionally, the LOESS peaks at a lower sell fraction, which makes some sense: since we’re hitting more sell points, we can afford to hold on to more.

What if the normal distribution doesn’t describe our beliefs? Say we want more emphasis on the long term. Then our beliefs might be better modeled with the exponential distribution which is known to have a thicker tail than the normal.

If we use $10000 for the exponential distribution’s lambda parameter, then our simulations look like:

Price simulation with an exponential distribution

The behavior isn’t too different, with the exception that some simulations start surviving to the 5th sell point. Additionally, the LOESS curves move to the left a bit compared to the normal, but only by a little: from eyeballing it, the peak might move from a sell fraction of 0.55 to 0.45.

Again, there are more sophisticated analyses; for example, maybe you think that your probability distribution peaks around $100k/BTC and falls off to either side, in which case you would want a more complicated strategy to take advantage of your more complicated beliefs.

However, there’s a theoretical problem with our analyses thus far. The distributions we’ve been using are unbounded, allowing BTC prices that can theoretically go to infinity. Sure, we can treat economics as effectively unbounded: there sure are a lot of stars out there, and no economic activity has even left Earth orbit (Starmansome bacteria, and drawings of naked people notwithstanding). But that’s in the long run[15], and we only really care about BTC in the short term, when it’s generating “returns” in excess of normal market returns. For example, if BTC is wildly successful and becomes the world currency, it becomes hard to see how BTC can continue to grow in value far beyond the economic growth of the rest of the world[16]. So we might assume that once BTC eats the world, BTC just follows the bog standard economic growth of the world, and ceases to be interesting relative to all other assets[17].

However, this does mean we can add two assumptions: our distributions should be bounded, and there’s a chance the value of our held BTC doesn’t all disappear in the end. I’ll bound our distributions at the current stock market cap (as of 2018/03/06 $80 trillion, rounded to $100 trillion for ease of math)[18], and use a 2nd function (not a probability distribution!) to encode the probability that if BTC reaches a certain price, it will crash.

For the probability of reaching a price, I’ll keep using the exponential distribution, but bounded and re-normalized to add up to 1 within the bounds[19]. For the probability that BTC will crash, we don’t need a distribution: we could imagine a function that always returns 100% for a crash (as we were before), or 0, or any value between. Mathematically importantly, we’re not beholden to normalization concerns. I essentially free handed this function piecemeal with polynomials, with the goal of reflecting a belief that either BTC stabilizes as a small player in the financial markets, or becomes the world currency and not likely to lose value suddenly. Plotted on log axes:

Price distribution and probability BTC doesn't crash

When we run simulations (displayed on a log y-axis):

Price simulation with a bounded exponential distribution, on a log scale

Up to now transaction drag hasn’t been such a big deal, but it suddenly shows up as a big deal: if we end up in a world where the price of BTC goes long and retains value, 0.5% drag appears to suddenly be super important, preventing us from getting close to the maximum $10000000 from our initial 1 BTC. It’s not too surprising, since more mundane investments need to also deal with fee[20] and tax drag.

But if these beliefs are correct, do we do better on average? Not really, especially with transaction drag factored in. This holds true even when we zoom in on a linear axis[21][22]:

Price simulation with a bounded exponential distribution, on a linear scale

I’ll end here. You could always make your models more complicated, but I’m making precisely $0 off this and that XCOM 2 isn’t going to play itself.


So after all this analysis, what do I recommend you do?

Trick question! I don’t recommend you do anything, because this post is not financial advice. If you persist in trying to take financial advice from someone who may frankly be a corgi, the world will laugh at you when BTC crashes to the floor and Dogecoin rises to take it’s place as the true master of cryptocurrencies. ALL HAIL THE SHIBA, WOOF WOOF.


R code used to generate the graphs available on github.


[1]  “But all those people are famous and invested in the status quo!” Okay, you got me, will linking to a non-super-rich acquaintance’s opinion on Bitcoin help?

To be even fairer, I could also come up with a similar list supporting Bitcoin instead, but I’m less interested in debating the merits of Bitcoin, and more interested in what you do once you wake up with a hangover and a wallet full of satoshis.

[2]  I disagree with Scott when he says that we should have won bigger with Bitcoin. Most of the gnashing of teeth over Bitcoin is pure hindsight bias.

[3]  Currently the only reason I would get any cryptocurrency is to use it as a distributed timestamping service.

[4]  It’s not just breaking the base crypto layer: the nations of the world could decide to get real and criminalize Bitcoin. Law enforcement could get better at deanonymizing transactions, causing all the criminals to leave for something like Monero. Price stabilization just never happens, and people get sick of waiting for it to happen. Transaction fees spike whenever people actually try to use Bitcoin as a currency, or the Lightning Network turns out to have deep technical problems after a mighty effort to put it into place (deep problems in a widely deployed technology? That could never happen!). Ethereum gets its shit together and eats Bitcoin’s lunch with digital kittens. There’s the first mtgox-level hack since BTC started trading on actual exchanges. People decide they want to cash out of the tulip market en masse (although that might be unfair to the tulips).

[5]  It’s unclear where you would get a Shah today, but exhuming all past Shahs is probably enough to piss people off.

[6]  No, evading taxes/police actions is not a fundamental.

[7]  Names munged to emphasize that they’re fantasy financial instruments.

[8]  There’s something to be said about keeping a stable and liquid store like a savings account to make sure living expenses are covered for 6 months. You can replace the implied “all assets” with “all available assets” for a more non-toy policy.

[9]  If the market simply dropped back to its previous position before you could rebalance, then you aren’t any worse off than you were 2 days ago, so maybe it wouldn’t be so disappointing to miss this opportunity. But that’s just anchoring, and Homo Economicus in your position would be super bummed.

[10]  Normal investments have similar tax implications where you realize gains/losses at sale, covered by the general term tax drag.

[11]  More on probabilities as states of beliefs, instead of simply reflecting experimental frequencies.

[12]  Coming up with a better distribution is left as an exercise for the reader.

[13]  A mild amount of jittering was added to make this visible with more simulation points.

[14]  LOESS fits with squared loss, which emphasizes outliers, which you might not want. Additionally, LOESS is an ad hoc computational method (much like k-means) which won’t necessarily maximize anything; the main advantage is that it looks pretty if you choose the right spans to average over, and you don’t have to come up with a parametric model to fit to.

[15]  And as they say, in the long run we’re all deadYes, we’re working on that.

[16]  Sure, the bubble could continue, but bubbles pop at some point, and if it’s so damn important to the economy war isn’t out of the question, and if large scale nuclear war happens, more than just the price of Bitcoin is going to crash. “Here lies humanity: they committed suicide by hard math.”

Or a different perspective. Who would win?

  • Billions of people that didn’t buy into Bitcoin, all frozen out of the brave new economy, backed by all the military might of nations that care about the sovereignty of their money supply.
  • One chainy boi.

[17]  There’s reasons to believe BTC might act otherwise:

  • The fact that Bitcoin is deflationary, so it probably won’t act like a normal commodity in the limit if it eats the world. Even companies can issue more stock, or more gold found.
  • The marginal Bitcoin might be way over priced forever.

[18]  Interestingly, this implies that BTC only has around 1000x of hyper-growth headroom.

[19]  The distribution chart is not properly normalized, since the distribution is actually linear without the log axis, but it simulates correctly.

[20]  The movement to index funds seems partly rooted in avoiding high mutual fund fees.

[21]  I’m not entirely sure what that hump in the ideal price is doing: it shows up in the other LOESS curves, and persists with changes in the random seed.

[22]  We end up with a different maximum hump with the log and linear graphs: what’s going on here? Keep in mind that LOESS operates on minimizing squared error, and minimizing squared log error is a bit different than minimizing squared error.

Radical Transparency

Nothing that’s been said before, but it didn’t click until I thought about it some more and had an AHA! moment, so I’m doing my own write up.

Let’s say that you’re faced with a Newcomb problem[1].

The basic gist is this: Omega shows up, an entity that you know can predict your actions almost perfectly. Concretely, out of the last million times it has played out this scenario, it has been right 99.99% of the time[2]. Omega presents you with two boxes, of which box A contains $1000000 or nothing, and box B always contains $1000. You have only two choices, take just box A (one boxing) or take both box A and B (two boxing). The twist is that if Omega predicted you would two box, then A is empty, but if it predicted you would one box, then box A contains the $1000000.

Causal decision theory (CDT) is a leading brand of decision theories that says you should two box[3]: once Omega presents you with the boxes, Omega has already made up its mind. In that case, there’s no direct causal relationship between your choice and the boxes having money in them, so the box A already has $1000000 or nothing in it. So, it’s always better to two box since you always end up with $1000 more than you would otherwise.

People that follow CDT to two boxing claim that one boxing is irrational, and that Omega is specifically rewarding irrational people. To me it seems clear CDT was never meant to handle problems that include minds modeling minds: is it also irrational to show up in Grand Central station at noon in Schelling’s coordination problem, despite the lack of causal connection between your actions and the actions of your anonymous compatriot? So you might agree that CDT just doesn’t do well in this case[4] and decide to throw CDT out the window for this particular problem, netting ourselves an expected $999900.10 from one boxing[5], instead of the expected $1099.90 payout from two boxing.

But let’s throw in a further twist: let’s say the boxes are transparent, and you can see how much money is inside, and you see $1000000 inside box A, in addition to the $1000 inside box B. Now do you two box?


I previously thought “duh, of course”: you SEE the two boxes, both with money in them. Why wouldn’t you take both? A friend I respect told me that I was being crazy, but didn’t have time to explain, and I went away confused. Why would you still one box with an extra $1000 sitting in front of you?

(Feel free to think about the problem before continuing.)

The problem was that I was thinking too small: I was thinking about the worlds in which I had both boxes with money in them, but I wasn’t thinking about how often those worlds would happen. If Omega wants to maintain a 99.99% accuracy rate, it can’t just give anyone a box with $1000000. It has to be choosy, to look for people that will likely one box even when severely tempted.

That is, if you two box in clear-box situations and you get presented with a clear box with $1000000 in it, congratulations, you’ve won the lottery. However, people like you simply aren’t chosen often (at a 0.01% rate), so in the transparent Newcomb world it is better to be the sort of person that will one box, even when tempted with arguably free money.


The clear-box formulation makes it even clearer how Newcomb’s problem relates to ethics.

Yes, ethics. Let’s start with what Omega might put in an advertisement:

“I’m looking for someone that will likely one box when given ample opportunity to two box, and literally be willing to leave money on the table.”

Now, let’s replace some words:

“I’m looking for a <study partner> that will likely <contribute to our understanding of the class material> when given ample opportunity to <coast on our efforts>.”

“I’m looking for <a startup co-founder> that will likely <help build a great business> when given ample opportunity to <exploit the business for personal gain>.”

“I’m looking for <a romantic partner> that will likely <be supportive> when given ample opportunity to <make asymmetric relationship demands>.”

In some ways these derived problems are wildly different: these (lowercase) omegas don’t choose correctly as often as 99.99% of the time, there’s an iterated aspect, both parties are playing simultaneously, and there’s reputation involved[6]. But the important decision theory core carries over, and moreover it generalizes past “be nice” into alien domains that include boxes with $1000000 in them, and still correctly decides to get the $1000000.


[1]  I agree for most intents and purposes that the Parfit’s Hitchhiker formulation of the problem is strictly better because it lacks problems that commonly trip people up in Newcomb’s problem, like needing a weird Omega. However, then you get the clear-box problem right away, and I’m going for more incremental counter-intuitive-ness right now.

[2]  Traditional Newcomb problem formulations started with a perfect predictor, but it becomes a major point that people get tripped up over because it’s so damn “unrealistic”. I’m sure no one would object to Omega never losing tic-tac-toe, but no one seems to want to accept a hypothetical entity that can run TEMPEST attacks on human brains and do inference really well. Whatever, it’s ultimately not important to the problem, so it’s somewhat better to place realistic bounds on Omega.

[3]  Notably, Evidential Decision Theory says you should one box, but fails on other problems, and makes it a point to avoid getting news (which isn’t the worst policy when applied to most common news sources, but this applies to all information inflow).

[4]  I haven’t really grokked it, but friends are excited about functional decision theory, which works around some of the problems with CDT and EDT.

[5]  It’s not exactly $1000000, since Omega isn’t omniscient and only has 99.99% accuracy, so we have to take the average of the outcomes weighted by their probability to get the overall expected outcome ($1000000 * 0.9999 + $1000 * 0.0001 = $999900.10).

[6]  Notably, it starts to bear some resemblance to the iterated prisoner’s dilemma.

Tape is HOW expensive?

Maybe you've seen that hard drive prices aren't falling so quickly. Maybe you've seen the articles making claims like "tape offers $0.0089/GB!"[1], looked at recent hard drive prices, and seriously thought about finally fulfilling the old backup adage "have at least 3 backups, at least one of which is offsite" with some nice old-school tape[2].

So you'd open up a browser to start researching, and then close it right afterwards in horror: tape drives prices have HOW many digits? 4? The prices aren't even just edging over $1000, it's usually solidly into the $2000s, or higher. Maybe then you start thinking about just forking all your money to The Cloud™ to keep your data.

But maybe it's worth taking a look and seeing exactly how the numbers work out. As an extreme example, if you can buy a $2000 device that gives you infinite storage, then that is a really interesting proposition[3]. Of course, the media costs for tape aren't zero, but they are cheaper than the equivalent capacity in hard drives. Focusing in, the question becomes: when does the lower cost of each additional tape storage overcome the fixed costs of tape, such that tape systems become competitive with hard drives?


Some background: tape formats are defined by the Linear Tape-Open Consortium (LTO)[4], which periodically defines bigger and better interoperable tape formats, helpfully labled as LTO-N. Each jump in level roughly corresponds to a doubling of capacity, such that LTO-3 contains 400GB/tape while the recent LTO-8 contains 12TB/tape.

And some points of clarification:

  • LTO tapes usually have two capacity numbers; for example, LTO-3 tapes usually advertise themselves as being able to contain 400 or 800GB. If you're lucky, the advertising material will suffix "(compressed)" sotto voce, notifying you that the 800GB number is inflated by some LTO blessed pie-in-the-sky compression factor. Ignore this, just look at the LTO level numbers and their uncompressed capacity.
  • We usually talk about hard drives as a single unit (if you can see the individual hard drive platters, that means you are having a bad problem and you will not be storing data on that drive today), but tape is more closely related to the floppy/CD drives of yore, where media is freely exchangable between drives.

First, I gathered some hard numbers on cost. I trawled Newegg and Amazon for drives and media for each LTO level from 3 to 8, grabbing prices for the first 3 drives from each source and 5 media from each. Sometimes this wasn't possible, like for LTO-8: it's recent, and I could only find 2 different drives. I restricted myself to a handful of pricing examples because I didn't want to gather data endlessly (there are a lot of people selling LTO tapes), but I didn't want to have to sift through a startling lack of data about whether unusually low/high prices were legitimate offers, or indications something was wrong with the seller/device. Whatever, I just got enough data to average it out[5].

Second, I took the average media cost for an LTO level, and how much uncompressed data that level could store, and figured the cost per TB. It's true that some of the later LTO levels should look a lot more discretized: for example, storing 5 and 10 TB on a LTO-8 tape (which can store 12TB) will cost exactly the same, while you'll need to get around twice as many LTO-3 tapes. However, just making everything linear makes analysis a lot easier, and will give approximately correct answers. If it turns out that tape becomes competitive at some small media storage multiple then we can re-run the numbers.

Then, it's just a matter of solving a couple of linear equations, one representing the tape fixed and variable costs, and the other the hard drive costs. To capture some variability in the hard drive cost, I compared the tapes against both a hypothetical cheap $100/4TB drive and a $140/4TB drive[6].

Cost_{Tape} = TapeMedia/TB \cdot Storage + TapeDrive
Cost_{HD} = HD/TB \cdot Storage

Finding the storage point where the costs become equal to each other:

Storage_{competitive} = \frac{TapeDrive}{HD/TB - TapeMedia/TB}

When we solve with some actual data (Google Sheets), we get the smallest competitive capacity going to LTO-5 (1.5TB/tape). And yet, it doesn't look good: if we're comparing against expensive hard drives, we need to be storing ~100TB to become competitive, and if we're comparing against cheap hard drives, we need ~190TB to break even.

So I did some more sensitivity analysis: right now, drives and media are expensive for the recent LTO-7 and 8 standards. Will our conclusions change when LTO-7/8 equipment drop to current LTO-5 prices? Comparing to expensive drives the minimum competitive capacity drops to ~65TB, but that's assuming no further HD R&D, and is still way above the amount of data I will want to store in the near future[7].

In retrospect, it should have been more obvious than I was thinking that the huge fixed costs of tape drives along with non-minuscule variable costs just doesn't make sense for any data installation that doesn't handle Web Scale™ data.

And that's not even fully considering all the weird hurdles tape has:

  • It's unclear whether there are RAID-like tape appropriate filesystems/data structures, especially when you don't have N drives that you can write to at the same time. You can read stories about wrestling with tape RAID, but it doesn't seem to be a feature of the standard Linear Tape File System.
  • Tied into with the previous point, you'll need to swap tapes once one of them fills up. Or if you're trying to get media redundancy, you'll need to do a media swapping dance every time you want to backup. Needing to manage backup media isn't really great when you're trying to make backups so easy they're fire-and-forget.
  • Tape drives are super expensive, which makes them a giant single point of failure. Having redundant drives means you need even more tons of data to stay competitive with normal hard drives.

So we've arrived at the same conclusion as our gut: tapes are overdetermined to be a bad idea for the common consumer. If you can get really cheap clearance/fire sale drives, it might become worth it, but keep in mind the other concerns listed above.

Data and analysis available on Google Sheets.


[1]  Which initially doesn't sound very impressive, given Backblaze's B2 offers $0.005/GB. However, that's an ongoing monthly cost: two months is enough to put tape back into the game, at least according to the linked Forbes article. (I've also remembered more impressive numbers in other articles, but maybe that's just my memory playing tricks on me.)

[2]  Tape has nice properties beyond just having a lower incremental storage cost. It's offline as opposed to constantly online: once you have access to a hard drive, you can quickly overwrite any part of it. Since it isn't possible to reach tapes that aren't physically in the drive, it becomes much more difficult to destroy all your data (say, in a ransomware attack). Tapes are possibly more stable in terms of shelf life, and you can theoretically write to it faster than hard drives.

[3]  If nothing else, owning as many universe breaking/munchkin approved pieces of technology seems like a good policy.

[4]  Sure, you can use VCRs for storage with ArVid, but it is not competitive at all at 2GB on 2 hour tapes. It could probably be made to work better since it uses only 2 luminance levels instead of a full 256+ gradations, but the graniness of home videos doesn't give me hope for much better resolution. Plus, you can do all that extra work, but you'll only end up with capacity comparable to current Blu-Rays. And, where are you going to find a bunch of VCR tapes these days?

[5]  Taking the median is probably better for outlier rejection, and taking the minimum price in each category would probably be a good sensitivity analysis step. I don't believe either choice drastically changes the output for me, since I have relatively small amounts of data to store, but you might want to run the numbers yourself if you have more than, say, 20TB to store.

[6]  It's true that there will likely be some additional hardware costs to actually access more than 12 hard drives, but if nothing else you could go the storage pod route and get 60 drives to a single computer, so we'll just handwave away the extra costs.

[7]  Honestly, I'm not even breaking 1TB at the moment.