Daniel May: current view, Machine Intelligence Research Institute
This is part of a series of posts where each research fellow describes their reasons for favouring a certain recipient of our final donation, as described further in the full post on Version 0.
Name: Daniel May
My current best guess is that we should donate to intervention/organisation:
Machine Intelligence Research Institute
My cost-effectiveness estimate for the intervention/organisation is:
Results of GPP model for AI safety: “Then adding a career (of typical quality for the area) to the field now adds about 3 people to the field total, which is 0.3% of the total. This is about 0.3% of the amount that would be needed to double the field, so we should expect it to avert about 0.03% of the bad outcomes, which is a total chance of 1 in 30,000 of averting existential catastrophe.”
Inputs: 10%, 1,000, 3, 10%
“Now, the work to build a field which ties into existing AI research is happening, and is scaling up quite quickly. Examples:
Concrete Problems in AI Safety presents a research agenda which is accessible to a much broader community;
The Future of Life Institute made a number of grants last year;
The Open Philanthropy Project has given over $5 million to establish a Center for Human-Compatible AI at Berkeley;
Google DeepMind and OpenAI are both building teams of safety researchers.
I expect this trend to continue for at least a year or two. Moreover I think this work is significantly talent-constrained (and capacity-constrained) rather than funding-constrained. In contrast, MIRI has been developing a talent pipeline and recently failed to reach its funding target, so marginal funds are likely to have a significant effect on actual work done over the coming year. I think that this funding consideration represents a significant-but-not-overwhelming point in favour of MIRI over other technical AI safety work (perhaps a factor of between 5 and 20 if considering allocating money compared to allocating labour, but I’m pretty uncertain about this number).
A few years ago, I was not convinced that MIRI’s research agenda was what would be needed to solve AI safety. Today, I remain not convinced. However, I’m not convinced by any agenda. I think we should pursuing a portfolio of different research agendas, focusing in each case on not optimising for technical results in the short term, but optimising for a solid foundation that we can build a field on and attract future talent to. As MIRI’s work looks to be occupying a much smaller slice of the total work going forwards than it has historically, adding resources to this part of the portfolio looks relatively more valuable than before. Moreover MIRI has become significantly better at clear communication of its agenda and work -- which I think is crucial for this objective of building a solid foundation -- and I know they are interested in continuing to improve on this dimension."
“It seems plausible that money could be better spent by 80,000 Hours or CFAR in helping to develop a broader pipeline of talent for the field. However, I think that a significant bottleneck is the development of really solid agendas, and I think MIRI may be well-placed to do this.”
Quotes from the 2017 AI Risk literature review and organization overview:
“In general I am more confident the FHI work will be useful than the MIRI work, as it more directly addresses the issue. It seems quite likely general AI could be developed via a path that renders the MIRI roadmap unworkable (e.g. if the answer is just to add enough layers to your neural net), though MIRI’s recent pivot towards ml work seems intended to address this.
However, the MIRI work is significantly less replaceable - and FHI is already pretty irreplaceable! I basically believe that if MIRI were not pursuing it no-one else would. And if MIRI is correct, their work is more vital than FHI's."
What would change my mind:
Update: My view did change from/since the meeting discussing current views, in favour of MIRI, GFI, and meta-orgs being more cost-effective, with MIRI at the top.
Global health (particularly AMF):
This seems like it has decent tractability and a solid evidence base. Neglectedness seems low in general, but there seem to be some charities working in more neglected areas (e.g. AMF, SCI).
Evidence that GiveWell’s analysis relies on assumptions that don’t hold true or roughly true (Sindy’s post on whether we should defer to GIveWell).
Existential risk (and global catastrophic risk):
I believe that the far-future matters greatly due to the potential for very large numbers of worthwhile lives, and that I could be convinced that an organisation in this space is the most cost-effective. I am concerned about tractability, and in some cases (e.g. AI safety) neglectedness.
It seems like there are several x-risks with differing probabilities. I worry that by increasing the probability that we one x-risk turns out okay, we are either merely holding it off for some years, or that another x-risk will occur anyway.
Preventing many x-risks seems to require humans to value certain things and act on them (e.g. caring about AI friendliness over potential selfish economic benefits, or valuing future people enough spend one’s life slightly worse off. Many people care about animals suffering on factory farms, but struggle to act on this). I am quite skeptical that we are or will be prepared for the necessary kinds of coordination, and I worry about the volatility of these values (I recall, for example, that being wealthier correlates with an expanded moral circle. I can imagine these values shifting backwards if things started to go relatively badly, which seems not unlikely.)
I would change my mind if I saw more evidence that coordination was possible or happening, such as more things like the Partnership on AI being created (and evidence of them having an effect), as well as by seeing evidence that increased the probability I think the future will be well (e.g. people globally content, meeting basic needs) around times of significant probabilities of x-risk (e.g. evidence that climate change will be solved in time, evidence that we won’t lack for things like energy and other resources). For example, wealth generated by future AIs which caused unemployment is likely to be redistributed?
I think human lives as they are today are worth living, with high confidence, but I worry that our descendants’ lives (including non-human descendants, such as the potential for billions of animals suffering in the wild or in factory farms) might not be, in that they will be much less worse living, or that their existence will be net bad (see e.g. “This is the Dream Time”, Robin is more optimistic than I am about this future.), though I also think it is possible that our descendants’ lives can be much more worth living, for example, by changing our biology. This matters less for Qays’ model, which only looks at humans up until the year 2100, but could make the case for models concerned with the further future better if I encountered evidence that far-future lives will be worth it.
Evidence of more or less people going into x-risk areas could convince me either away, as could the opinions of experts in the area (such as “Future Progress in Artificial Intelligence: A Survey of Expert Opinion” for more fields).
Causal chain for how org will influence x-risk (though as Tom mentioned at meeting, this seems built in to the GPP model)
I think that both factory farms and wild animal suffering are huge problems, and I worry that they could continue to be into the far-future if alternative food such as artificial meat is not developed. I am concerned about its tractability, and I think it is fairly neglected (veg* advocacy is popular, but I am unsure how much of an effect it has, and I think there’s a decent expect more neglected parts like research into cultured meats would be much more effective)
I am unaware of interventions for either of these with strong evidence backing (including the value of going vegetarian/vegan), but I do not have a good sense of the area and have not much looked into it. I could be convinced if I saw this kind of evidence, so I should at first read Animal Charity Evaluators’ website, along with Open Phil’s investigation into artificial meat.
I could be convinced either way if I encountered evidence that animals’ lives matter much more or less than I currently think.
If more research and funding was going into cultured meat than I think, or was likely to increase significantly in future, I would decrease my estimate.
I would decrease my estimate if it seems more likely that not eating animals was the default path of the future (expanding moral circle).
Concerns about counterfactuals and measuring impact (for EA advocacy work rather than e.g. meta research into cause prioritization). For example, would people give equally as much if GWWC did not exist, would EAs write career advice and offer coaching in their spare time to the same standard as 80k offers if it did not exist?