Measuring Shadows

Monday, December 26, 2016

2016 Donations

In case anyone is interested, here's where I'm giving in 2016. These reflect a fair bit of thought and communication with people in the EA community, but I haven't put nearly as much thought into it has have people working fulltime thinking about donations. For other perspectives, you can check out the recommendations/grants of the Open Philanthropy Project, GiveWell, Animal Charity Evaluators, and Giving What We Can. Disclosure: I'm on the board of some of the charities I'm giving to, and am friends with the people running many of them.

TL;DR: (M = meta EA organization, D = direct work; A = animals, X = xrisk, P = global poverty, G = general/other.)

Largest:
The Center for Effective Altruism (M/G)
80,000 Hours (M/G)
Medium:
The Humane League (D/A)
The Future for Humanity Institute (D/X)
Small:
Animal Charity Evaluators (M/A)
Against Malaria Foundation (D/P)
The Good Food Institute (D/A)
The Machine Intelligence Research Institute (D/X)

Large Donations

This year I expect my largest donations to be to CEA and 80K. This reflects my general excitement about the potential of meta Effective Altruism organizations.

Center for Effective Altruism

The Center for Effective Altruism (CEA) is an Oxford (and now Bay Area) effective altruist organization that has been instrumental in organizing the EA community. It has filled a number of different roles over the years. Many of the organizations within the EA movement, including 80,000 Hours, Animal Charity Evaluators, Giving What We Can, and the Global Priorities Project, are outgrowths of CEA. This alone makes it partially responsible for, respectively, the primary EA career advice organization, the primary source for animal welfare charity recommendations, over $1 billion of lifetime donation pledges, and the primary interface between EA, governments, and policy. CEA also works closely with the Future of Humanity Institute, one of the top sources of research and coordination in the AI safety field. CEA has also handled PR for much of the EA movement, been one of the primary drivers of growth, helped to organize EA conferences, helped to raise a ton of money for the movement, and done crucial behind-the-scenes work to make sure that needs in the EA movement are being filled. CEA was recently accepted to Y-Combinator's nonprofit section. I personally have a ton of respect for CEA's leadership.

CEA has so far raised about $800K this year and is targetting about $3M. You can donate here.

80,000 Hours

80,000 Hours (80K) is an EA career advice service. Their main role in the EA community is to help promising college students figure out which careers they can do the most good in. They have helped thousands of impressive students to find jobs working directly for EA organizations, become promising AI researchers, find influential jobs in politics, find particularly good jobs earning to give, and pledge to donate significant amounts of money. They have generally helped to grow the EA community through outreach to students and, sometimes, wealthy donors. They personally helped me with my career decision, and have some pretty impressive stats about their cost effectiveness. They recently participated in Y-Combinator as a nonprofit.

80K is targetting a budget of about $1.5M. You can donate here.

For what it's worth, CEA and 80K are generally targeting a donation ratio of $2 to CEA for each $1 to 80K; I plan to support in roughly that ratio.

Medium Donations

My medium donations for 2016 are The Humane League and the Future of Humanity Instutite. While I don't personally think they will do as much with the marginal dollar donating this year as 80K or CEA, they are both impressive organizations doing a lot of good.

The Humane League

The Humane League (THL) is a farmed animal welfare organziation. THL has had huge impacts in corporate campaigning, movement growth, and leafleting, and has also helped make the animal welfare movement more data-driven. They're victories in getting large fractions of US hens to live in cage-free environments through corporate campaigning are particularly impressive. Like CEA and 80K, I am impressed by its leadership. They've been a consistent force for sensible, smart, and effective animal welfare approaches.

You can donate to The Humane League here.

The Future of Humanity Institute

The Future of Humanity Institute (FHI) is an oxford-based research institute that's working on issues concerning the long-run future of the world, particularly AI existential risk. FHI has been the base for a number of important projects in the x-risk space, including Nick Bostrom's work and a lot of progress coordinating between AI safety researchers, industry, academics, and governments. I would consider making a larger donation to FHI except that I'm not sure they're currently very funding constrained. Like THL for animals, FHI has been a consistent force for reasonable, productive work in the AI x-risk community.

You can donate to FHI here.

Small Donations

I'll be giving small donations this year to MIRI, Animal Charity Evaluators, the Good Food Institute, and the Against Malaria Foundation. I don't think any of them are the best uses of money right now but would like to cast a vote of support for what they do.

Animal Charity Evaluators

ACE researches charities working on improving animal welfare and attempts to find the best. I don't think ACE is particularly funding constrained right now, but it has served and important and neglected role in the EA community, and has a record of choosing very impressive charities that are leading the way for an effective, rational animal welfare movement. You can donate to ACE here.

The Against Malaria Foundation

AMF is an organization that coordinatins mosquito-repelling bednets in Africa to help prevent malaria. AMF has, for many years, been possibly the most effective global health charity in terms of lives saved per dollar (currently estimated to be somewhere around $4,000 per life, though the estimates are sensitive to what assumptions you make). You can donate to AMF here.

The Good Food Institute

GFI is a newer charity which is helping to promote the developement and adoption of plant-based and/or cultured (i.e. grown in a lab) meat replacements. I know relatively little about GFI but am excited about the potential of meat replacements to end factory farming, and am working partially off of ACE's recomendation. You can donate to GFI here.

The Machine Intelligence Research Institute

MIRI is an organization doing technical research on AI safety. I think (but am not sure!) that FHI's approach to x-risk is probably the more important one right now, and am uncertain of the usefulness of MIRI's output, but MIRI is one of the few places dedicated to technical AI x-risk work right now and was one of the early forces popularizing the idea. You can donate to MIRI here.

Wednesday, August 12, 2015

Multiplicative Factors in Games and Cause Prioritization

TL;DR: If the impacts of two causes add together, it might make sense to heavily prioritize the one with the higher expected value per dollar. If they multiply, on the other hand, it makes sense to more evenly distribute effort across the causes. I think that many causes in the effective altruism sphere interact more multiplicatively than additive, implying that it's important to heavily support multiple causes, not just to focus on the most appealing one.

-----------

Part of the effective altruism movement was founded on the idea that, within public health charities, there is an incredibly wide spread between the most effective and least effective. Effective altruists have recently been coming around to the idea that at least as important is the difference between the most and least effective cause areas. But while most EAs will agree that global public health interventions are generally more effective, or at least have higher potential, than supporting your local opera house, there's a fair bit of disagreement over what the most effective cause area is. Global poverty, animal welfare, existential risk, and movement building/meta-EA charities are the most popular, but there are also proponents of first world education, prioritization research, economics, life extension, and a whole host of other issues.

Recently there's been a lot of talk about whether one cause is so important that all other causes are rounding errors compared to it (though some disagreement over what that cause would be!). The argument, roughly goes: when computing expected impact of causes, mine is 10^30 times higher than any other, so nothing else matters. For instance, there are 10^58 future humans, so increasing the odds that they exist by even .0001% is still worth 10^44 times more important that anything that impacts current humans. Similar arguments have been made where the "very large number" is the number of animals, or the intractability of a cause, or moral discounting of some group (often future humans or animals).

This line of thinking is implicitly assuming that the impacts of causes add together rather than multiply, and I think that's probably not a very good model. But first, a foray into games.

Krug Versus Gromp

Imagine that you're playing some game against a friend. You each have a character--yours is named Krug, and your opponents' is named Gromp. The characters will eventually battle each other, once, to the death. They each do some amount of damage per second D, and have some amount of health H. They'll keep attacking each other continuously until one is dead.

If they fight, then Krug will take H_g / D_k seconds to kill Gromp, and Gromp will take H_k / D_g seconds to kill Krug, with the winner being the one who lasts longer. Multiply through by D_g*D_k, and you get that the winner is the one who has the higher D*H--what you're trying to maximize is the product of damage per second, and health. It doesn't matter what your opponent is doing--there's no rock, paper, scissors going on. You just want to maximize health * damage.

Now let's say that before this fight, you each get to buy items to equip to your character. You're buying for Krug. Krug starts out with no health and no damage. There are two items you can buy: swords that each give 5 damage per second, and shields that each give 20 health. They both cost $1 each, and you have $100 to spend. It turns out that the right way to spend your money is to spend $50 buying 50 swords, and $50 buying 50 shields, ending up with 250 damage per second, and 1,000 health. (You can play around with other options if you want, but I promise this is the best.)

The really cool thing is that your money allocation is totally independent of the cost of swords and shields, and how much damage/health they give. You should spend half your money on swords and half on shields, no matter what. If swords cost $10 and gave 1 attack, and shields cost $1 and gave 100 health, you should still spend $50 on each. One way to think about this is: the nth dollar I spend on swords will increase my damage per second by a factor of n/(n-1), and the nth dollar spent on shields will increase my health by n/(n-1). Since all I care about is damage * health, I can just pull out these multiplicative factors--the actual scale of the numbers don't matter at all.

This turns out to be a useful way to look at a wide variety of games. In Magic, 4/4's are better than 2/6's and 6/2's; in League of Legends, bruisers win duels; in Starcraft, Zerglings and Zealots are very strong combat units. In most games, the most powerful duelers are the units that have comparable amounts of investment in attack and defense.

Sometimes there are other stats that matter, too. For instance, there might be health, damage per attack, and attacks per second. In this case your total badassery is the product of all three, and you should spend 1/3 of your money on shields, 1/3 on swords, and 1/3 of caffeine (or whatever makes you attack quickly). In general most combat stats in games are multiplicative, and you're usually best off spending equal amounts of money on all of them, unless you're specifically incentivized not to (e.g. by getting more and more efficient ways to buy swords the more you spend on swords). In general, when factors each increase linearly in money spent and multiply with each other, you're best off spending equal amounts of money on each of the factors. Let's call this the Principle of Distributed Power (PDP).

Multiplicative Causes

So, what does this have to do with effective altruism?

I think that, in practice, the impacts of lots of causes multiply, instead of adding. For instance, I think that a plausible way to view the future is that expected utility is X * G, where X is the probability that we avoid existential risk and make it to the far future, and G is the goodness of the world we create, assuming we succeed in avoiding x-risk. By the Principle of Distributed Power, you'd want to invest equal amounts of resources on X and G. But within X there are actually lots of different forms of existential risk--AI, Global Warming, bioterrorism, etc. And within G, there are lots and lots of factors, each of which might multiply with each other--technological advancement, the care with which we treat animals, ability to effectively govern ourselves, etc. And the PDP implies that our prior should be to invest comparable resources in each of those terms.

The real world is a lot messier than the battle between Krug and Gromp. One of the big differences is that the impact of work on most of these causes isn't linear. If you invest $1M in global warming x-risk maybe you reduce the odds that it destroys us by .01%, but if you invest $10^30 clearly you don't decrease the odds by 10^28%--the odds can't go below 0. Many of these causes have some best achievable outcome, and so at some point you have to have decreasing marginal utility of resources.

Another difference is that we're not starting from zero on all causes. The world has already invested billions of dollars in fighting global warming, and so that should be subtracted from the amount that's efficient to further spend on it. (If you start off with $100 already invested in swords, then your next $100 should be invested in shields before you go back to splitting up your investments.)

In practice, when considering causes that multiply together, the question of how to divide up resources depends on how much has already been invested, where on the probability distribution for that cause you currently think you are, and lots of other practicalities. In other words, it depends on how much you think it costs to increase your probability of a desired outcome by 1%.

But as long as there are other factors that multiply with it, a factor's importance transfers to them as well. Which, in some cases, is a fact long ago discovered: the whole reason that x-risk is important is because of how immensely important the future is, which is equally an argument for improving the future and for getting there.

None of this proves anything. But it's significantly changed my prior, and I now think it's likely that the EA movement should heavily invest in multiple causes, not just one.

I've spend a lot of time in my life trying to decide what the single most important cause is, and pissing other people off by being an asshole when I think I've found it. I also like playing AD carries. But my winrate with them isn't very high. Maybe it's time to build bruiser.

Monday, December 31, 2012

Pitcher Fatigue, Part 2: The Top 10

Earlier, I wrote a post on the declining effectiveness of starting pitchers as they get deeper into games, postulating that it came from two major sources: the first being the fact that it's difficult to throw 100 pitches in a night without your arm getting temporarily tired, and second that the second time a batter sees a pitcher, they already know what type of stuff the pitcher is throwing and so are better able to hit it. Overall I estimated that by rotating pitchers frequently each game so that no pitcher went through the lineup more than once, a team could save about 5.6 wins each season (ignoring other effects, like the fact that if you're in the NL you get to pinch hit more often).

Also, starting with this post I'm going to make a conscious effort to switch from using OPS as my default batting stat to wOBA. wOBA, which is on the same scale as on-base percentage, is basically a version of OPS that uses more accurate weightings for events.

______________________________

On average, in 2012, the first time pitchers saw a batter they allowed a wOBA of about 0.338. The second time they saw those batters, the wOBA jumped to about 0.350, for a difference in wOBA of about 0.011. I'm going to name this statistic--wOBA for second plate appearances minus wOBA for fist--w-diff.

So the league average w-diff in 2012 was about 0.011. But different pitchers had different w-diffs.

Look, for instance, at R.A. Dickey. Dickey is a knuckleballer, and so one would expect hitters to be unusually bad the first time they see him--they have no practice hitting a knuckle-ball--but to get much better the second time, meaning one would expect him to have an unusually large w-diff. And, in fact, he does have a large w-diff over his career if you ignore all of the seasons in which he didn't have a large w-diff, which is a thing that makes a lot of sense to do if you have a personal vendetta against the year 2011.

Being a Utilitarian, Part 2: Conventional Charities

This is the second post in a series on actually being a utilitarian in the world; for the first post, look here. Also, for a more theoretical series on utilitarianism, look here.

______________

So, say that you're a utilitarian, and you're wondering what to do with your life. (Even if you're not a utilitarian but are wondering what to do with your life, most of this will apply.) What should you do? What, in the current society, can an individual do to make the world a better place? And what causes should you care about?

Is there anything you can do with your life to make the world a better place?

Less Stupid Use of Pitchers: Pitcher Fatigue

A while ago I wrote a post about one of the most unenlightened areas of baseball strategy: the use of pitchers. I proposed eliminating the distinction between starting pitchers, middle relievers, and closers in favor of a system that just uses a set of pitchers, each pitching different total numbers of innings, but no single pitcher pitching more than a few innings in a game; in other words, a starter would now throw two innings every few games instead of seven innings every five games.

The advantages of this, as I see it, are four fold.

1) If you're an NL team, you can pinch hit for your pitchers whenever they come up.

2) Pitchers don't have to throw 100 pitches in a game.

3) Batters never get to see the same pitcher twice in a game, and so can't get used to their pitches.

4) You can get the pitcher-batter match-ups you want all the time, instead of being stuck with your same pitcher the first three times through the lineup.

In the first post I estimated the size of effect (1): pinch hitting for you pitcher every time would let you score about 0.2 more runs per game, translating into about 3.2 wins per season (the difference between a .500 team and a .520 team).

Now I'm going to look at effects (2) and (3).

Being a Utilitarian, Part 1

I've written a series of posts about the different types of utilitarianism arguing for aggregate, classical, act, one-level utilitarianism. I haven't, however, talked at all about what it would mean to be a utilitarian in the real world.

In the real world, obviously, you aren't faced with a series of trolley problems or utility monsters. If you don't think about it very much, you might conclude that utilitarianism isn't actually useful because you can't calculate the total utility of each possible action.

However, as it turns out, utilitarianism can be useful even if you don't know the exact state of the universe.

In future posts I'll examine thornier, more wide-reaching issues, but for now I'll just talk about one issue--the first issue that I actually thought about in utilitarian terms. For people familiar with utilitarianism it probably won't be that interesting or revolutionary, but it's a good way to remind yourself that just because a theory is complicated doesn't mean approximations can't be useful. (It also parallels an argument Peter Singer has made on the subject.)

Re-starting the blog, and results of the second contest

As you may have noticed, after a hiatus while the school year started, I'm back to blogging.

First, I never resolved the second contest. No one solved the puzzle but Matt Nass made partial progress, so he gets 3 Shadow-points. I'm going to leave the puzzle open and if anyone solves it they get one Shadow-point. Here's the puzzle again, with a little bit filled in as a hint:

Instructions for the puzzle are here.

Also, I think that weekly was probably too frequent for the contests, so they're going to change to bi-weekly; I'll have another one out soon.

If there's anything you want me to write about, put it in the comments here.

Friday, November 23, 2012

Newcomb's Decision

This post is partially a continuation of my previous posts on utilitarianism, and partially on philosophy in general; mostly, it's my two cents on one of the odder parts of consequentialist debate: decision theories.

Newcomb's Paradox

You, a mere mortal, encounter P, some super smart alien. Or maybe it's a supercomputer, or maybe a god; versions of the paradox differ on this. P comes up to you and says: "I have a deal for you. I'm going to give you two boxes--box A, and box B. Box B is transparent, and you can see $1,000 in it. You can't see what's in box A. I'm going to give you two choices. The first is to take box A--you get whatever is in it. The other choice is to take both boxes--you get box A, plus the $1,000 from box B."

So, you ask, why don't you take both boxes, getting the free $1,000? Well, says P, there's a catch: "I have predicted whether you will take one box or two boxes." (Or maybe I've simulated all of the atoms in the universe, or maybe studied your psychology, or maybe something else--versions of the paradox differ in how P knows how many boxes you're going to take. But however he knows it, you believe him; maybe he has, in the past, predicted everyone who's taken this challenge successfully.) "So I know what you're going to do", says P, "and before you arrived I decided how much money to put in box A. If I predicted that you were going to take only box A, I put $1,000,000 in it. Otherwise--if I predicted that you were going to take both boxes--I left box A empty."

"So", says P, "How many boxes do you want to take?"

Elections and the Future

The Democrats' victory in the 2012 elections--primarily President Obama's reelection but also the Democratic caucus in the senate growing by three senators*--has caused a fair amount of hand wringing among conservative circles about the future of the Republican party--the new fashion in political circles seems to be guessing which of opposition to comprehensive immigration reform, opposition to gay rights, and opposition to tax hikes for the wealthy will have been felled by the 2012 election. I agree in large part with the long term trend of American politics, but I think it's important to keep it in perspective.

Election 2012 liveblog

Welcome to the Measuring Shadows 2012 election liveblog! We'll be live chatting on the widget to the right, and writing longer comments on this page.

Lame38, and liveblogging the election

I will be liveblogging the election tomorrow night on this blog, starting with some posts early in the day but picking up around 7pm. But why should you follow this blog?

Because there is a crisis waiting to unfold tomorrow night: what if polls close have just closed in Virginia but Nate Silver hasn't updated fivethirtyeight yet? How will you know if Obama's chances of winning the election have jumped to 92.5% or plummeted to 91.5%?

It is in the hopes that this catastrophe may be averted that I unveil Lame38: a shitty version of 538 that I promise to update frequently. What is Lame38? Well, I took all of the projections from fivethirtyeight, created a simplified model, and will run it with updated results--giving a real time projection of what Obama's chances of winning are, incorperating both states that have been called and the actual votes from noncompetitive states.

Q: Why is this better that 538?
A: Well maybe 538 will take like 15 minutes to update but Lame38 will only take 5 minutes. That'd be pretty cool, right?

Q: So it's just a shittier version of 538?
A: That's kind of a rude question.

Q: What information does your model incorporate as the night goes on?
A: I incorporate not just what states have been called, but also popular vote results from each state to try to estimate biases in the projections.

Q: So how does Lame38 work?
A: I took the projected vote differences and errors for each state from fivethirtyeight, as well as the standard deviation of the national popular vote. For each simulation I will sample a national popular vote, bias each state by its difference from the projected national popular vote (to estimate national bias), and then sample each state's vote. I then run about 1,000,000 simulations.

Q: So, what are the odds Obama wins the election?
A: Well, 538 says 92.2*%, and Lame38 says 93.5%, so I'd say the answer is about 92.2%.

Q: What else will you be blogging about?
A: Senate races, the race for California's 15th congressional district, and whatever else is on my mind.

Q: Will this be a great liveblogging, or the greatest liveblogging?
A: I'm shooting for "worth reading".

I hope to see you there!

*When I ran the simulations; 538 has since updated to 92.0%.

Thursday, August 16, 2012

Checking in on Tim Lincecum

A month ago I wrote an article looking at changes in San Francisco Giants' starting pitcher Tim Lincecum's pitch command and velocity this season. In particular, I found that the velocity on his fastball and slider had decreased by about a mile per hour, and that the average distance of his pitches from the edge of the strikezone had increased--assuming that, generally, pitches near the edge of the strikezone are better than those in the middle or nowhere near it.

Since the all-star break, though, Lincecum's ERA, at least, has been a respectable 3.66. Have his command and speed improved as well?

It turns out that his speed is the same as earlier this year, with a fastball averaging around 90.4 mph, and about a mile per hour slower than last year. So, no improvement on that front.

His average distance from the edge of the strikezone, on the other hand, has gotten a bit better--it's been at .940*, as compared to .923 last year and .961 for the first half of 2012**.

So, long story short, there are some signs that his pitching might be picking up but nothing conclusive.

By the way, I'll be announcing results from the second contest and introducing the third one in the next day or two.

_________________________________________________________________________________
The value for the second half of this year is only statistically significantly different from the value from last year, and from the value for the first half of this year, at the p= .30 level--so the jury's out on this one.

Monday, August 13, 2012

Checking in on the Giants' lineup

Earlier a wrote a few posts on what the SF Giants' optimal starting lineup should be using Basim, a baseball simulator I wrote. A lot has changed since then, though--Posey has become much better, Blanco and Pagan have cooled off, and Hunter Pence and Marco Scutaro have joined the club. So, what should the Giants' lineup look like now? What lineup do I hope they start tonight?

First off, here was my guess at their best lineup (once again assumin Zito is the pitcher):

1. Buster Posey
2. Brandon Belt
3. Melky Cabrera
4. Pablo Sandoval
5. Hunter Pence
6. Marco Scutaro
7. Angel Pagan
8. Brandon Crawford
9. Barry Zito

Running this lineup through the simulator*, it scored an average of 4.03 runs per game.

I then found that a random lineup (i.e. random ordering of the nine players) scored about 3.89 runs per game. The lineups that the simulator liked the best generally had Posey, Belt, Cabrera, or Pence batting leadoff, which is unsurprising--each has either a high OBP or a high ground into double play rate that would be very painful in the heart of the order. The single lineup that the simulator liked best** was the following:

1. Buster Posey
2. Angel Pagan
3. Hunter Pence
4. Brandon Belt
5. Melky Cabrera
6. Marco Scutaro
7. Pablo Sandoval
8. Brandon Crawford
9. Barry Zito

It scored an average of about 4.035 runs per game***.

I then looked at two lineups that were close to what I predict the Giants will run; they differ only in whether the Giants play Theriot or Crawford; the lineup Pagan, Scutaro, Cabrera, Posey Sandoval, Pence, Belt Theriot, Zito scored an average of 3.96 runs per game, while the same lineup but with Crawford batting for Theriot scored 4.00 runs per game on average. So, it seems like about half the difference between my lineup and the one with Theriot just comes fromt the fact that Craword is better than Theriot.

Anyway, here's to hoping the Giants will do something smart.

_________________________________________________________________________________
*: For Pence and Scutaro I used their pre-Giants numbers.

**: This should be taken with a grain of salt--to actually find the best would take days of simulation; treat this as a lineup that is pretty close to the best.

***: FWIW, the ten best starting lineups, according to the simulatr (with the same caveat as **), in the form (lineup, average runs scored by lineup per game): [(['Buster Posey', 'Melky Cabrera', 'Brandon Belt', 'Marco Scutaro', 'Hunter Pence', 'Pablo Sandoval', 'Angel Pagan', 'Brandon Crawford', 'Barry Zito'], 4.0152400000000004), (['Brandon Belt', 'Buster Posey', 'Pablo Sandoval', 'Melky Cabrera', 'Hunter Pence', 'Angel Pagan', 'Brandon Crawford', 'Barry Zito', 'Marco Scutaro'], 4.0168675), (['Melky Cabrera', 'Buster Posey', 'Brandon Belt', 'Marco Scutaro', 'Pablo Sandoval', 'Brandon Crawford', 'Angel Pagan', 'Hunter Pence', 'Barry Zito'], 4.0179150000000003), (['Brandon Belt', 'Buster Posey', 'Pablo Sandoval', 'Melky Cabrera', 'Hunter Pence', 'Brandon Crawford', 'Angel Pagan', 'Marco Scutaro', 'Barry Zito'], 4.0218325000000004), (['Melky Cabrera', 'Brandon Belt', 'Marco Scutaro', 'Pablo Sandoval', 'Angel Pagan', 'Hunter Pence', 'Buster Posey', 'Brandon Crawford', 'Barry Zito'], 4.0219899999999997), (['Hunter Pence', 'Marco Scutaro', 'Buster Posey', 'Pablo Sandoval', 'Brandon Belt', 'Melky Cabrera', 'Angel Pagan', 'Brandon Crawford', 'Barry Zito'], 4.0253224999999997), (['Brandon Belt', 'Hunter Pence', 'Pablo Sandoval', 'Angel Pagan', 'Melky Cabrera', 'Brandon Crawford', 'Marco Scutaro', 'Buster Posey', 'Barry Zito'], 4.0315525000000001), (['Hunter Pence', 'Angel Pagan', 'Brandon Belt', 'Melky Cabrera', 'Buster Posey', 'Pablo Sandoval', 'Marco Scutaro', 'Brandon Crawford', 'Barry Zito'], 4.0322525000000002), (['Marco Scutaro', 'Pablo Sandoval', 'Brandon Belt', 'Melky Cabrera', 'Buster Posey', 'Hunter Pence', 'Brandon Crawford', 'Angel Pagan', 'Barry Zito'], 4.0353874999999997), (['Buster Posey', 'Angel Pagan', 'Hunter Pence', 'Brandon Belt', 'Melky Cabrera', 'Marco Scutaro', 'Pablo Sandoval', 'Brandon Crawford', 'Barry Zito'], 4.0365225000000002)]

Second contest ends tonight (Monday night)

Last week I announced the second contest, to solve this intimidating puzzle:

Details are at the first link. So far no one has fully solved it, so send in your partial solutions--they'll probably make it to the top three.

The contest ends tonight (Monday night) at 11:59 pm. I'll announce the third contest sometime Tuesday.

Friday, August 10, 2012

Swing Vote!

Hey everyone. Sam made a post a couple weeks ago analyzing votes (and money) for congressional elections. This is a post about analyzing votes for US Presidential elections. This is the result of a 2am conversation-calculation with Sam and a mutual friend Gary Wang. Also, we base a lot of intermediate steps in the result comes from the website FiveThirtyEight.

The question we're going to try to answer is the following. You are Joe Smith from Iowa (or, insert your favorite state here), who decides to stay home on November 6, election day. What's the probability you wake up the next morning, open up Google News, and feel really stupid? Or more succinctly, what's the probability that a single vote in Iowa will make the difference in November?

Traditionball: the most unenlightened area of baseball strategy

About ten years ago, baseball started to undergo a statistical revolution: youth became valued, OPS was born, and walks finally became valued. Fast forward a decade and OPS is now a mainstream stat, multiple sites are constructing competing ways to summarize the total value of a player, and even in baseball clubhouses sabermetrics are the new cool kid on the block.

But there are still a few areas of baseball strategy stuck in the dark ages of gut instincts and wild speculation, and chief among them is use of pitchers.

Right now it baseball there are three types of pitchers: starters, relievers, and closers. Starters come in to pitch the start of the game, stay in for at least five innings, and are eventually taken out. They pitch every five days. Closers come in in the ninth inning with a lead of between one and three runs. They never pitch more than an inning, and never come in otherwise. Middle relievers pitch in between starters and closers.

These roles bear an uncanny resemblance to two of the stupidest pitching statistics, wins and saves.

This system is, of course, not close to optimal. Frequent pitching changes at the beginning of the game would allow a manager to get better matchups, keep pitchers fresher from stopping them from having to throw too many pitches in one day, and allow pitchers to throw however many pitches is best for them--not a bimodal distribution with centers at fifteen and one hundred.

It would also give an NL team another advantage--they could always pinch hit for their pitchers (or at least as long as it wasn't a two out, none on situation).

I'll look at the first effects in a later post, but for now, how much would always pinch hitting help?

Well, first I found the number of runs scored by an average NL lineup from 2011 using Basim; it was 3.799.

Then, I substituted the average substitute player for the league in for the ninth spot in the lineup; Basim then simulated it and found an average of 4.006 runs per game.

That's roughly a 3.2 win difference right there--the difference between a .500 team and a .520 team.

It's true, of course, that implementing such a system could incite a revolt from pitchers--but it seems like there is too much to be gained for it to be worth ignoring as a manager.

Tuesday, August 7, 2012

Utilitarianism, part 6: To do, or not to do

This is the sixth post in a series about utilitarianism. For an introduction, see the first post. For a look at total vs. average utilitarianism, see here. For a discussion of act vs. rule, hedonistic vs two level, and classical vs. negative utilitarianism, see here. For a response to the utility monster and repugnant conclusion, see here. And for a look at whether to count lives not yet in being, see here.

Also, note that I'm now putting page breaks in the middle of my posts so that you can see more than one on the front page...
-------------------------------------------------------------------------

I'm going to start off by making a note about something slightly different from the content of this post. Earlier, I defined a philosophy as a preference ordering on all possible universes; the ordering had to be transitive, reflexive, etc. Basically, a philosophy is something that compares any two possible universes; in other words, it tells you which options are the best (if you have complete information, that is). Perhaps for you a philosophy is something different. Maybe it's something that compares some situations but doesn't say anything about other comparisons. Maybe it's a binary function that calls all actions either morally permissable or impermissable. Maybe it is a framwork to look at actions that doesn't necssarily tell you which are best, but instead some other difficult to define properties of them. Probabily it's a mechanism to justify your current way of life. Anyway, if you don't think a philosophy should be a preference ordering on possible universes, there's probably very little I can do to convince you, just as if you think faith is more important than evidence or that gut instincts are more important that statistics in baseball there's probably little I can do to convince you. But from now on I am using that definition, and will look critically upon philosophies that fail to create a preference ordering.

Act and Omission

Anyway, there is a large debate in philosophy about whether taking an action should be treated asymetrically from failing to take an action--the act/omission distinction. There are many phrasings of the problem, but here is one of the more famous ones: the trolley problem. The trolley problem is a thought experiment in which you, the actor, are near some trolley tracks. The tracks split, and past the split there are currently five people tied to one of the tracks, and three tied to the other. You're standing next to the lever which controls which path the trolley takes; in the first version of the problem the lever is currently such that three people will do, and in the second version of the problem the lever is currently such that the trolley will run over and kill the five people. A trolley is coming. You have time to pull the lever, if you want, but not to untie any of the people. In the first version it's pretty clear you don't pull the lever--not only are you causing the death of five people, but you're only saving three by doing it. But how about the seond version? Do you pull the lever and switch the trolley, kiling three other people, or not do anything and let the first five people die? That is, do you act, or not? And should morality treat the omission of action, which results in two extra deaths, the same way it would the action of killing two people? In other words, are these two scenarios the same? Does it matter which way the lever is currently pointing?

Contest Number Two: Two Degrees of Separation

Last week I introduced the contest of the week: each week I will propose a contest, and award Shadow-points to the winners; every two months the person with the most Shadow-points (in the period) will geet their name posted on the side of the blog, a $2 reward, and the chance to write any article they want for the blog. The results of the first week's contest are here.

--------------------------------------------------------------

I'm going to try to alternate types of contests, so this week's will be a little bit different from last week's. This week, the contest is a word puzzle of sorts. The first person to solve it will get first place, second person second place, etc.

Without further ado, here's the puzzle:

Two Degress of Separation*

Results from the First Contest

A week ago I proposed a contest for readers to construct the best possible lineup from the 2002 Giants roster; the rules are here and here. The winner of the contest gets 3 Shadow-points; second place gets 2; and third place gets 1. In addition everyone whose entry beats my own entry gets an additional Shadow-point. Every two months the person with the most Shadow points gets their name on in the Shadow Hall of Fame, $2, and the ability to write any one article for the blog.

-----------------------------------------------

Today, I'm announcing the winnners of the contest. The average submission scored somewhere around 4.9 runs per game, and the best scored above 5. I have more comments to make, but without further ado, the three winners:

Last Chance for Contest

A week ago, I introduced a contest to design the best lineup from the 2002 Giants roster; details are here and here. Submissions for the contest are due tonight, so if you want to participate but haven't given me a lineup, either post it as a comment here or email it to me by 11:59 tonight.

I'll announce the results of the contests sometime tomorrow.

Also, I've run some more simulations with Basim on the 2000-2011 season; it looks like the correlation between RAA* and runs scored by a team is .963; without very many simulations (20,000 per player) the correlation between eWAA and runs scored is .952, but that number should go up with more simulations as the noise goes down (due to limited computing power it's taking a while to get a fuller result).

Also, if anyone has a suggestion for what I should write on (baseball, philosophy, or anything else), let me know.

_________________________________________________________________________________

RAA, runs above average, is the baseline offensive stat used to construct WAR; I'm using a modified version that removes ballpark advantage, etc. to do an apples-to-apples comparison.