Thursday, July 19, 2007
Not Necessarily the Doomsday Clock
by Tom Bozzo
So how do you get that 46 years?
Suppose you're observing an Event that occurs during a fixed time interval (potentially a strong assumption). Suppose also that Baldrick is flying your space-time conveyance and drops you at a random point in the interval (potentially a very strong assumption). Suppose third that you know nothing else about the event. Your "best guess" as to where you've landed, in the expected value sense, is the midpoint of the interval. So if you then get your bearings and figure out how long ago the event started, which is all the information you have, your best guess is that the event will end the same amount of time in the future. That isn't a very good guess, though, in the sense that there's a 50% chance that the "true" end will be sooner or later than that.
Applied to the human spaceflight program, dated to 1961 (questionable [*]), then by advanced mathematics about 46 years have elapsed since then and the information you have and the assumptions above lead to the result. QED.
What the astrophysicist J. Richard Gott did, in a short paper, was to construct interval estimates with high confidence levels -- statements that the unknown end date for the event should fall between A and B 95 percent of the time. For the spaceflight case, A is 2008 (next year) and B is AD 3,801. But saying that you're 97.5 percent confident that the human spaceflight program will end in the next 1,800 years or so doesn't have the same sense of urgency. More generally, 95 percent confidence results in a range from 1/39th the age of the event on the low side to 39 times the age of the event on the high side. Call this the "Copernican formula" if you will. The proof methodology (see this paper [PDF], helpfully linked by Tierney) uses only undergraduate-level mathematical statistics, so read it yourself if you're so inclined.
This leads me to strongly endorse John Quiggin's conclusion:
Reinforcing Quiggins's point on how posterior distributions may be influenced, had Gott's paper appeared a week earlier, he'd have missed on "Kiss" to the tune of 100 days, since the previous week of running time adds some 9 months to the prediction's upper bound. If it matters whether the production runs another week or another year, searching for information is not unlikely to be rewarded.
Meanwhile, if you wanted to make some inference on whether both "Kiss" and "Will" would be playing at some future date, forget about it. The most interesting contribution comes from Brian Weatherson (at CT and Thoughts, Arguments, and Rants), who derives a neat result showing that if you infer the probability of both plays running at a future date based solely on the length of time they've run together (the information the method admits), it follows that if "Kiss" (the shorter-duration event) is still playing at that date, then "Will" (the longer-running event) will also be playing with probability 1. Weatherson concludes that there must be "something deeply mistaken with the Copernican formula."
My own little gloss, pending peer review in the self-correcting blogithingy, is here in the CT comments. What seems to be happening in this case is that (1) Gott's method throws away the information on how long "Will" has been running, and (2) sneaks in an additional assumption that the "Kiss" and "Will" events must be dependent or correlated. There may be circumstances under which these extremely strong assumptions may be justified, Weatherson maybe goes a bit too far in suggesting that these but they strike me as implying more than diffuse information on anything other than the elapsed times of the events.
Last, since you are by definition still with me here in the unlikely event you are reading this, here's the brief rant portion of the post: How the frack did Gott get 5 frackin' pages in Nature for this, which looks a lot more like it merits a paragraph of Mathematical News of the Weird?! I've been turned down cold — not even this 'reject and resubmit' stuff Drek writes about for stuff a hundred times harder and at least somewhat more relevant, if I don't say so myself. (If you really have time to kill, you may note that part of what I'm talking about eventually came out via other researchers' efforts as part of this IIASA working paper a few years later.) And if that's happened to me, then so too must everyone except Nick Bostrom, as I infer from Tom's Anti-Copernican Principle. W. T. F.
And BTW, Nature, what's up with US$30 for an e-print? Surely the revenue-maximizing price — which given the approximately $0 marginal cost, is also profit-maximizing — is not set at levels that make the likes of me think about sending junior staff to the library (were there a business case for actually obtaining the paper). Just saying.
[/rant]
[*] It's not like Yuri Gagarin's rocket just materialized on the pad and blasted off. And remember, going back even a few years into the preflight stages of human space programs puts a century or two on the upper bound.
Just going to show what happens when you drop off even the post-paywall NYT op-ed page, reaction to John Tierney's report that we have 46 years to colonize Mars Or Else Civilization is Dooooomed has been relatively muted over the Intertubes. Prof. Bainbridge quotes the Ole Perfesser without comment (see Roy at Alicublog for the omitted analysis) but also Charlie Stross's excellent post on the grim case for space colonization.
So how do you get that 46 years?
Suppose you're observing an Event that occurs during a fixed time interval (potentially a strong assumption). Suppose also that Baldrick is flying your space-time conveyance and drops you at a random point in the interval (potentially a very strong assumption). Suppose third that you know nothing else about the event. Your "best guess" as to where you've landed, in the expected value sense, is the midpoint of the interval. So if you then get your bearings and figure out how long ago the event started, which is all the information you have, your best guess is that the event will end the same amount of time in the future. That isn't a very good guess, though, in the sense that there's a 50% chance that the "true" end will be sooner or later than that.
Applied to the human spaceflight program, dated to 1961 (questionable [*]), then by advanced mathematics about 46 years have elapsed since then and the information you have and the assumptions above lead to the result. QED.
What the astrophysicist J. Richard Gott did, in a short paper, was to construct interval estimates with high confidence levels -- statements that the unknown end date for the event should fall between A and B 95 percent of the time. For the spaceflight case, A is 2008 (next year) and B is AD 3,801. But saying that you're 97.5 percent confident that the human spaceflight program will end in the next 1,800 years or so doesn't have the same sense of urgency. More generally, 95 percent confidence results in a range from 1/39th the age of the event on the low side to 39 times the age of the event on the high side. Call this the "Copernican formula" if you will. The proof methodology (see this paper [PDF], helpfully linked by Tierney) uses only undergraduate-level mathematical statistics, so read it yourself if you're so inclined.
This leads me to strongly endorse John Quiggin's conclusion:
The real lesson from Bayesian inference is that, with little or no sample data, even limited prior information will have a big influence on the posterior distribution. That is, if you are dealing with the kinds of cases Gott is talking about, you’re better off thinking about the problem than relying on an almost valueless statistical inference.Indeed, if observing the passage of a year and nothing else, the upper bound of the interval moves out 39 years. That can be a big deal in many applied circumstances! For example, here's Gott himself writing in the New Scientist in 1997. A subhead of "Living proof" suggests he isn't engaged in deliberate leg-pulling as he recounts:
As another test, I used my formula on the day my "Nature" paper was published to predict the future longevities of the 44 Broadway and off-Broadway plays and musicals then running in New York; 36 have now closed - all in agreement with the predictions. The "Will Rogers Follies", which had been open for 757 days, closed after another 101 days, and the "Kiss of the Spider Woman", open for 24 days, closed after another 765 days. In each case the future longevity was within a factor of 39 of the past longevity, as predicted.In this application, a prediction within a factor of 39 of past longevity conceivably covers the range from total flops to huge hits to productions that will eventually be performed by automata in Wisconsin Dells. The "prediction" for the "Will Rogers Follies" is that it will (likely) close within the next 82 years. That's out on a limb. (And certain philosophers inclined to bash social scientists for theories with weak predictive value might put this in their pipe and smoke it.)
Reinforcing Quiggins's point on how posterior distributions may be influenced, had Gott's paper appeared a week earlier, he'd have missed on "Kiss" to the tune of 100 days, since the previous week of running time adds some 9 months to the prediction's upper bound. If it matters whether the production runs another week or another year, searching for information is not unlikely to be rewarded.
Meanwhile, if you wanted to make some inference on whether both "Kiss" and "Will" would be playing at some future date, forget about it. The most interesting contribution comes from Brian Weatherson (at CT and Thoughts, Arguments, and Rants), who derives a neat result showing that if you infer the probability of both plays running at a future date based solely on the length of time they've run together (the information the method admits), it follows that if "Kiss" (the shorter-duration event) is still playing at that date, then "Will" (the longer-running event) will also be playing with probability 1. Weatherson concludes that there must be "something deeply mistaken with the Copernican formula."
My own little gloss, pending peer review in the self-correcting blogithingy, is here in the CT comments. What seems to be happening in this case is that (1) Gott's method throws away the information on how long "Will" has been running, and (2) sneaks in an additional assumption that the "Kiss" and "Will" events must be dependent or correlated. There may be circumstances under which these extremely strong assumptions may be justified, Weatherson maybe goes a bit too far in suggesting that these but they strike me as implying more than diffuse information on anything other than the elapsed times of the events.
Last, since you are by definition still with me here in the unlikely event you are reading this, here's the brief rant portion of the post: How the frack did Gott get 5 frackin' pages in Nature for this, which looks a lot more like it merits a paragraph of Mathematical News of the Weird?! I've been turned down cold — not even this 'reject and resubmit' stuff Drek writes about for stuff a hundred times harder and at least somewhat more relevant, if I don't say so myself. (If you really have time to kill, you may note that part of what I'm talking about eventually came out via other researchers' efforts as part of this IIASA working paper a few years later.) And if that's happened to me, then so too must everyone except Nick Bostrom, as I infer from Tom's Anti-Copernican Principle. W. T. F.
And BTW, Nature, what's up with US$30 for an e-print? Surely the revenue-maximizing price — which given the approximately $0 marginal cost, is also profit-maximizing — is not set at levels that make the likes of me think about sending junior staff to the library (were there a business case for actually obtaining the paper). Just saying.
[/rant]
[*] It's not like Yuri Gagarin's rocket just materialized on the pad and blasted off. And remember, going back even a few years into the preflight stages of human space programs puts a century or two on the upper bound.
Labels: Philosophy, Science, Social Science, Statistics