So I repeated the experiment with a younger group, taking the 1970 rosters and looking at guys born in 1944 or 1945. Now the age range of survivors is between 64 and 66 and we should expect the number of survivors would go up and the number of dead to go down, and these random samples meet these common sense expectations. 9 of 100 pro football players on the random list are dead, while 13 of the 100 major leaguers are gone.
If we do a chi-square significance test, the difference between 13 of 100 and 9 of 100 is not enough for us to say we will see a big difference in the underlying populations. This could easily just be random variation. If I had taken the 100 older ballplayers from yesterday's work and looked at mortality exactly 10 years ago, 11 of that 100 would have been gone. Only 10 of the older football players from yesterday were gone as of March 2000. In other words, we have several samples that say about 10% of 25 year olds don't make it to 65.
Here's the thing. If instead we work with the period life tables conveniently provided by the Social Security Administration, we see that out of 100 American males in the 24-26 age range in 1960, we would expect 37 not to survive to 2010. If we took a similar sample from 1970, about 16 would not survive. In other words, both baseball players and football players show greater longevity than their peers in everyday life.
Are these differences statistically significant? At the sample size of n=100, no. But at larger sample sizes, yes. That's a problem with statistics, and some folks like Dr. Deming considered it a major flaw. But whether the differences are significant or not, this data indicates that athletes actually have greater longevity on average.
Could my data have flaws? Yes. My sampling method might not be random enough, and it could be that Wikipedia and the Baseball-Reference.com and nfl.com have missed some obituary notices, which would mean I incorrectly numbered some dead ballplayers among the living. But the overall message is this. In this case, the news has taken completely bogus numbers to argue for the solution of a problem that doesn't exist. No matter what your political persuasion, you have to believe that this isn't the first time.