Saturday, February 19, 2011

Galaxy Zoo


I think I heard about Galaxy Zoo on a Science Friday broadcast about 2 months ago. I had heard about it before, but this time I logged on to http://www.galaxyzoo.org/ to see what it was all about.

The premise of the site is for astronomers tasked with classifying distant galaxies getting some help. There are something like 200 billion galaxies in the universe, which is far too many to go through all of them in a lifetime with the number of professional astronomers doing the work. They developed programs to sort out the galaxies, but found that a computer is not good at distinguishing a disk from a sphere or recognizing a disk seen from edge on. Humans are very good at this. So the images were split up by a computer and fed to a site where you can sign on to help. This is known as crowdsourcing, where you use volunteer citizen scientists to help perform a long task. Right now, there are a quarter million volunteers helping with the work. Unfortunately, that still works out to 800,000 galaxies for each volunteer to classify. The work would still not get done in our lifetime at this rate.

We have had so many exploratory space probes, generating so much data over the last 40 years. The average person probably assumes that all this data has been thoroughly analyzed by teams of diligent scientists. The reality is that the information is beyond the ability of scientists to study. It's a simple matter of time constraints. Can you thoroughly catalog all of your personal photographs? Most people probably do not take the time to do this, and that task is simple compared to analyzing the photos from a single fly-by of a satellite.

I used to watch Star Trek and wonder why they wouldn't have everything all figured out by 300 years from now? They are always flying by some red dwarf star or some nebula and stopping to explore like it's a wonderful new thing. I remember thinking, "surely they've seen this before?" It makes sense if you think about it. I have been paying attention to distance and speed in the various Star Trek series and have concluded that Warp 9 is not 9 times the speed of light. It's something much faster. There seems to be an exponential effect to warp factors, because they zip along doing 8 light years in a matter of hours. Even at these impressive rates, 300 years from now, we still have barely scratched the surface of one quadrant of the galaxy. That's just in our galaxy, which stretches for 100,000 light years across and contains 200 billion stars (give or take a hundred billion - we don't even know).

In the past, organizations like SETI (Search for Extra-Terrestrial Intelligence) have come up with ways that you can donate your computers processor to help crunch huge numbers. I believe they were analyzing huge amounts of radio spectrum looking for patterns that might be "man made". These efforts used distributed processing to run a large virtual supercomputer, but I have not heard that they have had much success in their efforts. They certainly have not found an intelligent signal, but I would also classify success as analyzing all the available data, and I'm not sure they've reached that milestone, either.

I've heard of crowdsourcing projects that search for habitable planets, supernovae remnants in nebulae, surface features on planets and satellites in our solar system, and protein structures in living organisms. There is a lot of knowledge out there to ponder. Last week, IBM put its new supercomputer called Watson on the game show Jeopardy to see if it could beat the two best Jeopardy contestants in history. It did so quickly and easily. This was seen as a huge challenge for computing, because answering the questions is not a simple database search, but an interpretation with some tricky aspects to it. Not all Jeopardy questions are straightforward, some involve word play and subtle twists. While the computer did admirably, they confessed that sometimes it would just miss the point completely. I witnessed this in the Final Jeopardy question on the second day. It was a question about airports being named after WWII battles and soldiers, and I knew it immediately (only because I stopped and read the historic plaque in O'Hare during one of several incredibly long layovers there). They said that Watson's strength is the ability to go over huge amounts of information in multiple databases, but it doesn't always make good sense of it. One of the possible applications that the IBM team said we could use Watson for was to point things out and make suggestions for a human user or team to quickly discard and sort through. They used the example of putting a patient's symptoms into the computer and having it list several possibilities. The computer could search through all the recent medical papers and all the old archives and do what amounts to an enormous amount of research very quickly. It would take a human forever to read all the medical papers and keep up to date. However, if the computer did that portion of the job for us, we could concentrate on those few possibilities that have a high probability of yielding good results, thus focusing our time and making us more efficient. We can see patterns and connections that the computer doesn't understand.

Our curiosity and need to improve our lives will always push us to learn about all the unknowns. I find it encouraging that we are still finding new ways to think and learn how to learn.

No comments: