Archive by Author | anilseth

Update on Andromeda Project Analysis, Part I

All of us here at the Andromeda Project are really excited to re-launch the project for a second round of classification in mid-October.  While we’re busily getting everything ready for you to look at, I thought we could provide you with an update of what we’ve been doing to analyze your work since last December.

The Basics — From Clicks to Clusters

The primary goal of the Andromeda Project is to find star clusters big and small.  Our hard-working Round 1 participants looked at over 1 million images clicking on all the clusters they found.  This meant at least 80 of you looked at each of the 12,425 Round 1 images.    A University of Utah undergraduate student, Matt Wallace, who has joined the Andromeda Project Science Team found looking at your image clicks a mesmerizing process and made this movie — the drawings are color-coded by the type of object (cluster=white, galaxy=green):

The simplest way we can translate these clicks into real clusters is to look at the fraction of people that called an object a cluster.  If 72 of the 80 people that looked at an image circled the same object, then that object has a “ClusterFrac” of 0.90 or 90%, while if only 8 of the 80 people clicked on something, its ClusterFrac is 0.10.  The simplest way we can find clusters is to choose a threshold ClusterFrac (e.g. 0.35) — by picking only objects above this threshold we get mostly real clusters without including too many objects that aren’t clusters.

Our Testing Ground — the Year 1 Sample

How do we know if we’re finding the clusters that we want to find?  One reference point is the “Year 1” cluster sample, published in Johnson et al. 2012.  This sample is based on about 20% of the Andromeda Project images that a group of professional astronomers looked through to create our initial cluster catalog of 601 good star clusters, as well as a catalog of galaxies and other non-cluster objects.  We can compare the fraction of these Year 1 clusters found to assess the completeness of the Andromeda Project cluster sample.  A completeness value of 0.90 means that 90% of these Year 1 clusters were found by Andromeda project users.  The lower we make the ClusterFrac threshold, the higher the resulting completeness.  We can also look at all the other objects that were found that might be contaminants; these include previously classified galaxies and objects we previously decided weren’t clusters as well as objects that were not identified by professional astronomers during the Year 1 search (at least some of which may be real clusters!!!!).

The plot below summarizes how we make the comparison between the Andromeda Project data and the Year 1 cluster sample.

Screen shot 2013-09-13 at 12.59.12 PM

The top panel of this plot shows the ClusterFrac threshold required to get the completeness shown on the horizontal x-axis.  For instance, to achieve 90% completeness of the Year 1 cluster catalog, we need to use a ClusterFrac threshold of 0.35 (i.e. where at least 35% of people clicked on the candidates).  In the bottom panel, we can look at the number of possible non-cluster contaminants we pick up along with the good clusters.  For instance, for 90% completeness in the Year 1 sample, we find a few percent of the objects in the sample are known galaxies, 7-8% are objects we previously decided were not clusters and a similar number of objects were previously unidentified.  On the other hand, if we use a ClusterFrac threshold of 0.5, contaminants make up <10% of the sample, but we only include ~75% of the Year 1 clusters.  Our goal is to try analyzing the data in a way to maximize the completeness and minimize the number of contaminants.  As part of this effort we’re weighting users based on how well they did at identifying good clusters and trying to determine what objects might be being missed by Andromeda Project users.  This may sound kind of critical, so we’d like to emphasize how awesome the data is.  Regardless of how we analyze the data, the Andromeda Project will produce the largest and best characterized sample of clusters known in any galaxy!  Thanks to you!


Welcome to the Andromeda Project!

We’re very excited you’ve taken the time to visit the new Andromeda Project site!  With your help, we’re going to identify the largest sample of star clusters known in any spiral galaxy, including our own Milky Way.  We will use the clusters you find to study the history of Andromeda and to better understand how stars form.


The beautiful images you will be looking at come from the Hubble Space Telescope.  Since 2010, Hubble has spent nearly two months of time looking at Andromeda as part of the Panchromatic Hubble Andromeda Treasury (PHAT) survey.  For the Andromeda Project, we’ve broken these pictures up into more than 9,000 separate images; there are more than 3 billion pixels that we need you to help us examine!

Star clusters are groups of hundreds to millions of stars that are all born together.  This means that their ages are (relatively) easy to determine, and the first thing we do after you find the clusters will be to measure their ages and masses.  Once their ages and masses are known, we can use these clusters to mark major formation epochs in the galaxies’ history.  We will also identify the youngest star clusters which we can use to test theories of star formation.

In addition to star clusters, you’ll also be identifying background galaxies and image artifacts.   We will use the galaxies to study the gas and dust within Andromeda; Galaxy Zoo veteran Bill Keel will have a blog post about this soon.

We’d like to thank our wonderful beta testers who helped make this site more usable.  If you have any questions or comments take a look at the About and Guide sections, or create a post in the Talk section.

Good luck cluster hunters!

Anil Seth is an assistant professor at the University of Utah’s Physics & Astronomy department (