Oriel's Orrery: research

Showing posts with label research. Show all posts

Tuesday, July 5, 2016

From Photons to Photos: Observational Astronomy with CCDs

Summer research season is underway for me! This year, I am working with Mike Brown on an observational astronomy project. I’ll be using some of the data collected on the Keck Telescope last winter to study Jupiter’s Trojan asteroids.

Observational astronomy is probably what most people imagine when they picture an astronomer’s work, although the exact image in mind might be a bit outdated. Today’s astronomers are more likely sitting behind a computer screen than behind the eyepiece of a telescope, even in work that isn’t strictly computational or theoretical. That’s because astronomy, like many aspects of modern life, has gone digital.

Astronomers first recorded their observations on paper, by hand, until the invention of photography. By the early twentieth century, ground breaking discoveries, such as Edwin Hubble’s discovery of other galaxies and the expansion of the universe, were being made with the assistance of photographic plates. As photography evolved, so did astronomy. Today, digital cameras use sensors called CCDs to capture images, as do most telescopes. Now, astronomers can affix sensors to the foci of their telescopes in order to collect high-quality data.

How do CCDs capture astronomical images, assuming you have a telescope and a target for your observations? CCD stands for charge coupled device, a name that gives a hint of how it works. A CCD is sectioned into pixels, or bins, and when exposed to light, the bins gain a charge. The charge on each bin is proportional to the amount of light that it received, so the more charge each bin has, the more light it was exposed to, the brighter the area of the sky it observed. When the exposure is finished, the charge on each bin is read out and converted into a number (count) that represents how much charge built up in each bin. This transforms the image into an array of counts that represents how much light was detected in each pixel of the CCD.

Arrays, simple lists of numbers, are very easy for computers to store, transfer, and manipulate, so they are a useful format for astronomical data. Conversion of images to numbers just isn’t possible with sketches and photographic plates, and opens up new possibilities for handling data, since computers can very easily handle manipulating lists of that form. Some astronomers today work on “training” computers to perform automatic analyses of arrays, so computers can quickly accomplish basic tasks identifying variable stars or Kuiper Belt Objects. Such computer programs are useful, especially with the rise of large scale digital sky surveys that produce enormous quantities of data on a nightly basis.

A small part of a typical array might look like this. While useful to a computer, it’s very difficult for a human brain to figure out what’s going on without some help. In order to understand what’s going on, we can rearrange the numbers a bit. To make things even clearer, we can map count numbers to colors. I’ll pick greyscale, so that we can keep in mind that more counts corresponds to more light.

Unfortunately, CCDs, as powerful and useful as they are, do introduce their own biases into the data, so our image doesn’t look very clean right now. This problem is easy to correct, as long as you are prepared to encounter it. The CCD-introduced bias can be fixed by taking two specific types of pictures, known as darks and flats, which act like a control in a scientific experiment.

The first type of control picture, the dark, is necessary due to thermal noise in the CCD chip. Thermal noise is caused by the heat radiation from the sensor itself, since CCDs are sensitive to infrared light (heat). CCDs in telescopes are often cooled to low temperatures to reduce the effect of this noise, which is present as long as the instrument has a temperature, so it cannot be eliminated completely. To combat this problem, astronomers prepare a dark, which is an exposure of the CCD to a completely lightless environment, a bit like taking a picture with a camera that still has the lens cap attached. This way, the CCD is only exposed to the thermal noise originating from the instrument itself. Here is what a dark might look like:

The second type of image, the flat, is an image taken of a flat field of uniform light. This could be an evenly illuminated surface. Many astronomers will take flats during sunset when the evening sky is bright enough to wash out the background stars, but not so bright that the sensors are overloaded. Since we know the image should be evenly lit, the flat field allows astronomers to pick up systematic defects in the CCD. Due to tiny imperfections during manufacturing, some pixels may be more or less sensitive than average, or the telescope itself might have lens imperfections that concentrate light in different areas of the image. Flat field images let astronomers discover and correct for these effects. A typical flat might look like:

Now that we have our image, dark, and flat field, we can begin to process the data. First, we subtract the dark from the image of the object:

And from that image, we subtract the flat field, giving a nice, clear picture of our target object:

Now that we’ve done initial processing of the image to correct for bias, we can start to do more interesting analyses of the data. One very basic thing we can do is use this image to figure out how bright the object we are looking at is. We can sum up the counts that belong to the object to get a total brightness. In this case, the sum of the object counts is 550. But this number doesn’t mean very much on its own. The object might actually be quite dim, but appears bright because a long exposure was taken, and the CCD had more time to collect light. Or, we could have taken a very short exposure of a very bright object. So, we need to find a reference star of known brightness in our image, and measure that. If we know how bright the object appears compared to the reference star in our images, and we know how bright the reference star is, we can infer the brightness of the object.

If we have taken a picture of the same object in different filters, we can also create false-color images. Filters can be placed in the telescope aperture in order to restrict which wavelengths can pass through the telescope. Using filters allows astronomers to choose which colors of light will reach the CCD and be counted. To make a false color image, astronomers combine images from two or more different filters. Each separate image is assigned a color according to which filter it was taken in (perhaps blue for ultraviolet light, green for visible light, and red for infrared light), then the images are combined into one.

False color images are useful because the color coding for each filter help draw attention to important differences between the individual images while still allowing astronomers to see the structure of the object in many different filters at once. In planetary science, for instance, different colors in an image might reflect differences in the composition of the surface of a planet, revealing regions of strikingly different geological histories across the whole planet. Images can also be combined to produce “true color” images using filters for different wavelengths of visible light in order to produce pictures that mimic closely what different astronomical objects would look like to human eyes. CCD technology has brought astronomy down to earth, quite literally, by producing images that reveal what the cosmos would look like, if only we could see it as well as our telescopes.

Sunday, January 24, 2016

Blinky-Blinky: How to Discover Kuiper Belt Objects

The Remote Observing Facility doesn't look special from the outside. It's one of many featureless white doors on the maze-like first floor of Caltech's Cahill Center for Astronomy and Astrophysics. Tonight, though, it's the only door propped open, and from down the hallway, you can hear the voices of its occupants, busy setting up for the next twelve hours of work. When work begins for the night, it's 8 pm, Pacific Time--that's 6 pm, in Hawai'i, where the Keck Telescope is located.

Inside the windowless room, there are three digital clocks. The first gives the current Greenwich Mean Time, a number that is dutifully recorded at the beginning of each exposure of the telescope. Another gives the time in Hawai'i, keeping track of sunset, sunrise, and twilight on the distant island. Finally, the clock in the middle keeps track of the local time in Pasadena, the only real link to the rhythms of daily life outside of the strange limbo-like atmosphere of the office.

The first step is to turn on and calibrate the instruments and run through a series of checklists for the telescope. Under the row of clocks, a webcam whirs to life, and three panels on the monitor below it blink on. The first is dark--later, when observing starts, it connects us to the telescope operator. She sits at the summit of Mauna Kea, moves the telescope into position, and sets up guidance and tracking so that the telescope stays pointed in a fixed direction as the Earth rotates beneath it. The second shows the telescope technician, located at the base of the mountain and acts as IT for the night. The last one is an image of me and the other occupants of the ROF.

Observing officially begins at the end of astronomical twilight. The targets are potential Kuiper Belt Objects¹, identified by another telescope on a previous night and picked out by a computer as likely candidates. The idea behind these detections is what researcher Mike Brown calls "blinky-blinky," observe a patch of sky at two different times, and see if anything has moved. Look a third time, just to make sure the apparent movement wasn't caused by random noise in the instrument. If the object is seen in three different places, along a straight line, there's a good chance what you've found is real.

This is the same method Clyde Tombaugh used to discover Pluto, only he did it by literally blinking between two images. Today, Pluto-killer Mike Brown has autonomous programs to do it for him. For any given candidate object, the computer even spits out potential distances and orbits. Intuitively, this makes sense: objects that are farther away appear to move more slowly across the sky per unit time.

Looking at follow up targets several months later allows for a confirmation of candidates' existence, as well as a narrowing down of orbital properties. Observations fall into a particular routine. Every two minutes, we move onto a new target. We press a button to expose the telescope--it's essentially the same process as taking a long exposure image with a digital camera--and note the time. Two minutes later, we reposition the telescope, get an "okay"from the summit, and expose again. Every so often, the telescope gets re-aligned with a guide star of known position, and observations resume. Research continues like this for the next eight hours. Once we get two exposures of the same object, we can do our own makeshift "blinky-blinky" to get a first look at the data as it arrives.

Luckily for me, finals week means I leave the ROF early after only half a night of observing. Mike Brown² and the rest of his team stay up until dawn, searching the clear Hawai'ian skies for distant worlds.

¹ Kuiper Belt Objects are icy bodies that orbit near Neptune and beyond. When you think about Kuiper Belt Objects, think about Pluto-like objects.
² Mike Brown also does some neat research on Europa.

Wednesday, August 26, 2015

Cross Sectional Astronomy

My research this summer came to an end last week with a seminar I presented at along with many other students in Caltech's Summer Undergraduate Research Program. In addition to presenting my work with Monte Carlo simulations, I also attended talks given by other students doing research in astronomy and physics.

Many of the astronomy projects I learned about focused on creating software for recognizing and analyzing different astronomical phenomena, from variable stars to pulsars and contact binary systems. Many large-scale sky surveys, such as the Palomar Transient Factory and the Sloan Digital Sky Survey, produce a wealth of data on astronomical objects. Computers are often the best way to analyze the abundance of data produced by these surveys in order to identify interesting targets for follow-up study. But why do astronomers need these huge sky surveys and millions of target objects to study?

Analyzing how any population changes over time, whether it is a population of people, stars, or starfish, is a common problem in many areas of science. It can be a tricky problem too, especially when trying to tease out correlation and causation from subtle differences between subgroups of the population. There are two main study methodologies for dealing with this problem: longitudinal studies and cross sectional studies.

Longitudinal studies are the intuitive approach to learning how a population changes over time: just watch as the population (or more realistically, a random sample of the population) evolves naturally. It makes sense, but it's difficult in a lot of situations. For example, longitudinal studies of humans take dedication and decades of research. For phenomena with long lifespans, such as stars, this type of study is simply impossible--the stars vastly outlast human lives and even human civilizations!

Cross sectional studies instead study many individuals in the population at the same time. Each individual represents an individual in a slightly different stage of evolution, with slightly different characteristics; a random sample provided by nature. In humans, an example of a cross sectional study is gathering pictures of many different individuals at different ages in order to examine how appearance changes with age.

Since astronomers only have access to a snapshot of the universe as it appears today, cross sectional studies are what astronomers use to study populations of stars. The most famous example of a cross sectional study is the Hertzsprung-Russel diagram, a plot that correlates star surface temperatures (or colors) with their luminosities. The diagram shows stars in different stages of their evolution, from main sequence stars to red giants and white dwarves, along with stars in transitional states between these major milestones.With the diagram, we can trace the development of different types of stars, and how this development changes with different intrinsic properties of the star (mass turns out to be the most important property in determining the ultimate fate of a star).

There are some problems with the cross sectional approach. For example, age itself may correlate with the evolution of the population in question. In the human example, improving health as time goes on might manifest itself in physical differences, such as an increase in height, between generations that are not caused by the aging process itself. In astronomy, a star that is now nearing the end of its life formed in a quite different universe than a protostar that has just reached the main sequence. We know from theoretical models that the concentration of metals in the universe has increased with time as stars convert hydrogen and helium into heavier elements. Luckily, we can attempt to correct for these effects. Due to the finite speed of light and the vast size of the universe, by looking further and further away, we effectively look back in time. This can help us to determine how conditions were different for older stars when they formed, when compared to stars which are forming today.

Having a large sample size is important in a cross sectional study because it ensures that a representative sample is available and than no important features of the population will be missed. Cross sectional methods and large samples provided by surveys help astronomers to discover how stars age, correlate properties among different populations of stars, and provide experimental confirmation of hypotheses for many types of astronomical objects. There is still much to be learned about a variety of astronomical systems--stars, planets, and more.

Thursday, August 13, 2015

DIY Random Distributions

In this post, I explained how to generate a random, uniform distribution of points on a disk. That problem turns out to be a special case of a more general technique that can be used to generate random numbers with any probability distribution you want. As a bonus, it also explains the seemingly-magical fix (taking the square root of a random distribution to find r) that generates the desired result. But it's not magic--it's a really cool bit of mathematics.

As in my previous post on stochastic geometry, assume that you have a random number generator that outputs numbers randomly selected from 0 to 1. Each possible number has an equal chance of being chosen, so if you plotted how often each number was picked, you would get a uniform distribution.

Let's say, instead, you wanted to generate random numbers between 0 and 1 with a probability distribution function proportional to the polynomial p(x) = -8x⁴ + 8x³ -2x² + 2x. Broadly speaking, we'd like to pick numbers around 0.7 the most often, with larger numbers being generated more often than smaller numbers. The distribution looks something like this¹:

The first step is to find the cumulative distribution function c(x) of the probability distribution function p(x). If you imagine the chart above as a histogram, the cumulative distribution function would give, for every x, the height of all the bars to the left of that x value. In other words, the cumulative distribution function gives the proportion of the area under the curve that lies to the left of x compared to the total area under the curve. This should sound familiar if you've ever taken a calculus course--to find c(x), we take the integral of p(x) and divide by the integral of p(x) from 0 to 1. If you haven't taken calculus, don't worry. Taking an integral in this context just means finding the area under a curve between two intervals, as described earlier.

Here is what c(x) looks like, plotted alongside p(x).

The next step is the easiest. Use the random number generator to generate as many random numbers as you need between 0 and 1. I picked five: 0.77375, 0.55492, 0.08021, 0.51151, and 0.18437².

Now, using c(x) as a sort of translator, we can figure out which random numbers in our non-uniform distribution these numbers correspond to. It's important to realize that the random numbers we generated are values of c(x) not values of x. No matter what interval we use, c(x) will always have values from 0 to 1, but we could always use a different probability distribution that had x values from any real number to any other real number. In my research, I use this technique to generate random angles that have values from 0 to π, for instance. So, using these values of c(x), we can interpolate to find the values of x that they correspond to.

Here is the process of interpolation for the numbers I chose. The red points represent the uniformly distributed random values for c(x). The yellow points represent the randomly generated x values that have the same probability distribution as p(x). Very roughly, 0.77375, 0.55492, 0.08021, 0.51151, and 0.18437 correspond to 0.78, 0.65, 0.25, 0.62, and 0.38 respectively, via the green graph of c(x). Although it's hard to tell right now, if I generated enough numbers, we would indeed find we were picking numbers around 0.7 the most often, with more large numbers being generated than small numbers.

In the case of generating random points over a disk, we need to generate random values of r. We are more likely to find points at larger radii than smaller radii simply because a circle with a larger radius has a greater perimeter: perimeter is proportional to radius. Thus, our p(x) is proportional to r, and our c(x) is proportional to r². This is why we need to take the non-intuitive step of taking the square root when generating uniform, random coordinates for the disk! While to our eyes, the result looks like a uniform covering of the disk, the distribution underneath isn't uniform at all.

This technique is also useful if you want to generate random values according to a Gaussian distribution, also known as a normal distribution or a bell curve. These distributions are ubiquitous in statistics and if you are familiar with image processing, they are the functions behind "Gaussian blur". But of course, they can be used to generate any probability distribution you like, not just these examples.

***

¹ I picked this distribution because it's very easy to integrate and visually interesting. It's not actually related to my research at all, and I don't think there's anything especially interesting about it.
² I really did generate these numbers with my computer--I didn't cherry pick them to look good!

Thursday, July 9, 2015

How Do You Build an X-Ray Telescope?

X-ray telescopes are tricky to engineer. With such high energy and short wavelengths, x-rays can pass through nearly any material we throw at them, making it very difficult to make mirrors that can direct and focus light into a useful image. This is because x-rays, like any type of electromagnetic wave, can interact and scatter off of objects similar in size to the wavelength of the wave itself.¹ X-rays have such short wavelengths that they do not interact with most atoms and atom-sized objects--not because the wavelength is too big to interact, but because the wavelength is too small! X-rays can pass through the spaces between atoms in many materials.

One way around this problem is to use mirrors that intercept light at very, very high angles of incidence so that the x-rays merely graze the surface of the mirror. This decreases the effective spacing between the atoms as seen by an incoming x-ray.² Many mirrors for x-ray telescopes are designed using rings of thin foils arranged so that the surface of the foil is nearly parallel (but not quite!) to the telescope's viewing direction.

This method doesn't work for very high energy x-rays, called "hard" x-rays. Instead, telescopes use a coded mask, which is a screen which blocks or admits x-rays in a very specific pattern. This pattern is then projected onto the telescope detector, like a shadow. By comparing the pattern on the screen and the pattern detected by the instrument, the position of the x-ray source can be quickly determined. Specifically, given the shift between the detected shadow, the actual position of the screen, and some trigonometry, you can determine the celestial coordinates of the source.

Often, coded masks have a very particular grid pattern for blocking light.³ Why is this? It helps make identifying the shift between the shadow and the screen easier to identify. Imagine that, in order to figure out the shift, you have two pictures, one of the screen, and one of the detected pattern, lying on top of each other. You are allowed to move the picture of the detected pattern relative to the screen, but instead of seeing if the two pictures match each other, you only get a number telling you the percentage of "matches" (places where dark is on top or dark or light is on top of light). When this percentage reaches 100% you can be sure that the displacement of the detected pattern is the correct shift for determining the position of the source in the sky.

If, instead of using a coded mask, the screen was simply randomly generated, what percentage readings would we expect from trying to match up the pictures? The percentage would still be 100% when the two images were aligned, but away from the peak, the percentages would vary greatly and unpredictably. In one position, 60% of the image might match, while shifted slightly to the left, only 1% would match. This makes detecting the position of the best match difficult, especially when instead of knowing the percentage match, you only know how much better a match one position is compared to nearby positions. Relative readings like this are more representative of the problem, since sometimes portions of the shadow will miss the detector entirely.

Instead, coded masks are designed so that the percentage of matches is constant for every position except for the position that represents the best fit. This way, the position where the shadow image best matches the screen pattern is very easy to identify.

I learned about these interesting design considerations from my colleagues at Caltech's Space Radiation Laboratory. Some of the other SURF researchers I have met this summer are working on X-ray telescopes. I was surprised to find similarities in the design of x-ray telescopes to problems I had been tackling as part of my research. The detector I am working on has two layers. By combining information from where the incoming particles hit on both layers of the detector, the direction of the incoming particles can be determined. Instead of using a screen to block out light, the first layer of the detector directly locates particles, rather than than creating a familiar pattern. However, the readings of the second detector can almost be thought of as a shadow of the reading on the first detector, shifted by some amount that depends on particle trajectories.

¹ I discuss how this affects optical telescopes in my post about Palomar.
² If you're having trouble imagining this, think of if the x-ray came at the mirror edge on. It would appear as if all the atoms overlapped along the same line of sight. A few degrees from edge on, there is still a considerable amount of overlap.

³ You can see a few nice examples and a great explanation of coded masks in this video.

Wednesday, July 1, 2015

Stochastic Geometry and Monte Carlo Simulations

For the past few weeks, I have been working on a summer research project with the Caltech SURF program. Today, I just learned about an interesting mathematical tool I had been using the whole time that I finally started to understand. It's called stochastic geometry.

Let's say that you'd like to generate a random, uniform distribution of points over some region. Assume that you have a function that outputs a uniform random distribution of real number decimals between 0 and 1.¹ The method you use to generate a uniform distribution of points in any given region depends on the shape of the region over which you generate the points. A rectangular region is relatively simple. For a rectangular region that is n units wide and m units tall, you just need to generate a random x coordinate and a random y coordinate and multiply the resulting numbers by n and m, respectively. This generates two values that can be used as Cartesian coordinates to locate a random point in the rectangle. Do this many times and plot the resulting points on the rectangle. The result will be a rectangle filled uniformly with random points.

What about other shapes, such as the interior of a circle? This is a bit more complicated, because the range of possible y coordinates depends on the x coordinate chosen. You could, in theory, generate points in a square that contains the circular area of interest, and throw away points that were generated outside of the disk.² But this isn't very efficient. Fortunately, polar coordinates make shapes like circles a bit easier to deal with. In polar coordinates, the location of any point is described by a radial distance r from the center of the disk and an angular distance θ from the zero angle. At first, it seems like you could generate random values for r and θ just like we did for the square, but this ends up favoring points in the center of the circle. Instead of being a random, uniform distribution, the disk, when filled, has an overabundance of points near the middle.³

To make a truly uniform, random distribution of points on the disk, we need to favor larger radii over smaller radii. This makes sense, because there are more possible points to pick from on the perimeter of a larger circle than on the perimeter of a smaller circle, because bigger circles have bigger perimeters. It turns out that the solution to this problem is to generate random values for r², and then take the square root of these values to find r for plotting purposes.⁴

What good is generating a uniform distribution of points? In Monte Carlo simulations, a large number of random events are simulated in order to make statistical predictions about systems that cannot be easily modeled analytically. Being able to generate a uniform random distribution ensures that any deviations from uniform data are true properties of the system and not artifacts of the random generation process itself. For my research project, Monte Carlo simulations are being used to model the response of a telescope to incoming particles. Because the particular instrument I am working on is not easily analyzable, Monte Carlo simulations are invaluable for modeling what will happen when the telescope begins to take data.

***

¹ This is a common feature of most programming languages.
² Mathematicians call circular regions disks. Technically, a circle is just points on the perimeter of the disk.
³It looks like this. Wolfram Mathworld has lots of other entries related to this subject, check it out if you are so inclined!
⁴θ can simply be randomly generated. This is a consequence of the fact that generating a uniform random distribution of on the perimeter of a circle is as simple as generating a random value for θ and multiplying by 2π