Thursday, July 9, 2015

How Do You Build an X-Ray Telescope?

X-ray telescopes are tricky to engineer. With such high energy and short wavelengths, x-rays can pass through nearly any material we throw at them, making it very difficult to make mirrors that can direct and focus light into a useful image. This is because x-rays, like any type of electromagnetic wave, can interact and scatter off of objects similar in size to the wavelength of the wave itself.1 X-rays have such short wavelengths that they do not interact with most atoms and atom-sized objects--not because the wavelength is too big to interact, but because the wavelength is too small! X-rays can pass through the spaces between atoms in many materials.

One way around this problem is to use mirrors that intercept light at very, very high angles of incidence so that the x-rays merely graze the surface of the mirror. This decreases the effective spacing between the atoms as seen by an incoming x-ray.2 Many mirrors for x-ray telescopes are designed using rings of thin foils arranged so that the surface of the foil is nearly parallel (but not quite!) to the telescope's viewing direction.

This method doesn't work for very high energy x-rays, called "hard" x-rays. Instead, telescopes use a coded mask, which is a screen which blocks or admits x-rays in a very specific pattern. This pattern is then projected onto the telescope detector, like a shadow. By comparing the pattern on the screen and the pattern detected by the instrument, the position of the x-ray source can be quickly determined. Specifically, given the shift between the detected shadow, the actual position of the screen, and some trigonometry, you can determine the celestial coordinates of the source.

Often, coded masks have a very particular grid pattern for blocking light.3 Why is this? It helps make identifying the shift between the shadow and the screen easier to identify. Imagine that, in order to figure out the shift, you have two pictures, one of the screen, and one of the detected pattern, lying on top of each other. You are allowed to move the picture of the detected pattern relative to the screen, but instead of seeing if the two pictures match each other, you only get a number telling you the percentage of "matches" (places where dark is on top or dark or light is on top of light). When this percentage reaches 100% you can be sure that the displacement of the detected pattern is the correct shift for determining the position of the source in the sky.

If, instead of using a coded mask, the screen was simply randomly generated, what percentage readings would we expect from trying to match up the pictures? The percentage would still be 100% when the two images were aligned, but away from the peak, the percentages would vary greatly and unpredictably. In one position, 60% of the image might match, while shifted slightly to the left, only 1% would match. This makes detecting the position of the best match difficult, especially when instead of knowing the percentage match, you only know how much better a match one position is compared to nearby positions. Relative readings like this are more representative of the problem, since sometimes portions of the shadow will miss the detector entirely.

Instead, coded masks are designed so that the percentage of matches is constant for every position except for the position that represents the best fit. This way, the position where the shadow image best matches the screen pattern is very easy to identify.

I learned about these interesting design considerations from my colleagues at Caltech's Space Radiation Laboratory. Some of the other SURF researchers I have met this summer are working on X-ray telescopes. I was surprised to find similarities in the design of x-ray telescopes to problems I had been tackling as part of my research. The detector I am working on has two layers. By combining information from where the incoming particles hit on both layers of the detector, the direction of the incoming particles can be determined. Instead of using a screen to block out light, the first layer of the detector directly locates particles, rather than than creating a familiar pattern. However, the readings of the second detector can almost be thought of as a shadow of the reading on the first detector, shifted by some amount that depends on particle trajectories.

1 I discuss how this affects optical telescopes in my post about Palomar.
2 If you're having trouble imagining this, think of if the x-ray came at the mirror edge on. It would appear as if all the atoms overlapped along the same line of sight. A few degrees from edge on, there is still a considerable amount of overlap. 
3 You can see a few nice examples and a great explanation of coded masks in this video.

Sunday, July 5, 2015

Art and Science - Sestina Numbers

Earlier this year, I wrote a sestina which was then published in Totem, Caltech's literary magazine. Sestinas are one of my favorite types of poems, because they use a complicated repetition scheme that gives structure to the poem without using rhyme. Knowing this, one of my friends pointed out to me a generalization of the structure of sestinas that produces a sequence of numbers with interesting and surprising properties.

Sestinas have 39 lines each, with six stanzas and a three line ending. Ignoring the last three lines, each of the six stanzas use the same six words at the end of each line. This is what makes writing a sestina difficult--you have to pick six versatile words and avoid repeating the same message in each stanza! However, in each stanza, the six words are in different orders. The pattern by which the words get shuffled to produce the next stanza is the same between each stanza: the last word of the previous stanza is always the end word for the first line of the next. By the end of the poem, five shuffles later, repeating the shuffling procedure will produce the original order of the six words. Here's a picture of how it works:

1 goes to 2, 2 goes to 4, 3 goes to 6, 4 goes to 5, 5 goes to 3, and 6 goes to 1

Mathematicians call this reshuffling a permutation. If you've been looking closely, you might have noticed how it works for this particular permutation. To to figure out where the nth word goes, first figure out if n is in the first half or second half of the stanza. For n in the first half, the nth word will be the 2nth word in the next stanza. For n in the second half, the nth word will be in the 2*(6-n)+1th position. The paper linked to below suggests a good way to think of the permutation: as a shuffle (alternating between the first three numbers and the second three numbers) with the second group of  "cards' turned upside down (so that the last word becomes the first word). This type of permutation can be generalized for any number m stanzas, but the number will only be a sestina number if after m permutations, the original order is obtained. 

The picture below shows a braid pattern that represents each permutation. The black lines designate the final word ordering for each stanza. Notice how, if you wrapped the picture around on itself, the colors would connect to each other in the same order as they started (representing the order of the first stanza). For knot theorists, this means sestinas form links with six loops. 


As it turns out, sestina numbers have a lot of interesting mathematical properties related to prime numbers. For instance, if s is a sestina number, then 2s+1 is a prime number! Many sestina numbers are also prime numbers, too. One method for proving an infinite number of sestina numbers exist depends on the truth of the Reimann Hypothesis, an unsolved problem in mathematics that is deeply connected to the distribution of prime numbers. Unexpectedly, sestina numbers are related to both beautiful poetry and beautiful mathematics. 

Read more about sestina numbers (also called Queneau Numbers) here
Read more about sestinas here

Wednesday, July 1, 2015

Stochastic Geometry and Monte Carlo Simulations

For the past few weeks, I have been working on a summer research project with the Caltech SURF program. Today, I just learned about an interesting mathematical tool I had been using the whole time that I finally started to understand. It's called stochastic geometry.

Let's say that you'd like to generate a random, uniform distribution of points over some region. Assume that you have a function that outputs a uniform random distribution of real number decimals between 0 and 1.1 The method you use to generate a uniform distribution of points in any given region depends on the shape of the region over which you generate the points. A rectangular region is relatively simple. For a rectangular region that is n units wide and m units tall, you just need to generate a random x coordinate and a random y coordinate and multiply the resulting numbers by n and m, respectively. This generates two values that can be used as Cartesian coordinates to locate a random point in the rectangle. Do this many times and plot the resulting points on the rectangle. The result will be a rectangle filled uniformly with random points.

What about other shapes, such as the interior of a circle? This is a bit more complicated, because the range of possible y coordinates depends on the x coordinate chosen. You could, in theory, generate points in a square that contains the circular area of interest, and throw away points that were generated outside of the disk.2 But this isn't very efficient. Fortunately, polar coordinates make shapes like circles a bit easier to deal with. In polar coordinates, the location of any point is described by a radial distance r from the center of the disk and an angular distance θ from the zero angle. At first, it seems like you could generate random values for r and θ just like we did for the square, but this ends up favoring points in the center of the circle. Instead of being a random, uniform distribution, the disk, when filled, has an overabundance of points near the middle.3

To make a truly uniform, random distribution of points on the disk, we need to favor larger radii over smaller radii. This makes sense, because there are more possible points to pick from on the perimeter of a larger circle than on the perimeter of a smaller circle, because bigger circles have bigger perimeters. It turns out that the solution to this problem is to generate random values for r2, and then take the square root of these values to find r for plotting purposes.4

What good is generating a uniform distribution of points? In Monte Carlo simulations, a large number of random events are simulated in order to make statistical predictions about systems that cannot be easily modeled analytically. Being able to generate a uniform random distribution ensures that any deviations from uniform data are true properties of the system and not artifacts of the random generation process itself. For my research project, Monte Carlo simulations are being used to model the response of a telescope to incoming particles. Because the particular instrument I am working on is not easily analyzable, Monte Carlo simulations are invaluable for modeling what will happen when the telescope begins to take data.

***

1 This is a common feature of most programming languages.
2 Mathematicians call circular regions disks. Technically, a circle is just points on the perimeter of the disk.
It looks like this. Wolfram Mathworld has lots of other entries related to this subject, check it out if you are so inclined!
θ can simply be randomly generated. This is a consequence of the fact that generating a uniform random distribution of on the perimeter of a circle is as simple as generating a random value for θ and multiplying by 2π