Bruce YellinData Center Architect [email protected]
COMPUTER SCIENCE HELPS SHIELD EARTH FROM ASTEROIDS
2016 EMC Proven Professional Knowledge Sharing 2
Table of Contents
The Threat ................................................................................................................................. 5
Finding The Threats: A Brief History of Asteroid Detection ......................................................... 7
How Do We Find Asteroids Today? ..........................................................................................10
Optical Telescopes ................................................................................................................10
Charge-Coupled Device – CCD .........................................................................................11
Radio and Radar Telescopes ................................................................................................13
Ground-Based Telescopes ....................................................................................................15
Large Synoptic Survey Telescope - LSST - Optical Telescope ..........................................15
Asteroid Terrestrial-impact Last Alert System – ATLAS – Optical Telescope .....................17
Satellite Telescopes ..............................................................................................................18
NEOWISE – Optical Telescope ..........................................................................................18
Gaia Space Telescope – Optical Telescope .......................................................................20
The Square Kilometer Array – Mankind’s Largest Big Data Challenge – Radio Telescope 22
Using Hadoop To Spot An Asteroid ...........................................................................................27
3D Asteroid Modeling – Try It Yourself! .....................................................................................28
Taking Action ............................................................................................................................29
High-Performance Computing and Big Data .............................................................................34
Conclusion ................................................................................................................................38
Appendix - Glossary ..................................................................................................................40
Appendix – Draw an Ellipse in Excel .........................................................................................41
Footnote....................................................................................................................................42
Disclaimer: The views, processes or methodologies published in this article are those of the
author. They do not necessarily reflect EMC Corporation’s views, processes or methodologies.
2016 EMC Proven Professional Knowledge Sharing 3
Chelyabinsk Asteroid Orbit Earth at Impact
Sun
Venus orbit Earth
orbit
Mars
orbit
“…it came dangerously close to wiping us all
out.” – Prof. Brian Cox
Earth is facing an asteroid threat from outer
space, and it isn’t the Arachnids of Klendathu
from the 1997 science fiction film Starship
Troopers hurling them at our planet. It is a real
threat from one of the hundreds of millions of
asteroids that orbit the Sun and travel between
Mars and Jupiter and beyond. In essence,
Earth sits in an asteroid shooting gallery.
Many were caught off guard early Friday, February 15,
2013, when a medium-sized 66-foot wide meteoroid
weighing 28 million pounds (13,000 metric tons)
approached Earth at 43,000 mph1. (Meteoroids traveling at
160,000 mph can enter the atmosphere, eventually
decelerating to a much slower speed2.) Coming in at a
steep 30o angle3, friction made it glow 23-29 miles above
the ground, and it exploded in the atmosphere 18 miles
over Chelyabinsk, Russia, producing a Sun-bright light.
With kinetic explosive energy greater than 20-30
WWII atomic bombs, the shockwave broke glass
windows and hurt nearly 2,000 people4.
Astronomers never saw the meteoroid coming – it
was just too small and it came from behind the Sun
so Earth’s telescopes could not detect it. This orbit
diagram, constructed after the event, shows the path
in yellow-green5. Current estimates indicate there
could be as many as 80 million “rocks” of this size6.
In a short 8 day period from March 4-11, 2014, four asteroids silently
approached Earth. The largest would have likely wiped out a city the
size of London. On March 4, a 380-foot asteroid called “2014 DU110”
came within 13 million miles of Earth. The next day, an asteroid discovered by telescope only 5
days earlier named “2014 DX110” passed the Earth from about the same distance as the Moon.
Given the vastness of space, many would call this a near-miss. On March 6, a 100 foot “2014
2016 EMC Proven Professional Knowledge Sharing 4
EC” asteroid (orbit diagrams to the
right7), discovered only 2 days
earlier, came within 38,300 miles of
our planet – less than 1/6th the
distance to the moon and just above
the 22,000 mile geosynchronous
orbit of some satellites. According to
University of Manchester physicist
Dr. Brian Cox, there is an “asteroid with our name on it” and it is only a matter of time before an
asteroid large enough to wipe out the human race collides with Earth.”8
Asteroid impacts are not rare. While
the chance that a large one will
obliterate a city is once in a century9,
this map shows a total of 556
impacts from 1994-2013, with 26
asteroids, containing a force of 1 to
600 kilotons of TNT, exploding in the
atmosphere. By contrast, the
Hiroshima atomic bomb equaled 15
kilotons of TNT. One might conclude
our current strategy to protect the planet consists of “blind luck”.
In 1908, an asteroid perhaps as big as “2014 CU13” exploded 3-6 miles above the city of
Vanavara, Russia. Called the Tunguska Event, it destroyed a 770 square mile area about 2,200
miles west of Moscow. The damage equaled 10-15 megatons of TNT (over 1,000 times the
energy of the WWII atom bomb).
An explosion of that magnitude
over a heavily populated area like
New York City would wipe it out,
kill perhaps a million people, create
an unparalleled ecological disaster
and plunge the world’s economy
into chaos10.
2016 EMC Proven Professional Knowledge Sharing 5
Jupiter
MarsEarth
Venus
Mercury
The main asteroid belt is 100 million
miles wide and 111 million miles
from Earth
The Trojan Group of asteroids
Sixty-five million years ago, as noted by the Alvarez hypothesis11, an asteroid 6-7 miles in
diameter (10-12 kilometers) traveling at 45,000 mph (20
km/s)12 struck offshore near the Yucatán Peninsula with the
force of three billion WWII atomic bombs13. It created a 15-mile
deep, 110-mile wide Chicxulub (Chi’-shoo-loob) crater and a
100-meter (328 feet) tsunami. The impact triggered the
planet’s fifth mass extinction event14, eradicating dinosaurs
and most other species15, and marked the end of the 350 million-year-old Age of Reptiles16.
Asteroids of this size hitting Earth would convert kinetic energy into an instantaneous inferno
with “hot-coal colored” rocks shooting into the sky eventually causing global firestorms. Ash
would fill the air and block out the sun. Food and breathable air would be gone. If this happened
today, perhaps landing further offshore, U.S. Gulf states like Florida, Alabama, Mississippi,
Louisiana and Texas might disappear underwater. The human race would be extinct.
While astronomers believe the chances of a devastating strike is
unlikely, it seems inevitable. And if one does hit, mankind would be
eradicated. Earth needs an approach that gives scientists and leaders
enough notice to deflect an asteroid when it is millions of miles away.
We are scanning the skies for asteroids. We have plans to protect the
human race. Asteroid defense is a big data analysis problem.
The Threat
Asteroids are minor planets that orbit our part of the Solar System in 4 distinct regions. The
main asteroid belt contains millions of bodies 200 million miles from the Sun and is found
between the orbits of Mars and Jupiter18. There are
also Trojan groups which pace and follow Jupiter by
±60o, a Kuiper belt or region which ranges from
2,800 to 4,650 million miles away19, and the Oort
cloud which is thought to be 100,000 AU or 9,300
billion miles from the Sun20. This image shows the
expected location of the main asteroid belt (shown in
red/pink in this diagram) and the Trojan group (green
in the diagram) on June 28, 201621.
BIG DATA “When accumulated data exceeds the capacity or capture rate of local resources, local storage and manipulation is impractical at best, impossible at worst.”
17
2016 EMC Proven Professional Knowledge Sharing 6
Diameter Quantity
A few hundred miles Several dozen
Tens of miles Hundreds
A few miles Thousands
Large fraction of a mile Tens of thousands
Small fraction of a mile Hundreds of thousands
http://cseligman.com/text/asteroids/sizedistribution.htm
Asteroid Size
While most asteroids “peacefully” orbit the
Sun, there are those that travel through our
inner solar system and are of primary concern
should they strike the Earth. These are called
Near Earth Asteroids (NEAs), and when
combined with Near Earth Objects (NEOs)
such as satellite debris, create a hazard
ranging from fireballs in the sky to the dinosaur
extinction documented by Alvarez.
For the most part, asteroids are 4.5 billion-
year-old rotating, irregular solar system
building blocks. They are sometimes called
planetoids. Comprised of clay, silicates, and
nickel-iron, they can weigh from 1,200 billion
billion tons (5,000 times lighter than Earth)22 in the case of the largest called Ceres, down to the
weight of a car or even a pebble. They can also be as
large as Ceres’s 590-mile diameter (Earth’s diameter is
7,918 miles). About 10 million NEAs are larger than 10
meters wide while many millions of asteroids are tiny
with little mass.23
Current asteroid hunting initiatives mainly scan space for objects larger than 1 kilometer – 3,280
feet – or about 500 feet higher than Burj Khalifa in Dubai, the world’s tallest building.
Astronomers estimate they have found about 95% of civilization-ending asteroids24.
With Asteroids 30 feet wide passing near our Moon every week, a study that examined the last
20 years of data from global nuclear weapons testing sensors concluded that perhaps 60
asteroids approaching 20 meters in size have hit Earth's atmosphere, exceeding previous
estimates25. In 2005, the U.S. Congress instructed NASA to find 90% of the asteroids 140
meters wide (1.5 football fields long) by the year 202026, but as of late 2014, they have only
found 10% of them27. There is no mandated program for asteroids smaller than 500 feet long.
The Minor Planet Center (MPC) maintains a database of over 140 million asteroid observations
and tracks over 700,000 asteroids28. Orbit calculations must be constantly revised because they
change (for example, when objects collide). The following Hubble Space Telescope image
2016 EMC Proven Professional Knowledge Sharing 7
shows the 460-foot diameter asteroid “P/2010 A2” gaining a dust and
gravel trail after being struck by another asteroid29, undoubtedly
changing its orbit. It is presently beyond our “big data” technology to
comprehensively monitor all of the main asteroid belt activity.
An asteroid’s path can also be altered by the Yarkovsky effect – when the Sun warms an
asteroid, the heat is dissipated in another direction as it rotates30. Accurate orbit predictions
require everything is tracked. From Earth, one way to track an asteroid’s rotation is by observing
the timing of light reflecting off its surface. Spherical asteroids have a fairly constant amount of
reflected light31. Asteroid occultation, occurring when an asteroid passes in front of a star
temporarily blocking its light, can also help us measure its size, shape and exact position32.
Finding The Threats: A Brief History of Asteroid Detection
If astronomers could predict meteoroid and asteroid strikes years in advance, Earth would
conceivably have time to prepare for the disaster or possibly even prevent it. It all starts with
finding the threats and the first such discovery occurred in 1801.
An Italian astronomer, Giuseppe Piazzi, was in Palermo searching the
Italian sky with the telescope to the left, looking to prove a then-
prevailing theory that a planet orbited between Mars and Jupiter33. He
recorded the position of a small dot of light on January 1, 1801, along
with angular measurements and exact times as shown in the table below. (A
precursor to today’s rows and columns in Excel and database theory, the use of data tables to
record information can be traced to
the Sumerians of 3100 BC34). He
wasn’t sure if it was a star or a
comet35. On subsequent nights, he
observed the dot move from its
original position and in front of
known stars. Overall, he made 22
observations of a large object for 41
days until it disappeared behind the
Sun on February 11, 1801. He named the object Ceres Ferdinandea in honor of the Roman era
goddess of agriculture (Ceres or Cerere in Italian) and King Ferdinand of Sicily36, although it
2016 EMC Proven Professional Knowledge Sharing 8
Asteroid
SunSemi-major
axis
PerihelionAphelion
Feb 11
Ceres
Ceres
Ceres
Jan 2
Jan 22
Observation
Date
Time
HH:MM:SS
Right
Ascension Declination
Jan 2, 1801 08:39:04.6 51º 47′ 49″ 15º 41′ 05″
Jan 22, 1801 07:20:21.7 51º 42′ 21″ 17º 3′ 18″
Feb 11, 1801 06;11:58.2 54º 10′ 23″ 18º 47′ 59″
Ceres Piazzi Gauss Calculations
was later known as Ceres. After publishing his data, other astronomers tried to find the object in
the August and September sky, without success.
A 24-year old German mathematician, Carl Friedrich Gauss, studied the complex
problem, taking into account that Piazzi’s observations were made from (1) Earth’s
24-hour circular rotation (2) while the planet is moving along an elliptical orbit around
the Sun and (3) the motion of the object also orbited the Sun. Gauss needed to
understand the object’s orbit through an ever changing, time-sensitive set of motions.
In general, the orbit of a planet or asteroid is based on how close it resembles a
circle, ellipse or parabola. This is called eccentricity and is the deviation from a
circle with an eccentricity of 0. A hyperbola has an eccentricity of 2, a parabola
has an eccentricity of 1, and an ellipse is
between a parabola and a circle.
[NOTE: If you would like to try your hand
at constructing an ellipse, please see the appendix.] No
one knew what type of orbit Ceres was following, but Gauss
assumed it was elliptical - i.e. an eccentricity between 0 and 1. Mathematicians and
astronomers had no known methods to compute an elliptical orbit from available observations.
From Piazzi’s 22 observations, Gauss decided to work with only three
from January 2, January 22, and February 1137. The actual orbit of the
Earth was well understood in 1801, so Gauss could pinpoint Piazzi’s
position for these
observations. Using the exact
time to the fraction of a
second, and two angles down
to the tenths of seconds of arc,
but lacking the distance from
Palermo to the white dot,
Gauss was able to construct
11 equations in 6 unknowns
and solve this complex problem using a “least squares” approximation
method he had developed years earlier to analyze the Moon’s orbit.
2016 EMC Proven Professional Knowledge Sharing 9
Year 2012
First A B C D E F G H J K L M N O P Q R S T U V W X Y
Letter
D
J
a
n
1
J
a
n
1
6
F
e
b
1
F
e
b
1
6
M
a
r
1
M
a
r
1
6
A
p
r
1
A
p
r
1
6
M
a
y
1
M
a
y
1
6
J
u
n
1
J
u
n
1
6
J
u
l
1
J
u
l
1
6
A
u
g
1
A
u
g
1
6
S
e
p
1
S
e
p
1
6
O
c
t
1
O
c
t
1
6
N
o
v
1
N
o
v
1
6
D
e
c
1
D
e
c
1
6
Second A B C D E F G H J K L M N O P Q R S T U V W X Y Z
Letter A 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Subscript 14
As a result, asteroid "2012 DA14" was the 351st object found in 2012 in the 2nd half of February
Multiply the number by 25 and add 1. So 14 becomes 14*15+1 = 351
Example: The meaning behind the name of asteroid "2012 DA14"
Least squares can help estimate an orbit when there are many
unknown equations. It is often used to determine the
approximate shape and direction of a best fitting curve with a
given set of points. This is done by minimizing the sum of the
squares of the offsets of the data points. On the left is an
example of red data points and the resulting blue curve that
could be drawn as the line that would best represent the points.
In Gauss’s case as
shown on the right,
using just 3
observation points could mean the object is traveling
through space in a circular, parabolic, elliptical, or
hyperbolic curve. Gauss leveraged the work of
Johannes Kepler almost two centuries earlier and
assumed Ceres followed an elliptical orbit.
On November 25, 1801, astronomers were able to find Ceres in the sky not far from where
Gauss had predicted it would be38. The basis of Gauss’s calculations is still used today to
calculate post-flight trajectory simulations of solid and liquid fueled rockets39.
As an asteroid, it was soon given the name “1 Ceres” as early discoveries were given a number
followed by a mythical name such as 2 Juno, 3 Pallas, 4 Vesta, and so on40. Over time, the
MPC adopted other naming conventions including a provisional designation and a permanent
designation. These
names can be confusing.
To the right is an
explanation of the
provisional designation
for asteroid “2012 DA14”
discovered on February
23, 201241. Permanent
numbers are assigned by
the International Astronomical Union (IAU) when the object has enough observations to ensure
it can be found at another time.
2016 EMC Proven Professional Knowledge Sharing 10
Wavelengths
How Do We Find Asteroids Today?
Telescopes are designed to receive frequencies of electromagnetic waves called wavelengths.
We are very familiar with the visible light wavelength that allows us to see colors in the 400–700
nanometer (nm) frequency
range , but there are many
wavelengths that we cannot
see. There are shorter X-ray
and ultraviolet
wavelengths, as well as longer infrared and radio wavelengths.
Optical telescopes are either ground-based or space-based, use lenses, and are generally
designed to capture light in the infrared through X-ray spectrum. Their images can be affected
by atmospheric distortions, so they are often located on high mountain tops to minimize the
interference, or in space42. Asteroids appear much brighter in infrared than in visible light.43
Radio telescopes are only found on Earth, and use parabolic receivers to capture long
wavelengths. Asteroids that reflect sunlight can be seen by optical telescopes while very dark
non-reflective asteroids are best viewed by a radio telescope. This set of Crab Nebula images
shows the amount of information available in each of the wavelengths44.
Optical Telescopes
There are three basic types of optical telescopes – refractor, reflector, and compound.
Refractor telescopes have a large glass lens on its farthest end allowing light to be bent
(refracted) to the focal point and magnified when viewed through the eyepiece45. Issac Newton
invented the reflector telescope. Light bounces (reflects) off a rear mirror until it reaches a
radio wave infrared visible light ultraviolet X-ray
2016 EMC Proven Professional Knowledge Sharing 11
flat mirror. It is then directed to the eyepiece after reaching the focal point. The compound or
catadioptric telescope uses reflecting and refracting to reduce optical error. Light is bounced off
a curved lens in the back, then bent by a lens towards the front, and finally sent backward again
through its focal point and out the eyepiece.
Charge-Coupled Device – CCD
This miracle of integrated circuits revolutionized the world of
photography and optical telescope-based astronomy. Up until
1980, modern astronomers relied on film cameras. Invented at Bell
Labs in 1969 for use as a memory device46, the CCD ushered in
the era of digital photography, which meant images could be
transmitted and digitally stored on a disk. This is the same camera technology that we now take
for granted in our smartphones. Whereas film uses silver halides suspended in an emulsion to
capture certain wavelengths of photons, the silicon CCD transforms wavelengths into electric
signals. Without the CCD and powerful processors with large memory capacity, telescopes such
as the Hubble Space Telescope would be near impossible if it relied on film for imagery.
A CCD contains an array of photodiodes that
essentially absorb photons of light and convert it into a
measurable electrical charge47. Comprised of silicon,
they absorb photons and store them like a capacitor
such that the greater the number of photons, the
higher the electrical charge. In rapid succession, single
pixels contained in shifting rows of image information
are processed by dedicated circuits and handed off to
a serial shift register – something that assembler
language programmers are very familiar with.
Electron packets accurately timed by a horizontal shift
register clock are shifted one row at a time to an
output amplifier which registers the photodiode
charge. When the array has been exposed to light,
the values are stored in memory - see the illustration
to the left48.
2016 EMC Proven Professional Knowledge Sharing 12
0 0 0 0
0 1 1 0
0 0 1 1
0 0 1 0
A 1-bit asteroid
representation
=
Photodiode material
Wavelength
nm
Silicon 190–1100
Germanium 400–1700
Indium gallium arsenide 800–2600
Lead(II) sulfide <1000–3500
Mercury cadmium telluride 400–14000
https://en.wikipedia.org/wiki/Photodiode
The CCD memory images are bitmap (raster) graphics – a series of black and white dot (pixels).
The images lend themselves to a
table layout similar to Excel’s (x, y)
addressing scheme of rows,
columns, and cells. This allows the
data to easily be manipulated using
most computer languages. In this
simple example, you see a
magnified asteroid shape translated
into a 1-bit matrix image of zeroes
and ones. With an 8-bit image, up to 256 shades of gray can be represented in each cell based
on the electron charge of each pixel. More bits equal higher resolution and a larger disk storage
requirement.
The material used to build the CCD photodiode dictates the
wavelength it records. For example, a silicon photodiode
captures visible light in the 190 - 1100 nm electromagnetic
spectrum.
Fairchild Semiconductor produced the first CCD in 1973. With a
resolution of 100 x 100 pixels (~10 KB), it was used in a telescope the
following year49. In 1975, Kodak built the first digital camera. It weighed 8
pounds and recorded a 0.01 megapixel (100 x 100 pixels) black and
white photo to cassette tape (shown to the right of the blue body of the
camera) in 23 seconds50. In comparison, the iPhone 6s incorporates a
12-megapixel camera51.
Color filters enable a grayscale CCD to record color
images. A red filter allows only red light to pass through
to the pixel, a green filter absorbs all the colors of
visible light except for green, and so forth. CCDs can
be arranged in a mosaic with discrete color “Bayer”
filters as shown to the left, with each CCD mapped to a
primary color.
2016 EMC Proven Professional Knowledge Sharing 13
Multi-chip mosaics are a cost-effective way to gain the
advantages of a much larger CCD or can be used to build
a camera with far greater resolution than might be
available with a single chip design. The image to the right
is from the wide-field Chilean VLT Survey Telescope that
uses 32 CCD chips, each with 2K x 4K pixels, making the
entire mosaic a 16K-by-16K, or 268 megapixels52.
Radio and Radar Telescopes
All telescopes capture
photons. Optical telescopes
capture photons with a
wavelength of about 390-
700 nm (purple to red) and
record them with a CCD
camera. Radio telescopes capture the longest wavelengths, typically 1 millimeter up to
hundreds of meters, and do not use a CCD camera.
Even though the same object in the sky emits
photons across all wavelengths, our eyes can only
process certain wavelengths – i.e., we cannot see
or hear a radio wave. The parabolic shape of the
radio dish antenna focuses the low energy photons
at the antenna. The antenna absorbs the energy
and hands the weak space signal to an amplifier.
From there, the signals are usually recorded on a
disk drive and processed by computer.
Radio telescopes detect asteroids (or any other
object) by initially sending a signal into space, and if
it bounces off an asteroid, the antenna receives that
signal – a “ping” and “echo”. The amount of time the
radio wave takes to make the round trip is used to calculate the distance from the dish to the
asteroid. The technique is called ranging and is the basis of RADAR (Radio Detection and
Ranging).
2016 EMC Proven Professional Knowledge Sharing 14
Radio dish
sends
signal
Signal reflects
from closest
parts of
asteroid first
Signal reflects
from closest parts
of asteroid first
Reflected
wavelengths
compressed from
parts rotating
towards antenna,
extended from
parts rotating
away
Radio dish sees
return signals at
many wavelengths
around broadcast
one
broadcast wavelength broadcast wavelength broadcast wavelength
timetime time
wavele
ng
th
wavele
ng
th
wavele
ng
th
The following set of 5 images is based on the work of Emily Lakdawalla53 and depicts a radio
dish sending a signal towards the asteroid . The asteroid is moving, rotating and irregularly
shaped. The signal bounces off the closest part of the asteroid first , with subsequent waves
bouncing back as they reach the farthest portions of the asteroid . As the dish receives and
processes the reflected signals, a waveform image of the asteroid begins to appear .
Eventually, the dish receives the entire reflected signal, including those parts bouncing off the
farthest face of the asteroid .
Since the object is irregular, rotating, and moving (left to right,
near to far, etc.), the imagery taken over days would show
multiple facets of the asteroid. For example, in this radar image
taken of asteroid “2007 PA8”, these 9 reflected images were
taken over a 2 week period and show multiple sides of this
rotating and moving object.
From the orbit diagram of November 5, 2012, the asteroid came within 0.0472 AU or 4 million
miles from the radar dish on Earth54 (Earth’s “white” orbit appears
next to the 2007 PA8 “blue” orbit.) The processing of the radar
image would be able to estimate the size of the asteroid and its
movement since the radio signals are transmitted and received at the speed of light.
With a radar telescope, astronomers are not tied to reflective sunlight or radiation. By bouncing
a signal off an object, day or night, clear sky or cloudy, the object is illuminated by reflected
radio waves allowing them to evaluate its intensity, direction, orbit and other deduced data.
2016 EMC Proven Professional Knowledge Sharing 15
Ground-Based Telescopes
Telescopes can be located on Earth or in space, with pros and cons for each approach. For
example, Earth-bound telescopes can use very large mirrors such as the 10-meter mirror in the
Keck Observatory in Hawaii whereas the Hubble Space Telescope uses a 2.4-meter mirror.
Larger mirrors gather more light and ground telescopes generally cost less. Space-based
telescopes are free from Earth’s atmospheric distortions and can capture greater wavelengths
of light that would normally be filtered out by our atmosphere55. With that in mind, let’s take a
look at some of the major telescopes in use and their standing in the big data era.
Large Synoptic Survey Telescope - LSST - Optical Telescope
Scheduled to be operational in January 2022, the LSST’s goal is
to photograph space from Earth every few nights to find asteroids
and perhaps unlock the nature of dark energy. Using a wide field
of view telescope to record images to its 3.2 gigapixel CCD
camera, the LSST will take about 800 panoramic images a night
equaling 15 TB of raw data every day56. To put that into
perspective, the Sloan Digital Sky Survey (SDSS) in 2000 gathered in just a few weeks more
data than throughout the then-history of astronomy. In a matter of a few days, the LSST gathers
more data than the entire SDSS project57.
Over its ten year mission, hundreds of petabytes will be processed to produce 60 PB of data
and a 15 PB database catalog, thereby creating a 3D map of space effectively allowing a user
to “fly” through space58. The camera will take a 15-second exposure every 20 seconds59
covering 6 wavelengths from 320 nm near ultraviolet to 1050 nm near infrared, and is expected
to take over 200,000 pictures a year occupying well over a petabyte of uncompressed disk
space.
The LSST camera uses 189 4K x 4K CCD chips
arranged in a mosaic focal plane. In this image, you can
see the 21 replaceable electronic physical (x, y)
assemblies (called rafts), with each raft containing 9
CCD chips in a 3 x 3 mosaic. If you look at the center
raft, you will see the addressing scheme also uses (x, y)
with (0, 0) in the lower left and (2, 2) in the upper right.
2016 EMC Proven Professional Knowledge Sharing 16
The LSST’s camera is enormous.
Pictured to the left, it weighs 6,200-
pounds, and is 5.5 feet tall and 9.8
feet wide. On the right is a picture of
a staffer showing the relative size of
the CCD mosaic.
The LSST will create unprecedented volumes of high-quality data – more than astronomers can
manually process every night. It will mark a revolution in how humans will explore space through
computer science. This effort is classified as a big data problem as the management and data
mining of this real-time data is paramount for astronomers to interpret the observations. Initial
computational requirements are estimated to require 3,000 16-core compute nodes at the
telescope’s location in Chile60. In 60 seconds, the captured image data must undergo a multi-
step parallel processing reduction to find asteroids and other moving objects, all before the next
batch of data comes in61. Once a day, raw data and metadata are sent 5,000 miles to a
supercomputer at the University of Illinois to be reprocessed and archived. Archiving the data
will initially require 150 teraflops of compute power, growing to nearly a petaflop by the 10th
year, and use 15 PB of disk space a year. The immense volume of data must be statistically
analyzed for low-level correlations to help reverse-engineer the results and determine the cause
and underlying cosmic physics – this is called the “inverse problem”62.
The 2010 prototype used 200,000 lines of C++ and Python code.63 “The Large Survey Database
(LSD) is a Python framework and DBMS for distributed storage, cross-matching, and querying
of large survey catalogs (>109 rows, >1 TB).”64 The processing complex is estimated to have a
source catalog of 350 billion rows and an object catalog of 37 billion rows, each with 200+
attributes, all representing 400,000 16-megapixel images65. The LSD uses partitioned tables
stored as compressed Hierarchical Data Format 5 (HDF5) files. HDF5 uses B-trees to index
table objects and works well with 3D data for faster access than the rows of an SQL database.
HDF5 can represent complex data objects and metadata much simpler and faster than a star
schema66,67. “Vertically, the tables are partitioned into sets of related columns (‘column groups’),
grouping together logically related data (e.g. astrometry, photometry). Horizontally, the tables
are partitioned into partially overlapping “cells” by position in space (lon, lat) and time (t).”68
2016 EMC Proven Professional Knowledge Sharing 17
Combined
images
4 C
CD
imag
es m
inu
tes a
part
Possible
asteroids
Static
image
+ -subtract
Asteroid Terrestrial-impact Last Alert System – ATLAS – Optical Telescope
ATLAS was designed to be Earth’s asteroid collision “early warning” system. It scans space to
provide a day's warning for 30-kiloton "town killer” asteroid impacts, a week’s notice for a 5-
megaton 150-foot diameter "city killer" asteroid, and
three weeks of warning for a 100-megaton 390-foot
"county killer” strike69. (NOTE – the Chelyabinsk meteor
was estimated at 13 kilotons and 66 feet). ATLAS’s first
discovery (composite image to the right) was August 9,
2015, when it spotted asteroid “2015 PE312”, estimated
to be 200-500 feet in diameter based on its brightness70.
If ATLAS provides enough lead time, authorities can evacuate an impact area, or a tsunami
zone if the object strikes the ocean. With two ground-based telescopes 100 miles apart, ATLAS
robotically scans the sky four times every night seeking out NEOs by looking for movement
against the background of stars and galaxies. ATLAS may eventually have 8 telescopes.
The ATLAS system can analyze 500 MB/min to make detailed comparisons of images taken
one hour apart71. The telescope observes the same area of space four times
before software combines them into a single image. As this illustration
shows, algorithms subtract static
“stars” and “planets” leaving only
objects that appear to be moving.
Objects moving in a straight line
between images become “suspect”
asteroids. With a “suspect” asteroid, the system searches a database
in real-time for this object using its coordinates and brightness data and
issues an alert within 10 minutes after analysis72. More on this critical step in the
section “Using Hadoop To Spot An Asteroid”.
The ground-based ATLAS will have the same limitations as other telescopes of this variety – the
Sun makes it impossible to see what is directly behind it and its glare blocks out those reflective
asteroids in a perimeter around the Sun. That is what happened with the Chelyabinsk meteor –
it came from the direction of the Sun and was not visible. With ATLAS located in Earth’s
northern hemisphere, it is also unable to see into a major part of the southern sky. The Moon
also reflects the Sun’s light causing other asteroids coming from that direction to not be visible.
2016 EMC Proven Professional Knowledge Sharing 18
ATLAS exemplifies the blurred lines between astronomy and automation. A human would be
hard pressed to accomplish this mission without serious compute power. Each telescope will
have a 10.5 K x 10.5 K CCD equaling 110 megapixels and take 1,000 images a night73. That
equates to 150 GB every day or 55 TB/year/telescope. With two telescopes, 110 TB a year will
be generated, and if eight telescopes come on-line, they will generate almost a petabyte of data.
Satellite Telescopes
Hunting asteroids with a space telescope has many advantages over ground-based telescopes.
Space-based telescopes are not susceptible to the filtering of infrared and ultraviolet light by
Earth’s atmosphere, as well as the optical distortion caused by atmospheric turbulence. While
space telescopes cost more and are harder to repair, they allow astronomers to get clear
images of outer space. Let’s look at two space telescopes that will help us find asteroids.
NEOWISE – Optical Telescope
In 2009, NASA launched the 6 foot wide, 10 foot tall Wide-field
Infrared Survey Explorer (WISE) space telescope aboard a Delta
II rocket74. With solar panels for energy, WISE orbits 325 miles
above Earth and follows a Sun-synchronous path from the North
Pole to the South Pole75.
With infrared’s ability to find “dark” asteroids or ones that do not reflect a lot of visible light,
WISE uses four 1-megapixel CCDs of different infrared wavelengths to capture amazing images
of space76. This greatly enhanced infrared image of the dying
star Helix Nebula shows an asteroid’s red streaks. CCDs
made of Mercury-Cadmium-Telluride (MCT) capture the
infrared wavelength bands of 3.4 and 4.6 microns while
CCDs made of Arsenic-doped Silicon capture the 12 and 22-
micron bands77.
In this infrared illustration, WISE’s Scientist Dr. Amy Mainzer is holding a teacup. On the left,
there is not enough visible light to see any details. On the right, infrared shows many more
details. The same holds true in space when looking for
asteroids without the aid of visible light or when their surfaces
are not highly reflective. Dark asteroids absorb sunlight, so
2016 EMC Proven Professional Knowledge Sharing 19
Science
Data
Instrument and S/C Engineering Data
Science
Plan
(UCLA)
QA MetadataArchive
Level 0Archive
Image/Engin.
Level 1Archive
Image/Src/Meta
TrackletDatabase
Level 3Archive
Image/Src/Meta
WISE Intranet QA Web Pages
QuickLookProcessed
Science and Engineering
Data (ftp/website)
Ingest
Quality Assurance
Final Product
Generation
Data Reduction Pipelines
Scan/FrameWISE-MOPSMulti-Frame
Project EngineeringArchive I/F
Tape
ScienceTeam/Project
Archive I/F
(IRSA)
Minor Planet Center
Public Atlas
and Catalog
Access
(IRSA)
Release ProductArchive
Atlas/Catalog
WISE Science Data System @IPAC
EXEC
EOS & White Sands Protected and Public
Web Services
Archive System
❸
❷
❶
❹ ❺
they get hotter and appear to glow with infrared detection, just like Dr. Mainzer.
Every space object reflects infrared light, and the warmer they are the greater the amount of
infrared light they produce. As a result, the WISE telescope needs to be colder than the objects
it observes or it would pick up infrared from the telescope itself. When WISE was launched, it
contained enough hydrogen to cool the telescope for 10 months. After that time, the Arsenic-
doped Silicon CCDs failed even though the MCT CCDs continued to operate78. NASA renamed
the WISE telescope NEOWISE (Near-Earth Object WISE) using just the surviving MCT CCDs.
In February 2011, NEOWISE was “turned off” or decommissioned. In September 2013, NASA
reactivated and reprogrammed NEOWISE to search for asteroids that could hit Earth as well as
finding asteroids that could theoretically be redirected into a Moon orbit79.
WISE takes a picture every 11 seconds and took 2.7 million of them in 2010. The Tracking and
Data Relay Satellite System (TRDSS) transmits WISE imagery to ground stations using
communication satellites operating at 300 megabits/s in the Ku/Ka-bands and 800 megabits/s in
the S-band80. WISE radios data 4 times a day in 15-minute durations81. The computing complex
located in the Infrared Processing and Analysis Center (IPAC) at the California Institute of
Technology (Caltech) in Pasadena, California combines the images into a catalog for worldwide
access82. The satellite uses stored commands for automatic controls such as attitude control
and receives new sequences sent from the NASA Jet Propulsion Laboratory (JPL).
The IPAC processes images
following this block diagram83. The
Ingest module accepts
NEOWISE data packets, telemetry,
and other data and puts it into the
Level 0 database . The Level 0
images are then handed off to Data
Reduction Pipeline processing .
This pipeline removes instrument
signatures and performs other QA
work on the raw images84. The
WISE-MOPS portion of the pipeline finds the NEOs. The Final Product Generation
documents the images and puts them in the Archive .
2016 EMC Proven Professional Knowledge Sharing 20
The processing of a raw image starts on the top left of
this sequence. It is filtered, with new bad and
previously bad pixels (shown in the yellow circle)
removed85.
In 2011, the WISE/IPAC processing used:
5 Sun/Oracle X4270 storage servers
15 Sun/Oracle J4400 SAS JBODs, H/W RAID, 3 X 18 TB usable per server; 270 TB total
42 node compute cluster; Dell 8‐core Xeon, 32 GB RAM, 0.5‐1 TB internal disk
3 Cisco 48‐port Catalyst 3750E switches with two 10 Gbit/s interfaces each
Resource management RHE4 (cluster), Solaris/ZFS (servers), NFS3, Condor, Ganglia86
Gaia Space Telescope – Optical Telescope
The European Space Agency used a Soyuz-STB rocket to launch an optical space
telescope named Gaia in December 2013 for a 5-year mission primarily to create a
3D catalog of 1 billion objects in space, or roughly 1% of our Milky Way galaxy87. It
uses an optical telescope and CCDs to capture images of stars in the 400 - 1000
nanometer wavelength and is expected to find thousands of planets the size of
Jupiter, quasars, and the positions and velocities of over 200,000 asteroids and
comets88.
Unlike other space telescopes, Gaia orbits in what is known as Lagrange point or L2 – a stable
place between the Earth and the Sun where a satellite is free of gravitational
vibrations. Stationed 1 million miles from Earth, it will be unaffected by the
same blind spot that causes Earth-bound telescopes to be unable to detect
asteroids emerging from behind the Sun.
Using 106 CCDs, each with 4500 x 1966 pixels for a mosaic of 1 billion pixels, Gaia will take
images and collect makeup, position, motion, and other data on a billion stars and other objects
70 times over its 5-year mission. Each object will become a discrete Java object on Earth when
processed. The data is transmitted over a 5 Mbit/s radio link during an 8 hour period each day.
Gaia generates 50 GB of raw data daily, and by the time the mission ends, it will have created
200 TB of data. The data is stored in the main database and an object-oriented database
management system from InterSystems Caché and processed by the Data Processing and
Analysis Consortium (DPAC)89. The final product is estimated to equal one petabyte.
2016 EMC Proven Professional Knowledge Sharing 21
Acronym
Coordination
Unit Location
ESAC CU 1, 3 Madrid, Spain
BPC CU 2, 3, 9 Barcelona, Spain
ISDC CU 7 Geneva, Switzerland
IoA CU 5 Cambridge, England
CNEX CU 4, 6, 8 Toulouse, France
OATO CU 3 Torino, Italy
GAIA Data Processing Centers
In 2013, Gaia was believed to be the largest astronomy data processing challenge to date90. To
process Gaia’s data, DPAC uses
a processing complex depicted
by the diagram to the right91. The
processing is performed by
equipment architected and
operated by over 400 European
scientists and software
developers from 24 countries
including France, Italy, UK,
Germany, Belgium, Spain, and Switzerland92. This “team effort” consortium has broken the Gaia
processing into 9 components to facilitate geographically distributed development. The
components are called Coordination Units (CU), 8 of which perform various aspects of
processing with the 9th handling the data archive catalog. CU1 and CU2 handle development
and simulations, and CU3, 5, and 6 handle the data processing of astrometric, photometric and
spectroscopic data. The CU3 is also known as the Astrometric Global Iterative Solution (AGIS)
and is designed to insert over 7 billion Java objects into the Caché database every day93.
Double star, orbital boundary, and solar system object analysis are performed by the CU4
component. CU7 tackles variable stars and CU8 handles spectral classification. Lastly, CU9 is
involved with Gaia data publication94.
The data processing would be distributed across the nations
listed in the table to the right. The DPAC requires that each CU
uses the Java framework to be database-agnostic and run using
any vendor’s database95.
An enormous amount of processing, as part of the AGIS “astrometric core solution”, is needed
to create position and motion data for the observed objects. While the main database (center of
the data flow diagram on the top of this page) holds the Gaia data and the results of data
processing, the AGIS contains a subset of the data for up to 40 passes through 100 TB of Java
objects in a 4-week period96. Multiple AGIS Java programs ingest 50 billion discrete 600-byte
objects contained in the 100 TB data in just 5 days. AGIS finished results are stored in a
versioned copy of the main database.
2016 EMC Proven Professional Knowledge Sharing 22
As an example of the processing power behind Gaia, the Barcelona,
Spain BPC data center in charge of CU2 simulations and CU3
Intermediate Data Updating (IDU) uses the “MareNostrum III”97
supercomputer that has 3,028 compute nodes using 16 core Intel
SandyBridge-EP E5-2670 processors (2.6 GHz), 32 GB of RAM and
500 GB of local disk. Interconnected with an Infiniband point–to–
point 10 Gb fiber optic network, the nodes utilize IBM’s General Parallel File System (GPFS,
now renamed to Spectrum Scale) mapped to 1.9 PB of disk space98.
In Toulouse, France, the Data Processing Center CNES (DPCC) is responsible for components
CU4, CU6, and CU8. They are handled with Dell servers used in both a Hadoop cluster and a
high performance compute cluster as pictured below99. CNES will have a big data mission to
assist in the processing of Gaia’s one petabyte of data stored in tables of 80 billion rows100.
The Square Kilometer Array – Mankind’s Largest Big Data Challenge – Radio Telescope
There is a new set of radio telescopes coming on-line called the Square Kilometer Array (SKA).
SKA will be the largest scientific instrument on the planet when completed101 and be 100 times
more sensitive than existing radio telescopes. The amount of data it is expected to generate will
dramatically push the boundaries of today’s computer science techniques.
With approximately 1/3rd of the telescopes located in Australia
and 2/3rds in South Africa, SKA will cover an area of
1,000,000 square meters, equaling the size of 187 American
football fields. Three different types of antennas will be used,
each capable of receiving specific data frequencies. The low-
2016 EMC Proven Professional Knowledge Sharing 23
SKA Represents a
Computing Revolution
Petabytes
a year
Exabytes
a year
Zettabytes
a year
Data generated by SKA2 antennas ** 138,555,830 135,300 156
Data generated by SKA1 antennas 13,855,583 13,530 16
Global Internet Traffic 2013 430,080 420 0.5
SKA1 combined archive 6,656 6.50 < 0.01
Business emails sent worldwide 3,000 2.90 < 0.01
Facebook uploads 180 0.17 < 0.01
Google searches 98 0.09 < 0.01
YouTube 15 0.01 < 0.01
CERN 15 0.01 < 0.01
NOAA 6 < 0.01 < 0.01
Library of Congress 5 < 0.01 < 0.01
** SKA1 = first phase of SKA = 10% of total projected data
Source: SpaceUp Toulouse - The Square Kilometre Array telescope
https://www.youtube.com/watch?v=PkR6LAOgSII
frequency aperture array uses dipole antennas to handle the 50 to 350 MHz wavelengths,
acting in unison or as many smaller independent radio telescopes102,103. The mid frequency is
captured with dish antennas that cover the 350 MHz to 14 GHz spectrum while a subset in the
350 MHz – 4 GHz range is handled with larger traditional parabolic antennas.
With the ability to scan the sky 10,000 times faster than before104, the SKA requires innovations
in supercomputing, algorithmic analytics, and disk storage. The telescopes use a “Central
Signal Processor” (CSP) to forward the image data by high-speed communication links to
scientists working around the world. The Digital Data Backhaul (DDBH) network moves signals
from the telescope to the CSP, then to the Science Data Processor (SDP), and finally to local
SKA distribution centers. The distances, some measured in thousands of kilometers, data rates
to 27 terabits/second105 (almost 300,000 TB/day), and its timing requirements will stretch the
limits of modern telecommunications.
Initial SKA prototypes were named MeerKAT in
South Africa, and ASKAP and MWA in
Australia. MWA’s “Phase 1” will have 250,000
low-frequency antennas, increasing to a million
over time106. It should provide a much higher
resolution and will scan the sky 135 times faster
than existing radio telescopes.
In the first of multiple phases, telescopes will produce 160 TB of raw data per second (35,000
DVDs per second). With low-frequency range telescopes collectively generating 157 TB/s, and
mid frequency range telescopes generating 2 TB/s107, SKA is a big data computing project.
Individual telescopes will create up to 20 GB of raw data per second108. In total, up to 5
exabytes (EB) every day needs to be processed by a supercomputer, with the systems handling
156 zettabytes of data annually when fully operational. Data traffic is estimated at ten times the
current global internet traffic109 with the
SKA requiring enough fiber channel
cable to wrap around the Earth twice110.
The volume of data makes it impractical
to move through a network, so it must
somehow be processed where it finally
lands.
2016 EMC Proven Professional Knowledge Sharing 24
Antenna & Front-End Systems
Correlation
Data Product
Generation
Long Term Storage
High Availability
Storage / DB
Temporary Storage
On-Demand Processing
800 Petabytes30 Petaflops/s
18 PB/year
> 1 Exaflop/s
> 7 Petabytes/s
> 300 Gigabytes/s
Massive Data Flow, Storage & Processing
Host processor
Multi-core X86
M-C
ore
->1
0T
FLOP
/s
M-C
ore
->1
0T
FLOP
/s
To rackswitches
Disk 1≥1TB
PCI Bus
Disk 2≥1TB
Disk 3≥1TB
Disk 4≥1TB
Processing Blade GGPU or MIC
56Gb/s
Moore’s Law – every two years, the number of CPU transistors doubles, effectively doubling computer processing power
As shown in this SKA Big Data
Flow Diagram, the radio dish and
array data rates rapidly increase
to 5 PB/s in Phase 2.
Researchers are able to review
the data and work with subsets,
perhaps in a cloud computing
model, after it lands in the
Science Archive to the right of the diagram.
The parallel architecture needed to process these rates and
volume sizes must take into account the worldwide
geographic routing of data. Existing IT infrastructure simply
cannot handle these data rates. Imagine the impact of taking
an outage to cope with unplanned code upgrades or break-fix
issues. Here is a flowchart of the anticipated data rates. SKA
is the very definition of a truly ambitious big data project.
SKA’s 500,000 telescopes will collect an enormous 14 EB of radio signal data and store 1 PB
every day. If you tried to store a petabyte of data on an EMC VNX2 using RAID 6(14+2), you
would consume 300 x 4 TB drives every day111. However, the critical issue is the compute
power and infrastructure to process a petabyte of data every day and not disk capacity per se.
The scalability, bandwidth, power consumption, and drive characteristics such as Input/Output
Operations per Second (IOPs) would dictate a far more elegant solution (if it even exists today).
The SKA design team initially used a conservative blade
architecture design and extrapolated it to 2018/2020 to
handle future processing requirements. From the
LOFAR (Low-Frequency Array) low-power design112,
a Dell PowerEdge T620 using 8-core dual Xeon E5-
2600 processors with PCIe Gen3 15.75 GB/s expansion
slots, 768 GB RAM, 32 x 2½” solid-state disk drive bays, 2 x 10 or 2 x 40
GbE NICs, and 2 x 56 Gb/s Infiniband ports were envisioned. Using
Moore's Law, these blades could have double to triple the processing
power by 2020 and be capable of 64 TFlops.
2016 EMC Proven Professional Knowledge Sharing 25
SKA subsystems and service components
UIF Toolkit SKA Common Software Application Framework
AccessControl
Monitoring Archiver
Live Data Access
Logging System
Alarm Service
Configuration
Management
Scheduling Block
Service
Communication Middleware
Database Support
3rd Party Tools and Libraries
Development Tools
Operating System
High-level APIs and Tools
Core Services
Base Tools
Projects
1. Algorithms and Machines 2. Access Patterns 3. Nanophotonics 4. Microservers 5. Accelerators 6. Compressive Sampling 7. Realtime Communications
Processing blade 1
Processing blade 2
Processing blade 3
Processing blade 4
Processing blade 5
Processing blade 6
Processing blade 7
Processing blade 8
Processing blade 9
Processing blade 10
Processing blade 11
Processing blade 12
Processing blade 13
Processing blade 14
Processing blade 15
Processing blade 16
Processing blade 17
Processing blade 18
Processing blade 19
Processing blade 20
Leaf Switch-1 56Gb/s
Leaf Switch-2 56Gb/s
42U RackTwenty of these 2U blades will be housed in a 42U rack. Each node, taking into
account memory, network interfaces, SSDs and other components, is expected to
consume 882 watts. Two 36 port Mellanox SX6536 Infiniband “leaf” switches
connect to one 56 Gb/s port on each blade, delivering 74.52 Tb/s of switching
capacity. Each rack would have an electrical power density of about 20 kW.
Creating a low-profile SKA processing building block is essential to be able to power
the overall processing complex necessary to handle the expected data rates. The
SKA 2013 “SDP Element Concept” architecture guide described a bulk storage
system incorporating a “scale-out” Xyratex ClusterStor 3000 which uses
the Lustre file system that is expandable to 30 PB and uses Infiniband to connect the
blades. Its power consumption is 18.5 kW113. [Note: Lustre (Linux Cluster) is a
parallel distributed file system used for large-scale cluster computing114.]
To explore the enormous processing power required over the entire SKA timeline, with a focus
on Phase 1 of SKA, IBM and the Netherlands Institute for Radio Astronomy (ASTRON) are
working to create a massively powerful computing system through advanced chip designs.
Called “Project DOME”, they will try to find energy efficient ways to
transport the huge data volumes between radio antennas to a central
location, and provide real-time data filtering and methods to store the
data. Ideally, they need to develop a 300 petaflop computer that uses
less than 8 MW of power, or more than 10 times the fastest
supercomputer with the same energy profile115. In total, ASTRON and IBM have mapped out 7
projects to handle this new SKA big data frontier. They include information management,
computer chip system design employing 3D stacked chips, optical interconnects, water cooling
and nanophotonics.
The software architecture is expected to include an Application layer, Common software layer,
High-Performance Computing (HPC)
services, and Operating System layers. The
designers envision a “loose coupling in the
higher layers of the software stack…” with tighter
coupling for performance oriented lower layers116. Further subdivisions of each layer are likely.
The Base Tools layer contains Common Software development tools and run-time environment
on top of the operating system. This layer contains a Communication Middleware that handles
2016 EMC Proven Professional Knowledge Sharing 26
intra-application exchanges, a Database Support component providing administration, data
access and abstraction application programming interfaces (API), and may include Cassandra,
the Hadoop database HBase, or relational databases such as MySQL and Postgres. Third party
tools and libraries might include astronomical libraries such as casacore, wcslib, HDF5, etc.117
“Development Tools comprises a comprehensive build system that supports recursive
compilation, executing of unit and functional tests and creation of deployable packages (release
process). It also provides wrappers on top of existing compilers such as make and/or SCons for
C++ applications, Ant/Maven for Java applications and setuptools for Python.”118
Access control and authentication, archiving of monitor data, access to SKA real-time
monitoring and control data, application logging, alarm tools, configuration management, and
scheduling are part of Core Services.
High-level APIs and Tools provide APIs, allowing packages to integrate and access core
services. The User Interface Toolkit has APIs for the Graphical User Interface (GUI) including
widgets for displays, log browsing, alarms, and tools to monitor and operate large scale control
systems.
The Science Data Processor binds hardware compute, network, software, and algorithms
together to handle data rates exceeding the daily worldwide web traffic119. Planned to be online
by 2020 and at “full power” by 2025, 100 petaflop supercomputers (100,000,000,000,000,000
floating point operations per second) will be needed to crunch SKA data120. Ultimately, exaflop
supercomputers will be required. As of June 2015, the fastest supercomputer is China’s Tianhe-
2. Capable of “just” 34 petaflops, it could only handle 1/3 of SKA’s requirements121. The
compute power is needed to process real-time image data from thousands of telescopes
operating at thousands of frequencies. Some of the calculations include122:
Removing corrupted data
Calibrating each antenna
Transforming the data onto a rectangular grid
Applying Fourier transformations to convert the data an image in the sky
Removal of data spikes from bright stars
The process then iteratively combines
parameters such as complex gains to
eventually create a converged image.
These steps are memory intensive and
require massive data storage
2016 EMC Proven Professional Knowledge Sharing 27
4321
capabilities. However, neither the processing power nor storage capabilities exist today on a
practical basis.
As we have seen in this section, SKA data rates will overwhelm the ability for astronomers and
data scientists to work with the raw data, pushing the analysis of patterns and correlations
beyond the limits of the human brain. SKA promises to redefine all that we associate with the
term big data – maybe we should call this “Ultra Big Data”?
Using Hadoop To Spot An Asteroid
With millions of asteroids in space, you would think it would be easier to find them. However,
their relatively small size poses a problem as they only appear to be tiny dots of light in the sky.
Is the dot a star and or an asteroid? In order to find an asteroid, telescopic images must be
compared, and an object that seems to move from one image to the next might be an asteroid.
In Piazzi’s time, the comparison was done manually, and as a result, few asteroids were found.
French physicists first used a camera for astronomy in 1845, but the film was not sensitive
enough to capture starlight123. These days, telescopes are far more sensitive and film cameras
have been replaced by CCD cameras. Algorithms now compare images with positive findings
reviewed by astronomers. Algorithmic methods have plusses and minuses. Algorithms that are
too sensitive can yield many “false positives”, and with lower sensitivity, it may miss the object.
The Catalina Sky Survey took 7 images of asteroid “2014 AA” on January 1, 2014124. This SUV-
sized asteroid weighed about 44 tons and burned up in our atmosphere the next day125. These
are 4 of those images126. At a high level, an Earth-bound telescope adjusted for planetary
rotation to
take CCD
images
minutes apart
of the same
part of space. As mentioned in the ATLAS section of this paper, the images were aligned and
cleaned up through coaddition to allow image subtraction to isolate the asteroid.
2016 EMC Proven Professional Knowledge Sharing 28
In greater detail, when a telescope takes multiple images of space minutes apart, images will
partially overlap, or images from different telescopes will need to be coadded to clean up and
enhance faint images. Starting with a base image, subsequent images of the same section of
space are algorithmic aligned and added to produce a sharper, brighter image. With ever-higher
resolution and an increasing number of images, astronomers rely on massively parallel
processing Hadoop
systems to do this
work. In this
illustration, image
data is injected into
the Hadoop system
where dozens to
thousands of nodes
break the problem
apart and parallelize
the search for
boundary matching
(MAP). The images are eventually stacked and brought together into a mosaic (REDUCE).
When the processing is complete, a final composite image is produced127. This approach is far
faster than a serial approach of image alignment.
To complete this image process, once bright static dots are isolated and subtracted from each
frame, an asteroid can be seen streaking across the sky as shown on the previous page. This is
called image or pixel subtraction128 and allows an asteroid’s motion to stand out – i.e., the stars
are so far away they appear fixed in space. What is left is possibly an asteroid. This is hard to
spot without computer algorithms.
3D Asteroid Modeling – Try It Yourself!
Asterank is a database created by Ian Webster that contains information on over 600,000
asteroids129 using their known orbit and physical composition data from the Minor Planet Center
and NASA’s Jet Propulsion Laboratory. Webster’s highly informative 3D full-motion view of
asteroids shows their interaction with the planets and serves as a model of potential Earth
2016 EMC Proven Professional Knowledge Sharing 29
impactors. Here is a
still image of the 3D
view. The interface
allows for speed
settings, pan and
zoom, the layering
of planet orbits and
the Milky Way. In
this zoomed image
for May 17, 2016,
you can see the Sun in the center, the planetary orbits of Mercury, Venus, Earth, and Mars and
a portion of Jupiter, and the position of asteroids in this section of space. You are encouraged to
explore this database and the viewer at http://www.asterank.com. The source code is on
GitHub130.
There is also an API to query the MongoDB database using the syntax:
{field: {$lt: value} }, where $lt selects the documents where the value of the field is
less than the specified value131. For example,
{"e":{"$lt":0.1},"i":{"$lt":4},"a":{"$lt":1.5}}&limit=1 searches for an
Asteroid with Eccentricity E <0.1, Inclination (degrees) I <4, and a Semi Major axis < 1.5AU.
The query returns asteroid 138911 “2001 AE2”.
Taking Action
While the threat of a cataclysmic, massive, civilization-ending asteroid colliding with Earth has a
very low probability, the likelihood of smaller strikes remains constant. Based on the Moon’s
craters, we know Earth has been and will continue to be hit repeatedly. Asteroids with the
equivalent of 600 kilotons of TNT have hit Earth over the last decade. In 1997, David Morrison,
one of the pre-eminent experts on NEOs and asteroids stated that of the “roughly two thousand
kilometer-scale asteroids that are expected in Earth-crossing orbits, fewer than two hundred
have actually been found.”132 In 2005, British astronomer David J. Asher co-authored a paper
titled “Earth In The Cosmic Shooting Gallery” and wrote, “The terrestrial impact rate appears to
be substantially higher than current near-Earth object population models imply, consistent with a
significant unseen cometary contribution to the terrestrial impact hazard.”133
2016 EMC Proven Professional Knowledge Sharing 30
While we may not see extinction in our lifetimes, many feel we are fortunate to have made it this
far. The argument Dr. Asher’s analysis naturally raises is one of preparation. If we were to wake
up tomorrow and be told an asteroid will strike on Friday, it could be too late to react. If there is
not enough lead time to deflect it, then it only makes sense based on his findings that we need a
strategy to put an object deflection infrastructure in place in advance before the detection.
Let’s look at what can be done if an asteroid of sufficient
size is on a collision course with Earth. Any defensive
strategy depends on computer science as demonstrated
by recent
endeavors to
send a
spacecraft to an asteroid, as we did with NASA’s Dawn
space probe. We have the technology to put a probe in
orbit around asteroid Vesta and dwarf planet Ceres to
take great pictures as shown in this image of the Ceres surface.
Launched in September 2007, it took almost 4 years (July 2011) and a lot of planning to have
Dawn orbit Vesta, some 117 million miles from Earth. Due to its relatively slow speed and
Vesta’s own orbital velocity, Dawn traveled 1.7 billion miles with a Martian gravity assist along
the way. On August 2013, it was sent on the second part of its mission, a 930 million mile, 2½
year journey to Ceres134.
While asteroids threatening us will not be as distant as Vesta or Ceres, the key to deflecting or
redirecting them is sufficient lead time, perhaps measured in years. No nation today is prepared
to launch a rocket to deflect, redirect or destroy an asteroid. Based on the object’s size and the
lead time, a change of a fraction of a degree is all that it would likely take to change its orbit and
prevent the collision.
Nuclear Explosion
In 1998, a Texas-sized asteroid was 18 days away from annihilating Earth, or so the disaster
movie Armageddon goes. In the movie, space shuttles with nuclear bombs were launched
towards the asteroid with a plan to save mankind by using the bombs to break it into pieces. A
few months after Armageddon, the film Deep Impact depicted a crew using nuclear bombs on a
7-mile wide comet. Unfortunately, they broke it into 2 large pieces, with both still targeting Earth.
2016 EMC Proven Professional Knowledge Sharing 31
One fragment caused a 3,500-foot tsunami in the Atlantic Ocean near America’s East Coast,
killing millions while the other piece is destroyed before it strikes Canada.
In real life, the lack of warning could lead to a similarly
desperate approach. With lead time, a spacecraft with a
nuclear weapon could be launched to deflect a certain
sized asteroid. By detonating the weapon near the
asteroid, the
shockwave or
intense radiation could be sufficient to nudge the
asteroid off course while keeping it intact, causing it to
miss Earth135. It is generally agreed not to detonate
anything on the asteroid’s surface or subsurface since
breaking it into many smaller but still significant smaller
pieces could still target Earth – a “buckshot effect”136.
Kinetic Impact
NEOShield-2 is a project by the European and German
space agencies to create a high-velocity kinetic impactor
that can crash into an asteroid at a high velocity137. The
impactor transfers its mass and velocity to the asteroid
causing it to have a small change in velocity, thus
diverting its course by a fraction of a degree. An
example of this is when a cue ball hits another billiard
ball, imparting kinetic energy and sending the other ball
flying.
The degree of deflection depends on the mass and speed of the impactor. A small impactor
moving quickly can have the same effect of a large impactor moving very slowly. Calculations
show a 1 mile-per-hour impact would divert an asteroid 170,000 miles if it were struck 20 years
in advance138. If an asteroid was small enough, ramming it with a spacecraft like Dawn could
supply enough kinetic energy to throw it off course. There are also hybrids that use this kinetic
approach. One such kinetic impactor is called an HAIV, or Hypervelocity Asteroid Intercept
Vehicle. HAIVs consist of two spacecraft with the first kinetically punching a hole in the asteroid
and the second implanting explosives in the asteroid similar to the Nuclear Explosion method139.
2016 EMC Proven Professional Knowledge Sharing 32
The Yarkovsky/Paint Effect
Russian civil engineer Ivan Yarkovsky wrote in the year 1900 “…that the diurnal heating of a
rotating object in space would cause it to experience a force that, while tiny, could lead to large
long-term effects in the orbits of small bodies…”140. We
feel this ourselves when wearing a white or a black shirt
on a hot sunny day – the white reflects some of the heat
while the black shirt absorbs it. In other words, if we
could paint one side of an asteroid white, it would change
the number of thermal photons reflected off of it causing
it to change course.
The photons act as a tiny rocket pushing the asteroid in a different direction. Adjusting the thrust
could be accomplished through the opaqueness of white paint or by painting the opposite side
black. This approach would take many years or even decades to change an orbit, so plenty of
impact notice would be needed.
Sails
German astronomer Johannes Kepler noted in 1619 that
a comet’s tail was away from the Sun because of
pressure from sunlight141. Similar to a sailboat that uses
large sails and wind power to move, the pressure of
sunlight against a giant solar sail pushes it forward.
Sunlight is made of photons. Photons have no mass but
they do have momentum, and the larger the solar sail, the greater the capture of photons to
push it – in essence, the Sun has wind energy.
If a spacecraft can attach a solar sail to an asteroid, then
the Sun’s emitted photons hit the sail and push against it,
transferring its momentum. The sail would slowly nudge
the asteroid into a slightly different orbit. By furling or
unfurling the sail, the degree of propulsion could be
changed. This concept would work for smaller asteroids
but the size of the sail might make it impractical for very
large ones or if the lead time to attach one is too small.
2016 EMC Proven Professional Knowledge Sharing 33
Catch It
If you could snare an asteroid in a net – a giant one
made of metal or some strong carbon fiber – then a
spacecraft could “tug” the asteroid into a new orbit. It
could also bring it somewhere else such as into an orbit
around the Moon for further study142.
Heat it up
Using giant mirrors, sunlight could be aimed at an
asteroid that contains trapped water to heat it up. The
heat would cause any vapor in the asteroid to be ejected
out. The ejected vapor would act like a small rocket
motor pushing the asteroid into a slightly different
orbit143. A high-powered laser aimed at the asteroid
(laser sublimation) would have the same effect.
Nudge It
Similar to the manipulated gravitational forces generated
by the Star Trek Enterprise’s tractor beam, we know that
objects, even man-made objects, exert a gravitational
pull. By orbiting a spacecraft around an asteroid, a weak
gravitation force would be exerted on the asteroid, and
by very slight changes in the spacecraft’s direction, it
could nudge the asteroid enough to change its course as well144. Care would need to be taken
that the spacecraft didn’t accidently strike it or aim its thrusters towards the asteroid’s surface in
its attempt to orbit it. The closer the orbit, the greater the gravitational pull. In theory, you could
also tether the asteroid to another heavy object like a giant spacecraft, thereby altering the
asteroid’s orbit. The lead time for these ideas could be measured in decades.
Attach a rocket motor to it
If time is short or the object is too large, then waiting for a sail
to guide it away or spray painting it white might not be the
right approach. If a spacecraft could attach a big chemical
2016 EMC Proven Professional Knowledge Sharing 34
Sky Survey Projects
Data Volume
Estimate
DPOSS (The Palomar Digital Sky Survey) 3 TB
2MASS (The Two Micron All-Sky Survey) 10 TB
SDSS (The Sloan Digital Sky Survey) 40 TB
SkyMapper Southern Sky Survey 500 TB
GBT (Green Bank Telescope) 20 PB
LSST (The Large Synoptic Survey Telescope) ~ 200 PB expected
SKA (The Square Kilometer Array) ~ 4.6 EB expected
http://datascience.codata.org/articles/10.5334/dsj-2015-011/
rocket engine to it, it would push the asteroid in a different direction145.
Eat It
In 2004, NASA created a farfetched idea to send dozens
of nuclear-powered spacecraft to an asteroid and
working as a team, drill into it and send the rubble into
space using powerful electromagnets or a rail gun.
NASA called this project Modular Asteroid Deflection
Mission Ejector Node (MADMEN). By changing the mass
of the asteroid
and the recoil of sending the chunks away, the asteroid’s
course would be altered. NASA’s analysis showed they
would need a formation of 39 “munching” spacecraft,
needing just 17 to survive the landing on the asteroid.
With the craft fully functioning, the mission stood a 43%
chance of success146.
All these methods and dozens of others all rely on sufficient warning. As we have seen, the
warning can only come through active computer science-aided observation of space. The
problem is enormous and even stretches today’s definition of big data in that the technology
does not yet exist that can process all the data in sufficient time to be of value.
High-Performance Computing and Big Data
The amount of telescope data generated has
grown at an incredible rate. Astrophysicist
and data scientist Dr. Kirk Borne tells a story
of an astronomer in 2000 who asked if NASA
could store a terabyte of sky survey data and
was told “That’s impossible! Don’t you realize
that the entire data set NASA has collected over the past 45 years is one terabyte?”147 These
days, “virtual astronomy” is measured in petabytes and exabytes. As we’ve discussed. the SKA
will create 5 petabytes of data per second when fully operational.
2016 EMC Proven Professional Knowledge Sharing 35
Computer science uses parallel processing to address problems such as how to defend Earth
through timely decisions based on huge volumes of data. Rather than trying to overcome the
limitations of silicon, thermodynamics, and the speed of light to build a highly scalable single
processor chip, it is far more practical to increase their overall computational performance by
using multiple processor chips that communicate with each other. Distributing the compute load
among processors is a practical approach to solving problems easily answered in parallel –
these are called “embarrassingly parallel”148. For example, Hadoop splits up large chunks of
work among processing nodes working in parallel, and when the nodes are finished processing,
the individual answers are brought together into a single solution. This is the same concept as
using multi-lane highways to allow more cars to travel in parallel, but without speeding up the
cars.
Two of the popular
parallel processing
approaches are Single
Instruction, Multiple
Data Stream (SIMD), and Multiple Instruction, Multiple Data Stream (MIMD). At a very high
level, an SIMD can run the same instruction on all processors but on different data streams
while an MIMD can run different instructions on different data streams.
Today’s approach to processing vast amounts of astronomical data is to apply advanced
parallel processing techniques. This class of HPC computer science problem can require a
supercomputer – something that can apply aggregated compute power. Supercomputers can
either be built from a few dozen to thousands of off-the-shelf servers (e.g. Dell, HP,
Lenovo/IBM) aggregated with high performance interconnects, or designed from the ground-up
to be specialty supercomputers (e.g. Cray, IBM) incorporating commodity components.
Examples of expensive specialty supercomputers include:
Cray XC40. The U.S. National Nuclear Security Administration purchased “Trinity” for
$174 million to run Linux on 9,436 nodes using 301,056 compute cores and 2 PB of
memory to support a 78 PB parallel file system with a bandwidth of 1.6 TB/s149.
Tianhe-2 is the fastest machine in the world according to Top500150. It uses Intel Xeon
E5 processors to supply 3,120,000 compute cores (10X that of the Cray XC40), 1 PB of
memory, and 12.4 PB of storage for a cost of $390 million dollars151. It consumes
between $26 and $36 million dollars’ worth of electricity every year152.
2016 EMC Proven Professional Knowledge Sharing 36
Create a computational box to do the simulation and divide it into 100 million cubes. Place a model of theasteroid in the computational box and assign groups of the cubes to different processors. Allow processors to compute how the contents of each cube evolve.
There is also a cloud computing framework that unites HPC and big data on a pay-as-you-go
basis (scalable and dynamic) without the need to own the platform. Various public cloud
providers such as Amazon153 and Google154 offer these environments.
If supercomputing is to help protect
Earth, it must continue to follow Moore’s
Law. By 2020, the world needs its first
exaflop machine (1,000 petaflops)
capable of quadrillions of calculations
per second, but current systems are
trending below that goal as this chart
shows (performance based on Tianhe-2
as shown in the red oval is trending flat
in 2015)155. Other critical factors include
cost, power, cooling, storage
requirements, etc.
To further emphasize the critical nature of this problem, the supercomputer in this illustration is
tasked to prepare a planetary defense that predicts the success of a nuclear detonation156. A
logical 3D grid is created such that
each cube of the grid can be
assigned to its own asteroid
segment and compute core. Then
the most precise elliptical orbit
data of the asteroid is fed into the
grid such that its composition,
speed, mass, trajectory and other
data is represented, with each
compute core working on its own
view of the cube. The goal is to determine how big a blast is needed and where should it be
placed such that when detonated, the asteroid will be blown into much smaller fragments that
will miss Earth. Each compute core is applying physics equations to understand the effect of an
explosion on its piece of the asteroid. Given the 3D model is tracking a rotating high-speed
asteroid, the simulation would represent a timeline with second or sub-second resolution. Each
core must be in inter-process communications with other cores so the blast effect on its piece of
2016 EMC Proven Professional Knowledge Sharing 37
the asteroid will be understood and taken into account by adjacent cores. The overall simulation
must begin in the future to allow enough lead time to put a plan in place to blow up the object.
From an IT perspective, the supercomputer is just a machine and can breakdown, so the
architecture or the machine itself must provide enough redundancy, for example, to minimize
the impact of a processor replacement or a bad cable. If the simulation gets corrupted, the
system must allow itself to back up to the appropriate timestamp or checkpoint since starting the
process from scratch is obviously not an option. Checkpoints must record all the processing
since the last checkpoint – i.e. each core needs to know where in their equation calculations
they need to resume from. These checkpoints could take hundreds of terabytes of data storage,
so I/O service time must be taken into account. There is also the likelihood that multiple
checkpoints would need to be saved – perhaps exabytes of fast storage will be required.
Just as collecting and analyzing petabytes of data in real-time pushes the boundaries of
Moore’s Law, the same challenges apply to storing the data. While Moore’s Law predicts the
immense supercomputer power to generate data faster than ever before, the ability to store it for
additional analysis has not kept up. There is a growing gap between the speed of processors
and storage – spinning hard disk drive (HDD) performance is simply too slow. HDDs were
invented in 1956157, well before the first commercially available microprocessor in 1971158. In
2000, the fastest HDD operated at 15,000 RPM but they have not rotated any faster since.
Improvements in classical hard drive technology have focused on platter density, larger cache
memory, etc., allowing the fastest rotating 600 GB drive with a 128 MB cache buffer to transfer
290 MB/s of sequential 4K block data over a 12 Gb/s SAS interface.159 Mechanical drive
capacity is not the answer – the largest helium-filled 10 TB drive with a 256 MB cache transfers
data at a sustained transfer rate of 249 MB/s160. Before the introduction and commercialization
of the solid-state drive (SSD), supercomputers might require tens of thousands of HDDs just to
handle the throughput performance requirements (e.g. IOPs).
Employing integrated circuits
to store data instead of
rotating platters with moving
arm magnetic heads, the
performance difference of
today’s SSD (typically at a
higher cost) well exceeds the
2016 EMC Proven Professional Knowledge Sharing 38
Random READ MB/s 2,800
Random WRITE MB/s 2,200
Random READ IOPs 345,000
Random WRITE IOPs 385,000
Sequential READ MB/s 980
Sequential WRITE MB/s 740
Random READ IOPS 199
Random WRITE IOPS 115
https://www.sandisk.com/business/datacenter/products
In-server Flash Memory Accelerators
Solid-State Disks with 12 Gb/s SAS Interface
fastest HDD. As a result, supercomputers might only need thousands of SSDs to do their work.
Even with the pace of telescopes like SKA, hundreds of terabytes of checkpoint data written to
thousands of SSDs will still take many minutes.
As illustrated by this chart, even faster solutions are
becoming available as storage moves from an external
storage area network (SAN) and “next to” processor
memory. Bypassing disk controller cards and host bus
adapters effectively give in-memory technology orders of
magnitude higher throughput than today’s best storage
array. These memory images could be gradually de-staged to slower SSDs and even slower but
huge HDDs allowing the supercomputer to continue with its calculations with the checkpoint
completed as a background process. Examples of in-server flash memories include EMC’s
DSSD which provides compute-side SSDs directly through the PCIe bus161. Co-location with the
computer allows for near-memory speed storage with bandwidths of 1 TB/s and 250 M IOPS.162
To defend our planet, it is acknowledged that supercomputing and new storage technologies
need to work together as part of the HPC/big data asteroid defense problem. The field of
astronomy recognizes these critical areas and established astroinformatics and astrostatistics
disciplines to focus on them. Astroinformatics combines astronomy and IT technologies such as
machine learning, statistics, visualization, data management, and others163 while astrostatistics
encompasses astrophysics, statistical analysis, and data mining164.
Conclusion
Most of the funded projects are naturally focused on finding asteroids, but equally important is
what to do about them when they pose a risk to us. There is little doubt that we need a plan
beyond praying to deflect or destroy them, especially with little or no lead time. Critical to both
parts of this approach is data analysis. To find asteroids, and to deflect or destroy them, you
need computer science.
Taking the form of computation algorithms, HPC, big data, modeling, simulation, data mining,
networking and other critical areas, computer science is fundamentally critical to help shield
Earth from asteroids. Coupled with the work of many gifted astronomers, the “golden age” of
astronomy, marked by massive photon gathering mirrors, radar telescopes, and spacecraft like
2016 EMC Proven Professional Knowledge Sharing 39
“All I’m saying is now is the time to
develop the technology to deflect
an asteroid.” [www.slideshare.net/perficientinc/creating-a-successful-
api-program-to-drive-digital-transformation]
the Hubble Space Telescope, would not be where it is today without the integrated circuit CCDs
and microprocessors.
Clearly times have changed, and every day the field of
astronomy is being transformed by computer science.
In a field that less than 100 years ago relied on humans
to scan the sky with optical telescopes, and people to
perform manual data reductions, results were often
distributed to a limited few or kept in a desk drawer.
Technology now allows for giant maps of the sky to be
collected in an automated fashion with data scrubbed
by algorithms. The result is the beginning of a giant
database of imagery and metadata searchable by
anyone around the world. We are witnessing the initial
creation of a space shield that will hopefully protect
mankind from what happened to the dinosaur – if it’s
not too late.
2016 EMC Proven Professional Knowledge Sharing 40
Appendix - Glossary
Aphelion - the asteroid’s farthest distance from the Sun measured in astronomical units (AU).
Asteroid – A small (relative to a planet) rocky body orbiting the Sun.
Astrometry - the precise measurement of the positions and motions of celestial bodies.
Declination - (abbreviated dec; symbol δ) is one of the two angles that locate a point on the
celestial sphere in the equatorial coordinate system, the other being “hour angle”.165
Ephemerides – A table of future positions.
Kinetic energy – The energy an object possesses when in motion. The heavier the object and
the faster it travels, the more the kinetic energy it possesses.
Meteoroid – A small piece of the asteroid that orbits the Sun. Generally less than 1 meter in
size.
Meteor - The streak of light produced by atmospheric friction as an asteroid or meteoroid enters
Earth’s atmosphere.
Meteorite - A meteor chunk not vaporized on entry into the atmosphere and lands on the Earth.
Perihelion - the asteroid's closest distance to the Sun measured in astronomical units (AU).
Photometry - the measurement of the brightness of a celestial body over wide bands of
wavelength.
Planetoid – See asteroid.
Right ascension - (abbreviated RA; symbol α) is the angular distance measured eastward
along the celestial equator from the vernal equinox to the hour circle of the point in question.166
Semi-major axis - distance is equal to one-half of the major axis of an ellipse.
Spectrometry - the measurement of the spectrum of light emitted by a celestial body.
2016 EMC Proven Professional Knowledge Sharing 41
Appendix – Draw an Ellipse in Excel
I’ve included some simple steps if you would like to draw some simple ellipses using Excel.
The general formula for an ellipse with its major and minor axis lying on a graph’s x and y-axis
follows this formula: 𝑥2
𝑎2 +𝑦2
𝑏2 = 1
To put it into an easy to use Excel form, you want to “solve” this equation for y:
𝑦 = ±√(1 −𝑥2
𝑎2) ∗ 𝑏2
, so an Excel equation, it looks like 𝑦 = ±𝑠𝑞𝑟𝑡((1 − 𝑥^2/𝑎^2) ∗ 𝑏^2))
The following shows you the formulas in each cell. By changing the values in A1, you can alter
the width of the ellipse and with B1 you can change the height of it. By using a larger X range
with smaller intervals, the ellipse would look smoother. My example uses X intervals of 0.5, so if
you used 0.1, the curve would appear smoother.
Width
"A"
Height
"B"
4 3
y=±sqrt((1-x^2/a^2)*b^2))
x y y-
-4 0.0 0.0
-3.5 1.5 -1.5
-3 2.0 -2.0
-2.5 2.3 -2.3
-2 2.6 -2.6
-1.5 2.8 -2.8
-1 2.9 -2.9
-0.5 3.0 -3.0
0 3.0 -3.0
0.5 3.0 -3.0
1 2.9 -2.9
1.5 2.8 -2.8
2 2.6 -2.6
2.5 2.3 -2.3
3 2.0 -2.0
3.5 1.5 -1.5
4 0.0 0.0-4.0
-3.0
-2.0
-1.0
0.0
1.0
2.0
3.0
4.0
-4 -3 -2 -1 0 1 2 3 4
2016 EMC Proven Professional Knowledge Sharing 42
Footnote
1 https://en.wikipedia.org/wiki/Chelyabinsk_meteor
2 http://www.amsmeteors.org/fireballs/faqf/#5
3 http://www.popsci.com/science/article/2013-02/astronomers-calculate-russian-meteorites-orbit-and-realize-it-has-
80-million-cousins 4 http://www.wired.com/2015/07/asteroid-2015-hm10-will-not-destroy-earth/
5 http://neo.jpl.nasa.gov/images/Chelya_orb.png
6 http://www.popsci.com/science/article/2013-02/astronomers-calculate-russian-meteorites-orbit-and-realize-it-has-
80-million-cousins 7 NASA JPL Orbit Diagram of 2014 EC http://ssd.jpl.nasa.gov/sbdb.cgi?sstr=2014+EC&orb=1
8 http://www.express.co.uk/news/nature/507480/Asteroid-Strikes-Earth-Damage-Nasa-Destruction
9 http://www.jpl.nasa.gov/news/news.php?feature=4380
10 https://en.wikipedia.org/wiki/Tunguska_event
11 http://paleobiology.si.edu/dinosaurs/info/everything/why_2.html
12 http://www.scientificamerican.com/article/asteroid-killed-dinosaurs/
13 https://www.youtube.com/watch?v=Dcp0JhwNgmE
14 https://en.wikipedia.org/wiki/Extinction_event
15 http://www.space.com/19681-dinosaur-killing-asteroid-chicxulub-crater.html
16 https://en.wikipedia.org/wiki/The_Age_of_Reptiles
17 https://ccaeducause.files.wordpress.com/2011/01/bernard-meade.pdf
18 https://en.wikipedia.org/wiki/Asteroid_belt
19 https://en.wikipedia.org/wiki/Kuiper_belt
20 https://en.wikipedia.org/wiki/Oort_cloud
21 http://imgur.com/gallery/FTE4Ly9
22 http://www.daviddarling.info/childrens_encyclopedia/comets_QA.html
23 http://www.popsci.com/article/technology/what-nasa-should-do-instead-asteroid-retrieval-mission
24 http://www.space.com/23501-russian-meteor-explosion-asteroid-threat.html
25 http://www.bbc.co.uk/news/science-environment-24839601
26 http://www.computerweekly.com/news/1280090479/Lack-of-funds-puts-Earth-in-shadow-of-asteroid-threat
27 http://www.vox.com/2014/9/16/6226379/nasa-asteroid-risk-location
28 http://www.minorplanetcenter.net/iau/lists/ArchiveStatistics.html
29 https://en.wikipedia.org/wiki/P/2010_A2
30 http://www.popsci.com/science/article/2013-02/how-powerful-new-telescopes-are-helping-us-find-more-asteroids-
hopefully-just-time 31
http://www.britannica.com/EBchecked/topic/39730/asteroid 32
https://en.wikipedia.org/wiki/Occultation#Occultations_by_asteroids 33
Mathematics Magazine, Vol. 72(1999), pp. 83-91 34
https://en.wikipedia.org/wiki/History_of_ancient_numeral_systems#cite_note-13 35
www.lpi.usra.edu/books/AsteroidsIII/pdf/3027.pdf 36
https://en.wikipedia.org/wiki/Ceres_(dwarf_planet) 37
http://www.schillerinstitute.org/fid_97-01/982_orbit_ceres.pdf 38
https://www.math.rutgers.edu/~cherlin/History/Papers1999/weiss.html 39
“Orbital Mechanics: Theory and Applications” by Tom Logsdon, ISBN 0-471-14636-6, p. 164 40
http://www.open.edu/openlearn/science-maths-technology/science/physics-and-astronomy/astronomy/the-naming-asteroids 41
https://groups.google.com/forum/#!topic/b-a-s/bYkwFzW9t7o 42
http://science.nasa.gov/science-news/science-at-nasa/1999/features/ast20apr99_1/ 43
http://www.lawrencehallofscience.org/static/hou/hs/wise/ppt/WISE-Asteroids.ppt 44
https://en.wikipedia.org/wiki/Telescope 45
https://www.youtube.com/watch?v=goL3K_xQzbE 46
http://inventors.about.com/od/cstartinventions/a/CCD.htm 47
https://en.wikipedia.org/wiki/Photodiode 48
Computerworld. August 6, 2001, p.49 49
http://www.digicamhistory.com/1970s.html 50
http://petapixel.com/2010/08/05/the-worlds-first-digital-camera-by-kodak-and-steve-sasson/ 51
http://www.apple.com/iphone-6s/specs/ 52
http://spiff.rit.edu/richmond/asras/catch_plates/catch_plates.html 53
http://www.planetary.org/blogs/emily-lakdawalla/2011/3248.html 54
http://ssd.jpl.nasa.gov/sbdb.cgi?sstr=2007+PA8&orb=1 55
https://answers.yahoo.com/question/index?qid=20080212210936AAHddvM
2016 EMC Proven Professional Knowledge Sharing 43
56
http://www.lsst.org/about/dm 57
http://www.gutenberg.us/articles/big_data 58
http://www.lsst.org/lsst/public 59
https://en.wikipedia.org/wiki/Large_Synoptic_Survey_Telescope 60
http://www.symmetrymagazine.org/breaking/2010/10/18/astronomical-computing 61
http://www.symmetrymagazine.org/breaking/2010/10/18/astronomical-computing 62
https://en.wikipedia.org/wiki/Inverse_problem 63
http://www.theregister.co.uk/Print/2010/11/26/lsst_big_data_and_agile/ 64
http://research.majuric.org/wp/survey-science/large-survey-database/ 65
http://www.lsst.org/about/dm/technology 66
https://en.wikipedia.org/wiki/Hierarchical_Data_Format 67
https://www.cac.cornell.edu/education/Training/Data12/DataFormats2012.pdf 68
http://research.majuric.org/wp/survey-science/large-survey-database/ 69
http://fallingstar.com/home.php 70
http://blog.fallingstar.com/index.php/2015/12/04/our-first-neo/ 71
http://www.leonarddavid.com/asteroid-alert-system-first-light-reported/ 72
https://gears.guidebook.com/guide/39106/event/11384479/ 73
http://fallingstar.com/specifications.php 74
http://wise.ssl.berkeley.edu/mission_faq.html 75
http://wise.ssl.berkeley.edu/mission.html 76
http://www.jpl.nasa.gov/multimedia/wise/ 77
http://wise.ssl.berkeley.edu/mission_faq.html 78
http://wise2.ipac.caltech.edu/docs/release/allsky/expsup/sec8_1.html 79
https://en.wikipedia.org/wiki/Wide-field_Infrared_Survey_Explorer#NEOWISE 80
https://en.wikipedia.org/wiki/Tracking_and_Data_Relay_Satellite_System 81
http://wise.ssl.berkeley.edu/documents/wise/launch/2009-12-03.pdf 82
http://wise.ssl.berkeley.edu/edu_accessing_images.html 83
http://wise2.ipac.caltech.edu/docs/release/neowise/expsup/sec4_1.html 84
http://wise2.ipac.caltech.edu/docs/release/allsky/expsup/sec4_3a.html 85
http://wise2.ipac.caltech.edu/docs/release/prelim/expsup/sec4_3a.html 86
http://www.eso.org/sci/php/meetings/adass2011/Slides/PDF/All/ADASS_XXI_I01_Cutri.pdf 87
https://en.wikipedia.org/wiki/Gaia_(spacecraft) 88
http://esamultimedia.esa.int/multimedia/publications/BR-296/ 89
https://en.wikipedia.org/wiki/Gaia_(spacecraft) 90
http://www.odbms.org/wp-content/uploads/2013/11/Charting_the_Galaxy.pdf 91
http://www.mpia.de/gaia/about/dpac 92
https://en.wikipedia.org/wiki/Data_Processing_and_Analysis_Consortium 93
http://www.intersystems.com/library/library-item/european-space-agency-chooses-intersystems-cach-database-for-gaia-mission-to-map-milky-way/ 94
http://gaia.ac.uk/mission/gaia-dpac 95
http://www.iwinac.uned.es/Astrostatistics/w/manuscripts/deTeodoro.pdf 96
http://www.odbms.org/blog/2011/02/objects-in-space/ 97
https://upload.wikimedia.org/wikipedia/commons/b/ba/MareNostrum_III_cenital_general.jpg 98
http://gaia.ub.edu/?page_id=4327 99
http://www.apc.univ-paris7.fr/~beckmann/common/Gleyzes_Espace_BigData_CNES.pdf 100
http://www.spaceops2012.org/proceedings/documents/id1275512-Paper-003.pdf 101
https://www.youtube.com/watch?v=PkR6LAOgSII 102
https://www.skatelescope.org/location/ 103
https://www.skatelescope.org/layout/ 104
https://en.wikipedia.org/wiki/Square_Kilometre_Array 105
https://www.skatelescope.org/sadt-report-skaenews-july2015/ 106
https://www.youtube.com/watch?v=PkR6LAOgSII 107
https://www.skatelescope.org/frequently-asked-questions/ 108
https://www.skatelescope.org/signal-processing/ 109
https://www.skatelescope.org/signal-processing/ 110
https://www.skatelescope.org/signal-processing/ 111
https://www.emc.com/collateral/software/white-papers/h10938-vnx-best-practices-wp.pdf A RAID 6 (14+2) raid group using 4TB drives contains approximately 50TB of usable space. Twenty of these groups would equal 1 PB. 112
https://www.skatelescope.org/wp-content/uploads/2013/09/SDP-PROP-DR-001-1_ElemConc.pdf 113
https://www.skatelescope.org/wp-content/uploads/2013/09/SDP-PROP-DR-001-1_ElemConc.pdf 114
https://en.wikipedia.org/wiki/Lustre_(file_system)
2016 EMC Proven Professional Knowledge Sharing 44
115
http://www.cam.ac.uk/research/features/masters-of-the-universe#sthash.5RBAd34q.dpuf 116
https://www.skatelescope.org/wp-content/uploads/2013/09/SDP-PROP-DR-001-1_ElemConc.pdf 117
https://www.skatelescope.org/wp-content/uploads/2013/09/SDP-PROP-DR-001-1_ElemConc.pdf 118
https://www.skatelescope.org/wp-content/uploads/2013/09/SDP-PROP-DR-001-1_ElemConc.pdf, p. 48 119
https://www.skatelescope.org/sdp/ 120
https://www.skatelescope.org/software-and-computing/ 121
http://www.top500.org/lists/2015/06/ 122
https://www.skatelescope.org/software-and-computing/ 123
https://en.wikipedia.org/wiki/Timeline_of_astronomy 124
http://minorplanetcenter.net/blog/lets-start-2014-with-a-bang-hello-and-goodbye-to-asteroid-2014-aa/ 125
https://en.wikipedia.org/wiki/2014_AA 126
http://minorplanetcenter.net/blog/wp-content/uploads/2014/01/2014AA-2014-01-02-673_0-by-G96.gif 127
http://kti.tugraz.at/staff/elex/courses/science20/slides/e-science_e-infrastructures_content_mining_week4.pdf 128
https://en.wikipedia.org/wiki/Image_subtraction 129
http://www.asterank.com/about 130
https://github.com/typpo/asterank 131
https://docs.mongodb.org/v3.0/reference/operator/query/lt/ 132
http://www.csicop.org/si/show/is_the_sky_falling 133
http://www.arm.ac.uk/preprints/455.pdf 134
http://dawn.jpl.nasa.gov/multimedia/pdfs/Dawn_Vesta_Ceres_Lithograph.pdf 135
https://en.wikipedia.org/wiki/Asteroid_impact_avoidance 136
“Military Space Power: A Guide to the Issues” by Wilson Wong and James Fergusson, ISBN 0313356807, p. 98 137
http://www.neoshield.net/mitigation-measures/kinetic-impactor/ 138
http://news.discovery.com/space/asteroids-meteors-meteorites/top-10-asteroid-deflection-13013010.htm 139
http://www.travelsinorbit.com/save-the-planet-from-asteroids/ 140
https://en.wikipedia.org/wiki/Yarkovsky_effect 141
“Solar Sailing: Technology, Dynamics and Mission Applications” by Colin McInnes. ISBN 3540210628 p.33 142
http://www.dailymail.co.uk/sciencetech/article-2308660/Animation-released-shows-Nasa-intends-CAPTURE-asteroid.html 143
http://phys.org/news/2008-12-asteroid.html 144
http://www.universetoday.com/90605/nasa-developing-real-life-tractor-beams/ 145
http://www.projectrho.com/public_html/rocket/infrastructure.php 146
http://www.sei.aero/downloads/SEI_LOEM_30March2004.pdf 147
http://discovermagazine.com/2011/apr/14-when-astronomy-met-computer-science 148
https://gigadom.wordpress.com/2011/06/29/to-hadoop-or-not-to-hadoop/ 149
http://www.cray.com/sites/default/files/CP-Cray-NNSA-XC40-Trinity.pdf 150
Top500 is an organization that rates supercomputers (www.Top500.org). 151
https://en.wikipedia.org/wiki/Tianhe-2 152
http://www.hpcwire.com/2014/07/17/dd/ 153
https://d0.awsstatic.com/whitepapers/Intro_to_HPC_on_AWS.pdf 154
https://cloud.google.com/solutions/architecture/highperformancecomputing 155
http://www.nextplatform.com/2015/07/13/top-500-supercomputer-list-reflects-shifting-state-of-global-hpc-trends/ 156
http://www.lanl.gov/science/NSS/pdf/NSS_April_2013.pdf 157
http://www.pcworld.com/article/127105/article.html 158
https://en.wikipedia.org/wiki/Intel_4004 159
HGST Ultrastar C15K600 https://www.hgst.com/sites/default/files/resources/Ultrastar_C15K600_SAS_Spec_V1.4.pdf 160
https://www.hgst.com/products/hard-drives/ultrastar-he10 161
http://www.theregister.co.uk/2015/08/18/dssd_nvme_fabric_flash_magic/ 162
http://insidehpc.com/2015/04/taccs-wrangler-uses-dssd-technology-for-data-intensive-computing/ 163
https://en.wikipedia.org/wiki/Astroinformatics 164
https://en.wikipedia.org/wiki/Astrostatistics 165
https://en.wikipedia.org/wiki/Declination 166
https://en.wikipedia.org/wiki/Right_ascension
2016 EMC Proven Professional Knowledge Sharing 45
EMC believes the information in this publication is accurate as of its publication date. The
information is subject to change without notice.
THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS.” EMC CORPORATION
MAKES NO RESPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO
THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Use, copying, and distribution of any EMC software described in this publication requires an
applicable software license.
Top Related