"Magnified" pixels with R-G-B
values
Help:
Beyond looking at pictures
We can do more with these data than just show them as images, because unlike analog photographs these data are numbers.
Viewing data pixel-by-pixel
A pixel (or "picture element") is one dot on your screen, representing one set of red, green and blue values. Click here to see a table of simulated pixels. Every pixel is a mixture of red, green and blue light, representing three bands of reflected EMR.
Be aware that colored light does not mix the same way colored paint does. The most surprising combination is probably red and green combining to make yellow (a traffic light is a good mnemonic).
Note that the unstretched Garden City pixels all look pretty muddy-brown. (Look at the stretched and unstretched Landsat images.) The stretched images resemble more closely the bright mixes on the right. Also note that any even R/G/B mix yields a gray of some brightness.
Viewing data by spectral signatures
A spectral signature is a characteristic set of reflectances over the electromagnetic spectrum. Different objects (water, concrete, grass, etc.) reflect different amounts at different wavelengths. Click here to see simple spectral signatures from the Garden City images. The MSS sensor was designed to have bands that are useful in distinguishing different types of ground cover. Here you can see that healthy vegetation looks different from harvested land, and both are different from open water.
The incline between red and NIR is called the "red edge". Vegetation indexes use this difference to gauge the health of vegetation. The NDVI (normalized difference vegetation index), for example, is the ratio of the difference to the sum.
Viewing data by histograms
Look at the histogram of two quarter-sections (each .5 x .5 mi). It shows how many pixels there are at the various brightnesses. After irrigation the red/NIR values split dramatically. What else can you read from this graph?
First of all, the irrigated quarter-section was more diverse; the solid lines contain more area away from their peaks than the dotted lines do, since round sprinklers on square plats leave the corners dry. Also, the visible-green and visible-red values (shown as the blue and green lines) flip with irrigation; the higher green values are why corn looks green.
Vegetation indexes (NDVI):
From a scene of Landsat data, we can calculate a vegetation index-- a measure of the land's "greenness". Remember that healthy vegetation absorbs visible light, especially red light, and reflects near-infrared. So if a pixel has a large difference between MSS bands 4 and 2 (near-infrared and red-- we could also use bands 3 and 2), we know that the pixel probably represents an area covered by vegetation. The "simple vegetation index" formula is just this difference: VI = band 4 - band 2, and we could show this difference as a grayscale (monochrome) image, with bright pixels representing high VI values.
But a better index is the "normalized difference vegetation index" or NDVI; it helps to compensate for some problems with the simple vegetation index. For example, two identical patches of vegetation could have different VI values if one were in bright sunshine and another under a hazy sky; the bright pixels would all have larger values, and therefore a larger absolute between-bands difference. What really matters is the difference in proportion to total illumination, so NDVI is the ratio of the difference to the sum: NDVI = (band 4 - band 2) / (band 4 + band 2). There are many variations of vegetation indexes, and even of NDVIs, that follow this basic formula.
Healthy vegetation will have a high NDVI value. Bare soil and rock reflect similar levels of near-infrared and red and so will have NDVI values near zero. Clouds, water, and snow are the opposite of vegetation in that they reflect more visible energy than infrared energy, and so they yield negative NDVI values.
Clouds are a problem for calculating NDVI, because they obscure the vegetation below. To work around this, NDVI is typically calculated using multiple images of the same area, hoping that at least one will be cloud-free. For example, the EROS Data Center uses 14 consecutive days of AVHRR data (from the Advanced Very High Resolution Radiometer on NOAA satellites) to make NDVI products; for each pixel of each band, basically, the highest of the 14 values is used and the lower 13 thrown out. We assume that the highest value represents the least cloudy day. So the clouds could interfere only if they covered an area for all 14 days. This is an advantage of AVHRR data over Landsat data, which repeat every 16-18 days instead of daily.
Once we have 26 such biweekly datasets to cover a year, we can easily make calculations about growing season, maximum greenness, and rates of vegetation "greenup" in the spring and senescence in the fall.
Classifications:
Remotely sensed data are often used to determine land cover. In other words, the pixels in a scene (or a mosaic of scenes) are classified into categories such as forest, urban, agricultural, water, etc. This is possible because different land covers have different spectral "signatures". A graph of reflectance across the EMR spectrum (reflectance vs. wavelength) would look different for forest and prairie, and different even for corn and wheat. The bands of satellite sensors are designed to detect these signatures.
Our above discussion of NDVI actually laid out a kind of three-category classification-- vegetation (high NDVI), soil/rock (near-zero NDVI), and clouds/water/snow (negative NDVI). Of course, useful classifications often use more than two bands and typically are more detailed.
One common classification system is the USGS Anderson system, which was designed in the 1970s for remote sensing and has four levels of detail. Level I has nine broad categories (urban, agricultural, rangeland, forest, water, wetland, barren, tundra, ice). Levels II, III, and IV are progressively more detailed. For example, Level I includes simply forest, but Level II includes deciduous (hardwood) forest, coniferous (evergreen) forest, and mixed forest.
A classification can be either supervised or unsupervised. A supervised classification uses areas on the ground that we already know about-- we use the spectral signatures of these areas as a "key" against which to compare unknown areas. For example, suppose we wanted to classify a giant forest that was like a checkerboard of deciduous clumps and coniferous clumps. We could analyze the data covering one area that we know to be deciduous and one that we know to be coniferous. Then we would determine the distinctive signature for each type, and tell the computer to look at every pixel's signature and tell us whether it was more like the deciduous signature or more like the coniferous signature. That would be our supervised classification. In an unsupervised classification (of a totally unknown area, for example) we just tell the computer to arrange the pixels into clusters of pixels with similar signatures. Then we try to decide what each cluster is, and then perhaps tell the computer how to rearrange or divide some clusters into better clusters.
One good example of classification is the Land Cover Characterization Program (LCCP) of the U.S. Geological Survey and other organizations. First, it breaks the whole world into 1-km squares and decides what land cover is on each square. Then it breaks the U.S. into 30-m squares and decides what land cover is on each square. And then it does a similar job for U.S. cities in even more detail, including not only land cover (what is sitting there), but also land use (what people are doing there).
However we do a classification, at some point we usually make the result into another image. We might display our deciduous/coniferous forest as an image of blue pixels for deciduous and red pixels for coniferous, and perhaps gray pixels for "unknown" or "other". (This would be a crude "thematic map"-- hence the name of the Landsat TM sensor.) The new image lacks much of the shaded detail of the "raw" image, but it is easy to read, it tells us about what we are interested in, and it puts meaningful knowledge into our heads that we wouldn't have gotten from the raw image. In other words, we have obtained information from the data.
This data-to-information process often continues, with our classification's output becoming an input to another round of analysis. For example, we might run a model (a computer simulation making predictions) of weather patterns over our forest, and one input to that model would be land cover, since deciduous and coniferous trees affect the weather differently by how much water they transpire into the air. This weather model might produce an output image of expected rainfall, and that image in turn might feed a model of expected fire danger, and so on. At any stage of this process, we can feed the data (as images) straight into our brains, or we can feed the data (as numbers) into our computers for another round of calculating, classifying and modeling.
Back to the main Help / Garden City article...
Bookmark
www.usgs.gov/Earthshots
for Earthshots,4th ed., 14 February 1999, from the
EROS Data Center of
the U.S. Geological Survey,
a bureau of the U.S. Department
of the Interior.