Ars2.gif (15048 bytes)

3.4 Validity

Blind Comparison of Automated and Manual Observations

3.4.1 Purpose

The purpose of this evaluation methodology was to determine the validity of ASOS generated observations by comparing them to observations made by a human observer. This blind comparison was done to ascertain whether the differences between the ASOS and the manual observations were within selected parameters.

 

3.4.2 Identification of Sites

The ASOS assessment team compared data from six pre-commissioned sites chosen on the basis of congressional interest and user complaints (see Table 3.4-1). Figure 3.4-1 shows a map of the ASOS sites that were included in the Validity—Blind Comparison of Automated and Manual Observations Methodology.

3.4.3 Evaluation Methodology

3.4.3.1 Process

The process for this task was a blind comparison of ASOS generated observations to human observations from selected sites over a period of 30 days from May 15 to June 15, 1997. The data was analyzed to determine whether the differences between the machine and human observations were within selected parameters.

As part of the re-assessment strategy, system access (i.e., access by the contract weather observers to the computer display of the weather observation generated by the ASOS) was shut off in order to conduct a "blind comparison" test.

3.4.3.2 Assumptions and Limitations

These assumptions were made prior to the start of this comparison:

The observation from the human observer is considered the baseline for comparison purposes.

The observer was carrying out a Basic Weather Watch, therefore the human observer may not necessarily report a condition as soon as it occurs.

The Operator Interface Display, if available, was disabled, therefore the observer could not augment or backup the ASOS observations.

3.4.3.3 Parameters and Comparison Method

The following observational parameters were evaluated:

Sky condition, i.e., total cloud cover

Visibility

Present weather

Ambient temperature

Dew point

Wind direction and speed

Altimeter setting

For this portion of the study, weather data from the ASOS was compared to manual observation data for the exact same time period. In order to eliminate the need for subjectively matching individual observations, it was assumed that a report remained valid until it was replaced with a subsequent report. The criteria listed in Table 3.4-2 were used to determine "unrepresentative" conditions, defined as the differences between the ASOS and the observer data that were greater than a specified threshold value. Inherent to the comparison is the assumption that the human observation was produced with no knowledge of what the ASOS was reporting (i.e. "blind" comparison).

In this analysis, the term "within tolerance" is used to describe instances when the ASOS and the manual observation either agree or are within selected parameters. The term "out of tolerance" refers to instances when the difference between the ASOS and the manual observation are not within selected parameters. Tolerances used were for the most part the same ones used for the Direct Observational Comparison portion of the 1995 ASOS Aviation Demonstration Report. Tolerances not specified during that study or ones not matching the study are described in the text.

 

3.4.3.4 Data Source

The primary source for collecting this data was received by downloading the ASOS data directly from the ASOS site and the manual data from the University of Michigan Internet weather web page. The data was compared only for those time periods when the weather element under study was present in both the ASOS and the manual observation. For example, the visibility was only compared for those hours when both the human and the ASOS were reporting visibility. If the visibility measurement was missing for either source, that time period was deleted for visibility comparison purposes, for both the ASOS and the human. If, however, all other measurements were present during that time period, those elements would be compared, for example, winds, temperature, etc.

 

3.4.3.5 Software Tools

The tools used to analyze the data were the Parse Data software developed by NWS, Microsoft ACCESS Database and Microsoft EXCEL Spreadsheet. The Parse Data software tool developed by NWS has been used in two previous studies and is described in detail in the paper Comparability Between Human and ASOS Ceiling/Visibility Observations. This tool had to be updated to process the METAR weather observation code, which has been in use since July 1996.

 

3.4.4 Results

3.4.4.1 Overall Comparability

This analysis calculated the comparability for nearly 5,000 hours of coincident ASOS and human observations and the overall results are graphed in Figure 3.4-2. Comparability measurements for the individual elements are discussed in the following paragraphs with an emphasis on visibility differences.

Figure 3.4-2a Bluefield, WV Chart

Figure 3.4-2b Butte, MT Chart

Figure 3.4-2c Livingston, MT Chart

Figure 3.4-2d Miles City, MT Chart

Figure 3.4-2e Ponca City, OK Chart

Figure 3.4-2f Wausau, WI Chart

 

3.4.4.2 Comparability of Visibility Observations

Human and ASOS visibility for the six sites in the analysis were comparable 95% of the time. This value is similar to the 96% reached in the ASOS Aviation Demonstration conducted in 1995. Figure 3.4-3 shows the comparability of visibility by site. An analysis of the comparability across the range of visibility is shown in Figure 3.4-4. It is evident from this figure that the vast majority of the visibility data available for analysis was for visibility that were generally good (greater than 7 miles). For example, there were 42 hours of available data for visibility up to 2 miles, while 1363 hours of data were available for visibility greater than 12 miles.

Although the human observation was selected as the baseline for this study, there are some factors that may explain the differences between the human and the ASOS. At the site with the lowest comparability (BLF), the observer’s furthest visibility marker during the night is a light that is 3/4 miles from the point of observation. After dark, at BLF the human must estimate visibility values that are greater than 3/4 miles. For example, if the human estimates the visibility to be two miles because the light clearly is visible but there are no further markers, then an ASOS reported visibility of three miles would be determined to be "out of tolerance."

Another factor that must be considered is the visual acuity difference between observers. It is not unusual for an observer to become so familiar with the terrain that he only has to see where an object is, in relation to nearer and clearly visible marks, to know what the object is. The observer would theoretically report a greater visibility than someone who is unfamiliar with the terrain would and unable to distinguish the distant object would. In thirty years of visibility research by the Air Force Cambridge Research Labs (AFCRL), developing calibrations for visibility sensors, they found human threshold (acuity) variations of ±25% and total variation of 50% between observers.

Although instrument flight rules (IFR) and marginal visual flight rules (MVFR) conditions were present during 195 hours out of a total of approximately 5,000 hours, analysis shows that comparability was less than expected. For the range of zero to two miles, the ASOS and observer values were within the specified limits, 54% of the time; in the range above two miles up to four miles the agreement was 49%; while at five miles the ASOS agreed with the human observer 47% of the time. (See Figure 3.4-4)

During some of the periods of time when these differences occurred, rain showers or thunderstorms were present. Due to the variable nature of this type of precipitation, differences in visibility values are possible as a result of the different vantage points of the ASOS and the human observer. The human observer may also report a lower visibility value due to a rain shower occurring over a sector of the field that contributes to a lower prevailing visibility. Conversely, if the precipitation activity and its associated lower visibility is over the ASOS, then the human’s visibility, which takes into consideration higher visibility from all quadrants, will result in higher visibility values.

3.4.4.3 Comparability of Ceiling Observations

A direct comparison of human observed ceiling heights to ASOS measured ceiling heights was not practical for this study because the human observers at the sites studied did not have ceilometers. Although ceiling height could not be analyzed, sky cover, which is determined independent of observer access to ceilometers, was within tolerance 96% of the time (See Figure 3.4-5).

In the previous study (1995 ASOS Demonstration Report) the observers had access to information from the pre-commissioned ASOS observations and used the cloud height sensor, if they thought it was reasonable, as a tool to form their cloud height measurement. In contrast, this study was done "blind" and the observers at the six sites did not have access to a cloud height indicator. Observers have a variety of options to assist them in either measuring or estimating the ceiling. These options include Pilot Reports (PIREPS), ceiling lights (which involve triangulation using a light during darkness), Pilot Balloons (PIBALS), convective cloud tables, usage of objects of known heights, and estimation of cloud heights based upon observer experience.

While these options for measuring cloud height existed before the start of the re-assessment, observers at a few of the sites commented that the pre-commissioned ASOS cloud height sensor was very accurate and that they routinely used the ASOS output as a tool to develop their ceiling height measurement. They commented that the "blind" aspect of this study took away that particular tool, so they had to resort to estimating the ceiling height based upon the other options mentioned above.

This study pointed to another interesting aspect of the differences between human and ASOS ceiling measurements. Figure 3.4-6 shows that observers have preferred cloud height reporting levels, while a more natural distribution is reported by ASOS.

Although observers are instructed to report cloud heights to the nearest 100 feet for clouds below 5,000 feet, there is a very natural tendency of the human observer, when estimating the ceiling, to choose 500 foot increments when estimating the ceiling between 1,000 and 5,000 feet. Since the ASOS has no such preference, some "out-of-tolerance" measurements can be attributable to this fact, because the "within tolerance" value for ceilings within this range is only 300 feet. Differences between the human and the ASOS measurement can also be introduced by the different methods of determining ceiling height. Each of the methods has strengths but their weaknesses can very easily contribute to the differences. These methods are described in the following paragraphs.

 

3.4.4.3.1 Using PIREPS to Determine Cloud Heights for Ceilings.

Generally, the usefulness of a PIREP is dependent upon the pilot’s location relative to the phenomena that is being reported. A PIREP provided on approach to the airfield while still 15 miles away may not be as representative of the ceiling as a PIREP from an aircraft close to the airfield. At night with many clouds or with no moonlight, errors can compound. A report of cloud height from a pilot may be very good, but there are difficulties for both the pilot and the observer in discerning the true character of the overall sky cover (FEW, SCT, BKN or OVC.)

 

3.4.4.3.2 Using Ceiling Lights and PIBALS to Determine Cloud Heights for Ceilings

Ceiling lights and PIBALS are based upon simple methods of measuring the distance from the ground to a cloud base. With the ceiling light, the observer turns on a light at a known distance (baseline) from the observer, pointed vertically. The observer then uses a theodolite to measure the angle formed from the observer to where the light reflects off of a cloud base, and then solves an equation to determine the distance from the light to the cloud base. When using this method, the measurement has an error estimate of ±5% of the cloud height. It can be used to measure ceilings up to ten times the length of the baseline. For example, if the distance between the observer and the ceiling light is 500 feet, the ceiling light can be used to measure ceilings between 50 feet and 5,000 feet with an error estimate at 5,000 feet of ±250 feet. While this method works well at night for cloud heights that fall within its range, errors may be introduced because sky condition (FEW, SCT, BKN or OVC) at night is still very subjective due to totally dark or low light conditions.

A PIBAL helps the observer determine the cloud height during the daylight hours by measuring the time it takes the balloon to go into the cloud. The elapsed time is multiplied by the standard rise rate of the balloon to compute the height of the cloud. Although no error estimate for this method was available, it is reasonably accurate for low clouds in low wind conditions during daylight hours. The accuracy of the height obtained by the balloon will be decreased when the balloon does not enter a representative portion of the cloud base or is used at night with a light attached or when used during occurrences of high winds above the surface or heavier forms of precipitation.

 

3.4.4.3.3 Using the Convective Cloud Height Base Diagram

This method can only be used to estimate heights of cumulus clouds formed in the vicinity of the station and cannot be used in mountainous or hilly terrain or to determine the height of other than cumulus clouds. This diagram is most accurate when used to determine the height of cloud bases below 5,000 feet. The observer obtains the temperature and dew point and uses this standard diagram to determine the cloud height.

 

3.4.4.3.4 Using Heights of Known Objects to Determine Cloud Heights for Ceilings

Ceiling heights also may be estimated by using known heights of unobscured objects less than 1 1/2 miles from the airport. This can be helpful in estimating lower ceilings at airports with taller objects in the vicinity. Under the previous weather code (SAO) when the terms "measured" and "estimated" were used to describe how the ceiling measurement was obtained, getting information from known objects (along with ceilometers and ceiling lights) was classified as a "measured" ceiling, whereas all other methods were considered "estimated."

 

3.4.4.3.5 Using Observational Experience

If other guides for determining ceilings are lacking or, in the opinion of the observer, are considered to be unreliable the observer uses his/her experience to estimate the ceiling.

 

3.4.4.4 Differences in Wind Direction and Speed

Wind direction differences fell within the selected tolerance of ±30 degrees 86% of the time, while wind speed differences fell within the selected tolerance of ±5 knots 93% of the time. The wind direction comparability result is lower than the results from the 1995 ASOS Demo (98% ), but tolerances used in the earlier study are not known.

Figures 3.4-7 and 3.4-8 show wind direction and speed variances. Some variances may be attributable to siting differences.

 

3.4.4.5 Comparability of Present Weather

Liquid precipitation was the only precipitation type evaluated during this study due to the springtime weather conditions. During the times that the human observer reported some type of liquid precipitation, the ASOS reported rain 95 % of the time as shown in Figure 3.4-9. The comparison was considered to be within tolerance when both the human and the ASOS reported some type of precipitation, for example, if the human observed drizzle and the ASOS reported rain, or if fog or mist was reported by both. (The only options for the ASOS at this time are rain, snow, freezing rain and unknown precipitation.) Fog and mist were considered the same since they are differentiated solely upon the basis of visibility measurement. Some of the differences in this area can be attributed to either the observer reporting very light precipitation that the ASOS cannot discern, or a showery precipitation over the observer but not over the ASOS. Some error in this comparison measurement is also introduced due to the fact that when rain begins or ends, it is not required to be put into the observation immediately (a SPECI observation) but is added to the remarks section of the next METAR or SPECI observation. The remarks were not analyzed in this study, but if they were, it would most likely have lead to an increase in comparability for present weather.

3.4.4.6 Differences in Temperature and Dew Point

Temperature and dew point differences were within tolerances 84% and 89% of the time respectively. Tolerances for temperature were ± 2° Fahrenheit when the wind speed was greater than 6 KTS and ± 10° when the wind speed was 6 KTS or less. Dew point tolerances were ± 4° Fahrenheit. Temperatures in this study were converted from Celsius to Fahrenheit to allow the use of the tolerance values from the 1995 ASOS study which were in degrees Fahrenheit. Since differences of two degrees and three degrees Celsius correspond to differences of 3.6 and 5.4 degrees Fahrenheit respectively, the allowable Fahrenheit tolerance of four degrees meant no temperature difference of exactly 4° Fahrenheit would be measured. This reduced tolerance may account for some degree of reduced comparability.

Since the observer reports the temperature reading directly from either a thermometer or direct readout hygrothermometer, the type of instrument, siting, and ventilation of the observer’s temperature sensor may account for some of the variance with the ASOS sensor. It was found that of the times where the difference between the ASOS and the human measured temperature were out of tolerance, the ASOS temperature was cooler. This would agree with past analysis showing that a standard hygrothermometer used by weather observers has a warm bias and problems with solar heating. (This same study also shows a rather large variation among different instruments of the same type.)

 

3.4.4.7 Differences in Altimeter

The altimeter setting at 97% comparability was lower than in previous studies. A deviation of ± 0.02in. Hg was considered acceptable for this study (see Figure 3.4.-10). Most of the significant differences between the human and the ASOS were due to human typographical errors. This error may have occurred in various places: while the human was reading or recording the altimeter setting, during the process of communicating the altimeter setting to the FSS to transmit, or during the process of typing in the observation for transmission. The errors were so far out of the range of normal altimeter settings that the error was obvious or the transposition of numbers was obvious (i.e. human 20.92 Hg vs. ASOS 29.92 Hg.)


ARW_HOME.HTM                     Table of Contents