One of the major objectives of this project was to probe the users in regard to interpretation and understanding of the terminology used in NWS forecasts, including the probability of precipitation (POP) statement. Many other researchers (Sink 1993; Vislocky et al. 1995; Last and Skowronski 1990; Shaefer and Livingston 1988; Murphy and Brown 1983b; Murphy et al. 1980) have underscored both the positive and negative aspects of POP forecasts.
The idea behind the usage of a percentage to describe the probability of precipitation is a sound one. It should be logical to almost everyone that a 40% chance of rain is much less than an 80% chance. This seems to be a very concise way to describe the inherent uncertainty surrounding any precipitation forecast. Yet it is quite apparent that often a gap exists between the forecaster and the user as to the interpretation of the prediction. While some have argued that the general public does not understand probability, Murphy et al. (1980) found that the primary source of misunderstanding is caused by confusion about the specific event corresponding to the probability and not by a lack of comprehension of the definition of probability itself. The perpetual misinterpretation of our forecasts by the public indicates a persistent problem in forecast wording. One of the goals was to attempt to highlight the communication problems specific to local Juneau forecast products, and in the process, begin to bridge this gap in understanding.
In order to gauge the understanding of POP statements by forecast users, the questionnaires included a variety
of questions covering subjects like forecast accuracy and interpretation using both numerical POPs and
descriptive words. The first such pair of questions asked,
For the first question, respondents could circle anything from 0% up to 100% (in increments of 10%). For the second question, respondents were given a variety of terms currently used in NWS forecasts. Results from this pair of questions were compared to look for consistency between verbal qualifiers and their numerical counterparts. "Likely" was the most popular term chosen (34%). This matches up well with the results from the numerical question, where 35% of the responses fell in the 60% to 70% range, which are the POPs the NWS uses with the word "likely."
Many of the remaining the responses (34%) fell in the 80% to 100% range, considered by the NWS as "categorical" precipitation. It is common practice by the NWS to drop verbal qualifiers in a categorical precipitation forecast and simply say, "Rain," or whatever precipitation type is appropriate to the situation. This was what the authors had in mind when the answer choice none was included in the question. Unfortunately, it was noticed that respondents who circled 0% on the previous question were choosing none, evidently thinking that none described a 0% chance of rain. The other categorical word we included was developing, and this did correlate fairly well with the amount of high POP responses; 24% of the respondents chose developing as the phrase which would cause them to alter their plans, dress of schedule, making it the second most popular response.
The final results from these questions compared well with the Sink (1993) numbers on these questions. In both cases, 35% of the respondents chose percentages in the "likely" range (60-70%), and 34% fell into the "categorical" range (80-100%).
To determine whether or not the public interprets the verbal qualifiers in the same way they are used by the NWS, the respondents were asked to assign percent probabilities to several of the terms commonly used in public forecasts. One questionnaire contained words and phrases used to characterize spatial uncertainty (scattered, widely scattered, frequent, few, areas of, numerous, and isolated), and the other contained general expressions of uncertainty and temporal qualifiers (slight chance, likely, chance, possible, developing, occasional, ending, and periods of).
The results showed that the respondents generally did well in matching the trend from low to high probability terms when compared to the POP/verbal qualifier combinations used by the NWS. However, there were some notable differences. The respondents gave scattered a slightly lower POP than widely scattered. The reason for this discrepancy may be in how one interprets the word "widely." The National Weather Service uses "widely" to mean considerable space between convective cells, while the respondents may be interpreting "widely" as "widespread." Another difference is in the POPs assigned to the terms few and isolated. Respondents placed these qualifiers up near 30%, which is 20% higher than the POP assigned to these terms by NWS. This shows that usage of these qualifiers may not be implying the low POPs that the NWS intended. (See tables below.)
| Probability Percentage | Expression of Uncertainty | Equivalent Areal Qualifier (convective only) |
|---|---|---|
| 10% | slight chance | isolated, few |
| 20% | slight chance | widely scattered |
| 30%, 40%, 50% | chance | scattered |
| 60%, 70% | likely | numerous (or none used) |
| 80%, 90%, 100% | (none used) | (none used) |
| Term | Survey Mean | Probability Percentage |
|---|---|---|
| slight chance | 19.7% | 10%, 20% |
| few | 28.0% | 10% |
| ending | 31.7% | 80%, 90%, 100% |
| isolated | 34.0% | 10% |
| scattered | 34.0% | 30%, 40%, 50% |
| widely scattered | 34.3% | 20% |
| chance | 41.8% | 30%, 40%, 50% |
| areas of | 43.1% | 80%, 90%, 100% |
| occasional | 50.9% | 80%, 90%, 100% |
| developing | 52.9% | 80%, 90%, 100% |
| periods of | 56.0% | 80%, 90%, 100% |
| likely | 62.5% | 80%, 90%, 100% |
| frequent | 66.5% | 80%, 90%, 100% |
| numerous | 72.3% | 80%, 90%, 100% |
The NWS often adds words to describe a categorical precipitation forecast (80-100%) to provide more detail, such as timing, duration, or area-wide extent of the event. The terms most commonly used at the Juneau WSFO are ending, areas of, occasional, developing, periods of and frequent. None of the these terms scored mean probabilities at or above 80%, which is how they are used by the NWS. In fact, the term ending scored among the lowest of all the probabilities assigned by the respondents.
NWS forecasters use the term ending to describe a situation where precipitation is occurring or will be occurring for part of the period but is expected to end before the forecast period is over. The POPs are high, because measurable precipitation will surely occur before the event is over. However, the respondents gave ending an average POP of 30%. A reasonable explanation for this may be that the respondents applied the numerical percentage to describe the amount of time it would rain (snow), instead of the probability of precipitation occurring at all.
This conclusion is further supported by the fact that respondents consistently assigned considerably low POPs to all but one of the duration qualifiers listed. The terms occasional, developing, and periods of were given mean percent probabilities near 50%. A forecaster might use the term occasional to forecast an event where it rains on and off during the day, but the public may be expecting it to rain for only half of the day. In other words, public perception of the accuracy of an "occasional rain" forecast may be damaged if it rains during most of the day, whereas the forecaster may feel the forecast was justified. The only duration qualifier to receive a fairly high POP was the term frequent (66.5%).
The discrepancies between the respondents' POPs and the numerical probabilities used by the NWS for the spatial qualifiers was less pronounced, however there is still evidence to suggest a similar problem in the usage of the POP. Specifically, the respondents may be applying the numerical percentage to the amount of area covered by precipitation. Numerous received an average probability of 72.3%, and areas of only received a 43.1% mean POP.
All of the verbal qualifiers tested in this project can be very useful in providing detail to the public in a precipitation forecast. It is important, however, for the forecaster to keep in mind that the user may misinterpret the numerical POP as a description of the verbal qualifier and not the event itself. If this interpretation is in sharp contrast to the idea the forecaster wishes to convey, he/she should consider rewording the forecast.
Another way to test for variability in public reaction to numerical and verbal probability statements was to see
if the wording of the forecast influenced the perception of accuracy. This required the use of a question where
a verbal qualifier and its equivalent POP could be interchanged. The wording of the question was altered
slightly with each usage, but each version generally stated,
Respondents were given a choice of four words to rate the forecast: excellent, good, fair or poor. A second version of the above question was worded similarly but added a 30% POP. When the results from these two questions were compared, 61% of the respondents rated the chance forecast as "good" or "excellent," but when the 30% was added, this number rose to 67%. Similarly, 33% of the respondents rated the forecast without the POP as "fair," and only 26% rated the forecast with the POP that way.
This same question was also used with the verbal qualifier likely, as well as with a 70% POP to provide further comparison. This time 70% of the respondents who received the likely forecast rated it as "fair" or "poor." The people who received a forecast for a 70% chance of rain were more critical, as 81% considered the forecast to be "fair" or "poor."
The results show that respondents had stronger opinions about the forecasts with numerical POPs than with verbal qualifiers alone. The relationship between likely and 70% appears to be less solid than that of chance and 30%. Respondents were most critical of the 70% forecast miss, probably because there is a high expectation that it is going to rain. The fact that the likely forecast was not rated as poorly indicates that the verbal qualifier does not convey the same level of probability.
This group of questions provided more useful information when each person's responses were cross referenced to see if their answers matched for each pair. In other words, we checked to see how many respondents rated the question with the numerical POP the same as its similar verbal version. Only 47% of the respondents marked the same answer for both versions (likely vs. 70% POP), meaning that the majority of respondents did not consider a likely forecast as the same thing as a 70% chance forecast. The results were closer for the questions relating chance with a 30% POP, with 63% rating both forecasts identically, but nearly 4 out of 10 people rated them differently.
Another way to understand the public's interpretation of POPs was to use a question which plainly asked for a
definition. The question stated,
These questions were also cross referenced to determine if the people answered the two questions similarly. Because both versions of this question were given on the same questionnaire, respondents should have been aware that they were answering similar questions. Therefore, if they answered the same for both questions, it could be assumed that the respondent felt he/she knew the answer. Those who answered differently to both questions could either be guessing or may have actually felt that there were different definitions for numerically and verbally expressed POPs. The cross reference showed that only 43% had the same answer for both questions, and only one respondent out of 242 answered both questions correctly. Sink performed the same cross reference and found that 59% of her respondents had consistent answers, and only 7% were consistent and correct.
Basically, 0% of the respondents to the Juneau questionnaire chose the NWS definition for a POP forecast. The most popular answer to both questions was "C," which is the choice most similar to the correct answer. For the numerical version, 85% of the respondents chose "C", while only 50% of the respondents picked "C" for the verbal version. Again, this strongly suggests that the verbal expression of probability, and the word likely in particular, does not convey the same idea as the equivalent numerical POP. Interestingly, answer (D), which applied the percent to area coverage, received 34% of the responses when the word likely was used. This trend, combined with the fact that 57% of the respondents answered the two questions differently, may suggest that the verbal qualifiers may actually be taking meaning out of the forecasts instead of adding helpful detail.
It is worth noting that the results from this question pair, in terms of the lack of correct responses, may be due to confusion about the wording. The part which says, "for example, your house," may have tricked people into thinking it was the wrong answer. One respondent wrote, "I don't expect every weather prediction I hear on the radio to occur at my house." This comment indicates that he/she had the right idea but chose answer "C" on the questionnaire. This question pair may need to be reworded to better contrast the answer choices without using the phrase, "for example, your house."