Type of Document Thesis Author Shafer, Phillip Edmond Author's Email Address Phil.Shafer@noaa.gov URN etd-07022004-120856 Title Developing Statistical Guidance for Forecasting the Amount of Warm Season Afternoon and Evening Lightning in South Florida Degree Master of Science Department Meteorology, Department of Advisory Committee
Advisor Name Title Dr. Henry Fuelberg Committee Chair Andrew I. Watson Committee Member Dr. Jon Ahlquist Committee Member Dr. Paul Ruscher Committee Member Keywords
- South Florida
- lightning climatology
- statistical forecasting
- logistic regression
Date of Defense 2004-06-22 Availability unrestricted AbstractFourteen years of cloud-to-ground lightning data from the National Lightning Detection Network, and radiosonde releases from Miami and West Palm Beach, are used to develop statistical guidance equations that forecast the amount of warm season afternoon and evening lightning that is expected over two areas of South Florida that are serviced by Florida Power and Light Corporation (FP&L)-- the eastern halves of Miami-Dade and Broward Counties. A total of 54 parameters are calculated from the soundings to serve as candidate predictors for the equations. These include parameters that describe wind direction and speed in various layers, moisture, temperature, and stability. Day number, persistence, and same day morning lightning also are included as potential predictors of afternoon lightning.
A variety of statistical modeling techniques is attempted initially, but many are found to be inappropriate. The best results are obtained by creating four quartile groups of flash count based on climatology, and then using binary logistic regression to develop three prediction equations for each domain, one giving the conditional probability of a quartile one (Q1) lightning event, another for the probability of a quartile three (Q3) or greater event, and a third equation giving the probability of a quartile four (Q4) lightning event. Principal component analysis is used to select a subset of non-redundant predictors that have the greatest physical relevance to convection and lightning in South Florida. The final candidate sounding predictors are the vector mean 1000-700 hPa cross-shore wind component and speed, the K-index, modified Lifted Index, and the temperature at 900 hPa. Non-linear effects are considered by including second, third, and fourth order terms as additional candidate predictors. A combination of stepwise screening and cross-validation is used to select the variables that best generalize to independent data. To determine the most likely quartile of lightning activity, a decision tree scheme is constructed using probability thresholds for the three equations. Finally, the resulting prediction schemes are tested independently using k-fold cross-validation.
The dominant effect in each of the equations is the component of the wind perpendicular to the coastline which is found to have a significant non-linear relationship with lightning activity. Other important variables are the K-index and modified Lifted Index. Day number, persistence, and same day morning activity also are selected as important indicators of afternoon lightning in the two domains.
When each year is treated independently, the Miami-Dade scheme correctly forecasts the quartile ~ 37% of the time and is correct to within one quartile of the observed ~ 79% of the time. The scheme for eastern Broward County forecasts the correct quartile ~ 36% of the time and is correct to within one quartile ~ 77% of the time. The prediction schemes generally are superior to persistence and climatology for both the dependent data and during k-fold cross-validation. Thus, they possess real forecast skill. For example, when forecasting the correct quartile, these results are a ~ 4-6 percentage point improvement over persistence, and ~ 11-12 percentage point improvement over climatology. In terms of correctly predicting to within one quartile of the observed, the two schemes are an improvement over persistence by ~ 6-8 percentage points and over climatology by ~ 14-17 percentage points. Further analysis shows that the two schemes rarely forecast the upper two quartiles when no activity is observed. Additionally, correct predictions of Q4 events are shown to increase with flash count within the Q4 category. Overall, the cross-validation results show only a 1-2% reduction in skill from what is obtained for the fourteen years of dependent data, demonstrating that the two schemes are statistically robust, and can be expected to achieve similar results when implemented operationally.
Filename Size Approximate Download Time (Hours:Minutes:Seconds)
28.8 Modem 56K Modem ISDN (64 Kb) ISDN (128 Kb) Higher-speed Access Lightning.pdf 3.11 Mb 00:14:23 00:07:24 00:06:28 00:03:14 00:00:16