DATA ANALYST
Tools
Microsoft Excel
Microsoft Word
Microsoft PowerPoint
Python
Tableau
Skills
Data Extraction and Cleaning.
Data Exploration.
Data Preprocessing.
Machine Learning with Python
Joining tables
Data
Dataset Weather Prediction
Project brief
This is a Predicting Weather Variations with Machine Learning for ClimateWins a European nonprofit organization. Cimate Wins is interested in using machine learning to help predict the consequences of climate change around Europe and, potentially, the world.
Objective
Identify weather patterns outside the regional norm in Europe. Determine if unusual weather patterns are increasing. Generate possibilities for future weather conditions over the next 25 to 50 years. Determine the safest regions for habitation in Europe over the next 25 to 50 years.
Hypothesis: CNNs can better interpret radar and satellite imagery to classify weather conditions, improving the prediction of weather trends.
Approach:
• Developed a CNN model to classify radar images of various
weather conditions (e.g., cloudy, rainy, sunny).
• Used Bayesian optimization to refine hyperparameters like the
number of neurons, batch size, and learning rate for better
accuracy.
Model Used: CNN with Bayesian optimization.
Result:
• Initial Accuracy: The unoptimized CNN achieved around 11% accuracy.
• Optimized Model Accuracy: After Bayesian optimization, accuracy improved significantly to 80%.
• Confusion Matrix: Showcases the model’s ability to differentiate between weather conditions such as 'cloudy' and 'rainy,' highlighting areas where classification errors reduced after optimization.
Conclusion: This model demonstrated the potential for analyzing complex visual data, making it useful for predicting shifts in weather patterns.
Hypothesis: A random forest model can identify abnormal
weather trends based on historical patterns and station data.
Approach:
• Used RandomForestClassifier to analyze key features like
precipitation and temperature across various European
weather stations.
• Applied RandomizedSearchCV for hyperparameter tuning,
optimizing parameters such as n_estimators, max_depth, and
min_samples_split to improve model accuracy.
Model Used: RandomForestClassifier with optimized hyperparameters.
Result:
• Accuracy: Improved from an initial 71.2% to approximately
72% after hyperparameter optimization.
• Feature Importance: The most predictive features were
Kassel, Belgrade and Heathrow stations, highlighting the areas with significant weather variation.
Conclusion: This approach successfully identified areas
experiencing deviations from historical patterns, helping to
detect increasing anomalies like shifts in precipitation and
temperature extremes.
Lost Plot
Displays training and
validation loss. Lower loss
indicates better fit, with spikes showing areas for
improvement.
Tracks accuracy over epochs. Higher values indicate better performance; fluctuations suggest varying model stability.
Shows how well the model
classifies weather types. Most predictions are accurate, but some classes are confused with others.
Accuracy Plot
Confusion Matrix
Model Performance Evaluation Using CNN
Deep Learning for Image-Based Weather Classification
Deep Learning with CNNs
• Description: CNNs were applied to classify weather conditions based on radar and satellite imagery.
• Results: Improved test accuracy to 80.17% through Bayesian optimization.
• Key Features: Enabled analysis of complex spatial patterns in weather images.
Random Forest Model
• Description: An ensemble learning method used to classify weather conditions, predict safe flight conditions, and analyze variable importance.
• Results: Achieved 73% accuracy with Randomized Search CV for multi-station data; 100% accuracy when focused on a single station like Maastricht.
• Key Features: Precipitation, temperature metrics,
cloud cover, and sunshine.
Predicting Anomalies with Random Forest
Predicting Weather Anomalies with Random Forest: Focus on using ensemble models to analyze shifts in weather patterns based on historical data.
Use Random Forest models for immediate analysis of feature importance and to identify key predictors of abnormal weather patterns.
Implement CNNs for analyzing satellite data and weather imagery to improve real-time classification of weather conditions.
Invest in developing GANs for longer-term scenario simulation, helping ClimateWins plan for potential future climates.
RECOMMENDATIONS
Ivonne Aspilcueta
Data Analyst
Hermosa Beach, CA, United States