This paper introduces a new evaluation function for solving the multiple instance problem. Our approach makes use of the main idea of diverse density (Maron, 1998; Maron & LozanoP´ 1998) but finds the best concept using the chi-square statistic. This approach is simpler than diverse density and allows us to search more extensively by using properties of the contingency table to prune in a guaranteed manner. We demonstrate that this approach solves the multiple-instance problem as well as or better than diverse density and that the pruning mechanism allows chi-squared to identify the best concepts more quickly.
This paper presents a method by which a reinforcement learning agent can automatically discover certain types of subgoals online. By creating useful new subgoals while learning, the agent is able to accelerate learning on the current task and to transfer its expertise to other, related tasks through the reuse of its ability to attain subgoals. The agent discovers subgoals based on commonalities across multiple paths to a solution. We cast the task of finding these commonalities as a multiple-instance learning problem and use the concept of diverse density to find solutions. We illustrate this approach using several gridworld tasks.
Abstract We present and evaluate a deep learning first-guess front-identification system that identifies cold, warm, stationary, and occluded fronts. Frontal boundaries play a key role in the daily weather around the world. Human-drawn fronts provided by the National Weather Service’s Weather Prediction Center, Ocean Prediction Center, Tropical Analysis and Forecast Branch, and Honolulu Forecast Office are treated as ground-truth labels for training the deep learning models. The models are trained using ERA5 data with variables known to be important for distinguishing frontal boundaries, including temperature, equivalent potential temperature, and wind velocity and direction at multiple heights. Using a 250-km neighborhood over the contiguous U.S. domain, our best models achieve critical success index scores of 0.60 for cold fronts, 0.43 for warm fronts, 0.48 for stationary fronts, 0.45 for occluded fronts, and 0.71 using a binary classification system (front/no front), whereas scores over the full unified surface analysis domain were lower. For cold and warm fronts and binary classification, these scores significantly outperform prior baseline methods that utilize 250-km neighborhoods. These first-guess deep learning algorithms can be used by forecasters to locate frontal boundaries more effectively and expedite the frontal analysis process. Significance Statement Fronts are boundaries that affect the weather that people experience daily. Currently, forecasters must identify these boundaries through manual analysis. We have developed an automated machine learning method for detecting cold, warm, stationary, and occluded fronts. Our automated method provides forecasters with an additional tool to expedite the frontal analysis process.
Welcome to the second issue in our third year of AI Matters . This issue features a timely new column: AI Events . This column is written by Michael Rovatsos and gives a summary of upcoming AI events for the rest of the year. AI Events will be a regular feature for future issues as well.
Fundamental to reinforcement learning, as well as to the theory of systems and control, is the problem of representing knowledge about the environment and about possible courses of action hierarchically, at a multiplicity of interrelated temporal scales. For example, a human traveler must decide which cities to go to, whether to fly, drive, or walk, and the individual muscle contractions involved in each step. In this paper we survey a new approach to reinforcement learning in which each of these decisions is treated uniformly. Each low-level action and high-level course of action is represented as an option, a (sub)controller and a termination condition. The theory of options is based on the theories of Markov and semi-Markov decision processes, but extends these in significant ways. Options can be used in place of actions in all the planning and learning methods conventionally used in reinforcement learning. Options and models of options can be learned for a wide variety of different subtasks, and then rapidly combined to solve new tasks. Options enable planning and learning simultaneously at a wide variety of times scales, and toward a wide variety of subtasks, substantially increasing the efficiency and abilities of reinforcement learning systems.
We introduce and validate Spatiotemporal Relational Random Forests, which are random forests created with spatiotemporal relational probability trees. We build on the documented success of random forests by bringing spatiotemporal capabilities to the trees, enabling them to identify critical spatial, temporal, and spatiotemporal features in the data. We validate our results on simulated data and real-world convectively-induced turbulence data from a commercial airline flying in the continental United States.
Our transformational science question is: can we revolutionize both the prediction and understanding of extreme events through trustworthy AI? Our use-cases include extreme weather such as tornadoes and hail as well as water-based events including extreme precipitation, compound flooding, harmful algal blooms, and sea turtle cold stunnings and nest inundations.
Abstract Thunderstorms in the United States cause over 100 deaths and $10 billion (U.S. dollars) in damage per year, much of which is attributable to straight-line (nontornadic) wind. This paper describes a machine-learning system that forecasts the probability of damaging straight-line wind (≥50 kt or 25.7 m s−1) for each storm cell in the continental United States, at distances up to 10 km outside the storm cell and lead times up to 90 min. Predictors are based on radar scans of the storm cell, storm motion, storm shape, and soundings of the near-storm environment. Verification data come from weather stations and quality-controlled storm reports. The system performs very well on independent testing data. The area under the receiver operating characteristic (ROC) curve ranges from 0.88 to 0.95, the critical success index (CSI) ranges from 0.27 to 0.91, and the Brier skill score (BSS) ranges from 0.19 to 0.65 (>0 is better than climatology). For all three scores, the best value occurs for the smallest distance (inside storm cell) and/or lead time (0–15 min), while the worst value occurs for the greatest distance (5–10 km outside storm cell) and/or lead time (60–90 min). The system was deployed during the 2017 Hazardous Weather Testbed.