Machine Learning Workshops provide accessible tools for Water Resource Sciences Applications

 by Xiang Li, PhD graduate student, Water Resource Sciences and John Nieber, Department of Bioproducts and Biosystems Engineering, Co-DGS for WRS

machinelearning

AI (artificial intelligence) and machine learning exploit data with an intention to leverage as much information as possible to understand a system. Data driven approaches, especially neural network families, have been implemented broadly in interdisciplinary efforts between computer scientists and domain scientists to develop relevant machine learning algorithms and to test their ability to advance scientific knowledge. In particular, its application in the various disciplines of water resource science has sprung up and even in the early stages of this development it is being demonstrated to provide valuable scientific insights. This should encourage students and faculty in water resource sciences to learn how to apply readily available machine learning tools.

Although the development of the software for machine learning tools requires a strong background in mathematics and computer science, it does not necessarily mean that students or researchers lacking relevant background cannot learn how to apply the tools. To fill this gap and help those who are interested in starting to master the machine learning in their own research, or thesis project, the first author, a current WRS Phd Candidate (minor in computer science) offered and hosted a three-part series workshop on ‘machine learning for non-machine learners’ in collaboration with the WRS graduate program. Including WRS graduate students, post docs and faculty, a total of 20 participants from the Twin Cities and Duluth campuses registered for the workshop.

This workshop spanned three weeks and was held online via Zoom every Tuesday evening  April 13 to April 27. The workshop topics covered basic programming scripts, machine learning knowledge framework, and implementation details. Through the workshop, participants were provided the opportunity to acquire a grasp of the history, fundamentals, as well as hands-on experience with the basics machine learning methods.

Participants represented various specializations in water resource sciences, including environmental chemistry, aquatic biology, limnology, water policy and economics, hydrology, and hydrogeology. To accommodate this diverse background, the math intensive machine learning notions were introduced and delivered in an interesting and friendly-to-layman fashion. To ease the learning curve and to avoid the overwhelming information at first, the topic of linear regression, which is of common knowledge to scientists from all backgrounds, was used in the introductory session to familiarize the participants with machine learning. As the workshop progressed, I gradually transitioned to the more complex topics of machine learning, including the topics of model optimization, artificial neural networks (ANN) and advanced variants of ANN (recurrent neural network and convolution neural network).

Metaphors were frequently used to help participants rapidly digest and understand the relevant difficult and challenging concepts. For example, one metaphor was used to describe the traditional artificial neural network as a layer-wise computational flow, which is like a 3-floor house. The procedure that input data go through the neural network via three computation layers is like any person going to the top floor of a 3-floor house when entering the baselevel. A second example applied to the procedure of optimization of a neural network. That procedure involves iteratively training a machine learning models until an optimal parameter set is found. This was likened to the step-by-step search that a scuba diver might use in trying to locate a treasure box at the bottom of a lake guided by the bathymetry.

To assist with the comprehension of the calculations and operations involved in neural network computation were animated to visualize the abstract machine learning implementation details. I also provided opportunities for hands-on practical applications to transform concepts learned to script languages. Those practical applications involved both a classic machine learning problem (classification) and a hydrologic application (streamflow prediction). An illustration of the architecture for a machine learning algorithm for streamflow prediction is shown below.

At the end of the workshop, participants evaluated the workshop and the evaluations showed that the workshop was very informative and useful for their own research. Although the online zoom format limits the degree of interaction in coding assignments, participants were able to conceptualize machine learning concepts. From the hands-on practical applications, participants should be capable to start implementing machine learning on their own. They will now be better equipped to read research articles involving machine learning, and better able to apply the broadly available machine learning software. It is our hope that students in the water resource sciences program will come to view machine learning as an essential tool to advancing their research.

 

It is expected that the same workshop will be offered again during the Fall semester of 2021.