Tenth Workshop on Data Mining in Earth System Science (DMESS 2022)

Co-conveners: Forrest M. Hoffman, Jitendra Kumar and Sarat Sreepathi

In conjunction with the IEEE International Conference on Data Mining (ICDM 2022)

Orlando, Florida, USA

Spanning many orders of magnitude in time and space scales, Earth science data, from point measurements to process-based Earth system model output, are increasingly large and complex, and often represent very long time series, making these data difficult to analyze, visualize, interpret, and understand. An “explosion” of heterogeneous, multi-disciplinary data—including observations and models of interacting natural, engineered, and human systems—have rendered traditional means of integration and analysis ineffective, necessitating the application of new analytical methods and the development of highly scalable software tools for synthesis, assimilation, comparison, and visualization. For complex, nonlinear feedbacks among chaotic processes, new methods and approaches for data mining and computational statistics are required for classification and change detection, model evaluation and benchmarking, uncertainty quantification, and incorporation of constraints from physics, chemistry, and biology into analysis. This workshop explores various data mining approaches and algorithms for understanding nonlinear dynamics of weather and climate systems and their interactions with biogeochemical cycles, impacts of natural system responses and climate extremes on engineered systems and interdependent infrastructure networks, and mitigation and adaptation strategies for natural hazards and infrastructure and ecosystem resilience. Encouraged are original research papers describing applications of statistical and data mining methods that support analysis and discovery in climate predictability, attributions, weather extremes, water resources management, risk analysis and hazards assessment, ecosystem sustainability, infrastructure resilience, and geo-engineering.

Rigorous review papers that either have the potential to expose data mining researchers to commonly used data-driven methods in the Earth sciences or discuss the applicability and caveats of such methods from a machine learning or statistical perspective, are also desired. Methods may include, but are not limited to cluster analysis, empirical orthogonal functions (EOFs), extreme value and rare events analysis, genetic algorithms, neural networks and deep learning methods, physics-constrained data analytics, automated data assimilation, and other machine learning techniques. Novel approaches that bring new ideas from nonlinear dynamics and information theory, network science and graphical methods, and the state-of-the-art in computational statistics and econometrics, into data mining and machine learning, are particularly encouraged.

Previous Workshops

Program

10th Workshop on Data Mining in Earth System Science (DMESS 2022)
at the IEEE International Conference on Data Mining (ICDM 2022)

Key West B Room
Hilton Orlando, 6001 Destination Pkwy, Orlando, Florida 32819, USA
Co-convened by Forrest M. Hoffman, Jitendra Kumar, and Sarat Sreepathi
Chaired by Forrest M. Hoffman

November 28, 2022

Time Session Chair Paper ID Title Speaker Authors
13:00 Forrest Hoffman Invited Keynote Presentation: Distributed Learning from Large Geospatial Datasets in the Exascale Era Richard Tran Mills, Argonne National Laboratory Richard Tran Mills
13:30 Introduction and Preparation for Later Panel Discussion All Workshop Participants
14:00 S10204 Improving Net Ecosystem CO2 Flux Prediction Using Memory-based Interpretable Machine Learning Siyan Liu Siyan Liu, Dan Lu, Daniel Ricciuto, and Anthony Walker
14:30 S10206 Using Image Processing Techniques to Identify and Quantify Spatiotemporal Carbon Cycle Extremes Bharat Sharma Bharat Sharma, Forrest M. Hoffman, Jitendra Kumar, and Auroop R. Ganguly
15:00 Coffee Break and Refreshments (Florida Foyers)
15:30 Forrest Hoffman S10202 Above Ground Biomass Estimation of a Cocoa Plantation using Machine Learning Patrick Hosein Sabrina Sankar, Marvin Lewis, and Patrick Hosein
16:00 S10201 Identifying Hydrometeorological Factors Influencing Reservoir Releases Using Machine Learning Methods Ming Fan Ming Fan, Lujun Zhang, Siyan Liu, Tiantian Yang, and Dan Lu
16:30 S10205 Streamflow Prediction Using a Hybrid Methodology Based on Convolutional Neural Network and Long Short-Term Memory Nian Zhang Juan F. Ramirez Rochac, Nian Zhang, Tolessa Deksissa, Jiajun Xu, and Lara A. Thompson
17:00 10203 Macaw: The Machine Learning Magnetometer Calibration Workflow Kevin Styp-Rekowski Jonathan Bader, Kevin Styp-Rekowski, Leon Döhler, Sören Becker, and Odej Kao
17:30 Forrest Hoffman Panel Discussion All Workshop Participants
18:00 Adjourn Workshop

Committee

Workshop Co-conveners

  • Forrest M. Hoffman is a Distinguished Computational Earth System Scientist at Oak Ridge National Laboratory (ORNL). As a resident researcher in ORNL’s Climate Change Science Institute (CCSI) and a member of ORNL’s Computational Sciences & Engineering Division (CSED), Forrest develops and applies Earth system models (ESMs) to investigate the global carbon cycle and feedbacks between biogeochemical cycles and the climate system. He applies data mining methods using high performance computing to problems in landscape ecology, remote sensing, and large-scale climate data analytics. He founded the workshop series on Data Mining in Earth System Science (DMESS) in 2009 and has served as lead convener for all six prior workshops. Forrest is also a Joint Faculty Professor in the University of Tennessee’s Department of Civil & Environmental Engineering in nearby Knoxville, Tennessee.

  • Jitendra Kumar is a computational hydrologist at Oak Ridge National Laboratory and a Joint Assistant Professor at the University of Tennessee, Knoxville. He conducts research at the intersection of high performance computing, environmental and Earth sciences, and systems analysis and data mining. His research entails data mining, large-scale global optimization, computational hydrology and hydrogeology, landscape ecology, remote sensing, and development of parallel algorithms for large-scale supercomputers.

  • Sarat Sreepathi is a Computer Scientist interested in interdisciplinary research at the intersection of High Performance Computing and domain sciences. He is a member of the Computational Earth Sciences Group in the Computational Sciences and Engineering Division at Oak Ridge National Laboratory. He received his Ph.D. in Computer Science from North Carolina State University. He is a Senior member of ACM and IEEE. He served as the Chair of the OLCF User Group Executive Board (2020-2021) and currently serves on the NERSC User Group Executive Committee. He co-leads the Performance group for the Energy Exascale Earth System Model (E3SM) project, DOE’s flagship climate modeling effort. He leads application co-design efforts as a member of the Exascale Computing Project (ECP) application teams (Climate: E3SM-MMF and Nuclear Fusion: XGC).

Program Committee

  • Udit Bhatia (IIT Gandhinagar, Gujarat, INDIA)
  • Bjørn-Gustaf J. Brooks (Living Carbon, San Francisco, California, USA)
  • Gustau Camps-Valls (Image Processing Laboratory, University of Valencia, Valencia, SPAIN)
  • Nathan Collier (Computational Earth Sciences Group, Computational Sciences & Engineering Division and Oak Ridge Climate Change Science Institute (CCSI), Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA)
  • Auroop R. Ganguly (Department of Civil and Environmental Engineering, Northeastern University, Boston, Massachusetts, USA)
  • William W. Hargrove (Eastern Forest Environmental Threat Assessment Center, USDA Forest Service, Asheville, North Carolina, USA)
  • Forrest M. Hoffman (Computational Earth Sciences Group, Computational Sciences & Engineering Division and Oak Ridge Climate Change Science Institute (CCSI), Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA)
  • Jian Huang (Department of Electrical Engineering and Computer Science, University of Tennessee, Knoxville, Tennessee USA)
  • Soo Kyung Kim (PARC, a Serox Company, Palo Alto, California, USA)
  • Jitendra Kumar (Terrestrial Systems Modeling Group, Environmental Sciences Division and Oak Ridge Climate Change Science Institute (CCSI), Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA)
  • Vipin Kumar (Department of Computer Science and Engineering, University of Minnesota, Minneapolis, Minnesota, USA)
  • Kuldeep Kurte (Computational Sciences and Engineering Division, Oak Ridge National Laboratory, USA)
  • Zachary L. Langford (Cyber & Applied Data Analytics Division, Oak Ridge National Laboratory, USA)
  • Miguel D. Mahecha (Remote Sensing Center for Earth System Research, University of Leipzig, Leipzig, GERMANY)
  • Jiafu Mao (Terrestrial Systems Modeling Group, Environmental Sciences Division and Oak Ridge Climate Change Science Institute (CCSI), Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA)
  • Murali Gopalakrishnan Meena (Oak Ridge Leadership Computing Facility, Oak Ridge National Laboratory, USA)
  • Richard T. Mills (Laboratory for Advanced Numerical Simulations, Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, Illinois, USA)
  • Steven P. Norman (Eastern Forest Environmental Threat Assessment Center, USDA Forest Service, Asheville, North Carolina, USA)
  • Bharat Sharma (Environmental Sciences Division and Oak Ridge Climate Change Science Institute (CCSI), Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA)
  • Sarat Sreepathi (Computer Science & Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA)
  • Vamsi Sripathi (Intel Corporation, Hillsboro, Oregon, USA)
  • Karsten Steinhaeuser (Department of Computer Science and Engineering, University of Minnesota, Minneapolis, Minnesota, USA)
  • TJ Vandal (NASA Ames Research Center, Moffett Field, California, USA)
  • Min Xu (Computational Earth Sciences Group, Computational Sciences & Engineering Division and Oak Ridge Climate Change Science Institute (CCSI), Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA)
  • Cheng-En Yang (Department of Civil and Environmental Engineering, University of Tennessee, Knoxville, Tennessee USA)
Submission

Authors are invited to submit manuscripts of 8 pages (10 pages maximum for additional fee) reporting unpublished, mature, and original research and recent developments/theoretical considerations in applications of data mining to Earth sciences by August 24, 2020. Paper submissions should be in the IEEE 2-column format, including the bibliography and any possible appendices.

Accepted papers will be published in the conference proceedings by the IEEE Computer Society Press. Submission implies the willingness of at least one of the authors to register and present the paper.

Manuscripts must be submitted electronically in the online submission system.


All submissions will be triple-blind reviewed by the Program Committee on the basis of technical quality, relevance to scope of the workshop, originality, significance, and clarity. The following sections give further information for authors.

Triple blind submission guidelines

Since 2011, ICDM has imposed a triple blind submission and review policy for all submissions. Authors must hence not use identifying information in the text of the paper and bibliographies must be referenced to preserve anonymity. Any papers available on the Web (including Arxiv) no longer qualify for ICDM submissions, as their author information is already public.

What is triple blind reviewing?

The traditional blind paper submission hides the referee names from the authors, and the double-blind paper submission also hides the author names from the referees. The triple-blind reviewing further hides the referee names among referees during paper discussions before their acceptance decisions. The names of authors and referees remain known only to the PC Co-chairs, and the author names are disclosed only after the ranking and acceptance of submissions are finalized. It is imperative that all authors of ICDM submissions conceal their identity and affiliation information in their paper submissions. It does not suffice to simply remove the author names and affiliations from the first page, but also in the content of each paper submission.

How to prepare your submissions

The authors shall omit their names from the submission. For formatting templates with author and institution information, simply replace all these information in the template by “Anonymous”.

In the submission, the authors’ should refer to their own prior work like the prior work of any other author, and include all relevant citations. This can be done either by referring to their prior work in the third person or referencing papers generically. For example, if your name is Smith and you have worked on clustering, instead of saying “We extend our earlier work on distance-based clustering (Smith 2005),” you might say “We extend Smith’s (Smith 2005) earlier work on distance-based clustering.” The authors shall exclude citations to their own work which is not fundamental to understanding the paper, including prior versions (e.g., technical reports, unpublished internal documents) of the submitted paper. Hence, do not write: “In our previous work [3]” as it reveals that citation 3 is written by the current authors. The authors shall remove mention of funding sources, personal acknowledgments, and other such auxiliary information that could be related to their identities. These can be reinstituted in the camera-ready copy once the paper is accepted for publication. The authors shall make statements on well-known or unique systems that identify an author, as vague in respect to identifying the authors as possible. The submitted files shall be named with care to ensure that author anonymity is not compromised by the file name. For example, do not name your submission “Smith.pdf”, instead give it a name that is descriptive of the title of your paper, such as “ANewApproachtoClustering.pdf” (or a shorter version of the same).

Algorithms and resources used in a paper should be described as completely as possible to allow reproducibility. This includes experimental methodology, empirical evaluations, and results. Authors are strongly encouraged to make their code and data publicly available whenever possible. In addition, authors are strongly encouraged to also report, whenever possible, results for their methods on publicly available datasets.

All manuscripts are submitted as full papers and are reviewed based on their scientific merit. There is no separate abstract submission step. There are no separate industrial, application, short paper or poster tracks during submission.

Dates

All deadlines are at 11:59PM US Pacific Daylight Time.

  • Paper submission: September 17, 2022
  • Paper notification: October 8, 2022
  • Camera-ready deadline and copyright forms: October 22, 2022
  • DMESS 2022 Workshop : November 28, 2022, 13:00–18:00 EST in the Key West B Room
  • ICDM 2022 Conference : November 28–December 1, 2022
Contact

Email: dmess2022 at climatemodeling dot org