Ninth Workshop on Data Mining in Earth System Science (DMESS 2019)

Co-conveners: Forrest M. Hoffman, Jitendra Kumar, Sarat Sreepathi, and Vamsi Sripathi

In conjunction with the IEEE International Conference on Data Mining (ICDM 2019)

Beijing, China — November 8, 2019

Spanning many orders of magnitude in time and space scales, Earth science data, from point measurements to process-based Earth system model output, are increasingly large and complex, and often represent very long time series, making these data difficult to analyze, visualize, interpret, and understand. An “explosion” of heterogeneous, multi-disciplinary data—including observations and models of interacting natural, engineered, and human systems—have rendered traditional means of integration and analysis ineffective, necessitating the application of new analytical methods and the development of highly scalable software tools for synthesis, assimilation, comparison, and visualization. For complex, nonlinear feedbacks among chaotic processes, new methods and approaches for data mining and computational statistics are required for classification and change detection, model evaluation and benchmarking, uncertainty quantification, and incorporation of constraints from physics, chemistry, and biology into analysis. This workshop explores various data mining approaches and algorithms for understanding nonlinear dynamics of weather and climate systems and their interactions with biogeochemical cycles, impacts of natural system responses and climate extremes on engineered systems and interdependent infrastructure networks, and mitigation and adaptation strategies for natural hazards and infrastructure and ecosystem resilience. Encouraged are original research papers describing applications of statistical and data mining methods that support analysis and discovery in climate predictability, attributions, weather extremes, water resources management, risk analysis and hazards assessment, ecosystem sustainability, infrastructure resilience, and geo-engineering.

Rigorous review papers that either have the potential to expose data mining researchers to commonly used data-driven methods in the Earth sciences or discuss the applicability and caveats of such methods from a machine learning or statistical perspective, are also desired. Methods may include, but are not limited to cluster analysis, empirical orthogonal functions (EOFs), extreme value and rare events analysis, genetic algorithms, neural networks and deep learning methods, physics-constrained data analytics, automated data assimilation, and other machine learning techniques. Novel approaches that bring new ideas from nonlinear dynamics and information theory, network science and graphical methods, and the state-of-the-art in computational statistics and econometrics, into data mining and machine learning, are particularly encouraged.

Previous Workshops


Beijing, China

November 8, 2019

Time Session Chair Paper ID Title Speaker Authors
8:00 Jitendra Kumar Invited Keynote Presentation Qinghua Guo, Institute of Botany, Chinese Academy of Sciences Qinghua Guo
8:25 S24202 Climate Change Perception in Scientific and Public Sphere Erick Stattner Didier Henry, Nathan Jadoul, Reynald Eugénie, and Erick Stattner
8:50 S24201 Climate Data Analytics Applied to Sugar Cane Crop in the French West Indies Erick Stattner Erick Stattner and Nathan Jadoul
9:15 S07201 An efficient Bayesian method for advancing the application of deep learning in Earth science Dan Lu Dan Lu, Siyan Liu, and Daniel Ricciuto
9:40 S07202 Learning-based inversion-free model–data integration to advance ecosystem model prediction Dan Lu Dan Lu and Daniel Ricciuto
10:05 Coffee Break
10:25 Martine Collard S07205 Evaluating carbon extremes in a coupled climate–carbon cycle simulation Min Xu Min Xu, Salil Mahajan, Forrest M. Hoffman, and Xiaoying Shi
10:50 S07204 Deep Transfer Learning With Field-Based Measurements for Large Area Classification Jitendra Kumar Zachary L. Langford, Jitendra Kumar, and Forrest M. Hoffman
11:15 S07203 Convolutional Neural Networks for Hydrometeor Classification using Dual Polarization Doppler Radars Jitendra Kumar Yuping Lu and Jitendra Kumar
11:40 Open Discussion Workshop Participants
12:00 Adjourn Workshop
12:00 Lunch


Workshop Co-conveners

  • Forrest M. Hoffman is a Senior Computational Climate Scientist at Oak Ridge National Laboratory (ORNL). As a resident researcher in ORNL’s Climate Change Science Institute (CCSI) and a member of ORNL’s Computational Sciences & Engineering Division (CSED), Forrest develops and applies Earth system models (ESMs) to investigate the global carbon cycle and feedbacks between biogeochemical cycles and the climate system. He applies data mining methods using high performance computing to problems in landscape ecology, remote sensing, and large-scale climate data analytics. He founded the workshop series on Data Mining in Earth System Science (DMESS) in 2009 and has served as lead convener for all six prior workshops. Forrest is also a Joint Faculty Professor in the University of Tennessee’s Department of Civil & Environmental Engineering in nearby Knoxville, Tennessee.

  • Jitendra Kumar is a computational hydrologist at Oak Ridge National Laboratory and a Joint Assistant Professor at the University of Tennessee, Knoxville. He conducts research at the intersection of high performance computing, environmental and Earth sciences, and systems analysis and data mining. His research entails data mining, large-scale global optimization, computational hydrology and hydrogeology, landscape ecology, remote sensing, and development of parallel algorithms for large-scale supercomputers.

  • Sarat Sreepathi is a Computer Scientist in the Future Technologies Group at Oak Ridge National Laboratory. He is working on development of the E3SM-Multiscale Modeling Framework (MMF) as part of the Exascale Computing Project (ECP). Additionally, he contributes to E3SM as a member of the core performance group. His research interests include High Performance Computing, Performance Analytics, Exascale Co-design, Optimization Algorithms, Computational Intelligence, Parallel I/O, Performance Analysis and Optimization.

  • Vamsi Sripathi is a Senior Software Engineer at Intel. He has a Masters’ degree in Computer Science from North Carolina State University. During his tenure at Intel, he has worked on the performance optimization of Basic Linear Algebra Subroutines (BLAS) in Intel Math Kernel Library (MKL) spanning multiple generations of Intel Xeon and Intel Xeon Phi architectures. Recently, he has been working on the optimization of deep learning algorithms and contributed to Caffe and TensorFlow frameworks.

Program Committee

  • Bjørn-Gustaf J. Brooks (Eastern Forest Environmental Threat Assessment Center, USDA Forest Service, Asheville, North Carolina, USA)
  • Udit Bhatia (IIT Gandhinagar, Gujarat, India)
  • Nathan Collier (Computational Earth Sciences Group, Computational Sciences & Engineering Division and Oak Ridge Climate Change Science Institute (CCSI), Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA)
  • Auroop R. Ganguly (Department of Civil and Environmental Engineering, Northeastern University, Boston, Massachusetts, USA)
  • Sangram Ganguly (Bay Area Environmental Research Institute and NASA Ames Research Center, California, USA)
  • William W. Hargrove (Eastern Forest Environmental Threat Assessment Center, USDA Forest Service, Asheville, North Carolina, USA)
  • Forrest M. Hoffman (Computational Earth Sciences Group, Computational Sciences & Engineering Division and Oak Ridge Climate Change Science Institute (CCSI), Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA)
  • Jian Huang (Department of Electrical Engineering and Computer Science, University of Tennessee, Knoxville, Tennessee USA)
  • Soo Kyung Kim (Lawrence Livermore National Laboratory, Livermore, California, USA)
  • Jitendra Kumar (Terrestrial Systems Modeling Group, Environmental Sciences Division and Oak Ridge Climate Change Science Institute (CCSI), Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA)
  • Vipin Kumar (Department of Computer Science and Engineering, University of Minnesota, Minneapolis, Minnesota, USA)
  • Zachary L. Langford (Boeing Research & Technology, Huntsville, Alabama, USA)
  • Miguel D. Mahecha (Department of Biogeochemical Integration, Max Planck Institute for Biogeochemistry, Jena, GERMANY)
  • Richard T. Mills (Laboratory for Advanced Numerical Simulations, Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, Illinois, USA)
  • Steven P. Norman (Eastern Forest Environmental Threat Assessment Center, USDA Forest Service, Asheville, North Carolina, USA)
  • Sarat Sreepathi (Computer Science & Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA)
  • Vamsi Sripathi (Intel Corporation, Hillsboro, Oregon, USA)
  • Karsten Steinhaeuser (Department of Computer Science and Engineering, University of Minnesota, Minneapolis, Minnesota, USA)
  • TJ Vandal (NASA Ames Research Center, Moffett Field, California, USA)
  • Min Xu (Computational Earth Sciences Group, Computational Sciences & Engineering Division and Oak Ridge Climate Change Science Institute (CCSI), Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA)

Authors are invited to submit manuscripts of 8 pages (10 pages maximum for additional fee) reporting unpublished, mature, and original research and recent developments/theoretical considerations in applications of data mining to Earth sciences by August 7, 2019. Paper submissions should be in the IEEE 2-column format, including the bibliography and any possible appendices.

Accepted papers will be published in the conference proceedings by the IEEE Computer Society Press. Submission implies the willingness of at least one of the authors to register and present the paper.

Manuscripts must be submitted electronically in online submission system.

All submissions will be triple-blind reviewed by the Program Committee on the basis of technical quality, relevance to scope of the workshop, originality, significance, and clarity. The following sections give further information for authors.

Triple blind submission guidelines

Since 2011, ICDM has imposed a triple blind submission and review policy for all submissions. Authors must hence not use identifying information in the text of the paper and bibliographies must be referenced to preserve anonymity. Any papers available on the Web (including Arxiv) no longer qualify for ICDM submissions, as their author information is already public.

What is triple blind reviewing?

The traditional blind paper submission hides the referee names from the authors, and the double-blind paper submission also hides the author names from the referees. The triple-blind reviewing further hides the referee names among referees during paper discussions before their acceptance decisions. The names of authors and referees remain known only to the PC Co-chairs, and the author names are disclosed only after the ranking and acceptance of submissions are finalized. It is imperative that all authors of ICDM submissions conceal their identity and affiliation information in their paper submissions. It does not suffice to simply remove the author names and affiliations from the first page, but also in the content of each paper submission.

How to prepare your submissions

The authors shall omit their names from the submission. For formatting templates with author and institution information, simply replace all these information in the template by “Anonymous”.

In the submission, the authors’ should refer to their own prior work like the prior work of any other author, and include all relevant citations. This can be done either by referring to their prior work in the third person or referencing papers generically. For example, if your name is Smith and you have worked on clustering, instead of saying “We extend our earlier work on distance-based clustering (Smith 2005),” you might say “We extend Smith’s (Smith 2005) earlier work on distance-based clustering.” The authors shall exclude citations to their own work which is not fundamental to understanding the paper, including prior versions (e.g., technical reports, unpublished internal documents) of the submitted paper. Hence, do not write: “In our previous work [3]” as it reveals that citation 3 is written by the current authors. The authors shall remove mention of funding sources, personal acknowledgments, and other such auxiliary information that could be related to their identities. These can be reinstituted in the camera-ready copy once the paper is accepted for publication. The authors shall make statements on well-known or unique systems that identify an author, as vague in respect to identifying the authors as possible. The submitted files shall be named with care to ensure that author anonymity is not compromised by the file name. For example, do not name your submission “Smith.pdf”, instead give it a name that is descriptive of the title of your paper, such as “ANewApproachtoClustering.pdf” (or a shorter version of the same).


All deadlines are at 11:59PM Pacific Daylight Time.

  • Paper submission: August 17, 2019
  • Paper notification: September 4, 2019
  • Camera-ready deadline and copyright forms: September 8, 2019
  • DMESS 2019 Workshop : November 8, 2019
  • ICDM 2019 Conference : November 8-11, 2019

Email: dmess2019 at climatemodeling dot org

DMESS 2019 is held in collaboration with:

Data journal logo