Syllabus
Part 1: Course Information
Class Time: Monday, 2 to 5 PM
Location: ENR 323
Instructor:
Xiaomeng Jin
Department of Environmental Sciences
Office: ENR 230
Email: xiaomeng.jin@rutgers.edu
Office Hour: Friday, 1 – 2 PM
Part 2: Overview
This course will introduce modern computing software, programming tools and best practices for open-source research that are transparent, accessible, reproducible and inclusive. The course consists of three components:
(1) Introduction to programming in the open-source Python language and in-depth exploration of the numerical analysis and visualization packages that comprise the modern scientific Python ecosystem;
(2) Introduction to the concept of open science and best practices for conducting open-source research;
(3) Introduction to cloud and parallel computing for big data analysis. The course is designed to be accessible for graduate students in atmospheric science, environmental sciences or other disciplines in earth sciences.
Student learning will be facilitated through a combination of lectures, in-class exercises, homework assignments and class projects.
Part 3: Course Structure
Format: The instructor will present new materials in the first half of the lecture. The second half of the class will be flipped: students will work first in small groups and then individually on assignments.
Textbook: There is no required textbook. All materials will come from free online resources and the course website itself.
Computers: Students can either bring their laptops or use the computers in ENR 323. Students will use Amarel, the university’s high performance computing cluster, to work on their assignments and final project.
Part 4: Grading Policy
Weekly Assignments (70%)
• Total: 100
• All questions complete: 50
• All questions correct: 30
• Clean, elegant, efficient code: rate between 0 and 10
• Clear comments and explanations: rate between 0 and 10
• Late penalty: -20 per day (24 hrs)
• Lowest grade on an assignment will be dropped.
Final Project (30%)
Part I: Individual Project (20%)
The goal of the final project is to assess your ability to combine and apply the skills you have learned in class in the context of a real-world research problem. Our class has mostly focused on tools for data analysis and visualization, so this must be the focus of your final project. Specifically, we seek to assess your ability to do the following tasks:
• Discover and download real datasets in standard formats (e.g. CSV, netCDF)
• Load the data into pandas or xarray, performing any necessary data cleanup (dealing with missing values, proper time encoding, etc.) along the way.
• Perform realistic scientific calculation involving, for example tasks such as grouping, aggregating, and applying mathematical formulas.
• Visualize your results in well-formatted plots.
Part II: Reproducing Another Student’s Project (10%)
The goal of the second part is to assess the reproducibility of the student’s project, and whether the students can reproduce and collaborate with others on code development. Our class focuses on conducting open-source research that are transparent, accessible, reproducible and inclusive, so your final project should demonstrate your understanding and ability to perform open-source research. We seek to assess your ability to:
• Clearly document your analysis to make it reproducible.
• Reproduce the other student’s final project.
• Bonus points will be given if the students submit pull requests and issues for code development.