Data Management Programming and Graphics with R

Vall d'Hebron

Course Information

Objectives


This course aims to introduce the statistical language R as a powerful tool for "data management" with which a life scientist is confronted in his daily life. Participants will learn how to read and manage data files, import them to R and process them for later explorations. Will be done Emphasis on aspects such as reading and transforming data, creating basic and advanced graphics and accessing databases. The course includes an introduction to the use of R as a statistical tool but it is not a course of statistical analysis with R

Methodology

Learning R is not complicated but it requires a lot of practice, so the course will be eminently applied. Each week different concepts will be introduced and resources will be provided to deepen them, as well as one or more cases so that attendees can work them up to the next session. The following week will be discussed the solution proposed by the students. It is especially recommended that students make groups where they can work with their own data. It will work preferably in a computer room but it will be easier for those who want to install R on their computers, learn how to do it

Course Contents

  • Session 1 Introduction to R, Rstudio and the tidyverse
  • Session 2 Data visualization
  • Session 3 Data Wrangling: Tidying and cleaning
  • Session 4 Statistical Analysis and Modeling
  • Session 5 Introduction to programming and Workflows. Wrap-up.

Dates, Schedule and Location

The course will take place in the months of October and November 2019 in the Computer Room of the Teaching Pavilion of the Vall d'Hebron Campus.

The sessions will be from 8 to 11 in the morning the days


Course Materials

Session I: Introduction

  1. Presentation
  2. Introduction to R

Datasets

  1. Osteoporosis (csv)
  2. Diabetes (Excel)

Session II: Data visualization

  1. Data visualization with ggplot2
  2. Data visualization with ggplot2 - Rcode

Datasets

  1. EconomistData: Data from 'The Economist' (csv)
  2. Housing data by states (csv)
  3. CPI data (csv)

Session III: Data Management

  1. Data Managment with tidyverse
  2. Data Managment with tidyverse - Rcode

Datasets

  1. WormsData(csv)
  2. Departament (csv)
  3. ImmunologicalData (xls)

Session IV:

  1. Statistics with R (pdf)
  2. Statistics with R (R code)

Datasets

  1. Diabetes_mod
  2. Osteoporosis

Session V:

  1. Programming with R (pdf)
  2. Programming with R (R code)

Case Studies:

  1. A case study in phosphoproteomics
  2. Download multiple files from a web site


References and Resources

R and Rstudio

Books

Tutorials and Workshops

Sites with more resources