Data analysis – ideas and tools
As part of a FIRM funded project “Mining And Modelling ; Animal Rotavirus Epidemiology” led by Dr. Helen O’Shea from CIT, we are running a one day seminar in Cork with two goals, to introduce postgradaute students to some basic ideas in data analysis, and to show how to implement these ideas using a powerful open source tool
Date – Friday May 30th 2014
Time – 8:30 am for 9am sharp
Place – CIT in Cork
At least a week before the seminar
- Download R from here and RStudio from here
- Install both on your laptop, get them both working.
- Work through the whole of this file, preferably with a colleague. This will take about 2 hours to do. Please try to find answers to your questions as you go along, but feel free to keep more difficult questions for the seminar.
Identify a file of your data that is of interest to you. Put it into one sheet of a spreadsheet (Excel or OpenOffice) and bring it along. During the day you will work on these data. Make sure of the following :-
- All variables are proper numbers or plain text – no spaces, funny characters or punctuation please.
- You have a mixture of numbers and grouping variables
- Data with a time sequence is fine, data with dates and times is not
- All missing data items are represented by the two letters NA and nothing else – no dashes, no 999, 77, 88 or anything else.
(R can cope with all of these, but we will not have time to explain how in this session)
On the day
|8:30 AM||Registration [PLEASE BE ON TIME]|
|9:00 AM||Getting data into R|
|10:00 AM||Describing data in R - summaries and simple tables|
|11:00 AM||Describing data in R - using with, ddply and friends|
|12:00 AM||Describing data in R - simple graphs in ggplot2|
|1:00 PM||Lunch break|
|2:00 PM||More interesting graphs in ggplot2|
|3:00 PM||Linear regression, and a taste of GLMs|