Analysis of Epidemiological Data Using
R and Epicalc
Author: Virasakdi Chongsuvivatwong
cvirasak@medicine.psu.ac.th
Editor: Edward McNeil
edward.m@psu.ac.th
Epidemiology Unit
Prince of Songkla University
THAILAND
Preface
Data analysis is very important in epidemiological research. The capacity of
computing facilities has been steadily increasing, moving state of the art
epidemiological studies along the same direction of computer advancement.
Currently, there are many commercial statistical software packages widely used by
epidemiologists around the world. For developed countries, the cost of software is
not a major problem. For developing countries however, the real cost is often too
high. Several researchers in developing countries thus eventually rely on a pirated
copy of the software.
Freely available software packages are limited in number and readiness of use.
EpiInfo, for example, is free and useful for data entry and simple data analysis.
Advanced data analysts however find it too limited in many aspects. For example, it
is not suitable for data manipulation for longitudinal studies. Its regression analysis
facilities cannot cope with repeated measures and multi-level modelling. The
graphing facilities are also limited.
A relatively new and freely available software called R is promising. Supported by
leading statistical experts worldwide,
it has almost everything that an
epidemiological data analyst needs. However, it is difficult to learn and to use
compared with similar statistical packages for epidemiological data analysis such as
Stata. The purpose of this book is therefore to bridge this gap by making R easy to
learn for researchers from developing countries and also to promote its use.
My experience in epidemiological studies spans over twenty years with a special
fondness of teaching data analysis. Inspired by the spirit of the open-source
software philosophy, I have spent a tremendous effort exploring the potential and
use of R. For four years, I have be