Lecture on Biostatistics
Introduction to R Language
- R: A software environment for statistical computing and data analysis.
- Open-source and freely available.
- Command-line interface; GUI interfaces available through software like R Studio.
- Produces publication-quality graphs with mathematical symbols.
- Interpreted language, developed at the University of Auckland.
- Comparable performance to tools like GNU Octave or MATLAB.
Getting Started with R
- Installation: Download from rproject.org depending on your OS (Windows, Unix, macOS).
- R Studio: Provides a GUI for R, making it easier for many users.
- Command window: Enter commands.
- Workspace: Stores generated data.
Basic Computations in R
- Scalars and arithmetic operations (e.g.,
a = 1
, b = 2
, a^b
, etc.).
- Trigonometry: Convert degrees to radians (e.g.,
sin(30)
requires conversion).
- Logarithms:
log10
and natural logs.
Creating and Manipulating Vectors
- Vector creation: Use
c()
function (e.g., c(1, 2, 3)
).
- Element-wise operations: Addition, multiplication, etc.
- Example:
cc * bb
performs element-wise multiplication.
- Scalar multiplication:
4 * cc
multiplies all elements by 4.
- Repetition and sequences:
rep()
and seq()
functions for repeating elements and creating sequences.
- Data input: Use
scan()
for manual data entry.
Importing Data
- CSV files: Use
read.csv()
to import data from CSV.
Descriptive Statistics
- Functions:
length()
, min()
, max()
, mean()
, median()
, var()
, sd()
.
- Frequency distribution:
table()
to get frequency counts of vector elements.
Sorting and Other Operations
- Sorting:
sort()
function with options for ascending or descending order.
- Logical operations: Finding positions or elements based on conditions.
- Quantiles: Use
quantile()
for percentiles.
Data Visualization
- Bar plots:
barplot()
for displaying categorical data.
- Histograms:
hist()
for showing distribution of data.
- Box plots:
boxplot()
for visualizing data.
Example: Cell Shape and Migration
- Categorical data: Use
table()
to analyze distributions.
- Example: Correlating cell shape (circular/spindle) with migratory behavior.
- Visualization: Bar plots to display categorical distributions.
Additional Plotting Features
- Customize plots with titles, labels, colors, and axis limits.
- Use
plot()
function for trajectory and coordinate plotting.
Conclusion
- R: Powerful tool for statistical analysis and data visualization.
- Encouragement to practice using R for statistical calculations and plotting.
Thank you for your attention. We will meet again in the next class.