L9

Sep 19, 2024

Lecture on Biostatistics

Introduction to R Language

  • R: A software environment for statistical computing and data analysis.
    • Open-source and freely available.
    • Command-line interface; GUI interfaces available through software like R Studio.
    • Produces publication-quality graphs with mathematical symbols.
    • Interpreted language, developed at the University of Auckland.
    • Comparable performance to tools like GNU Octave or MATLAB.

Getting Started with R

  • Installation: Download from rproject.org depending on your OS (Windows, Unix, macOS).
  • R Studio: Provides a GUI for R, making it easier for many users.
    • Command window: Enter commands.
    • Workspace: Stores generated data.

Basic Computations in R

  • Scalars and arithmetic operations (e.g., a = 1, b = 2, a^b, etc.).
  • Trigonometry: Convert degrees to radians (e.g., sin(30) requires conversion).
  • Logarithms: log10 and natural logs.

Creating and Manipulating Vectors

  • Vector creation: Use c() function (e.g., c(1, 2, 3)).
  • Element-wise operations: Addition, multiplication, etc.
    • Example: cc * bb performs element-wise multiplication.
  • Scalar multiplication: 4 * cc multiplies all elements by 4.
  • Repetition and sequences: rep() and seq() functions for repeating elements and creating sequences.
  • Data input: Use scan() for manual data entry.

Importing Data

  • CSV files: Use read.csv() to import data from CSV.

Descriptive Statistics

  • Functions: length(), min(), max(), mean(), median(), var(), sd().
  • Frequency distribution: table() to get frequency counts of vector elements.

Sorting and Other Operations

  • Sorting: sort() function with options for ascending or descending order.
  • Logical operations: Finding positions or elements based on conditions.
  • Quantiles: Use quantile() for percentiles.

Data Visualization

  • Bar plots: barplot() for displaying categorical data.
  • Histograms: hist() for showing distribution of data.
  • Box plots: boxplot() for visualizing data.

Example: Cell Shape and Migration

  • Categorical data: Use table() to analyze distributions.
    • Example: Correlating cell shape (circular/spindle) with migratory behavior.
  • Visualization: Bar plots to display categorical distributions.

Additional Plotting Features

  • Customize plots with titles, labels, colors, and axis limits.
  • Use plot() function for trajectory and coordinate plotting.

Conclusion

  • R: Powerful tool for statistical analysis and data visualization.
  • Encouragement to practice using R for statistical calculations and plotting.

Thank you for your attention. We will meet again in the next class.