Complete Python Pandas Tutorial Notes

Introduction

Options to start with Pandas:
- Use Google Colab to edit and run code in your browser.
- Alternatively, set up locally with editors like Visual Studio Code, PyCharm, or Jupyter Lab.
Clone the repository for the tutorial data files using the command:
```
git clone <repository_link>  
```

Create and activate a virtual environment:

python -m venv tutorial_env  
source tutorial_env/bin/activate

DataFrame: Main data structure in Pandas, resembling a table with enhanced functionality.

Example of creating a DataFrame:

import pandas as pd  
df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})

Methods to explore DataFrames:
- df.head() to view first few rows.
- df.tail() to view last rows.
- df.columns to see column names.
- df.index to see index values.

Load CSV files with:
```
df = pd.read_csv('path/to/file.csv')  
```
CSV is common, but other formats (like Parquet, Feather) can be more efficient.
Load data from different formats:
- CSV: pd.read_csv()
- Excel: pd.read_excel()
- Parquet: pd.read_parquet()

View data using head(), tail(), sample().
Access specific rows/columns with .loc[] and .iloc[]:
- .loc[] uses labels.
- .iloc[] uses index positions.
Use .at[] and .iat[] for accessing single values efficiently.

To modify a value in DataFrame:

df.loc[row_index, 'column_name'] = new_value

Add or drop columns:
- Add: df['new_column'] = values
- Drop: df.drop('column_name', axis=1, inplace=True)

Filter rows based on conditions:

filtered_df = df[df['column_name'] > value]

Use groupby() to aggregate data.

grouped = df.groupby('column_name').sum()

Explore the Olympic dataset further to practice with Pandas.
Check out tutorials on cleaning datasets and solving Pandas puzzles for additional practice.