Overview of Python String Operations

Sep 24, 2024

Lecture Notes: Python Data Type - Strings

Introduction to Python Strings

  • Strings are a data type in Python.
  • Characters enclosed in single or double quotes.
  • Applications in various fields, e.g., biology for sequences.

Creating Strings

  • Assign strings to variables using single or double quotes.
  • Example:
    my_string = "bioinformatics"
    

Printing Strings

  • Use print() function to display strings.
  • Can use direct string or variable.

String Case Conversion

  • Convert strings to lowercase or uppercase using:
    • string.lower() for lowercase.
    • string.upper() for uppercase.
  • Example:
    my_string.lower()
    my_string.upper()
    

Capitalizing Strings

  • Use string.capitalize() to capitalize the first letter.
  • Example:
    my_string.capitalize()
    

Finding Length of Strings

  • Use len() function to find the number of characters.
  • Space is counted as a character.
  • Example:
    len(my_string)
    

Checking Character Presence

  • Check if a character or substring is present using in keyword.
  • Use string.find() to find the index of a substring.

Counting Characters or Substrings

  • Use string.count() to count occurrences.
  • Example:
    my_string.count("o")
    

Replacing Characters or Substrings

  • Use string.replace(old, new) to replace parts of a string.
  • Example:
    my_string.replace("old", "new")
    

Combining Strings

  • Concatenation using + operator or string formatting.
  • Example:
    full_name = first_name + " " + last_name
    

String Formatting

  • Use placeholders %s or str.format() for formatting.
  • Example:
    "Hello, %s!" % name
    "Hello, {}!".format(name)
    

Indexing and Slicing

  • Access parts of strings using indexes.
  • Python indexing starts at 0.
  • Use slicing [start:end:step].
  • Negative indexing counts from the end.

Practical Applications in Bioinformatics

  • DNA sequences as strings.
  • Calculating GC content using string operations.
  • Example for GC content:
    gc = dna.count('G') + dna.count('C')
    gc_percent = (gc / len(dna)) * 100
    

Exercises

  • Example problems related to string manipulations.
  • Finding nucleotide positions, counting occurrences, etc.