Strings#


Learning Objectives

  • Manipulate strings using indexing, slicing, and formatting


A string is a data type utilized to denote a sequence of characters. The string is enclosed in quotes, either double quotes or single quotes, with the option to choose either one. It is important to use matching quotes since using mismatched quotes will result in a syntax error.

A string can range from having zero length, called an Empty String, to an extremely lengthy sequence of characters.

Concatenation, accomplished by using the plus sign, is a means of building longer strings using shorter ones. A less common operation is to multiply the string by a number, resulting in the content of the string being repeated that many times.

fruit = "Pineapple"

print("The word 'Pineapple' is",len(fruit),"characters long.")
The word 'Pineapple' is 9 characters long.

The len() function is a built-in function that returns the number of items in a sequence or the number of characters in a string, which in the case of ‘Pineapple’ is 9.

String Indexing and Slicing#

String indexing enables accessing individual characters in a string through the use of square brackets and the location or index of the character to be accessed. Python starts indexing at 0, so to access the first character in a string, one would use the index [0].

print("The first character in",fruit,"is",fruit[0])
The first character in Pineapple is P

Attempting to access an index that’s greater than the string’s length raises an IndexError as the accessed item doesn’t exist.

# Running this code raises 'IndexError'
##print(fruit[9])


# The word 'Pineapple' having 9 characters, has index values ranging from 0 to 8. 
# Therefore, using '9' as index to access a character in it produces an error.

Negative index values can also be used to access the string’s indexes from the end towards the start, with [-1] accessing the last character and [-2] accessing the second-to-last character.

Additionally, a slice or substring of a string, which comprises multiple characters, can be accessed. To do so, a range is created using a colon as a separator between the start and end of the range, such as [2:5].

print("The last character for",fruit,"is",fruit[-1])

print("The substring from index 4 to index 6 for",fruit,"gives:",fruit[4:7])
The last character for Pineapple is e
The substring from index 4 to index 6 for Pineapple gives: app

An alternative approach for defining the range involves specifying only one of the two indices. In this scenario, it is implied that the missing index corresponds to either the first value of 0 or the second value of the string’s length.

print("The substring of first four characters in",fruit,"is",fruit[:4])

print("The substring of last five characters in",fruit,"is",fruit[4:])
The substring of first four characters in Pineapple is Pine
The substring of last five characters in Pineapple is apple

Stings are immutable, which means they cannot be changed. Modifying a single character within a string is not possible, and any correction requires creating a new string that fixes the mistake. Alternatively, one can reassign the variable that holds the string to a new value with the mistake fixed.

greeting = "Hellop, world!"

print("Old greeting:",greeting)

new_greeting = greeting[:5] + ", " + greeting[-6:-1]

print("New greeting:",new_greeting)
Old greeting: Hellop, world!
New greeting: Hello, world

To locate the index of a specific character or subtring within a string, we can use the string method called index(). This method returns the index of the first occurrence of a character or substring. If the index is not found in the string, then a ValueError is raised. In case of multiple matches, only the index of the first occurrence is returned.

To avoid a ValueError, the in keyword can be used to first check if the substring exists in the string. It is used as a conditional operator and returns a boolean value of True if the substring exists in the string and False otherwise.

new_string = "An apple a day keeps the doctor away."

if 'apple' in new_string:
    print("'apple' is located at index",new_string.index('apple'))
'apple' is located at index 3

String Methods#

  • The lower() and upper() string methods can be used to convert a string to all lowercase or all uppercase characters, respectively. These methods are called using dot notation on a string and can be useful when checking user input.

  • The strip() method can be used to remove any whitespace characters, such as spaces, tabs (\t), and newline characters (\n), from the beginning and end of a string.

  • The count() method can be used to count the number of times a substring appears in a string.

  • The endswith() method can be used to check if a string ends with a particular substring. If the substring is found at the end of the string, the method will return True, otherwise it will return False.

  • The isnumeric() method can be used to determine if a string contains only numeric characters. If the string contains only numbers, the method will return True. This can be useful for checking if a string can be converted to an integer using the int() function.

  • The join() method can be used for concatenating strings. This method is called on a string and takes a list of strings as parameter. It returns a new string composed of the strings from the list joined using the initial string.

  • The inverse of the join() method is the split() method, which splits a string into a list of strings. By default, the split() method splits the string by any whitespace characters, but it can also split by any other character specified by a parameter.

basic_string = "This is the test string."

# Using `.upper()`
print("Converting to upper-case:",basic_string.upper(),end="\n\n")

# Using '.lower()'
print("Converting to lower-case:",basic_string.lower(),end="\n\n")

# Using '.count()'
print("Frequency of 's':",basic_string.count('s'),end="\n\n")

# Using '.isnumeric()'
print("Is the string numeric?",basic_string.isnumeric(),end="\n\n")

string_to_join = ["Today","is","Monday.","We","are","going","to","the","beach","on", "Saturday."]

# Using '.join()`
joined_string = " ".join(string_to_join)
print(joined_string,end="\n\n")

# Using `.split()`
print("Splitting 'This is the test string' gives us",basic_string.split())
Converting to upper-case: THIS IS THE TEST STRING.

Converting to lower-case: this is the test string.

Frequency of 's': 4

Is the string numeric? False

Today is Monday. We are going to the beach on Saturday.

Splitting 'This is the test string' gives us ['This', 'is', 'the', 'test', 'string.']

String Formatting#

The format() method provides a powerful way to concatenate and format strings. The format method works by creating a string containing curly brackets {} as placeholders, to be replaced. We then call the format method on the string using .format() and pass variables as parameters. The method automatically handles any necessary conversion between data types.

If the curly brackets are left empty, the variables are populated in the order they’re passed.

# base string with {} placeholders

example = "format() method"

formatted_string = "this is an example of using the {} on a string".format(example)

print(formatted_string)
this is an example of using the format() method on a string

However, we can use expressions inside the curly brackets to do more powerful string formatting. For example, we can put the variable name inside the curly brackets and use its name as a parameter. This provides more readable code and more flexibility with the order of variables.

# Variable name inside curly brackets

name = "Swanson"
job = "Researcher"

print("Hello, I am {name} and I am a {job}!".format(name=name,job=job))
Hello, I am Swanson and I am a Researcher!

If the placeholders indicate a number, they are replaced by the variable corresponding to that order (starting at zero).

# "{0} {1}".format(first,second)

first = "apple"
second = "banana"
third = "carrot"

format_string = "List of items: {0}, {2} and {1}".format(first,second,third)

print(format_string)
List of items: apple, carrot and banana

We can also use formatting expressions inside the curly brackets to alter the way the string is formatted.

  • Example 1: The expression {:.2f} formats the variable as a float number with two decimal places. The colon acts as a separator from the field name, if specified. We can also specify text alignment using the greater than operator >.

  • Example 2: The expression {:>3.2f} would align the text three spaces to the right and specify a float number with two decimal places.

# Example 1 

two_third = 2 / 3

print("Without formatting: {}".format(two_third))
print("With formatting: {:.3f}".format(two_third))
Without formatting: 0.6666666666666666
With formatting: 0.667

Program 13#


Write a function called is_palindrome that takes in a string and checks if it’s a palindrome. A palindrome is a string that can be read equally from left to right or right to left, ignoring capitalization and blank spaces. The function should return True if the passed string is a palindrome, and False if not.

Note: In the above example, the function should return True because “Kayak” is a palindrome string. When we read it from left to right or right to left, ignoring capitalization and blank spaces, the string remains the same.

def is_palindrome(input_string):
    
    """
    Parameters:
        input_string (str): A string to be checked
        
    Returns:
        True if input_string is palindrome, 
        False otherwise.
    """
    
    actual_input_string = input_string.lower()
    str_len = len(actual_input_string)
    str_ind = str_len-1
    
    new_string = ""
    actual_reverse = ""
    
    reverse_string = actual_input_string[str_ind::-1]
    reversing1 = reverse_string.split()
    
    for letter in actual_input_string:
        
        if letter != " ":
            new_string = new_string + letter
    
    for letter in reversing1:
        actual_reverse = actual_reverse + letter
    
    if new_string == actual_reverse:
            return True
    else:
        return False
# Should be True
print("Run 1")
print("Is 'Never Odd or Even' a palindrome? - ",is_palindrome("Never Odd or Even"),end="\n\n")

# Should be False
print("Run 2")
print("Is 'abc' a palindrome? - ",is_palindrome("abc"),end="\n\n")

# Should be True
print("Run 3")
print("Is 'kayak' a palindrome? - ",is_palindrome("kayak"))
Run 1
Is 'Never Odd or Even' a palindrome? -  True

Run 2
Is 'abc' a palindrome? -  False

Run 3
Is 'kayak' a palindrome? -  True

Program 14#


Write a function called replace_ending that takes three parameters: sentence, old, and new. The function should replace the old string in the sentence with the new string, but only if the sentence ends with the old string. If there is more than one occurrence of the old string in the sentence, only the one at the end is replaced, not all of them. The function should then return the updated sentence.

For example

replace_ending(“abcabc”, “abc”, “xyz”) should return “abcxyz”, not “xyzxyz” or “xyzabc”.

Note that the string comparison should be case-sensitive, so replace_ending(“abcabc”, “ABC”, “xyz”) should return “abcabc” (no changes made).

def replace_ending(sentence, old, new):
    
    """
    Parameters:
        sentence (str): The input sentence
        old (str): the string to be replaced
        new (str): the string that replaces the old string
        
    Returns:
        str: the modified sentence with old string
        replaced by new string. if old is found at the
        end of the sentence. Otherwise, returns the 
        original sentence.
    """

    # Checking if the old string is at the end of the sentence 
    if sentence.endswith(old):
        
        i = len(sentence) - len(old)
        new_sentence = sentence[:i] + new
        return new_sentence

     
    return sentence
# Should display "It's raining cats and dogs"   
print("Run 1")
print(replace_ending("It's raining cats and cats", "cats", "dogs"),end="\n\n")

# Should display "She sells seashells by the seashore"
print("Run 2")
print(replace_ending("She sells seashells by the seashore", "seashells", "donuts"),end="\n\n")

# Should display "The weather is nice in May"
print("Run 3")
print(replace_ending("The weather is nice in May", "may", "april"),end="\n\n")

# Should display "The weather is nice in April"
print("Run 4")
print(replace_ending("The weather is nice in May", "May", "April"),end="\n\n") 
Run 1
It's raining cats and dogs

Run 2
She sells seashells by the seashore

Run 3
The weather is nice in May

Run 4
The weather is nice in April