String Methods

Based on the sources and our conversation history, let’s delve into Python’s string methods.

What are String Methods?

String methods are functions that are specifically associated with string objects in Python. They are described as type-specific operations, distinct from generic sequence operations that work across multiple sequence types like strings, lists, and tuples. Methods are implemented as attributes attached to objects that reference callable functions.

From an abstract perspective, methods provide behaviour for objects. From a programming perspective, methods work in much the same way as simple functions, except that a method’s first argument typically receives the instance object that is the subject of the method call. String methods are generally specific to string objects and typically work only on them.

Purpose of String Methods

Strings are used to record textual information and represent text-based information. While basic operations like concatenation (+), repetition (*), indexing (S[i]), slicing (S[i:j]), and length (len(S)) are available as sequence operations for strings and other sequence types, string methods implement more sophisticated text-processing tasks that are specific to strings.

They serve as the first line of text-processing tools in Python.

Syntax for Calling String Methods

Methods are called using “dot notation” or “method syntax”. The syntax is object.method(arguments). An expression of the form object.attribute first fetches the value of the attribute in the object. An expression of the form function(arguments) invokes the code of the function, passing arguments and returning a result. Combining these, the method call object.method(arguments) first fetches the method of the object and then calls it, passing in the arguments. The string object itself is often referred to as the “subject” of the method invocation.

For example, instead of a function syntax like upper(word), you use the method syntax word.upper(). The empty parentheses () indicate that the upper method takes no arguments. Calling a method is also referred to as an “invocation”.

String Immutability and Methods

A key characteristic of Python strings is that they are immutable. This means that the characters they contain cannot be changed in-place.

Consequently, string methods that appear to modify a string (like replace, strip, lower) actually return a new string object containing the result of the operation. The original string object remains unchanged. To effectively “change” a string, you typically call a method that returns a new string and then assign the result back to the variable name. This behaviour contrasts with many list methods, which can modify the list in-place.

Common String Methods

The sources list and describe a variety of common string methods:

  • find(sub[, start[, end]]): Searches for a substring and returns the offset of the first occurrence, or -1 if not found. It can take optional start and end arguments to search within a specific range.
  • replace(old, new [, count]): Returns a new string with occurrences of old replaced by new. An optional count limits the number of replacements.
  • split([sep[, maxsplit]]): Returns a list of substrings by splitting the string around a delimiter (sep). If sep is not provided, it splits on whitespace. maxsplit is an optional argument.
  • join(iterable): A method of strings that takes an iterable (like a list of strings) and concatenates its elements into a single string, using the method’s subject string as the separator between elements. This is often faster than concatenating strings individually.
  • Case Conversions: Methods like upper(), lower(), swapcase(), and title() return new strings with characters converted to uppercase, lowercase, inverted case, or title case respectively.
  • Whitespace/Character Removal: Methods like strip([chars]), lstrip([chars]), and rstrip([chars]) return new strings with leading/trailing whitespace (by default) or specified characters removed. strip performs both left and right stripping.
  • Content Tests: Methods like isalpha(), isdigit(), isalnum(), isspace(), istitle(), isupper(), etc., check the content of the string and return True or False.
  • Prefix/Suffix Tests: startswith(prefix [, start [, end]]) and endswith(suffix [, start [, end]]) check if the string begins or ends with a given substring. These can also be achieved with slicing and indexing.
  • count(sub[, start[, end]]): Counts how many times a substring sub occurs within the string or a specified substring.
  • encode() and decode(): In Python 3.0, str objects (Unicode text) have an encode() method to convert them to bytes objects, while bytes objects have a decode() method to convert them to str objects. This conversion requires specifying an encoding.

This is not an exhaustive list; there are many more string methods available.

String Formatting Methods

Besides the older % formatting expression, Python 2.6 and 3.0 introduced a newer way to format strings using a method call, specifically the .format() method. This method uses the subject string as a template, with curly braces {} indicating substitution targets. Values to be substituted can be inserted by position (e.g., {0}) or keyword (e.g., {food}). While it largely overlaps with the functionality of formatting expressions, it has some extra features and handles argument passing differently.

Other Types with Methods

The concept of methods is used throughout Python. Objects of other core types like lists, dictionaries, and files also have type-specific methods. Even user-defined classes can have methods. Python 3.0 also has bytes and bytearray types, which have their own sets of methods, some overlapping with str and some unique, especially bytearray which has mutable methods similar to lists.

Finding and Learning String Methods

You can discover the methods available for a string object by using the built-in dir() function on a string literal (e.g., dir('')) or the str type name (dir(str)). This returns a list of attributes, including methods. Names starting and ending with double underscores (__) are related to the implementation and operator overloading and can generally be ignored initially.

To understand what a specific method does, you can use the built-in help() function, passing the method itself (e.g., help(S.method)).

Remembering all string methods takes time, but you can use resources like the Python library manual, reference books, cheat sheets, or interactive experimentation with dir and help.

Historical Note: String Module vs. Methods

Historically, in older Python versions (prior to 2.0), many common string operations were available as functions in a standard library module called string. In Python 2.0, these functions were made available as methods of string objects. Although the module was retained for backward compatibility in Python 2.X, the module functions have been entirely removed in Python 3.0. You should use string object methods instead of the old string module functions today, as methods are generally cleaner (no import needed for basic operations) and faster. The string module still exists in 3.0 but contains only other tools like predefined constants.

String Methods and Pattern Matching

It’s important to note that none of the standard string object methods support pattern-based text processing. For advanced pattern matching, you need to use the re (regular expression) module from Python’s standard library.