Working with Strings#
Strings in Python are an array of bytes representing Unicode characters and the length can be a single character or a sequence of characters. The elements of a string have no data type and can be accessed using the index operator. In Python strings are surrounds by either single quotation marks, or double quotation marks. PEP 8 describes no standard on how to use single or double-quotes:
Note
In Python, single-quoted strings and double-quoted strings are the same. This PEP does not make a recommendation for this. Pick a rule and stick to it. When a string contains single or double quote characters, however, use the other one to avoid backslashes in the string. It improves readability.
For triple-quoted strings, always use double quote characters to be consistent with the docstring convention in PEP 257.
Some tools like Black have a preference to have all strings and comments in double quotes, but both ways are correct.
The basics about strings#
The most basic form of a litereal string is one that is given directly to print()
.
1 #!/usr/bin/env python3
2
3 def main():
4 print("Hello World.")
5
6
7 if __name__ == "__main__":
8 main()
1 Hello World.
The second form is to assign a variable to a string and can be used by print()
as a reference to the string.
1 #!/usr/bin/env python3
2
3 def main():
4 phrase = "Hello World."
5 print(phrase)
6
7
8 if __name__ == "__main__":
9 main()
1 Hello World.
As seen in chapter Variables and Types, variables can be joined with the +
-sign and a string can also be concatenated with a variable.
1 #!/usr/bin/env python3
2
3 def main():
4 phrase_one = "Hello World."
5 phrase_two = "And Goodbye."
6 # Concatenate two variables
7 print(phrase_one + phrase_two)
8 # Concatenate two variables with a string
9 print(phrase_one + " " + phrase_two)
10
11
12 if __name__ == "__main__":
13 main()
1 Hello World.And Goodbye.
2 Hello World. And Goodbye.
Strings can also be a multiline string with a newline character as part of the value. Python does take the indentation of a multiline string not into account and will the indentation will be part of the string. On Stack Overflow in question 2504411 possible solutions to work around this issue are discusses.
1 #!/usr/bin/env python3
2
3 def main():
4 phrase = """Hello World.
5 And Goodbye."""
6 print(phrase)
7
8
9 if __name__ == "__main__":
10 main()
1 Hello World.
2 And Goodbye.
Strings are arrays#
Strings are like in other languages arrays and can be address in that way. The working of arrays is described in Arrays, but for now we read the second element of the array and print it.
1 #!/usr/bin/env python3
2
3 def main():
4 phrase = "Hello World."
5 print(phrase[1])
6
7
8 if __name__ == "__main__":
9 main()
1 e
As a strings is an array you can easily loop over all elements and get every element separately.
1 #!/usr/bin/env python3
2
3 def main():
4 for x in "Hello World.":
5 print(x)
6
7
8 if __name__ == "__main__":
9 main()
1 H
2 e
3 l
4 l
5 o
6
7 W
8 o
9 r
10 l
11 d
12 .
Getting string length#
The built-in function len()
return the length of an object and used the internal method __len__()
of the object to determine the length.
1 #!/usr/bin/env python3
2
3 def main():
4 phrase = "Hello World."
5 # Using the built-in version
6 print(len(phrase))
7 # Using the internal method of an object
8 print(phrase.__len__())
9
10
11 if __name__ == "__main__":
12 main()
1 12
2 12
Warning
The internal method __len__()
is a magic method in Python and should not be called directly. See section Magic methods in chapter Classes for more information.
Checking a string#
With the in
operator you can check if a string is in another string. On line 6 we check if the string Hello
is in the string Hello World.
and the result is True
. The second check on line 8 is checking if the string Hello
isn’t in the string Hello World.
and the result is No, Hello World.
1 #!/usr/bin/env python3
2
3 def main():
4 phrase = "Hello World."
5
6 print("Hello" in phrase)
7
8 if "Hello" not in phrase:
9 print("Yes, Hello World.")
10 else:
11 print("No, Hello World.")
12
13
14 if __name__ == "__main__":
15 main()
1 True
2 No, Hello World.
Slicing strings#
Other languages have functions for extracting parts of strings, but Python is using the slice syntax. The slice syntax works with brackets that contains the start and stop index, and the indexes are separated by a colon. By default the indexes are set to the first and numbers of positions in the string. The slice syntax also allows for a negative index to start from the end instead of the begin of the string.
1 #!/usr/bin/env python3
2
3 def main():
4 phrase = "Hello World."
5
6 # Get the characters from position 2 to position 5 (not included).
7 print(phrase[2:5])
8 # Get the characters from the start to position 5 (not included).
9 print(phrase[:5])
10 # Get the characters from position 2, and all the to end.
11 print(phrase[2:])
12 # Get the characters from position -5 to position -2 from the end.
13 print(phrase[-5:-2])
14
15
16 if __name__ == "__main__":
17 main()
1 llo
2 Hello
3 llo World.
4 orl
Note
As strings are like arrays in Python and the first character has index 0.
String methods#
Python has built-in methods that can be used on string. A complete overview is described in Python String Modules
, but here we will touch the most used methods.
Convert a string to upper or lower case#
With String methods upper()
and lower()
a copy of the string is returned after all chareacters are converted to uppercase or lowercase.
1 #!/usr/bin/env python3
2
3 def main():
4 phrase = "Hello World."
5 print(phrase.upper())
6 print(phrase.lower())
7
8
9 if __name__ == "__main__":
10 main()
1 HELLO WORLD.
2 hello world.
Note
Both methods follow the Unicode Standard and may for example still contain a lowercase character after calling upper()
if that is defined in the Unicode Standard.
Trim a string#
The method strip()
removes by default whitespace character from the string on both sides. With lstrip()
or rstrip()
the string is only being trimmed on the left or right side.
1 #!/usr/bin/env python3
2
3 def main():
4 phrase = " Hello World. "
5 print(phrase)
6 print(phrase.strip())
7 print(phrase.lstrip())
8 print(phrase.rstrip())
9
10
11 if __name__ == "__main__":
12 main()
1 Hello World.
2 Hello World.
3 Hello World.
4 Hello World.
The method strip()
by default trims whitespace characters, but can also use other characters to trim a string as shown in the example below where all given characters are stripped from both the left and right side of the string until a character is found that isn’t on the list.
1 #!/usr/bin/env python3
2
3 def main():
4 phrase = "Hello World."
5 print(phrase.strip("HeldW."))
6
7
8 if __name__ == "__main__":
9 main()
1 o Wor
Replacing a string#
The method replace()
replaces a string with another string. The example below first replaces the W
character with W
, and secondly it replaces the substring ll
with LL
.
1 #!/usr/bin/env python3
2
3 def main():
4 phrase = "Hello World."
5 # Replace the "W" character with "w"
6 print(phrase.replace("W", "w"))
7 # Replace the "ll" character set with "LL"
8 print(phrase.replace("ll", "LL"))
9
10
11 if __name__ == "__main__":
12 main()
1 Hello world.
2 HeLLo World.
Split and join#
The method split()
splits a string into a list of strings. The example splits the string Hello World.
using the default character that is a whitespace character, and it also shows that it can use a string like ll
to split the string.
1 #!/usr/bin/env python3
2
3 def main():
4 phrase = "Hello World."
5 # Split a string with the default character that is a whitespace
6 print(phrase.split())
7 # Split a string by using the whitespace character
8 print(phrase.split(" "))
9 # Split a string by using the substring "ll"
10 print(phrase.split("ll"))
11
12
13 if __name__ == "__main__":
14 main()
1 ['Hello', 'World.']
2 ['Hello', 'World.']
3 ['He', 'o World.']
The method join()
joins a list of strings into a string.
1 #!/usr/bin/env python3
2
3 def main():
4 phrases = ["Hello", "World."]
5 separator = " "
6 print(separator.join(phrases))
7
8
9 if __name__ == "__main__":
10 main()
1 Hello World.
The method join()
can also be used to join a list of strings into a string with a separator.
1 #!/usr/bin/env python3
2
3 def main():
4 phrases = {"wordOne": "Hello", "wordTwo": "World."}
5 separator = "-"
6 print(separator.join(phrases))
7
8
9 if __name__ == "__main__":
10 main()
1 wordOne-wordTwo
Formatting strings#
The method format()
formats a string with placeholders.
1 #!/usr/bin/env python3
2
3 def main():
4 name = "World"
5 phrase = "Hello {}."
6 print(phrase.format(name))
7
8
9 if __name__ == "__main__":
10 main()
1 Hello World.
The method format()
can also be used to format a string with a list for the placeholders.
1 #!/usr/bin/env python3
2
3 def main():
4 name_one = "Jack"
5 name_two = "John"
6 phrase = "Hello {} and {}."
7 print(phrase.format(name_one, name_two))
8
9
10 if __name__ == "__main__":
11 main()
1 Hello Jack and John.
The placeholders for the method format()
can also be used as keys in a dictionary.
1 #!/usr/bin/env python3
2
3 def main():
4 name_one = "Jack"
5 name_two = "John"
6 phrase = "Hello {1} and {0}."
7 print(phrase.format(name_one, name_two))
8
9
10 if __name__ == "__main__":
11 main()
1 Hello John and Jack.
Escape characters#
Strings are an array of character, but for Python to recognizing the correct they’re surrounded with double quotes (the "
-sign) like as in the example below, but if the string also contains a double quote the string is illegal and Python can’t detect it correctly.
1 #!/usr/bin/env python3
2
3 def main():
4 phrase = "Hello "World"."
5 print(phrase)
6
7
8 if __name__ == "__main__":
9 main()
1 File "/workspaces/learning-python/example.py", line 4
2 phrase = "Hello "World"."
3 ^
4 SyntaxError: invalid syntax
To solve the SyntaxError, the offending character can be escaped with the backslash character (the \
-sign). In the example below we changed the double quotes in the string from "
to \"
which resolves the SyntaxError.
1 #!/usr/bin/env python3
2
3 def main():
4 phrase = "Hello \"World\"."
5 print(phrase)
6
7
8 if __name__ == "__main__":
9 main()
1 Hello "World".
Python has a complete list of escape characters to also create new lines for example, but also to insert an octal or hexadecimal value into a string.
Code |
Result |
---|---|
|
Single Quote |
|
Backslash |
|
New Line |
|
Carriage Return |
|
Tab |
|
Backspace |
|
Form Feed |
|
Octal value |
|
Hex value |
1 #!/usr/bin/env python3
2
3 def main():
4 phrase = "Hello\n \"World\"."
5 print(phrase)
6
7
8 if __name__ == "__main__":
9 main()
1 Hello
2 "World".