Finding the longest or shortest item in a list

Finding the longest or shortest item in a list#

Python is a powerful language to quickly and efficiently do work with data, but it requires a more in-depth knowledge of the language to write more elegant and readable code. For many new Python programmers, this is a catch-22 as they’re still learning all the details and it takes time. Let’s take a simple example about finding the longest string in a list and simplify some code. Secondly making the code faster as we will use built-in functions at machine speed instead of interpreting Python code.

The example is a for-loop over a list and checks if the new item is longer than the current longest item before it stores it as the longest. Afterward, it prints the result which will be longest in this case.

#!/usr/bin/env python3

def main():
    mylist = ["find", "longest", "item", "in", "the", "list"]

    word = ""
    for item in mylist:
        if len(item) > len(word):
            word = item
    print(word)


if __name__ == "__main__":
    main()

Using builtin functions#

Python has built-in functions called min() and max() to find the smallest or biggest value in a list. In the example below the max() function is used to find the biggest numerical value and min() for the smallest numerical value.

#!/usr/bin/env python3

def main():
    mylist = [1, 2, 3, 5, 4]

    max_value = max(mylist)
    min_value = min(mylist)
    print(min_value)
    print(max_value)


if __name__ == "__main__":
    main()

As expected the output is 1 for the lowest value, and 5 as it is the highest value in the list.

1
5

Handling an empty sequence#

Using the built-in functions min() and max() will fail with ValueError: max() arg is an empty sequence if an empty sequence is used. Setting the default value to 0 will be returned when the sequence is empty. In the original example, we set the variable word to an empty string as the default value.

#!/usr/bin/env python3

def main():
    mylist = []

    value = max(mylist, default=0)
    print(value)


if __name__ == "__main__":
    main()

Finding the longest item#

Now that we know about the max() function and how to handle empty sequences or a list in our example it is time to implement it. One key item still missing is that max() works with numerical values so we have to score all elements with a numerical value. The function max() has another option called key which can give a function to score the element and with function len() the length of a string element can be determined.

#!/usr/bin/env python3

def main():
    mylist = ["find", "longest", "item", "in", "the", "list"]

    word = max(mylist, key=len, default="")
    print(word)


if __name__ == "__main__":
    main()

Running example and prints the same result as in the original example we started with, but we reduced four-line into just one.

longest

Finding the shortest item#

By using the min() function we can find the shortest string in list and we didn’t had to rewrite selection code for it. Here we see a benefit already as we can just switch from max() to min() without any other modifications.

#!/usr/bin/env python3

def main():
    mylist = ["find", "longest", "item", "in", "the", "list"]

    word = min(mylist, key=len, default="")
    print(word)


if __name__ == "__main__":
    main()

Changing the key calculation#

In the previous examples, the len() function was used to calculate the key value based on the length of the string. Other functions can also be used to calculate the key-value and in the example below we create a custom function for this. The calc function sums up the ASCII values for all characters in the string and returns that.

#!/usr/bin/env python3

def calc(word: str) -> int:
    return sum(list(map(ord, word)))


def main():
    mylist = ["aa", "ab", "ac", "ba"]

    word = max(mylist, key=calc, default="")
    print(word)


if __name__ == "__main__":
    main()

Note

The function ord() converts an ASCII character to an integer, and map() function iterates a function over a list.

A shorter way to write this is to use a lamba function and keep the function close to where it is used and making it more readable.

#!/usr/bin/env python3

def main():
    mylist = ["aa", "ab", "ac", "ba"]

    word = max(mylist, key=(lambda value: sum(list(map(ord, value)))), default="")
    print(word)


if __name__ == "__main__":
    main()

In both examples, the outcome is ac as it has the biggest numerical value if both ASCII values are summed up.

Conclusions about finding in lists#

In the original example, we started with a custom piece of code to select an element from a list and replaced it with built-in functions. With this, the code length was reduced and the complexity as well. These built-in functions have been tested extensively with every Python release and have optimized execution in C instead of Python, reducing the chance of bugs and slow code.

How to manage your dotfiles Scanning with KICS for issues in Terraform