Set Data type in Python

Set Data type in Python

Table of contents

In Python, a set is a sequential store of values with no duplicate entries in an unordered collection. Each value in a set is unique and can only occur once and these values can be of different immutable types (string, numbers etc).

Set Operations

In Python, we can perform many operations on a set and this article will be structured around explaining Sets with different operations.

  1. Definition and Creation

    In python, sets can be created in two ways. Firstly, you can use the built-in set function set() such that:

     >>> my_set = set() #this creates an empty set 
    
     >>> my_set_1 = set("apple", "orange", "lemon", "mango")
     #my_set_1 will return {"apple", "orange", "lemon", "mango"}
    
     #You can also use the set() function with a list (or any other iterable):
    
     >>> my_set = set(["football", "cricket", "tennis"])
    
     # my_set will return {"football", "cricket", "tennis"}
    

    Secondly, you can create a Set with the use of curly braces { }

     >>> my_set = {"london", "manchester", "paris", "berlin"}
    
  2. Membership

    To check the membership of a set i.e see if an element is contained in a set or not, we use the in and not in operators. You can read more about the internal implementation here

     >>> my_set = {"apple", "orange", "guava", "pear"}
    
     >>> print("apple" in my_set) # True
    
     >>> print("mango" in my_set) # print False
    
     >>> print("mango" not in my_set) # print True
    
     >>> print("orange" not in my_set) # print False
    
  3. Size

    To know the size of a set, simply use the len function

     >>> my_set = {"apple", "orange", "guava", "pear"}
    
     >>> print(len(my_set)) # 4
    
  4. Add elements to Set

    To add member elements to Set, we use the add method. This simply populates the set with unique numbers.

     >>> my_courses = set()
    
     >>> my_courses.add("Math")
     >>> my_courses.add("Chemistry")
    
     >>> print(my_courses) # {'Chemistry', 'Math'}
    
     >>> my_courses.add("Math")
     >>> my_courses.add("Physics")
     >>> my_courses.add("History")
    
     >>> print(my_courses) # {'Chemistry', 'Physics', 'History', 'Math'}
    

    Notice how even as we added "Math" multiple times, we still have only one occurrence of it in the set.

  5. Deleting elements from Set

    There are many ways of deleting element(s) from a set with each option having separate use cases.

    • Remove( )

      We can delete a specific item from a set by using the remove method and specifying the item to be deleted as thus:

        >>> my_courses = set({"math", "history", "chemistry"})
        >>> my_courses.remove("math")
      
        >>> print(my_courses) # {'chemistry', 'history'}
      
    • Discard( )

      Just like remove, we can also delete an item from a set by using the discard method and specifying the item to be deleted:

        >>> my_courses = set({"math", "history", "chemistry"})
        >>> my_courses.discard("math")
      
        >>> print(my_courses) # {'chemistry', 'history'}
      

      The difference between the remove and discard methods is that remove will throw an exception when you specify an item that does not belong to the set while discard does not. The discard method fails silently without throwing any exception.

        >>> my_courses = set({"math", "history", "chemistry"})
      
        >>> my_courses.remove("biology")
      
        # throws this error
        >>> Traceback (most recent call last):
          File "<stdin>", line 1, in <module>
        KeyError: 'biology'
      
        >>> my_courses = set({"math", "history", "chemistry"})
      
        >>> my_courses.discard("biology")
        # does not throw any error
        >>> print(my_courses) # {'math', 'chemistry', 'history'}
      
    • Clear( )

      With clear() , we can delete all the elements in a set

        >>> my_courses = set({"math", "history", "chemistry"})
      
        >>> my_courses.clear()
      
        >>> print(my_courses) # set()
      
    • Pop( )

      Pop( ) can be used to delete any random element from a set. It simply selects an item randomly from the set to delete.

        >>> my_courses = set({"math", "history", "chemistry"})
      
        >>> my_courses.pop()
      
        >>> print(my_courses) # {'chemistry', 'history'}
      

      The pop( ) method also throws an error when called on an empty set

        >>> set = set()
        >>> set.pop()
        Traceback (most recent call last):
          File "<stdin>", line 1, in <module>
        KeyError: 'pop from an empty set'
      
  6. Set Union

    The union of two sets refers to putting all elements of two sets together in one set. we can use either the union operator or the pipe | operator to get the union of two sets. The syntax is: set_1.union(*other_sets) or set_1 | (*other_sets)

     >>> course_year_1 = {"Java", "Python", "Typescript"}
    
     >>> course_year_2 = {"Javascript", "Python", "Rust", "Golang"}
    
     >>> all_courses = course_year_1.union(course_year_2)
    
     >>> print(all_courses) # {'Rust', 'Java', 'Typescript', 'Javascript', 'Golang', 'Python'}
    

    or

     >>> course_year_1 = {"Java", "Python", "Typescript"}
    
     >>> course_year_2 = {"Javascript", "Python", "Rust", "Golang"}
    
     >>> all_courses = course_year_1 | course_year_2
    
     >>> print(all_courses) # {'Rust', 'Java', 'Typescript', 'Javascript', 'Golang', 'Python'}
    
  7. Set Intersection

    The intersection of two or more sets is the common element(s) between the sets i.e all the elements that can be found in each of the adjoining sets.

    To get the intersection of a set, we use the intersection or the & operator. The syntax is: set_1.intersection(*other_sets) or set_1 & (*other_sets)

     >>> course_year_1 = {"Java", "Python", "Typescript"}
    
     >>> course_year_2 = {"Javascript", "Python", "Rust", "Golang"}
    
     >>> course_year_3 = {"Python", "C++", "HTML", "Golang"}
    
     >>> all_courses = course_year_1.intersection(course_year_2, course_year_3)
    
     >>> print(all_courses) # {'Python'}
    

    "Python" is the only item common to the three sets

  8. Set difference

    Set differences represent the elements in one set that are not in the other set. This can be gotten by using the difference or - operator with the syntax set.difference(*other_set) or set - other_set . Note that this operator does not return the symmetric difference between the two set i.e set_1.difference(set_2) is not equal to set_2.difference(set_1)

     >>> fruits = {"apple", "tomato", "orange", "pear", "carrot"}
    
     >>> vegetables = {"tomato", "spinach", "broccoli", "onion"}
    
     >>> print(fruits.difference(vegetables)) # this represent items in fruits but not in vegetables => {'carrot', 'orange', 'pear', 'apple'} 
    
     >>> print(fruits - vegetables) # {'carrot', 'orange', 'pear', 'apple'}
    

    similarly

     >>> fruits = {"apple", "tomato", "orange", "pear", "carrot"}
    
     >>> vegetables = {"tomato", "spinach", "broccoli", "onion"}
    
     >>> print(vegetables.difference(fruits)) # this represent items in vegetables but not in fruits => {'onion', 'spinach', 'broccoli'}
    
     >>> print(vegetables - fruits) # {'onion', 'spinach', 'broccoli'}
    
  9. Symmetric difference

    The symmetric difference between two sets is simply the items that are not common to both sets i.e items that cannot be found in both sets. You can view this as the opposite of intersection. You can get the symmetric difference between two sets by using the symmetric_difference or ^ operator with the syntax set.symmetric_difference(*other_set) or set ^ other_set

     >>> fruits = {"apple", "tomato", "orange", "pear", "carrot"}
    
     >>> vegetables = {"tomato", "spinach", "broccoli", "onion"}
    
     >>> print(vegetables.symmetric_difference(fruits)) # {'carrot', 'spinach', 'apple', 'onion', 'orange', 'broccoli', 'pear'}
    
     >>> print(vegetables ^ fruits)
    
  10. Set Modification

    Sets can also be modified although elements contained in a set must be of immutable types. To update/modify a set, we use the update or |= operators. set_1.update(set_2) will add to set_1 all elements in set_2 that are not previously in set_2

    set_1 = {"a", "b", "c"}
    
    set_2 = {"c", "d", "e"}
    
    set_3 = {"e", "f", "d"}
    
    set_1.update(set_2)
    
    set_2 |= set_3
    
    print(set_1) # {'a', 'b', 'c', 'e', 'd'}
    
    print(set_2) # {'c', 'e', 'd', 'f'}
    

Conclusion

In this article, I have taken you through some basic operations of sets in python and became familiar with functions, methods and syntaxes that can be used to work with sets. Please leave me some comments below.