Sets Data Structure in Python

What is a Set?

You've learned List, Tuple, and Dictionary. Set is the last data structure.

A set is a collection that has two special properties:

  • No duplicate values — every item is unique
  • No order — items have no index, no position


    my_set = {1, 2, 3, 4, 5}
    print(my_set)    # {1, 2, 3, 4, 5}

Looks similar to dictionary but no key-value pairs — just values.


The Killer Feature — No Duplicates


    numbers = {1, 2, 3, 2, 4, 1, 5, 3}
    print(numbers)    # {1, 2, 3, 4, 5} — duplicates automatically removed

This is the main reason sets exist. When you need unique values — use a set.


Creating a Set


    # Direct creation
    fruits = {"apple", "banana", "mango", "apple", "banana"}
    print(fruits)    # {'apple', 'banana', 'mango'} — duplicates gone

    # Convert list to set — very common use case
    marks_list = [85, 90, 85, 78, 90, 92, 78]
    marks_set = set(marks_list)
    print(marks_set)    # {85, 90, 78, 92} — unique marks only

    # Empty set — must use set(), not {}
    empty = set()       # correct
    empty = {}          # WRONG — this creates empty dictionary


No Index — Cannot Access by Position


    fruits = {"apple", "banana", "mango"}

    print(fruits[0])    # ERROR! sets don't support indexing

Sets have no order so there's no first or second item. You either loop through all items or check if something exists.


Checking Membership


    fruits = {"apple", "banana", "mango"}

    print("apple" in fruits)      # True
    print("grapes" in fruits)     # False

    if "mango" in fruits:
        print("Mango is available!")

Checking membership in a set is faster than in a list — especially with large data. This is one more reason to use sets when you only need to check existence.


Adding and Removing Items


    fruits = {"apple", "banana", "mango"}

    fruits.add("orange")          # add one item
    print(fruits)

    fruits.add("apple")           # adding duplicate — nothing happens
    print(fruits)                 # still same, no error

    fruits.remove("banana")       # remove — gives error if not found
    print(fruits)

    fruits.discard("grapes")      # remove — NO error if not found
    print(fruits)

Output:

{'apple', 'banana', 'mango', 'orange'}
{'apple', 'banana', 'mango', 'orange'}
{'apple', 'mango', 'orange'}
{'apple', 'mango', 'orange'}

Always prefer .discard() over .remove() unless you specifically want an error when item is missing.


Looping Through a Set


    fruits = {"apple", "banana", "mango"}

    for fruit in fruits:
        print(fruit)

Works same as list but order is not guaranteed — items may print in any order every time you run.


Set Operations — The Real Power

Sets support mathematical set operations. These are extremely useful:

Union — All items from both sets


    set1 = {1, 2, 3, 4}
    set2 = {3, 4, 5, 6}

    union = set1 | set2
    print(union)    # {1, 2, 3, 4, 5, 6} — all items, no duplicates


Intersection — Only items present in BOTH sets


    set1 = {1, 2, 3, 4}
    set2 = {3, 4, 5, 6}

    common = set1 & set2
    print(common)    # {3, 4} — only items in both


Difference — Items in first set but NOT in second


    set1 = {1, 2, 3, 4}
    set2 = {3, 4, 5, 6}

    diff = set1 - set2
    print(diff)    # {1, 2} — in set1 but not in set2

    diff2 = set2 - set1
    print(diff2)   # {5, 6} — in set2 but not in set1


Real World Use of Set Operations


    students_python = {"Rahul", "Priya", "Gagan", "Amit"}
    students_java = {"Priya", "Amit", "Ravi", "Neha"}

    # Who is enrolled in both courses?
    both = students_python & students_java
    print(f"Both courses: {both}")        # {'Priya', 'Amit'}

    # Who is enrolled in at least one course?
    any_course = students_python | students_java
    print(f"Any course: {any_course}")    # {'Rahul', 'Priya', 'Gagan', 'Amit', 'Ravi', 'Neha'}

    # Who is in Python but NOT Java?
    only_python = students_python - students_java
    print(f"Only Python: {only_python}")  # {'Rahul', 'Gagan'}


Most Common Use Case — Remove Duplicates from List

In real projects this is the most frequent reason to use a set:


    # User entered tags with duplicates
    tags = ["python", "coding", "python", "beginner", "coding", "python"]
    print(f"Original: {tags}")

    # Remove duplicates
    unique_tags = list(set(tags))
    print(f"Unique: {unique_tags}")

Output:

Original: ['python', 'coding', 'python', 'beginner', 'coding', 'python']
Unique: ['beginner', 'python', 'coding']

Convert to set to remove duplicates, convert back to list if you need indexing.


All Four Data Structures — Final Comparison


    # List — ordered, changeable, allows duplicates
    my_list = [1, 2, 3, 2, 1]

    # Tuple — ordered, unchangeable, allows duplicates  
    my_tuple = (1, 2, 3, 2, 1)

    # Set — unordered, changeable, NO duplicates
    my_set = {1, 2, 3, 2, 1}       # becomes {1, 2, 3}

    # Dictionary — key-value pairs, changeable, keys unique
    my_dict = {"name": "Gagan", "age": 22}


List

Tuple

Set

Dictionary

Ordered

Yes

Yes

No

Yes (Python 3.7+)

Duplicates

Yes

Yes

No

Keys: No, Values: Yes

Changeable

Yes

No

Yes

Yes

Access by

Index

Index

Key

Syntax

[]

()

{}

{key: val}



Quick Decision Guide

Need to store data that won't change?         → Tuple
Need unique values only?                      → Set
Need to look up data by a name/label?         → Dictionary
Everything else (ordered, changeable list)?   → List

Exercise 🏋️

Solve these three small problems using sets:

Problem 1 — Remove Duplicates

visitor_log = [101, 203, 101, 305, 203, 101, 407, 305]

Find how many unique visitors came. Print the unique visitor IDs.

Problem 2 — Common Friends

friends_a = {"Rahul", "Priya", "Gagan", "Amit", "Neha"}
friends_b = {"Priya", "Ravi", "Gagan", "Kiran", "Amit"}

Find:

  • Friends common to both A and B
  • Friends only A has (not B)
  • All unique friends combined

Problem 3 — Membership Check

Ask user to enter a username. Check if it exists in this set of registered users and print appropriate message:

registered_users = {"gagan", "rahul", "priya", "amit", "neha"}


No comments:

Post a Comment

Sets Data Structure in Python

What is a Set? You've learned List, Tuple, and Dictionary. Set is the last data structure. A set is a collection that has two special ...