What is a Hashtable/Hashmap?

A hashtable is a data structure that with a collection of key-value pairs, where each key maps to a value, and the keys must be unique and hashable.

  • In Python there is a built in hashtable known as a dictionary.

The primary purpose of a hashtable is to provide efficient lookup, insertion, and deletion operations. When an element is to be inserted into the hashtable, a hash function is used to map the key to a specific index in the underlying array that is used to store the key-value pairs. The value is then stored at that index. When searching for a value, the hash function is used again to find the index where the value is stored.

The key advantage of a hashtable over other data structures like arrays and linked lists is its average-case time complexity for lookup, insertion, and deletion operations.

  • The typical time complexity of a hashtable is O(1).

What is Hashing and Collision?

Hashing is the process of mapping a given key to a value in a hash table or hashmap, using a hash function. The hash function takes the key as input and produces a hash value or hash code, which is then used to determine the index in the underlying array where the value is stored. The purpose of hashing is to provide a quick and efficient way to access data, by eliminating the need to search through an entire data structure to find a value.

However, it is possible for two different keys to map to the same hash value, resulting in a collision. When a collision occurs, there are different ways to resolve it, depending on the collision resolution strategy used.

Python's dictionary implementation is optimized to handle collisions efficiently, and the performance of the dictionary is generally very good, even in the presence of collisions. However, if the number of collisions is very high, the performance of the dictionary can degrade, so it is important to choose a good hash function that minimizes collisions when designing a Python dictionary.

What is a Set?

my_set = set([1, 2, 3, 2, 1])
print(my_set)  

# What do you notice in the output?
# - Only one of each number is printed (there is only one 1, only one 2, etc.)
# - It is printed in order, from least to greatest

# Why do you think Sets are in the same tech talk as Hashmaps/Hashtables?
# - Sets are also a sort of array, which Hashmaps/Hashtables are
# - The two are very similar, except sets don't have as many features as Hashmaps or Hashtables do
{1, 2, 3}
lover_album = {
    "title": "Lover",
    "artist": "Taylor Swift",
    "year": 2019,
    "genre": ["Pop", "Synth-pop"],
    "tracks": {
        1: "I Forgot That You Existed",
        2: "Cruel Summer",
        3: "Lover",
        4: "The Man",
        5: "The Archer",
        6: "I Think He Knows",
        7: "Miss Americana & The Heartbreak Prince",
        8: "Paper Rings",
        9: "Cornelia Street",
        10: "Death By A Thousand Cuts",
        11: "London Boy",
        12: "Soon You'll Get Better (feat. Dixie Chicks)",
        13: "False God",
        14: "You Need To Calm Down",
        15: "Afterglow",
        16: "Me! (feat. Brendon Urie of Panic! At The Disco)",
        17: "It's Nice To Have A Friend",
        18: "Daylight"
    }
}

# Printing the dictionary
print(lover_album)

# What data structures do you see?
# Dictionaries (main structure, tracks)
# Lists (Artist, and Genre)
{'title': 'Lover', 'artist': 'Taylor Swift', 'year': 2019, 'genre': ['Pop', 'Synth-pop'], 'tracks': {1: 'I Forgot That You Existed', 2: 'Cruel Summer', 3: 'Lover', 4: 'The Man', 5: 'The Archer', 6: 'I Think He Knows', 7: 'Miss Americana & The Heartbreak Prince', 8: 'Paper Rings', 9: 'Cornelia Street', 10: 'Death By A Thousand Cuts', 11: 'London Boy', 12: "Soon You'll Get Better (feat. Dixie Chicks)", 13: 'False God', 14: 'You Need To Calm Down', 15: 'Afterglow', 16: 'Me! (feat. Brendon Urie of Panic! At The Disco)', 17: "It's Nice To Have A Friend", 18: 'Daylight'}}
print(lover_album.get('tracks'))
# or
print(lover_album['tracks'])

# Again, could be printed in a better way
{1: 'I Forgot That You Existed', 2: 'Cruel Summer', 3: 'Lover', 4: 'The Man', 5: 'The Archer', 6: 'I Think He Knows', 7: 'Miss Americana & The Heartbreak Prince', 8: 'Paper Rings', 9: 'Cornelia Street', 10: 'Death By A Thousand Cuts', 11: 'London Boy', 12: "Soon You'll Get Better (feat. Dixie Chicks)", 13: 'False God', 14: 'You Need To Calm Down', 15: 'Afterglow', 16: 'Me! (feat. Brendon Urie of Panic! At The Disco)', 17: "It's Nice To Have A Friend", 18: 'Daylight'}
{1: 'I Forgot That You Existed', 2: 'Cruel Summer', 3: 'Lover', 4: 'The Man', 5: 'The Archer', 6: 'I Think He Knows', 7: 'Miss Americana & The Heartbreak Prince', 8: 'Paper Rings', 9: 'Cornelia Street', 10: 'Death By A Thousand Cuts', 11: 'London Boy', 12: "Soon You'll Get Better (feat. Dixie Chicks)", 13: 'False God', 14: 'You Need To Calm Down', 15: 'Afterglow', 16: 'Me! (feat. Brendon Urie of Panic! At The Disco)', 17: "It's Nice To Have A Friend", 18: 'Daylight'}
print(lover_album.get('tracks')[4])
# or
print(lover_album['tracks'][4])
The Man
The Man
lover_album["producer"] = ['Taylor Swift', 'Jack Antonoff', 'Joel Little', 'Louis Bell', 'Frank Dukes']
# Printing the dictionary
print(lover_album)

# What can you change to make sure there are no duplicate producers?
# make it into a set
print(set(lover_album["producer"]))
{'title': 'Lover', 'artist': 'Taylor Swift', 'year': 2019, 'genre': ['Pop', 'Synth-pop'], 'tracks': {1: 'I Forgot That You Existed', 2: 'Cruel Summer', 3: 'Lover', 4: 'The Man', 5: 'The Archer', 6: 'I Think He Knows', 7: 'Miss Americana & The Heartbreak Prince', 8: 'Paper Rings', 9: 'Cornelia Street', 10: 'Death By A Thousand Cuts', 11: 'London Boy', 12: "Soon You'll Get Better (feat. Dixie Chicks)", 13: 'False God', 14: 'You Need To Calm Down', 15: 'Afterglow', 16: 'Me! (feat. Brendon Urie of Panic! At The Disco)', 17: "It's Nice To Have A Friend", 18: 'Daylight', 19: 'All Of The Girls You Loved Before'}, 'producer': ['Taylor Swift', 'Jack Antonoff', 'Joel Little', 'Louis Bell', 'Frank Dukes']}
{'Jack Antonoff', 'Louis Bell', 'Joel Little', 'Taylor Swift', 'Frank Dukes'}
import pprint
# Adding a 19th song to an album 4 years later 
lover_album["tracks"].update({19: "All Of The Girls You Loved Before"})
# Printing the dictionary
print(lover_album)

# can use pprint
pprnt = pprint.PrettyPrinter(width=41, compact=True)
pprnt.pprint(lover_album)

# How would add an additional genre to the dictionary, like electropop? 
# lover_album["genre"].update({"Electropop"})
# Don't need to add a number for an order, because it is a list
{'title': 'Lover', 'artist': 'Taylor Swift', 'year': 2019, 'genre': ['Pop', 'Synth-pop'], 'tracks': {1: 'I Forgot That You Existed', 2: 'Cruel Summer', 3: 'Lover', 4: 'The Man', 5: 'The Archer', 6: 'I Think He Knows', 7: 'Miss Americana & The Heartbreak Prince', 8: 'Paper Rings', 9: 'Cornelia Street', 10: 'Death By A Thousand Cuts', 11: 'London Boy', 12: "Soon You'll Get Better (feat. Dixie Chicks)", 13: 'False God', 14: 'You Need To Calm Down', 15: 'Afterglow', 16: 'Me! (feat. Brendon Urie of Panic! At The Disco)', 17: "It's Nice To Have A Friend", 18: 'Daylight', 19: 'All Of The Girls You Loved Before'}, 'producer': ['Taylor Swift', 'Jack Antonoff', 'Joel Little', 'Louis Bell', 'Frank Dukes']}
{'artist': 'Taylor Swift',
 'genre': ['Pop', 'Synth-pop'],
 'producer': ['Taylor Swift',
              'Jack Antonoff',
              'Joel Little',
              'Louis Bell',
              'Frank Dukes'],
 'title': 'Lover',
 'tracks': {1: 'I Forgot That You '
               'Existed',
            2: 'Cruel Summer',
            3: 'Lover',
            4: 'The Man',
            5: 'The Archer',
            6: 'I Think He Knows',
            7: 'Miss Americana & The '
               'Heartbreak Prince',
            8: 'Paper Rings',
            9: 'Cornelia Street',
            10: 'Death By A Thousand '
                'Cuts',
            11: 'London Boy',
            12: "Soon You'll Get Better "
                '(feat. Dixie Chicks)',
            13: 'False God',
            14: 'You Need To Calm Down',
            15: 'Afterglow',
            16: 'Me! (feat. Brendon '
                'Urie of Panic! At The '
                'Disco)',
            17: "It's Nice To Have A "
                'Friend',
            18: 'Daylight',
            19: 'All Of The Girls You '
                'Loved Before'},
 'year': 2019}
import pprint

# Print lover_album in more readable format
for k,v in lover_album.items(): # iterate using a for loop for key and value
    print(str(k) + ": " + str(v))

print('-----------------------------------------------------')
print("Tracks:")

# Write your own code to print tracks in readable format
for track in lover_album["tracks"].values():
    print(track)
title: Lover
artist: Taylor Swift
year: 2019
genre: ['Pop', 'Synth-pop']
tracks: {1: 'I Forgot That You Existed', 2: 'Cruel Summer', 3: 'Lover', 4: 'The Man', 5: 'The Archer', 6: 'I Think He Knows', 7: 'Miss Americana & The Heartbreak Prince', 8: 'Paper Rings', 9: 'Cornelia Street', 10: 'Death By A Thousand Cuts', 11: 'London Boy', 12: "Soon You'll Get Better (feat. Dixie Chicks)", 13: 'False God', 14: 'You Need To Calm Down', 15: 'Afterglow', 16: 'Me! (feat. Brendon Urie of Panic! At The Disco)', 17: "It's Nice To Have A Friend", 18: 'Daylight', 19: 'All Of The Girls You Loved Before'}
producer: ['Taylor Swift', 'Jack Antonoff', 'Joel Little', 'Louis Bell', 'Frank Dukes']
-----------------------------------------------------
Tracks:
I Forgot That You Existed
Cruel Summer
Lover
The Man
The Archer
I Think He Knows
Miss Americana & The Heartbreak Prince
Paper Rings
Cornelia Street
Death By A Thousand Cuts
London Boy
Soon You'll Get Better (feat. Dixie Chicks)
False God
You Need To Calm Down
Afterglow
Me! (feat. Brendon Urie of Panic! At The Disco)
It's Nice To Have A Friend
Daylight
All Of The Girls You Loved Before
import random

def search():
    tracks = lover_album["tracks"]
    # dictTracks = lover_album.tracks
    randomTrackNum = random.choice(str(tracks))
    search = input("What would you like to know about the album?")
    if lover_album.get(search.lower()) == None:
        print("Invalid Search")
    if lover_album.get(search.lower()) == "tracks":
        print(lover_album.tracks)
    if lover_album.get(search.lower()) == "random tracks":
        print(randomTrackNum) #elp it doestnworkk AAAAAAAAAAAAAAAAAAAAAAAAa
    else:
        print(lover_album.get(search.lower()))

search()

# This is a very basic code segment, how can you improve upon this code?
# - You could have the search also pull up tracks (added above)
# - You could also create a random list of tracks, much like making a randomized playlist
{1: 'I Forgot That You Existed', 2: 'Cruel Summer', 3: 'Lover', 4: 'The Man', 5: 'The Archer', 6: 'I Think He Knows', 7: 'Miss Americana & The Heartbreak Prince', 8: 'Paper Rings', 9: 'Cornelia Street', 10: 'Death By A Thousand Cuts', 11: 'London Boy', 12: "Soon You'll Get Better (feat. Dixie Chicks)", 13: 'False God', 14: 'You Need To Calm Down', 15: 'Afterglow', 16: 'Me! (feat. Brendon Urie of Panic! At The Disco)', 17: "It's Nice To Have A Friend", 18: 'Daylight'}

Hacks

  • Answer ALL questions in the code segments
  • Create a diagram or comparison illustration (Canva).
    • What are the pro and cons of using this data structure?
    • Dictionary vs List
  • Expand upon the code given to you, possible improvements in comments
  • Build your own album showing features of a python dictionary

  • For Mr. Yeung's class: Justify your favorite Taylor Swift song, answer may effect seed

import pprint

insert_name_here_album = {
    "title": "cheetos",
    "artist": "snail",
    "year": 2004,
    "genre": ["hlep", "pelase"],
    "tracks": {
        1: "asdjoasipko[fpl[as]]",
        2: "asdihajoipkof",
        3: "adsugyhu9i09-o-",
        4: "adsyaihsuoijdkoa",
        5: "7ety8quwi9opf",
        6: "adshijaosdipko",
        7: "iadsuasjoifko",
        8: "asdihuasijdkop",
        9: "adyiuop",
        10: "asdugyihuijko",
        11: "adshiasuoji",
        12: "asdsiashudjio",
        13: "asdksfopla[f]",
        14: "Yoasdjokpf",
        15: "sdjoakf",
        16: "iasdkolpf",
        17: "iasdsjaoidpko",
        18: "fytgyhujis"
    }
}
pprnt = pprint.PrettyPrinter(width=41, compact=True)
pprnt.pprint(insert_name_here_album)
print('---------------------------------------------------')
pprnt.pprint(insert_name_here_album['tracks'][4])
{'artist': 'snail',
 'genre': ['hlep', 'pelase'],
 'title': 'cheetos',
 'tracks': {1: 'asdjoasipko[fpl[as]]',
            2: 'asdihajoipkof',
            3: 'adsugyhu9i09-o-',
            4: 'adsyaihsuoijdkoa',
            5: '7ety8quwi9opf',
            6: 'adshijaosdipko',
            7: 'iadsuasjoifko',
            8: 'asdihuasijdkop',
            9: 'adyiuop',
            10: 'asdugyihuijko',
            11: 'adshiasuoji',
            12: 'asdsiashudjio',
            13: 'asdksfopla[f]',
            14: 'Yoasdjokpf',
            15: 'sdjoakf',
            16: 'iasdkolpf',
            17: 'iasdsjaoidpko',
            18: 'fytgyhujis'},
 'year': 2004}
---------------------------------------------------
'adsyaihsuoijdkoa'