Difference makes the DIFFERENCE
For more reading or to go indepth refer official reference of python language https://docs.python.org/3/reference/
Also refer to the Padhai-One Github repo for updates & other notebooks
Dictionaries are mutable data objects that has mapping between the keys and values. Keys are similar to set elements that would be hashed and stored. Each key will a have a value (object) associated with it. Dictionary do not preserve ordering.
Keys has to be immutable objects like int, float, string, tuple; whereas values can be any mutable or immutable object like even lists. This is because only keys are hashed, not values.
List can be created in 2 ways
{}
curly bracketsdict()
function, which takes any sequence type or iterable objects (like other list, tuple, set) whose elements are objects with pair of elementsfromkeys()
method for initializing dictionaryA dictioanry element has two parts represented as {key : value}
. Key and associated value should be seperated by :
.
The Keys will/need_to be unique while values will/need not.
# empty dictionary
a = {}
b = dict()
# initalized dictionary
c = {"a":1,"b":2, "c":3, "d":4}
d = dict([ (1,"a"), (2,"b") ]) # create dict from sequence of paired elements
e = c # shallow copy from a list
# Print
print("a = ",a, "\nb = ",b, "\nc = ",c, "\nd = ",d, "\ne = ",e )
As one can see a list with elements made of tuple with element pairs is converted into Dictionary.
For dict()
function, it is required to have paired elements or else it will throw error. Check the below cases.
a = [1,2,3,4]
try:
out = dict(a) # throw error
except Exception as error:
print("`a` convertion TypeError:", error)
b = (1,2)
try:
out = dict(b) # throw error
except Exception as error:
print("`b` convertion TypeError:", error)
c = {1,2,3}
try:
out = dict(c) # throw error
except Exception as error:
print("`c` convertion TypeError:", error)
d = "hello World"
try:
out = dict(d) # throw error
except Exception as error:
print("`d` convertion TypeError:", error)
It is mandatory for data sequences and sets to have elements as clubbed pairs of objects, that would form key and corresponding value in dictionary.
The elements can be clubbed using data objects like list, tuple or set; Each nested object should have exactly two elements.
a = [[1,2], [2,3], [4,5]]
b = dict(a)
c = [{1,2}, [2,3], (4,5)]
d = dict(c)
e= [['hello','world'] ]
f=dict(e)
print("b =", b)
print("d =", d)
print("f =", f)
g = [ [1,2,3], [4,5,6] ]
try :
h = dict(g) # this will throw error
except Exception as error:
print("Error at g:", error)
Dictionary keys will be hashed thus they must be immutable objects, where as values can be any mutable or immutable object
a = {
(1,2): "tuple key",
"test": "string key",
frozenset((1,2)): "Frozen set key",
1: "int key",
1.222: "float key",
1+2j: "complex key",
}
print("a =", a)
try:
b = { [1,2]: "list key" }
except Exception as error:
print("Error at b:", error)
try:
c = { {1,2}: "list key" }
except Exception as error:
print("Error at c:", error)
a = {
"tuple": (1,2),
"string": "test",
"frozenset": frozenset((1,2)),
"int": 1,
"float": 1.222,
"complex number": 1+2j,
"list": [1,2,3],
"set": {1,2,3},
}
print("a =", a)
As the keys of dictionaries are hashed there will not be duplicated keys in a dict. The value of key will be updated to the latest instance of key's value.
z = {
"a": 1, "b": 1,
"a": 2, "c": 2,
"a": 3, "d": 3,
}
print("z =", z)
One can see that the last occurence of key a
had value as 3
which is retained.
Also one can see that both the value of a
and c
are same(3
), thus values can have duplicate objects.
There could be cases where one may need to have just the keys in dict and values would be created at a future point.
But a dictionary key cannot exist without a value, in such cases one can initialize the value to None
a = {"a": None, "b":None, "c": None}
print(a)
Or there could be case where we may want to initialize all the values tp some constant values or empty objects and would modify later point in code.
This pattern is frequently used in may applications, so there is dedicated function in dictionary called dict.fromkeys()
.
This two arguments, firstly a sequence data or set as argument and secondly optional value to which each keys will be initalized, if not given None
will be taken as default.
a = [1,2,3,4,5,6,7]
b = dict.fromkeys(a) # values will be none
print("b =", b)
c = ["apple", "banana", "cucumber" ]
d = [1,2,3]
e = dict.fromkeys(c,d)
print("e =", e)
Note that function did not iterate over values argument instead it took entire list and made it as value for all the keys
Dictionary elements cannot be indexed like list or tuple with numbers. Each value has to be accessed using keys. This is because keys are stored based on hashing, not sequentially in memory.
Subscripting with []
square bracket is used for accessing values, using keys for referencing.
z = {"a": 1, "b": 2, "c": 3, "d": 4}
y = z["a"]
x = z ["b"]
print("value at \"a\" :", y)
print("value at \"b\" :", x)
try:
w = z[1]
except KeyError as error:
print("KeyError:", error)
Referenced with 1
dict does not consider it as index instead takes it as key and looks at corresponding hashing, since it does not exist it throws KeyError.
When one tries to reference a key that is not present using []
, it will throw error.
For certain program logic, it might be useful to get value if present else silently move to next step. In such cases one could use get()
.
It will return value if key exists else will return None
z = {"a": 1, "b": 2, "c": 3, "d": 4}
try:
y = z["e"]
except KeyError as error:
print("KeyError:" , error)
y = z.get("e")
print('using get:', y)
One can use in
statement to ckeck if a specific key is present in a dictionary or not.
Only membership of keys can be checked with
in
, one cannot use to check if a value exists in dict.
a = {"ant": 1, "bear":2, "cat": 3, "dog": 4, "elephant": 5}
a_dog = "dog" in a
a_wolf = "wolf" in a
a_1 = 1 in a
print("is dog present:", a_dog)
print("is wolf present:", a_wolf)
print("is 1 present:", a_1)
Iterating with for
and in
and accessing the dict elements for updating them can be done as shown below.
a = {"ant": 1, "bear":2, "cat": 3, "dog": 4, "elephant": 5}
for i in a:
a[i] +=10
print("updated a =", a)
One could use in
for checking if key exists. But what if one needs to check if a value exists in a dict?
A simple work around could be to use values()
method. it returns all the values of a dictionary as a dict_value
class which is view object (i.e gives view like a list).
One can use this with in
to check if the value present in the object returned by values()
a = {"ant": 1, "bear":2, "cat": 3, "dog": 4, "elephant": 5}
a_vals = a.values()
a_1 = 1 in a_vals
print("is 1 present:", a_1)
print("a_vals = ", a_vals)
Notice that though the object seems like list it is actually a dict_values object. The object is not subscriptable with index like list.
a = {"ant": 1, "bear":2, "cat": 3, "dog": 4, "elephant": 5}
a_vals = a.values()
print("a_vals before = ", a_vals)
Similarly there is keys()
method which will return dict_keys
class which again is a view object.
a = {"ant": 1, "bear":2, "cat": 3, "dog": 4, "elephant": 5}
a_keys = a.keys()
print("a_keys = ", a_keys)
Previously we converted sequence type data with paired elements into a dictionary. We could also do the inverse, we could convert dictionary into dict_items
view object, which equivalent to tuple pairs.
For which we could use items()
method which returns dict_items
which again is a view object.
a = {"ant": 1, "bear":2, "cat": 3, "dog": 4, "elephant": 5}
a_items = a.items()
print("a_keys = ", a_items)
One Significance of having view object is any changes or updates to the dictionary object will reflect in view object. These objects are references to the actual dictionary.
In Python2
values()
items()
keys()
methods returned seperate list object, that did not preservethe link. Only in Python3 this was changed to view object.
a = {"ant": 1, "bear":2, "cat": 3}
a_vals = a.values()
a_keys = a.keys()
a_items = a.items()
print("Before changing:\n", a_vals, "\n", a_keys, "\n", a_items)
a["ant"] = 10 # change value
a["dog"] = 40 # addd new element
print("After changing:\n", a_vals, "\n", a_keys, "\n", a_items)
keys()
, values()
and items()
would useful in many cases involving complex operations with dictionary data. Limiting the usage to simple examples here.
One could convert these view objects into other data structures easily with creation functions like
list()
,tuple()
andset()
and use them further.
Dictionary are mutable object, so one can add elements to existing object without creating new object.
For adding new element one can directly subscript the new key with []
and assign the corresponding values.
a = {"ant": 1, "bear":2, "cat": 3, "dog": 4, "elephant": 5}
a["fox"] = 6
print('a=', a)
One alternative way would be to use update()
to add new elements.
Update takes a dictionary or sequences with paired elements. So this could be used for bulk updates as well.
a = {"ant": 1, "bear":2, "cat": 3}
a.update({'dog': 4})
a.update( [("dog", 4), ("elephant", 5)] )
print(a)
The keys of dictionar are unique, thus using subscript []
or update()
on existing keys will change the values of the keys.
The subscripting
[]
will be faster for adding single element compared toupdate()
method.
Butupdate()
will be and faster useful for bulk update with large number of elements compared to iterating.
%%timeit -n1 -r10
## Using Subscripting
a = {}
for i in range(100000):
a["apple"] = 1
%%timeit -n1 -r10
## Using update()
a = {}
for i in range(100000):
a.update({"apple": 1})
One can see that subscripting []
is faster for updating single element compare to update()
.
But when large umber of elements has to be updated update()
will be more readable and faster.
# variables initalization
a = {}
b = [(i,i*100) for i in range(10)] #list comprehension
print('list = ',b)
print('after converting to dict =', dict(b))
b = [(i,i*100) for i in range(1000)]
%%timeit -n1
## Using Subscripting
for i in range(100000): #repeat 1M times
for (k, v) in b:
a[k] = v
%%timeit -n1
## Using Subscripting
for i in range(100000): #repeat 1M times
a.update(b)
It can be seen that update()
is faster than iterating with subscripting []
.
There is a setdefault()
method referenced to a key, does 2 different things based on condition.
a = {"ant": 1, "bear":2, "cat": 3}
n_val = a.setdefault('cat', 22)
print("If value exists returned:", n_val)
print("a =", a )
n_val = a.setdefault('wolf', 22)
print("If value does not exist returned:", n_val)
print("a =", a )
If one wants to remove any element, one can use pop()
.
a = {"ant": 1, "bear":2, "cat": 3, "dog": 4, "elephant": 5}
a.pop("bear")
print("a =", a)
There also exists popitem()
which removes the item that got inserted last. One would recollect that dict doesn't preserve ordering, hence popitem()
will be useful to remove elements in Last-In-First-Out basis from dict.
This behaviour of removing newly added elemt is from Python3.7, in older versions
popitem()
remove a random element from dict.
a = {"ant": 1, "bear":2, "cat": 3, "dog": 4, "elephant": 5}
print("Initial a =", a)
a.popitem()
print("1st pop: a =", a)
a.popitem()
print("2nd pop: a =", a)
Though initailized together, the elemets are created sequentially. Thus last entry is created at last, so popitem()
removes it.
When some new element is added it gets removed when popitem()
is called.
a = {"ant": 1, "bear":2, "cat": 3}
print("Initial a =", a)
a.popitem()
print("1st pop: a =", a)
a.update({"dog": 4, "elephant": 5})
print("New a =", a)
a.popitem()
print("2nd pop: a =", a)
One can use del
to remove specific elements refereced by keys or to remove entire object.
a = {"ant": 1, "bear":2, "cat": 3, "dog": 4, "elephant": 5}
del a['ant']
print('After deletion: a =', a)
When used for removing elements the existing object is modified. As one can see below that the object id remains the same.
a = {"ant": 1, "bear":2, "cat": 3, "dog": 4, "elephant": 5}
print("Id of `a` before:", id(a))
del a["bear"] # deletes elements from index 2 to 5 in the object.
print('`a` after modification =', a)
print("Id of `a` after:", id(a))
One can use del
on the object as whole.
c = {"ant": 1, "bear":2, "cat": 3, "dog": 4, "elephant": 5}
del c # reference to object is removed
try:
print(c) #This will throw ERROR since c is not defined
except NameError as error:
print("Error:", error)
When using del
on the entire object, the object remains same and exists in memory. Only the reference between variable and object is broken and variable becomes undefined.
d = {"ant": 1, "bear":2, "cat": 3, "dog": 4, "elephant": 5}
print("Id of `d`:", id(d))
e = d
del d # reference to object is removed
print("Id of `e`:", id(e))
print("Object in e:", e) # `e` still references the object
To remove all elements one can use clear()
method, this will retain the old object and only remove the elements
a = {"ant": 1, "bear":2, "cat": 3, "dog": 4, "elephant": 5}
a.clear()
print("a =", a)
At this point a question might come to mind that why should we clear elements of a dict object and reuse the same object
Check the below case where we link the object in variable a
to another variable b
by shallow copy. And we are changing the dict object in a
by assigning a new dict.
a = {"ant": 1, "bear":2, "cat": 3, "dog": 4, "elephant": 5}
b = a # a & b refers to same object
print("before modifying a: \na =", a, " id= ",id(a) )
print("b =", b, " id= ",id(b))
a = {"Zebra": 26} # a now point to new object
print("after modifying a: \na =", " id= ",id(a) )
print("b =", " id= ",id(b))
The object initially a
had, is not destroyed or modified, it still exists in memory and b
still references the old object, which might not be desired in some case.
Also the link between objects a
and b
i.e object referenced by b
is no longer associated to a
.The link is broken which could be bad in some case.
a = {"ant": 1, "bear":2, "cat": 3, "dog": 4, "elephant": 5}
b = a # a & b refers to same object
print("Initially: \na =", a," id= ",id(a))
print(" b =", b," id= ",id(b))
a.clear()
print("clearing a: \na =", a, " id= ",id(a))
print("b =", b, " id= ",id(b))
It may not be the case always to have same object in two different variables.
Sometimes one may want to create a copy of a dictionary and do some modification independent of other. In such cases it is better create two different objects. For which one can use copy()
which creates a deep copy of the object.
Alternatively, one can use dict()
which is used to create new dict from sequence type object.
a = {"ant": 1, "bear":2, "cat": 3, "dog": 4, "elephant": 5}
b = a.copy() # deep copy
b.pop("ant")
c = dict(a)
c.pop("cat")
print("a =", a, " id=",id(a))
print("b =", b, " id=",id(b))
print("b =", c, " id=",id(c))
But one would wonder what could be the difference between copy()
and dict()
.
dict()
function takes any iteratable data objects with paired elements, iterates through elements converts to dict type object.
whereas copy()
method directly creates a copy of dict object. This makes copy()
method faster than list()
%%timeit -n1 -r10
a = {"ant": 1, "bear":2, "cat": 3, "dog": 4, "elephant": 5}
for i in range(1000000):
b = a.copy
%%timeit -n1 -r10
a = {"ant": 1, "bear":2, "cat": 3, "dog": 4, "elephant": 5}
for i in range(1000000):
b = dict(a)
One can find the total number of elements in a
dict using len()
function, which will return the number of elements (i.e) number of keys present in the dictionary
a = {"ant": 1, "bear":2, "cat": 3, "dog": 4, "elephant": 5}
b = { 1: {"A": 11, "B": 22},
2: {"C": 11, "D": 22} } #dict with dict objects
print("Num of elements in a:", len(a))
print("Num of elements in b:", len(b))
It can be seen that for b
it show 2, len()
considers only objects it contains, does not take into account elements present in the object. So b
has two dict elements which is returned.
There would be cases where one will be in need to find sum of all keys in a dict.
It may not seem meaningful but there may come a necessacity
For which one can use sum()
function
b = { 1: "mango", 2: "apple" }
s = sum(b)
print("sum is ", s)
But the keys has to have arithmetically summable object like float, int; else one will get error.
a = {"ant": 1, "bear":2, "cat": 3, "dog": 4, "elephant": 5}
try:
sum(a) # this will throw error
except Exception as error:
print("Error in a:", error)
b = { (1,2,3): "First", (4,5,6): "Second" } #list object
try:
sum(b) # this will throw error
except Exception as error:
print("Error in b:", error)
No arithmetic operators can be used with dictionary.
Please refer to the Padhai-One Github repo for updates & other notebooks