Sort list of dict by key with different types in Python 3

Multi tool use
Sort list of dict by key with different types in Python 3
I have a method that group a list of dict by a key. To do it I found here that I have to use the groupby
function but before I have to sort the list. Here is my method right now:
groupby
def group_list_by_key(data, key):
data.sort(key=lambda x: x[key])
result =
for k, v in groupby(data, key=lambda x: x[key]):
result.append(list(v))
return result
This piece of code works only if every key is defined in all the dicts and the values are all of the same type. However, where I use this method I don't know if the key is defined everywhere and if they are of the same type. On Python 2.x I know that exists sorted
function with cmp
parameter that could do a custom sort but from Python 3.x this isn't possible anymore. Is there a way to make a custom sort? I am thinking about use the classic sort by <
and sorting also by typename.
sorted
cmp
<
By now I thought about use the get function and cast to string in the sort like
data.sort(key=lambda x: str(x.get(key)))
...
for k, v in groupby(data, key=lambda x: x.get(key)):
It only overcomes in case of string, numeric and None content but not a generic object and it breaks easily if for example I execute
a = [{'b': 0, 'c': 1}, {'b': '0'}, {'b': 0, 'c': 2}, {'b': 1}, {'c': 3}]
group_list_by_key(a, 'b')
The output is
[[{'b': 0, 'c': 1}], [{'b': '0'}], [{'b': 0, 'c': 2}], [{'b': 1}], [{'c': 3}]]
instead of what I expect should be (order of lists is not a problem)
[[{'b': 0, 'c': 1}, {'b': 0, 'c': 2}], [{'b': '0'}], [{'b': 1}], [{'c': 3}]]
doc says "Use functools.cmp_to_key() to convert an old-style cmp function to a key function." so cmp not being available in python3 is not really a problem
– njzk2
Jul 1 at 17:57
You might want to just clean up your data. Rather than trying to compare
0
with '0'
, you should convert one to the other (e.g. by calling int
on them both, perhaps).– Blckknght
Jul 1 at 18:27
0
'0'
int
@RafaelC I have added a sample expected output on the last input I wrote before
– Ripper346
Jul 2 at 15:32
2 Answers
2
You can solve your problem by doing something like this
data = [{'b': 0, 'c': 1}, {'b': '0'}, {'b': 0, 'c': 2}, {'b': 1}, {'c': 3}]
key='b'
def f(x):
ret = x.get(key, -1)
return ret if type(ret) == int else -2
result = [list(v) for k, v in groupby(sorted(data, key=f), f)]
# result: [[{'b': '0'}], [{'c': 3}], [{'b': 0, 'c': 1}, {'b': 0, 'c': 2}], [{'b': 1}]]
But if you still need a custom comparison function, you can do it using functools.cmp_to_key
import functools
sorted(x, key=functools.cmp_to_key(custom_cmp_function))
For whatever it is worth to you, PEP 8 recommends "Always use a def statement instead of an assignment statement that binds a lambda expression directly to an identifier."
– Robᵩ
Jul 1 at 21:54
Thank you, your answer is partially good because
0
, '0'
and None
are grouped that it's wrong– Ripper346
Jul 2 at 15:35
0
'0'
None
@Ripper346.. I miunderstood your requirements. I have updated the answer. Can you check if this is what you wanted?
– Sunitha
Jul 2 at 15:56
@Sunitha I am thinking about a use of your code with generic classes instead of only int, str and None, like datetime, custom classes, etc. Do you think that your function could afford it? In any case, your answer is correct and also more fancy than my solution
– Ripper346
Jul 2 at 16:22
Yes.. You should be able to easily extend the function
f
as per you need. All you have to do is to check for the type of ret
and return a unique value– Sunitha
Jul 2 at 16:33
f
ret
Thanks to @Sunitha and @njzk2 for pointing out the cmp_to_key function, it did totally what I wanted. So, my grouping now is:
from functools import cmp_to_key
from itertools import groupby
def group_list_by_key(data, key):
def compare_values_types(a, b):
a = a.get(key)
b = b.get(key)
if a.__class__ == b.__class__:
if a < b:
return -1
elif a > b:
return 1
else:
return 0
else:
if a.__class__.__name__ < b.__class__.__name__:
return -1
elif a.__class__.__name__ > b.__class__.__name__:
return 1
else:
return 0
data.sort(key=cmp_to_key(compare_values_types))
return [list(v) for k, v in groupby(data, key=lambda x: x.get(key))]
Calling on the sample list
a = [{'b': 0, 'c': 1}, {'b': '0'}, {'b': 0, 'c': 2}, {'b': 1}, {'c': 3}]
group_list_by_key(a, 'b')
It returns the expected list
[[{'c': 3}], [{'b': 0, 'c': 1}, {'b': 0, 'c': 2}], [{'b': 1}], [{'b': '0'}]]
What I did is to compare in the classic way the keys of the same type, otherwise I simply do a string comparison between the classes' names (using a.__class__.__name__
instead of type(a).__name__
, check out to this answer).
Thanks to all!
a.__class__.__name__
type(a).__name__
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
Can you provide a sample input and expected output ?
– RafaelC
Jul 1 at 17:37