Darkos February 2016

Python - process only new element of dictionnary

I have a unique (unique keys) dictionnary that I update adding some new keys depending data on a webpage. and I want to process only the new keys that may appear after a long time. Here is a piece of code to understand :

a = UniqueDict()

while 1:

    webpage = update() # return a list

    for i in webpage:
        title = getTitle(i)
        a[title] = new_value # populate only new title obtained because it's a unique dictionnary

        if len(a) > 50:
            a.clear() # just to clear dictionnary if too big

    # Condition before entering this loop to process only new title entered
    for element in a.keys():
        process(element)

Is there a way to know only new keys added in the dictionnary (because most of the time, it will be the same keys and values so I don't want them to be processed) ? Thank you.

Answers


pp_ February 2016

You may want to use a OrderedDict:

Ordered dictionaries are just like regular dictionaries but they remember the order that items were inserted. When iterating over an ordered dictionary, the items are returned in the order their keys were first added.


Sebastiaan Mannem February 2016

What you might also do, is keep the processed keys in a set. Then you can check for new keys by using set(d.keys()) - set_already_processed. And add processed keys using set_already_processed.add(key)


tdelaney February 2016

Make your own dict that tracks additions:

class NewKeysDict(dict):
    """A dict, but tracks keys that are added through __setitem__
    only. reset() resets tracking to begin tracking anew. self.new_keys
    is a set holding your keys.
    """    
    def __init__(self, *args, **kw):
        super(NewKeysDict, self).__init__(*args, **kw)
        self.new_keys = set()

    def reset(self):
        self.new_keys = set()

    def __setitem__(self, key, value):
        super(NewKeysDict, self).__setitem__(key, value)
        self.new_keys.add(key)


d = NewKeysDict((i,str(i)) for i in range(10))
d.reset()
print(d.new_keys)
for i in range(5, 10):
    d[i] = '{} new'.format(i)

for k in d.new_keys:
    print(d[k])

Post Status

Asked in February 2016
Viewed 2,650 times
Voted 4
Answered 3 times

Search




Leave an answer