`
scm002
  • 浏览: 318569 次
社区版块
存档分类
最新评论

Python dictdiffer模块比较两个字典

阅读更多

http://dictdiffer.readthedocs.io/en/latest/

 

Dictdiffer

https://img.shields.io/travis/inveniosoftware/dictdiffer.svg https://img.shields.io/coveralls/inveniosoftware/dictdiffer.svg https://img.shields.io/github/tag/inveniosoftware/dictdiffer.svg https://img.shields.io/pypi/dm/dictdiffer.svg https://img.shields.io/github/license/inveniosoftware/dictdiffer.svg

Dictdiffer is a helper module that helps you to diff and patch dictionaries.

Installation

Dictdiffer is on PyPI so all you need is:

$ pip install dictdiffer

Usage

Let’s start with an example on how to find the diff between two dictionaries using diff() method:

from dictdiffer import diff, patch, swap, revert

first = {
    "title": "hello",
    "fork_count": 20,
    "stargazers": ["/users/20", "/users/30"],
    "settings": {
        "assignees": [100, 101, 201],
    }
}

second = {
    "title": "hellooo",
    "fork_count": 20,
    "stargazers": ["/users/20", "/users/30", "/users/40"],
    "settings": {
        "assignees": [100, 101, 202],
    }
}

result = diff(first, second)

assert list(result) == [
    ('change', ['settings', 'assignees', 2], (201, 202)),
    ('add', 'stargazers', [(2, '/users/40')]),
    ('change', 'title', ('hello', 'hellooo'))]

Now we can apply the diff result with patch() method:

result = diff(first, second)
patched = patch(result, first)

assert patched == second

Also we can swap the diff result with swap() method:

result = diff(first, second)
swapped = swap(result)

assert list(swapped) == [
    ('change', ['settings', 'assignees', 2], (202, 201)),
    ('remove', 'stargazers', [(2, '/users/40')]),
    ('change', 'title', ('hellooo', 'hello'))]

Let’s revert the last changes:

result = diff(first, second)
reverted = revert(result, patched)
assert reverted == first

A tolerance can be used to consider closed values as equal. The tolerance parameter only applies for int and float.

Let’s try with a tolerance of 10% with the values 10 and 10.5:

first = {'a': 10.0}
second = {'a': 10.5}

result = diff(first, second, tolerance=0.1)

assert list(result) == []

Now with a tolerance of 1%:

result = diff(first, second, tolerance=0.01)

assert list(result) == ('change', 'a', (10.0, 10.5))

API

Dictdiffer is a helper module to diff and patch dictionaries.

dictdiffer.diff(firstsecondnode=Noneignore=Nonepath_limit=Noneexpand=Falsetolerance=2.220446049250313e-16)

Compare two dictionary/list/set objects, and returns a diff result.

Return an iterator with differences between two objects. The diff items represent addition/deletion/change and the item value is a deep copy from the corresponding source or destination objects.

>>> from dictdiffer import diff
>>> result = diff({'a': 'b'}, {'a': 'c'})
>>> list(result)
[('change', 'a', ('b', 'c'))]

The keys can be skipped from difference calculation when they are included in ignore argument of type collections.Container.

>>> list(diff({'a': 1, 'b': 2}, {'a': 3, 'b': 4}, ignore=set(['a'])))
[('change', 'b', (2, 4))]
>>> class IgnoreCase(set):
...     def __contains__(self, key):
...         return set.__contains__(self, str(key).lower())
>>> list(diff({'a': 1, 'b': 2}, {'A': 3, 'b': 4}, ignore=IgnoreCase('a')))
[('change', 'b', (2, 4))]

The difference calculation can be limitted to certain path:

>>> list(diff({}, {'a': {'b': 'c'}}))
[('add', '', [('a', {'b': 'c'})])]
>>> from dictdiffer.utils import PathLimit
>>> list(diff({}, {'a': {'b': 'c'}}, path_limit=PathLimit()))
[('add', '', [('a', {})]), ('add', 'a', [('b', 'c')])]
>>> from dictdiffer.utils import PathLimit
>>> list(diff({}, {'a': {'b': 'c'}}, path_limit=PathLimit([('a',)])))
[('add', '', [('a', {'b': 'c'})])]
>>> from dictdiffer.utils import PathLimit
>>> list(diff({}, {'a': {'b': 'c'}},
...           path_limit=PathLimit([('a', 'b')])))
[('add', '', [('a', {})]), ('add', 'a', [('b', 'c')])]

The patch can be expanded to small units e.g. when adding multiple values:

>>> list(diff({'fruits': []}, {'fruits': ['apple', 'mango']}))
[('add', 'fruits', [(0, 'apple'), (1, 'mango')])]
>>> list(diff({'fruits': []}, {'fruits': ['apple', 'mango']}, expand=True))
[('add', 'fruits', [(0, 'apple')]), ('add', 'fruits', [(1, 'mango')])]
Parameters:
  • first – The original dictionary, list or set.
  • second – New dictionary, list or set.
  • node – Key for comparison that can be used in dot_lookup().
  • ignore – List of keys that should not be checked.
  • path_limit – List of path limit tuples or dictdiffer.utils.Pathlimit object to limit the diff recursion depth.
  • expand – Expand the patches.
  • tolerance – Threshold to consider when comparing two float numbers.

Changed in version 0.3: Added ignore parameter.

Changed in version 0.4: Arguments first and second can now contain a set.

Changed in version 0.5: Added path_limit parameter. Added expand paramter. Added tolerance parameter.

Changed in version 0.7: Diff items are deep copies from its corresponding objects.

dictdiffer.patch(diff_resultdestination)

Patch the diff result to the old dictionary.

dictdiffer.swap(diff_result)

Swap the diff result.

It uses following mapping:

  • remove -> add
  • add -> remove

In addition, swap the changed values for change flag.

>>> from dictdiffer import swap
>>> swapped = swap([('add', 'a.b.c', [('a', 'b'), ('c', 'd')])])
>>> next(swapped)
('remove', 'a.b.c', [('c', 'd'), ('a', 'b')])
>>> swapped = swap([('change', 'a.b.c', ('a', 'b'))])
>>> next(swapped)
('change', 'a.b.c', ('b', 'a'))
dictdiffer.revert(diff_resultdestination)

Call swap function to revert patched dictionary object.

Usage example:

>>> from dictdiffer import diff, revert
>>> first = {'a': 'b'}
>>> second = {'a': 'c'}
>>> revert(diff(first, second), second)
{'a': 'b'}
dictdiffer.dot_lookup(sourcelookupparent=False)

Allow you to reach dictionary items with string or list lookup.

Recursively find value by lookup key split by ‘.’.

>>> from dictdiffer.utils import dot_lookup
>>> dot_lookup({'a': {'b': 'hello'}}, 'a.b')
'hello'

If parent argument is True, returns the parent node of matched object.

>>> dot_lookup({'a': {'b': 'hello'}}, 'a.b', parent=True)
{'b': 'hello'}

If node is empty value, returns the whole dictionary object.

>>> dot_lookup({'a': {'b': 'hello'}}, '')
{'a': {'b': 'hello'}}

Changes

Version 0.6.1 (released 2016-11-22)

  • Changes order of items for REMOVE section of generated patches when swap is called so the list items are removed from the end. (#85)
  • Improves API documentation for ignore argument in diff function. (#79)
  • Executes doctests during PyTest invocation.

Version 0.6.0 (released 2016-06-22)

  • Adds support for comparing NumPy arrays. (#68)
  • Adds support for comparing mutable mappings, sequences and sets from collections.abs module. (#67)
  • Updates package structure, sorts imports and runs doctests.
  • Fixes order in which handled conflicts are unified so that the Merger’s unified_patches can be always applied.

Version 0.5.0 (released 2016-01-04)

  • Adds tolerance parameter used when user wants to treat closed values as equals
  • Adds support for comparing numerical values and NaN. (#54) (#55)

Version 0.4.0 (released 2015-03-11)

  • Adds support for diffing and patching of sets. (#44)
  • New tests for diff on the same lists. (#48)
  • Fix for exception when dict has unicode keys and ignore parameter is provided. (#50)
  • PEP8 improvements.

Version 0.3.0 (released 2014-11-05)

  • Adds ignore argument to diff function that allows skipping check on specified keys. (#34 #35)
  • Fix for diffing of dict or list subclasses. (#37)
  • Better instance checking of diffing objects. (#39)

Version 0.2.0 (released 2014-09-29)

  • Fix for empty list instructions. (#30)
  • Regression test for empty list instructions.

Version 0.1.0 (released 2014-09-01)

  • Fix for list removal issues during patching caused by wrong iteration. (#10)
  • Fix for issues with multiple value types for the same key. (#10)
  • Fix for issues with strings handled as iterables. (#6)
  • Fix for integer keys. (#12)
  • Regression test for complex dictionaries. (#4)
  • Better testing with Travis CI, tox, pytest, code coverage. (#10)
  • Initial release of documentation on ReadTheDocs. (#21 #24)
  • Support for Python 3. (#15)

Version 0.0.4 (released 2014-01-04)

  • List diff behavior treats lists as lists instead of sets. (#3)
  • Differed typed objects are flagged as changed now.
  • Swap function refactored.

Version 0.0.3 (released 2013-05-26)

  • Initial public release on PyPI.

Contributing

Bug reports, feature requests, and other contributions are welcome. If you find a demonstrable problem that is caused by the code of this library, please:

  1. Search for already reported problems.
  2. Check if the issue has been fixed or is still reproducible on the latest master branch.
  3. Create an issue with a test case.

If you create a feature branch, you can run the tests to ensure everything is operating correctly:

$ ./run-tests.sh

...

Name                  Stmts   Miss  Cover   Missing
---------------------------------------------------
dictdiffer/__init__      88      0   100%
dictdiffer/version        2      0   100%
---------------------------------------------------
TOTAL                    90      0   100%

...

52 passed, 2 skipped in 0.44 seconds

License

Dictdiffer is free software; you can redistribute it and/or modify it under the terms of the MIT License quoted below.

Copyright (C) 2013 Fatih Erikli. Copyright (C) 2013, 2014 CERN.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

In applying this license, CERN does not waive the privileges and immunities granted to it by virtue of its status as an Intergovernmental Organization or submit itself to any jurisdiction.

Authors

Dictdiffer was originally developed by Fatih Erikli. It is now being developed and maintained by the Invenio collaboration. You can contact us at info@inveniosoftware.org.

Contributors:

分享到:
评论

相关推荐

    (Python基础教程之十八)Python字典交集–比较两个字典.pdf

    (Python基础教程之⼗⼋)Python字典交集–⽐较两个字典 Python⽰例,⽤于查找2个或更多词典之间的常见项⽬,即字典相交项⽬。 1.使⽤"&"运算符的字典交集 最简单的⽅法是查找键,值或项的交集,即 & 在两个字典...

    python两个_多个字典合并相加的实例代码

    首先,最简单的情况是合并两个字典,如果两个字典中没有重复的键,则可以直接使用update()方法。示例代码如下: ```python A = {'a': 11, 'b': 22} B = {'c': 48, 'd': 13} A.update(B) print(A) # 输出将是 {'a': ...

    python数学运算模块

    python数学运算模块 python数学运算模块 python数学运算模块 python数学运算模块 python数学运算模块 python数学运算模块 python数学运算模块 python数学运算模块 python数学运算模块 python数学运算模块 python数学...

    Python实现嵌套列表及字典并按某一元素去重复功能示例

    同样地,对于列表嵌套字典的情况,我们比较每个字典的`'host'`键的值,以达到去重的目的。 在实际运行的代码中,`deleteRepeat`函数没有被调用,而是创建了一个`HostScheduler`的实例,并传入一个包含重复`'host'`...

    Python 将DataFrame数据转成字典 Python源码

    Python 将DataFrame数据转成字典 Python源码Python 将DataFrame数据转成字典 Python源码Python 将DataFrame数据转成字典 Python源码Python 将DataFrame数据转成字典 Python源码Python 将DataFrame数据转成字典 ...

    python及paramiko模块安装包

    python及paramiko模块安装包及安装步骤,附件是四个安装文件 python-2.7.6.msi(python安装文件), pycrypto-2.6.win32py2.7.exe(pycrypto模块安装文件), ecdsa-0.10.tar.gz, paramiko-1.12.1.tar.gz 安装步骤:...

    Python Twisted模块 10.2.0

    Python Twisted模块 10.2.0Python Twisted模块 10.2.0Python Twisted模块 10.2.0Python Twisted模块 10.2.0Python Twisted模块 10.2.0Python Twisted模块 10.2.0

    python-比较两个文件的内容差异,并显示

    用python 实现两个文本文件之间的比较,并生成可视化HTML文件。

    Python列表、元组、字典

    字典是Python中唯一的映射类型,使用花括号{}定义,并包含多个key-value对,其中key必须是不可变类型,常见的有字符串和元组。字典中的元素是无序的,我们通过key来访问与之对应的value。字典提供了非常方便的键值对...

    Python练习题

    "Python练习题"这个资源包含了1到11个单元的练习题目,旨在帮助学习者巩固和深化对Python基础概念的理解,特别是元组、列表、字典以及类这四个核心数据结构的运用。下面我们将详细探讨这些知识点。 **元组(Tuples)*...

    python 入门(2)字典的使用,伪查询系统

    - `update()`:合并两个字典,将新字典的键值对添加到原字典中。 - `keys()`, `values()`, `items()`:分别返回字典的键、值和键值对列表(实际上是可迭代对象)。 - `len(dict)`:返回字典中键值对的数量。 - `in` ...

    python常用模块实例手册

    涵盖大部分python常用模块方法使用实例,方便新手学习和快速使用python。 请使用[notepad++]或[Sublime]等编辑器打开 1基础 2常用模块 3socket 4mysql 5处理信号 6缓存数据库 7web页面操作 8并发 9框架 10例子

    python 列表转为字典的两个小方法(小结)

    1、现在有两个列表,list1 = [‘key1′,’key2′,’key3’]和list2 = [‘1′,’2′,’3’],把他们转为这样的字典:{‘key1′:’1′,’key2′:’2′,’key3′:’3’} >>>list1 = ['key1','key2','key3'] >>>list2 =...

    Python常用模块

    Python常用模块整理

    Python中字典对象的常用操作示例

    使用环境:需要先安装PyCharm(请自己百度下载安装),以及然后官网上下载Python 2.7版本,以及Python 3.7版本后,安装在自己的电脑上。 使用步骤: ...目的:帮助理解字典对象Python中的常用操作。

    python爬虫学习案例-.字典中的字典遍历.rar

    在Python编程语言中,爬虫和数据分析是两个重要的领域,它们常常结合在一起,用于从互联网上获取并处理大量数据。本案例"python爬虫学习案例-.字典中的字典遍历.rar"着重于如何在处理数据时遍历嵌套的字典结构,这在...

    基于python_字典学习_深层字典学习_图像去噪_高斯噪声_椒盐噪声

    在Python中,可以使用Scikit-Learn库实现字典学习,如`sklearn.decomposition.DictionaryLearning`类,用于训练一个字典,使得输入数据可以被表示为这个字典的线性组合,且组合系数尽可能稀疏。 2. 深层字典学习: ...

    python3 打印输出字典中特定的某个key的方法示例

    大家都知道python中的字典里的元素是无序的,不能通过索引去找到它,今天说我下通过下面几个方法去找某个特定的key的元素。 Tester = {name:shawxie, phone:135xxxx, Address:深圳市南山区, job:软件测试, hobby:...

    解决python给列表里添加字典时被最后一个覆盖的问题

    items=[] #先声明一个字典和一个列表,字典用来添加到列表里面 >>> item['index']=1 #给字典赋值 >>> items.append(item) >>> items [{'index': 1}] #添加到列表里面复合预期 >>> item['index']=2 #现在修改字典 >>...

    python 字典中文key处理,读取,比较方法

    这是字典里两个元素的内容,编码是utf-8,中文内容 运行代码如下 # -*- coding: utf-8 -*- rate1 = open('takeOffTime_date.txt', 'r') dic = dict() for line in rate1: line = line.strip().split(' ') data=...

Global site tag (gtag.js) - Google Analytics