Shallow vs Deep Copying of Python Objects


在做NLP实验,进行数据预处理的时候,用到了字典对象之间的assignment,调试的时候发现数据不太对,通过查阅资料发现,原来Python中的assignment也是另有玄机,除此之外,还有shallow copy和deep copy两种高级玩法。

Assignment

Python是一门高度面向对象的语言,事实上,Python中每一个data都是一个object(int,float也不例外)。

object分为mutable和immutable两种:

  • immutable object就是int,float之类的,这类object是不能改变的;
  • mutable object是dict,list这类的,这类object是可以改变的。

n = 300

Variable Assignment

对于immutable object,它的assignment和C++,Java的操作是一样的,就不再多说。

但是对于mutable object,它的assignment就是C++中常说的传地址操作,例如a=dict();b=a,那么此后ab就是同一个东西的两个不同名字罢了,对它们的操作会影响到同一个mutable object。

综上,对于assignment操作,一定要清楚是操作在immutable object上的还是mutable object上的,否则会带来问题。

Shallow Copies

通常使用b = a.copy()进行shallow copy。

shallow copy的示意图如下:

b = a.copy()

如上图可见,对于一个mutable object,使用shallow copy会创建一个新的mutable object,但是这个mutable object里面的child mutable object依然还是指向原来的child mutable object(注意:只是mutable object,没有immutable object)

代码示例:

>>> xs = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
>>> ys = list(xs)  # Make a shallow copy
>>> xs
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]
>>> ys
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]
>>> xs.append(['new sublist'])
>>> xs
[[1, 2, 3], [4, 5, 6], [7, 8, 9], ['new sublist']]
>>> ys
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]
>>> xs[1][0] = 'X'
>>> xs
[[1, 2, 3], ['X', 5, 6], [7, 8, 9], ['new sublist']]
>>> ys
[[1, 2, 3], ['X', 5, 6], [7, 8, 9]]

Deep Copies

通常使用b = copy.deepcopy(a)进行deep copy。

deep copy的示意图如下:

b = copy.deepcopy(a)

这个没啥好说的啦,就是完全创建了一个新的对象,和之前的对象没有半毛钱关系的。

Copy Arbitrary Python Objects

对于一个任意的Python Object,对于它的shallow copy和deep copy,可以直接拿dict进行类比就好

代码示例:

class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y

    def __repr__(self):
        return f'Point({self.x!r}, {self.y!r})'

>>> a = Point(23, 42)
>>> b = copy.copy(a)

>>> a
Point(23, 42)
>>> b
Point(23, 42)
>>> a is b
False

class Rectangle:
    def __init__(self, topleft, bottomright):
        self.topleft = topleft
        self.bottomright = bottomright

    def __repr__(self):
        return (f'Rectangle({self.topleft!r}, '
                f'{self.bottomright!r})')
    
rect = Rectangle(Point(0, 1), Point(5, 6))
srect = copy.copy(rect)

>>> rect
Rectangle(Point(0, 1), Point(5, 6))
>>> srect
Rectangle(Point(0, 1), Point(5, 6))
>>> rect is srect
False

>>> rect.topleft.x = 999
>>> rect
Rectangle(Point(999, 1), Point(5, 6))
>>> srect
Rectangle(Point(999, 1), Point(5, 6))

>>> drect = copy.deepcopy(srect)
>>> drect.topleft.x = 222
>>> drect
Rectangle(Point(222, 1), Point(5, 6))
>>> rect
Rectangle(Point(999, 1), Point(5, 6))
>>> srect
Rectangle(Point(999, 1), Point(5, 6))

参考资料

  1. Variables in Python
  2. Shallow vs Deep Copying of Python Objects

文章作者: CarlYoung
版权声明: 本博客所有文章除特別声明外,均采用 CC BY 4.0 许可协议。转载请注明来源 CarlYoung !
 上一篇
Joint Extraction of Entities and Relations Based on a Novel Decomposition Strategy Joint Extraction of Entities and Relations Based on a Novel Decomposition Strategy
Joint Extraction of Entities and Relations Based on a Novel Decomposition StrategyBackground Joint Extraction of Entitie
2021-04-03
下一篇 
Python Collections Module Python Collections Module
Python语言在设计之初,使用了一个非常有用的特性叫做——modular programming(模块化编程)。基于这种语言特性,写程序就会像搭积木一样简单。而Python中实现模块化编程的工具有functions,modules,pac
2021-04-02
  目录