在做NLP实验,进行数据预处理的时候,用到了字典对象之间的assignment,调试的时候发现数据不太对,通过查阅资料发现,原来Python中的assignment也是另有玄机,除此之外,还有shallow copy和deep copy两种高级玩法。
Assignment
Python是一门高度面向对象的语言,事实上,Python中每一个data都是一个object(int,float也不例外)。
object分为mutable和immutable两种:
- immutable object就是int,float之类的,这类object是不能改变的;
- mutable object是dict,list这类的,这类object是可以改变的。
n = 300
对于immutable object,它的assignment和C++,Java的操作是一样的,就不再多说。
但是对于mutable object,它的assignment就是C++中常说的传地址操作,例如a=dict();b=a
,那么此后a
和b
就是同一个东西的两个不同名字罢了,对它们的操作会影响到同一个mutable object。
综上,对于assignment操作,一定要清楚是操作在immutable object上的还是mutable object上的,否则会带来问题。
Shallow Copies
通常使用b = a.copy()
进行shallow copy。
shallow copy的示意图如下:
如上图可见,对于一个mutable object,使用shallow copy会创建一个新的mutable object,但是这个mutable object里面的child mutable object依然还是指向原来的child mutable object(注意:只是mutable object,没有immutable object)
代码示例:
>>> xs = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
>>> ys = list(xs) # Make a shallow copy
>>> xs
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]
>>> ys
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]
>>> xs.append(['new sublist'])
>>> xs
[[1, 2, 3], [4, 5, 6], [7, 8, 9], ['new sublist']]
>>> ys
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]
>>> xs[1][0] = 'X'
>>> xs
[[1, 2, 3], ['X', 5, 6], [7, 8, 9], ['new sublist']]
>>> ys
[[1, 2, 3], ['X', 5, 6], [7, 8, 9]]
Deep Copies
通常使用b = copy.deepcopy(a)
进行deep copy。
deep copy的示意图如下:
这个没啥好说的啦,就是完全创建了一个新的对象,和之前的对象没有半毛钱关系的。
Copy Arbitrary Python Objects
对于一个任意的Python Object,对于它的shallow copy和deep copy,可以直接拿dict进行类比就好。
代码示例:
class Point:
def __init__(self, x, y):
self.x = x
self.y = y
def __repr__(self):
return f'Point({self.x!r}, {self.y!r})'
>>> a = Point(23, 42)
>>> b = copy.copy(a)
>>> a
Point(23, 42)
>>> b
Point(23, 42)
>>> a is b
False
class Rectangle:
def __init__(self, topleft, bottomright):
self.topleft = topleft
self.bottomright = bottomright
def __repr__(self):
return (f'Rectangle({self.topleft!r}, '
f'{self.bottomright!r})')
rect = Rectangle(Point(0, 1), Point(5, 6))
srect = copy.copy(rect)
>>> rect
Rectangle(Point(0, 1), Point(5, 6))
>>> srect
Rectangle(Point(0, 1), Point(5, 6))
>>> rect is srect
False
>>> rect.topleft.x = 999
>>> rect
Rectangle(Point(999, 1), Point(5, 6))
>>> srect
Rectangle(Point(999, 1), Point(5, 6))
>>> drect = copy.deepcopy(srect)
>>> drect.topleft.x = 222
>>> drect
Rectangle(Point(222, 1), Point(5, 6))
>>> rect
Rectangle(Point(999, 1), Point(5, 6))
>>> srect
Rectangle(Point(999, 1), Point(5, 6))