Python 子对象 inplace 升级成父对象，有啥高性能但更好懂的方法吗？

class Old(object): def __init__(self): self.current = {} self.new_added = [] def keep_history(self, key): if key in self.filed: self.current[key] = self.field[key] else: self.new_added.append(key) def born(self): self.son = Normal(self._fields) self._fields = None return self.son class Normal(object): def __init__(self, field=None): self.filed = filed or {} def mutate(self, key, value): self.aging() self.keep_history(key) self._filed[key] = value return self.born() def aging(self): self.__class__ = Old self.__init__()

现在是这种设计，想 Normal 对象在 mutate 的时候，保留一份最老－》比较老－》年轻的记录

当前据说考虑 GC 的顺序，是老对象指向新对象，不因为最新对象而阻碍了老对象回收。

没用 weak ref 是因为 weak ref 开销大。。。

但这样的代码确实不美观。。我在想如何写的性能又高，代码又好读。。

请各位大神指点。。。

1 条回复 2016-10-25 22:42:28 +08:00

yupbank

2016-10-25 22:42:28 +08:00

use case of Normal and Old.
to retain change history and able to mutate a dictionary base object in rdd

```python
data = [dict(sound=1, counting=2, c=4) for _ in xrange(1000000)]

data_rdd = sc.parallelize(map(Normal, data))
#sc is pyspark context

def function(normal):
new_normal = normal.mutate('counting', 3).mutate('sound', 2)
#still do some calculation with normal ［ 1 ］
#even do something with the change history through normal.son.current. normal.son.son.current ［ 2 ］
return new_normal

data_rdd.map(function).collect()
```
----
但有些 use case 不会发生［ 1 ］，［ 2 ］两种情况。那样 gc 直接回收 old 。因为在 spark 集群跑，所以一点点的性能优势可以放大好多。。

我想改进代码可读性。。毕竟直接 inplace 的改变对象的__class__ 好粗暴

（之前的设计不是我搞得。。