0

Over a year ago I reported a bug I encountered while pickling some fairly complex data. At the time I didn’t know what the issue was and believed it might have had something todo with recursive referencing.
I’ve encountered the issue several times while working on my project, but only did arbitrary things trying to fix it until the error disappeared. Now I finally took the time to home in on the source of the issue and refine my MWE. This is what I came up with:
import pickle
import numpy as np

# create data
dtypes = [(‘f0’, ‘O’)]
# for some reason, I need at least an extra of 19 fields for it to crash
# immediately
dtypes += [(f’f{i+1}’, ‘i4’) for i in range(19)]
data = np.empty(1, dtype=dtypes)
# print(data[0])

# dump data
dump = pickle.dumps(data[0], pickle.HIGHEST_PROTOCOL)
# print(‘dumping works’)

# load data
load = pickle.loads(dump)
# print(‘loading works’)

# process crashes here if len(dtypes) > 19
print(load)

# process prints random data, e.g.

# (((…), 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), 0, 0,
# -1931060898, 32763, 1472326776, 503, 1482667496, 503, 0, 0, 1484270024,
# 503, 1472326776, 503, -1930803631, 32763, 1484270024, 503)

# or

# ((((((…), False, True), False, True), dtype(‘int32’), None), 0, 0), 0, 0)

# or

# (((…), ,
# ,
# ,
# ,
# ),
# 0, 0, -1931060898, 32763, 451341512)

# and crashes immediately afterwards if 2 <= len(dtypes) <= 19.# process finishes with exit code 0 if data has no additional fields except f0, # and prints# (((...),),)Now I'm aware that similar issues have been reported in the past: pickling/unpickling numpy.void and numpy.record for multiprocessing Segmentation fault with numpy.void and pickle Python - pickling fails for numpy.void objects And in a quite recent and very similar case Segfault after loading pickled void objects a fix seems to have been introduced, however my code still causes a crash: Process finished with exit code -1073741819 (0xC0000005)Now for Python - pickling fails for numpy.void objects, the accepted answer is a comment by jottos (Dec 29 '09 at 18:42):so, pickling will only work with top level module functions and classes, and will not pickle class data, so if some numpy class code/data are required to produce a representation of the numpy void type pickling isn't going to work as expected. It may be that the numpy package has implemented an internal repr to print the void type as a tuple, if this is the case then what you pickled certainly is not going to be what you printed.But this is from over ten years ago, and it seems that bug fixes have been introduced since then. So is that still what is going on here, or is it something else? Especially since my code displays such arbitrary behavior. Setup Info Windows: 10 Home, v. 21H1, build 19043.1288 PyCharm: 2021.2 (Professional), build #PY-212.4746.96 Python (via anaconda): 3.7.7 [MSC v.1916 64 bit (AMD64)] Numpy: 1.19.2 Pickle: 4.0

Kuldeep Baberwal Changed status to publish February 17, 2025