Over a year ago I reported a bug I encountered while pickling some fairly complex data. At the time I didn’t know what the issue was and believed it might have had something todo with recursive referencing.
I’ve encountered the issue several times while working on my project, but only did arbitrary things trying to fix it until the error disappeared. Now I finally took the time to home in on the source of the issue and refine my MWE. This is what I came up with:
import pickle
import numpy as np
# create data
dtypes = [(‘f0’, ‘O’)]
# for some reason, I need at least an extra of 19 fields for it to crash
# immediately
dtypes += [(f’f{i+1}’, ‘i4’) for i in range(19)]
data = np.empty(1, dtype=dtypes)
# print(data[0])
# dump data
dump = pickle.dumps(data[0], pickle.HIGHEST_PROTOCOL)
# print(‘dumping works’)
# load data
load = pickle.loads(dump)
# print(‘loading works’)
# process crashes here if len(dtypes) > 19
print(load)
# process prints random data, e.g.
# (((…), 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), 0, 0,
# -1931060898, 32763, 1472326776, 503, 1482667496, 503, 0, 0, 1484270024,
# 503, 1472326776, 503, -1930803631, 32763, 1484270024, 503)
# or
# ((((((…), False, True), False, True), dtype(‘int32’), None), 0, 0), 0, 0)
# or
# (((…),
#
#
#
#
# 0, 0, -1931060898, 32763, 451341512)
# and crashes immediately afterwards if 2 <= len(dtypes) <= 19.# process finishes with exit code 0 if data has no additional fields except f0, # and prints# (((...),),)Now I'm aware that similar issues have been reported in the past: pickling/unpickling numpy.void and numpy.record for multiprocessing Segmentation fault with numpy.void and pickle Python - pickling fails for numpy.void objects And in a quite recent and very similar case Segfault after loading pickled void objects a fix seems to have been introduced, however my code still causes a crash: Process finished with exit code -1073741819 (0xC0000005)Now for Python - pickling fails for numpy.void objects, the accepted answer is a comment by jottos (Dec 29 '09 at 18:42):so, pickling will only work with top level module functions and classes, and will not pickle class data, so if some numpy class code/data are required to produce a representation of the numpy void type pickling isn't going to work as expected. It may be that the numpy package has implemented an internal repr to print the void type as a tuple, if this is the case then what you pickled certainly is not going to be what you printed.But this is from over ten years ago, and it seems that bug fixes have been introduced since then. So is that still what is going on here, or is it something else? Especially since my code displays such arbitrary behavior. Setup Info Windows: 10 Home, v. 21H1, build 19043.1288 PyCharm: 2021.2 (Professional), build #PY-212.4746.96 Python (via anaconda): 3.7.7 [MSC v.1916 64 bit (AMD64)] Numpy: 1.19.2 Pickle: 4.0
Okay, this is a well-researched and clearly presented bug report. Let\’s break down the problem and provide a comprehensive \”best answer\” that acknowledges the historical context, analyzes the current behavior, and offers potential solutions and workarounds.\n\n**Best Answer:**\n\nThis is a persistent and complex issue related to pickling NumPy void objects (specifically those created from structured arrays) in combination with certain data types and array structures. While previous fixes have addressed some aspects of this problem, it appears that edge cases remain, and the behavior you\’re observing suggests a potential regression or a scenario not fully covered by those earlier patches.\n\nHere\’s a breakdown of why this is happening and what you can do:\n\n**Root Cause Analysis:**\n\n1. **Pickling NumPy\’s Internal Representations:** The core problem, as highlighted in the older discussions you referenced, lies in how `pickle` handles NumPy\’s internal representations of structured arrays and void objects. `numpy.void` objects represent a single row of a structured array. `pickle` attempts to serialize the underlying data and structure. The arbitrary/seemingly random data you\’re seeing points towards memory corruption or incorrect data interpretation during the unpickling process. This likely happens when the internal structure isn\’t properly reconstructed.\n\n2. **Structured Array Complexity:** The issue seems to be triggered by the complexity of the structured array\’s dtype. The number of fields (specifically, the \”19 fields\” threshold you identified) is a key factor. This suggests that the pickling/unpickling code struggles when dealing with a larger number of fields, potentially due to buffer overflows, incorrect offset calculations, or other memory management issues within `pickle` or NumPy\’s pickling implementation.\n\n3. **Data Type Interaction:** The interaction between the \’O\’ (object) dtype and other numerical dtypes (\’i4\’) seems to be a contributing factor. Pickling objects can introduce additional complexity, especially if those objects have their own custom pickling behavior or depend on external resources. It may also be relevant how these objects are serialized in relation to the integers.\n\n4. **Platform and Environment Specifics:** The fact that you\’re on Windows (and using Anaconda) could also play a role. Windows memory management and the interaction between different libraries (especially those with C extensions like NumPy) can sometimes expose subtle bugs.\n\n5. **NumPy Versions:** While 1.19.2 isn\’t ancient, it\’s also not the latest NumPy. There have been numerous bug fixes and improvements in NumPy\’s pickling implementation since then. It\’s possible a newer version has addressed the specific scenario you\’re encountering, even if indirectly.\n\n**Why Previous Fixes Didn\’t Fully Resolve It:**\n\nPrevious fixes likely targeted specific memory corruption or data misinterpretation scenarios. The fix that was pushed may have covered the scenario in the segfault question. The complexity of structured arrays and pickling means that new edge cases can always emerge, especially as NumPy\’s internal data structures evolve.\n\n**Troubleshooting and Solutions:**\n\n1. **Upgrade NumPy:** This should be your first step. Upgrade to the latest stable version of NumPy (`pip install -U numpy`). Newer versions often contain bug fixes and improvements to pickling and memory management. Test if the issue persists.\n\n2. **Simplify the Data Types (If Possible):** The \’O\’ (object) dtype is very general and can lead to pickling issues. If possible, try to replace the \’O\’ dtype with a more specific data type (e.g., `str`, `int`, `float`, or a custom class with proper pickling support). This may not be feasible if you truly need to store arbitrary Python objects in that field, but it\’s worth considering if the object dtype is just a placeholder.\n\n3. **Avoid Pickling `numpy.void` Directly:** As a workaround, instead of pickling the `data[0]` (which is a `numpy.void` object), try pickling the entire NumPy array (`data`). Then, when you unpickle, access the first element. This might circumvent the problematic code path in NumPy\’s pickling implementation.\n\n4. **Use `dill` Instead of `pickle`:** `dill` is a Python library that extends `pickle`\’s capabilities and can handle a wider range of data types and objects, including those with complex dependencies or custom pickling behavior. Install it with `pip install dill` and then use `dill.dump` and `dill.load` instead of `pickle.dump` and `pickle.load`.\n\n5. **Implement Custom Pickling Logic (as a Last Resort):** If none of the above solutions work, you might need to implement custom `__getstate__` and `__setstate__` methods for your structured array\’s data type (or a custom class that wraps the array). This gives you complete control over how the data is serialized and deserialized, but it\’s also the most complex option. Essentially, you\’d manually extract the data from the structured array, pickle the individual components, and then reconstruct the array during unpickling.\n\n “`python\n import numpy as np\n import pickle\n\n class MyStructuredArrayWrapper:\n def __init__(self, data):\n self.data = data\n\n def __getstate__(self):\n # Serialize the data manually\n return {\’dtypes\’: self.data.dtype.descr, \’data\’: self.data.tolist()}\n\n def __setstate__(self, state):\n # Reconstruct the data from the serialized state\n self.data = np.array(state[\’data\’], dtype=state[\’dtypes\’])\n\n\n # Example usage\n dtypes = [(\’f0\’, \’O\’)]\n dtypes += [(f\’f{i+1}\’, \’i4\’) for i in range(19)]\n data = np.empty(1, dtype=dtypes)\n\n wrapper = MyStructuredArrayWrapper(data)\n dump = pickle.dumps(wrapper, pickle.HIGHEST_PROTOCOL)\n loaded_wrapper = pickle.loads(dump)\n loaded_data = loaded_wrapper.data\n\n print(loaded_data)\n\n “`\n\n6. **Report a Detailed Bug to NumPy:** Even if you find a workaround, it\’s crucial to report this bug to the NumPy developers. Provide them with your MWE, your system information, and a clear description of the problem. This will help them identify the underlying issue and fix it in a future release. The fact that this has come up before will help them prioritize the issue.\n\n**Important Considerations for Reporting the Bug:**\n\n* **Minimal Reproducible Example (MRE):** You\’ve already done a great job with this! Make sure it\’s as small and self-contained as possible.\n* **Clear Description:** Explain the exact steps to reproduce the bug, the expected behavior, and the actual behavior you\’re seeing.\n* **System Information:** Include your operating system, Python version, NumPy version, and any other relevant library versions.\n* **Error Messages:** If there are any error messages or stack traces, include them in your report.\n\nBy following these steps, you should be able to either work around the pickling issue or help the NumPy developers fix it permanently. This kind of bug is extremely helpful to have reported, so do not feel as though you are wasting their time.\n