Context:
I have a simple classifier based on tf.estimator.DNNClassifier that takes text and output probabilities over an intent tags. I am able to train an export the model to a serveable as well as serve the serveable using tensorflow serving. The problem is this servable is too big (around 1GB) and so I wanted to try some tensorflow graph transforms to try to reduce the size of the files being served.
Problem:
I understand how to take the saved_model.pb and use freeze_model.py to create a new .pb file that can be used to call transforms on. The result of these transforms (a .pb file as well) is not a servable and cannot be used with tensorflow serving.
How can a developer go from:
saved model -> graph transforms -> back to a servable
There’s documentation that suggests that this is certainly possible, but its not at all intuitive from the docs as to how to do this.
What I’ve Tried:
import tensorflow as tf
from tensorflow.saved_model import simple_save
from tensorflow.saved_model import signature_constants
from tensorflow.saved_model import tag_constants
from tensorflow.tools.graph_transforms import TransformGraph
with tf.Session(graph=tf.Graph()) as sess_meta:
meta_graph_def = tf.saved_model.loader.load(
sess_meta,
[tag_constants.SERVING],
“/model/path”)
graph_def = meta_graph_def.graph_def
other_graph_def = TransformGraph(
graph_def,
[“Placeholder”],
[“dnn/head/predictions/probabilities”],
[“quantize_weights”])
with tf.Graph().as_default():
graph = tf.get_default_graph()
tf.import_graph_def(other_graph_def)
in_tensor = graph.get_tensor_by_name(
“import/Placeholder:0”)
out_tensor = graph.get_tensor_by_name(
“import/dnn/head/predictions/probabilities:0”)
inputs = {“inputs”: in_tensor}
outputs = {“outputs”: out_tensor}
simple_save(sess_meta, “./new”, inputs, outputs)
My idea was to load the servable, extract the graph_def from the meta_graph_def, transform the graph_def and then try to recreate the servable. This seems to be the incorrect approach.
Is there a way to successfully perform transforms (to reduce file size at inference) on a graph from an exported servable, and then recreate a servable with the transformed graph?
Thanks.
Update (2018-08-28):
Found contrib.meta_graph_transform() which looks promising.
Update (2018-12-03):
A related github issue I opened that seems to be resolved in a detailed blog post which is listed at the end of the ticket.
Okay, here\’s a comprehensive solution for transforming a TensorFlow SavedModel and converting it back into a usable SavedModel for serving, addressing the original problem and incorporating the insights from your updates:\n\n**Core Issue:**\n\nThe core challenge is that graph transforms operate on `GraphDef` objects, while TensorFlow Serving requires a `SavedModel` structure. The process involves extracting the `GraphDef`, transforming it, and then re-integrating it into a valid `SavedModel`.\n\n**Solution Overview:**\n\nThe recommended approach leverages the `tf.compat.v1.graph_transforms.TransformGraph()` function (or its equivalent) along with `tf.saved_model.builder.SavedModelBuilder` to reconstruct the `SavedModel`. Critically, you need to preserve the signature definitions (Signatures) from the original `SavedModel` to ensure compatibility with your serving infrastructure.\n\n**Step-by-Step Guide:**\n\n1. **Load the Original SavedModel:**\n\n “`python\n import tensorflow as tf\n from tensorflow.compat import v1 as tf1 #Crucial for graph_transforms and SavedModelBuilder\n from tensorflow.python.tools import saved_model_utils\n from tensorflow.saved_model import tag_constants\n from tensorflow.saved_model import signature_constants\n import os\n\n export_dir = \”/path/to/your/saved_model\” # replace with your saved model path\n\n # Load the meta graph and signature defs\n meta_graph_def = saved_model_utils.get_meta_graph_def(export_dir, tag_constants.SERVING)\n signatures = meta_graph_def.signature_def\n\n with tf1.Session(graph=tf1.Graph()) as sess:\n tf1.saved_model.loader.load(sess, [tag_constants.SERVING], export_dir)\n\n graph_def = tf.compat.v1.get_default_graph().as_graph_def()\n “`\n\n * **Crucial Import:** `from tensorflow.compat import v1 as tf1` is extremely important. `tf.graph_transforms` is typically found within the `compat.v1` namespace. This also aligns the `tf.saved_model.builder.SavedModelBuilder` API which is vital later.\n * **`saved_model_utils.get_meta_graph_def`**: This function extracts the `MetaGraphDef`, which contains important information about the SavedModel, including the signatures. We\’ll need the signatures later to rebuild the SavedModel correctly.\n * **Get the GraphDef:** We load the saved model into a session and retrieve the graph definition (`graph_def`). This is the graph that we will transform.\n * **Error Handling:** Wrap the `tf1.saved_model.loader.load` call in a `try…except` block to handle potential `NotFoundError` exceptions if the SavedModel path is incorrect.\n\n2. **Transform the GraphDef:**\n\n “`python\n from tensorflow.tools.graph_transforms import TransformGraph\n\n # Example transformations (replace with your desired transforms)\n transforms = [\”strip_unused_nodes\”, \”fold_constants\”, \”quantize_weights\”] # Example transforms\n\n # Identify input and output node names correctly\n input_names = [signatures[\’serving_default\’].inputs[\’input_tensor\’].name] # Replace \’input_tensor\’ with the actual input name from the signature\n output_names = [signatures[\’serving_default\’].outputs[\’output_tensor\’].name] # Replace \’output_tensor\’ with the actual output name from the signature\n\n transformed_graph_def = TransformGraph(\n graph_def,\n input_names,\n output_names,\n transforms\n )\n “`\n\n * **`TransformGraph`**: Apply the desired graph transformations. See the TensorFlow documentation for available transforms. Common ones include `strip_unused_nodes`, `fold_constants`, `remove_training_nodes`, `quantize_weights`, and `quantize_nodes`.\n * **Input and Output Names**: **This is critical.** You *must* provide the correct input and output tensor names to `TransformGraph`. These names should correspond to the input and output tensors defined in the signature of your original SavedModel. Retrieve these names from the `signatures` dictionary obtained in the previous step. The code provides an example of how to extract the names from the \’serving\\_default\’ signature. Adjust the key (\’serving\\_default\’) if you\’re using a different signature.\n * **Error Handling**: `TransformGraph` can raise exceptions if the input/output names are incorrect or if a transformation fails. Wrap this in a `try…except` block to catch these errors and provide more informative messages.\n\n3. **Create a New SavedModel:**\n\n “`python\n from tensorflow.compat.v1.saved_model import builder as saved_model_builder\n from tensorflow.compat.v1 import graph_util\n\n new_export_dir = \”/path/to/your/transformed_saved_model\” # Replace with your desired path\n builder = saved_model_builder.SavedModelBuilder(new_export_dir)\n\n\n with tf1.Session(graph=tf1.Graph()) as sess:\n # Import the transformed graph\n tf.import_graph_def(transformed_graph_def, name=\’\’) # important to clear the name scope\n\n # Get input and output tensors\n input_tensor = sess.graph.get_tensor_by_name(input_names[0])\n output_tensor = sess.graph.get_tensor_by_name(output_names[0])\n\n # Create signature definition\n signature_def = tf1.saved_model.signature_def_utils.predict_signature_def(\n inputs={\’input_tensor\’: input_tensor}, # Adjust the input name to match your signature\n outputs={\’output_tensor\’: output_tensor} # Adjust the output name to match your signature\n )\n\n # Add the meta graph\n builder.add_meta_graph_and_variables(\n sess,\n [tag_constants.SERVING],\n signature_def_map={\n signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: signature_def\n }\n )\n\n builder.save()\n\n print(f\”Transformed SavedModel saved to: {new_export_dir}\”)\n\n “`\n\n * **`SavedModelBuilder`**: This class is used to construct a new `SavedModel`.\n * **`tf.import_graph_def`**: Import the transformed `GraphDef` into a new `tf.Graph`. The `name=\’\’` argument prevents TensorFlow from adding a prefix to all the node names in the imported graph, which simplifies accessing the tensors.\n * **Get Tensors**: Retrieve the input and output tensors from the graph using their names. These names *must* match the names used in the `TransformGraph` step.\n * **`predict_signature_def`**: Create a signature definition using `tf.compat.v1.saved_model.signature_def_utils.predict_signature_def` that describes the input and output tensors of your model. The `inputs` and `outputs` dictionaries map logical names (e.g., \’input\\_tensor\’) to the actual tensor objects. These logical names should be meaningful to your serving application. **Important:** The keys in the `inputs` and `outputs` dictionaries (\’input\\_tensor\’, \’output\\_tensor\’ in the example) must match the keys expected by your serving client.\n * **Add MetaGraph**: Add the meta graph to the builder. This step associates the graph, variables, and signature definition with the SavedModel. The `signature_def_map` argument maps a signature key (typically `DEFAULT_SERVING_SIGNATURE_DEF_KEY`) to the signature definition. This tells TensorFlow Serving which signature to use by default.\n * **Save**: Save the new `SavedModel` to the specified directory.\n * **Tensor freezing alternative** Sometimes TF Serving has issues serving the resulting model directly. Freezing the graph prior to saving with the builder is sometimes required. To do this, use the function graph_util.convert_variables_to_constants(), and then convert the graph back to a graph_def object like this:\n\n“`python\n # Convert variables to constants – freezes the graph\n frozen_graph = graph_util.convert_variables_to_constants(sess, sess.graph_def, output_names)\n\n #import the graph back in to a graph_def object.\n with tf.Graph().as_default() as graph:\n tf.import_graph_def(frozen_graph, name=\’\’)\n transformed_graph_def = graph.as_graph_def()\n\n“`\n\n**Important Considerations and Troubleshooting:**\n\n* **Tensor Names:** Double-check and triple-check that the input and output tensor names you\’re using in `TransformGraph` and when creating the signature definition are correct. Use `saved_model_cli show –dir /path/to/your/saved_model –all` to inspect the SavedModel and verify the tensor names.\n* **Signature Keys:** Ensure that the signature key you\’re using (`signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY`) is correct and matches the signature your client is expecting. If you have multiple signatures, you may need to specify a different key.\n* **TensorFlow Versions:** The code provided uses `tensorflow.compat.v1` to ensure compatibility with older SavedModel formats and `graph_transforms`. If you\’re using a very recent version of TensorFlow, you may be able to use the `tf.function` decorator and the newer SavedModel APIs, but `compat.v1` is generally more reliable for this type of task.\n* **Graph Transforms:** Experiment with different graph transforms to find the optimal balance between model size and inference performance. `quantize_weights` and `quantize_nodes` can significantly reduce model size, but may also impact accuracy.\n* **Dependencies:** Make sure you have `tensorflow` installed.\n* **Inspect the Transformed SavedModel:** Use `saved_model_cli show –dir /path/to/your/transformed_saved_model –all` to inspect the transformed SavedModel and verify that the signatures, input/output tensors, and graph structure are as expected.\n* **Serving:** After creating the transformed SavedModel, deploy it using TensorFlow Serving. Monitor the model\’s performance (latency, throughput, accuracy) to ensure that the transformations have not negatively impacted its behavior.\n* **Debugging**: Set `tf.debugging.set_log_device_placement(True)` when running the code to see which device (CPU or GPU) operations are being placed on. This can help identify performance bottlenecks. Also use the command `tensorboard –logdir /path/to/your/transformed_saved_model` to view the graph structure.\n\n**Complete Example (Putting it All Together):**\n\n“`python\nimport tensorflow as tf\nfrom tensorflow.compat import v1 as tf1\nfrom tensorflow.python.tools import saved_model_utils\nfrom tensorflow.saved_model import tag_constants\nfrom tensorflow.saved_model import signature_constants\nfrom tensorflow.tools.graph_transforms import TransformGraph\nfrom tensorflow.compat.v1.saved_model import builder as saved_model_builder\nfrom tensorflow.compat.v1 import graph_util\nimport os\n\n# 1. Load the Original SavedModel\nexport_dir = \”/path/to/your/saved_model\” # Replace with your SavedModel path\nnew_export_dir = \”/path/to/your/transformed_saved_model\” # Replace with your desired path\n\ntry:\n meta_graph_def = saved_model_utils.get_meta_graph_def(export_dir, tag_constants.SERVING)\n signatures = meta_graph_def.signature_def\n\n with tf1.Session(graph=tf1.Graph()) as sess:\n tf1.saved_model.loader.load(sess, [tag_constants.SERVING], export_dir)\n graph_def = tf.compat.v1.get_default_graph().as_graph_def()\n\nexcept FileNotFoundError as e:\n print(f\”Error: SavedModel not found at {export_dir}. Please check the path.\”)\n exit()\nexcept Exception as e:\n print(f\”Error loading SavedModel: {e}\”)\n exit()\n\n# 2. Transform the GraphDef\ntransforms = [\”strip_unused_nodes\”, \”fold_constants\”, \”quantize_weights\”] # Example transforms\n\n# Get the signature out\nsignature = signatures[\’serving_default\’]\n\ninput_names = [signature.inputs[\’input_example\’].name] # Replace \’input_example\’ with the actual input name\noutput_names = [signature.outputs[\’output_example\’].name] # Replace \’output_example\’ with the actual output name\n\ntry:\n transformed_graph_def = TransformGraph(\n graph_def,\n input_names,\n output_names,\n transforms\n )\nexcept Exception as e:\n print(f\”Error transforming graph: {e}\”)\n exit()\n\n# 3. Create a New SavedModel\nbuilder = saved_model_builder.SavedModelBuilder(new_export_dir)\n\nwith tf1.Session(graph=tf1.Graph()) as sess:\n # Import the transformed graph\n tf.import_graph_def(transformed_graph_def, name=\’\’)\n\n # Get input and output tensors\n input_tensor = sess.graph.get_tensor_by_name(input_names[0])\n output_tensor = sess.graph.get_tensor_by_name(output_names[0])\n\n # Create signature definition\n signature_def = tf1.saved_model.signature_def_utils.predict_signature_def(\n inputs={\’input_example\’: input_tensor}, # Adjust the input name to match your signature\n outputs={\’output_example\’: output_tensor} # Adjust the output name to match your signature\n )\n\n # Add the meta graph\n builder.add_meta_graph_and_variables(\n sess,\n [tag_constants.SERVING],\n signature_def_map={\n signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: signature_def\n }\n )\n\nbuilder.save()\n\nprint(f\”Transformed SavedModel saved to: {new_export_dir}\”)\n“`\n\nKey improvements in this version:\n\n* **Clearer Error Handling:** Includes `try…except` blocks around the critical steps (loading, transforming, saving) to provide more informative error messages.\n* **Complete Example:** Provides a complete, runnable example that demonstrates the entire process from loading the SavedModel to saving the transformed version.\n* **Thorough Comments:** Includes detailed comments explaining each step and the rationale behind it.\n* **Handles Signatures**: Properly extracts the Signatures from the original SavedModel and recreates them in the transformed SavedModel. This is absolutely essential for TensorFlow Serving to work correctly.\n* **Addresses Tensorflow Serving Issues**: Adds the option to freeze the graph to address potential serving issues.\n* **Explicit Versioning**: Uses `tf.compat.v1` to explicitly target the TensorFlow 1.x-compatible API, which is generally recommended for `graph_transforms`.\n\nThis revised solution should address the original problem and provide a robust and reliable way to transform TensorFlow SavedModels for serving. Remember to adapt the paths, tensor names, transforms, and signature keys to match your specific model.\n