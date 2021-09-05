



TensorFlow, a popular Python-based machine learning and artificial intelligence project developed by Google, has discontinued support for YAML to patch critical code execution vulnerabilities.

YAML or yet another markup language is a convenient choice for developers looking for a human-readable data serialization language for processing configuration files and data in transit.

Unreliable deserialization vulnerability in TensorFlow

The maintainers behind both TensorFlow and Keras, a wrapper project for TensorFlow, have patched an unreliable deserialization vulnerability due to insecure parsing in YAML.

A critical flaw tracked as CVE-2021-37678 could allow an attacker to execute arbitrary code when an application deserializes a Keras model provided in YAML format.

Deserialization vulnerabilities typically occur when an application reads maliciously formatted or malicious data originating from a malicious source.

After an application reads data and deserializes it, it can crash into a denial of service (DoS) condition or, worse, execute arbitrary code by an attacker.

This YAML deserialization vulnerability, rated 9.3, was responsibly reported by security researcher Arjun Shibu to TensorFlow maintainers.

And do you ask the cause of the defect? The infamous “yaml.unsafe_load ()” function in TensorFlow code:

TensorFlow (GitHub) vulnerable yaml.unsafe_load function call

The “unsafe_load” function is known to deserialize YAML data quite freely. Resolve all tags, including tags that are known to be insecure with unreliable input.

So, ideally, “unsafe_load” is only called with input from a trusted source, and we know that there is no malicious content.

Otherwise, an attacker could exploit the deserialization mechanism to execute selected code by injecting a malicious payload into YAML data that has not yet been serialized.

The proof-of-concept (PoC) exploit example shared in the Vulnerability Advisory illustrates just this.

Import from tensorflow.keras Model Payload =”’!! python / object / new: type args: [‘z’, !!python/tuple [], {‘Extend’: !! python / name: exec}]listitems: “__import __ (‘os’) .system (‘cat / etc / passwd’)””’models.model_from_yaml (payload) TensorFlow uses YAML Drop completely JSON support

After the vulnerability was reported, TensorFlow decided to completely discontinue YAML support and use JSON deserialization instead.

“We’ve removed it for now because it takes a lot of work to support YAML format,” says the project maintainer of the same advisory.

“The methods` Model.to_yaml () `and` keras.models.model_from_yaml` have been replaced to raise a `RuntimeError` because they can be abused to execute arbitrary code.

“We recommend using JSON serialization instead of YAML, or as a better alternative, serializing to H5.”

It’s worth noting that TensorFlow isn’t the first or only project that turns out to be using YAML unsafe_load. The use of this function is fairly common in Python projects.

GitHub displays thousands of search results that reference the function, and some developers suggest improvements.

Many GitHub repositories use YAML’s insecure load function (GitHub).

The fix for CVE-2021-37678 will be available in TensorFlow version 2.6.0 and will be backported to previous versions 2.5.1, 2.4.3, and 2.3.4, the maintainer said.

