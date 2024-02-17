



Google has open sourced Magika, its internal machine learning-powered file identifier, as part of its AI Cyber ​​Defense initiative aimed at providing better automation tools for IT network defenders and others. .

Determining the actual contents of a user-submitted file is probably harder than it looks. For example, it is not safe to infer a file type from its extension, and relying on heuristics or human-written rules like the widely used libmagic to identify the actual nature of a document from its data is unsafe. That's Google's opinion. ” It is time consuming and error prone. ”

Basically, if someone uploads a .JPG to an online service, you need to make sure it's a JPEG image and not a script masquerading as a JPEG image, which could get you in trouble later. you need to check. Magika uses trained models to quickly identify file types from file data. This is an approach that Big G believes is effective enough to use in production. We've heard that Magika is used by Gmail, Google Drive, Chrome's Safe Browsing, and VirusTotal to properly identify and route data for further processing.

Your mileage may vary. For example, Libmagic might work just fine. Either way, Magika is an example of how Google is using artificial intelligence internally to improve security, and we hope other companies can benefit from the technology as well. Another example is RETVec, a multilingual text processing model used to detect spam. This comes at a time when we are all being warned that fraudsters are clearly making greater use of machine learning software to automate intrusions and vulnerability probes.

Policymakers, security experts, and civil society have an opportunity to finally tip the balance of cybersecurity away from attackers and toward cyber defenders.

“AI is at a critical crossroads, and policymakers, security experts, and civil society have the opportunity to finally tip the balance of cybersecurity away from attackers and toward cyber defenders,” said Google Cloud CEO said Phil Venables, Director of Information Security, and Royal Hansen, Director of Engineering. “For privacy, safety and security,” he said Friday.

“As malicious actors experiment with AI, we need bold and timely action to shape the direction of this technology.”

The authors believe that network defenders can use Magika to determine the true contents of files quickly and at scale. This is the first step in malware analysis and intrusion detection. To be honest, this deep learning model is useful for anyone who needs to scan documents provided by users. For example, a video that is actually actionable should raise some sort of alarm and require further inspection. Email attachments that are different from those listed as needing to be quarantined. I understand.

More generally, in the context of cybersecurity, AI models can not only inspect files for suspicious content and source code for vulnerabilities, but also generate patches to fix bugs. You can, Googlers claimed. Engineers at the giant company are also experimenting with Gemini to improve automated fuzzing for open source projects.

According to Google, Magika is 50% more accurate in identifying file types than the company's previous handcrafted rules system, which takes milliseconds to identify file types and is said to be at least 99% accurate in testing. . However, it is not perfect and will fail to classify the file type about 3% of the time. Licensed under Apache 2.0, the code can be found here, and the model weighs 1 MB.

Apart from Magika, Chocolate Factory has partnered with 17 startups from the UK, US and Europe as part of this new AI cyber defense initiative, training them to use these types of automation tools to improve security. Masu.

It will also expand its $15 million cybersecurity seminar program to help universities train more European students in security. Closer to home, he has committed $2 million in grants to fund large-scale language models to support cyberattack research and academics at the University of Chicago, Carnegie Mellon University, and Stanford University. did.

“The AI ​​revolution is already underway. While people rightly celebrate the potential of new medicines and scientific advances, we must also continue to solve generational security challenges and provide the safety, security and security we deserve. , we are also excited about the potential of AI to bring us closer to a trusted digital world,” concluded Venables and Hansen.

