



AI has many signs, from hardware to domain applications such as healthcare, futuristic models to ethics.

In the spirit of the last few years, we will review the development of what has been identified as the leading technology driver of the 2020s in the world of databases, data management and AI. Looking back on 2021, we are trying to identify the patterns that will shape 2022.

Today, we’ll start with Part 1 of the review and start with AI and the Knowledge Graph.

Many aspects of AI: hardware, edges, MLOps, language models, future architectures, ethical special features

AI and ML management in the enterprise

While AI and ML deployments are on track, the biggest challenge for CXOs is managing these initiatives and figuring out where the data science team is right for them, the algorithms to buy and the ones to build.

As a general rule, we strive to approach AI comprehensively. Take into account positives and negatives, from glossy to mundane, and from hardware to software. Hardware has been an ongoing story in a wide range of AI stories over the past few years, and I feel it’s a good place to start a tour.

Over the last few years, we’ve been paying attention to the growing list of “AI chip” vendors. In short, it’s a company that has embarked on developing a new hardware architecture specifically for AI workloads. They are all trying to get some of the seemingly growing pies. As AI continues to grow, workloads will continue to grow, and the obvious goal is to serve as quickly and economically as possible.

Nvidia continues to dominate this market. Nvidia was already on the market long before AI workloads began to flourish, with the insight and reflexes to take advantage of it by building an ecosystem of hardware and software. The 2020 move to make Arm part of this ecosystem is under regulatory supervision. However, Nvidia did not remain idle in 2021.

Of the numerous announcements made at Nvidia’s GTC event in November 2021, the ones that bring something new at the hardware level will characterize the focus of AI in 2021, the relationship between reasoning and edge. I have. Nvidia has introduced many improvements to the Triton Inference Server. We also introduced the Nvidia A2 Tensor Core GPU, a low-power, small footprint accelerator for AI inference at the edge that Nvidia claims to offer up to 20x higher inference performance than CPUs.

And what about start-ups? SambaNova claims to be “the most funded AI startup in the world” after a whopping $ 676 million in Series D funding and a valuation of over $ 5 billion. SambaNova’s philosophy is now to provide “AI as a service” including the GPT language model, and 2021 seems to have been generally the year on the market for them.

As part of that, Xilinx claims to achieve dramatic speedups in neural networks compared to Nvidia GPUs. Cerebras claimed to “absolutely control” high-end computing and also earned a lot of money. Graphcore is competing with Nvidia (and Google) for MLPerf results. Tenstorrenthired’s legendary chip designer Keller. Blaizeraised $ 71 million to bring cutting-edge AI to industrial applications. Flex Logix has received $ 55 million in venture support and total revenue of $ 82 million. Last but not least, with the addition of new horses to the race at NeuReality, there is a way to combine ONNX and TVM deployments, and there is a promise to use AI to design AI chips. Unless it’s a fast-growing innovation, we don’t know what it is.

According to the Linux Foundation’s State of the Edge report, digital healthcare, manufacturing and retail are particularly likely to expand their use of edge computing by 2028. It’s no wonder that AI hardware, frameworks and applications targeting the edge are also proliferating.

TinyML is an art and science that creates simple enough machine learning models to work at the edge, witnessing rapid growth and building an ecosystem. Edge Impulse, a startup that wants to bring machine learning to everyone, has announced a $ 34 million Series B funding. With the advent of edge applications, AI and its hardware will be a big part of it.

What we called in 2020 stood out in 2021, and what we’ll be with us for the next few years is the so-called MLOps, bringing machine learning to the production environment. In 2021, people named various phenomena related to MLOps, sliced ​​and diced MLOps domains, applied data versioning and continuous machine learning, and developed the equivalent of data test-driven development. Did. Practical aspects such as data quality and data pipeline management, as well as MLOps, will continue to grow, although the focus has shifted from the glossy new model to perhaps the more mundane.

Another thing that is likely to continue to grow in terms of both size and number is the Large Language Model (LLM). Some believe that the LLM can internalize basic linguistic forms such as biology, chemistry, and human language. The number of rare LLM applications is on the rise. Other than that, not so many. In any case, the LLM is skyrocketing.

In addition to “normal suspects” (OpenAI with GPT3, DeepMind with the latest RETRO LLM, Google with an ever-expanding array of LLMs), Nvidia has partnered with Microsoft at Megatron LLM. But that’s not all.

Recently, Eleuther AI, a collection of independent AI researchers, has open sourced a 6 billion parameter GPT-j model. In addition, if you are interested in languages ​​other than English, Aleph Alpha has a large European language model fluent in English, German, French, Spanish and Italian. Wudao is a Chinese LLM, also the largest LLM with 1.75 trillion parameters, and HyperCLOVA is a Korean LLM with 204 billion parameters. In addition, there are always other slightly older / smaller open source LLMs such as GPT2 and BERT and many variations thereof.

Beyond the LLM, DeepMind and Google both use Perceiverand Pathways to suggest an innovative architecture for AI models. The route has been criticized for being fairly vague. However, I guess it may be based on Perceiver. But as we are in the field of technology of the future, DeepMind’s Neural Algorithmic Reasoning is, of course, a research direction that promises to combine classic computer science algorithms with deep learning.

A tour of AI, even condensed, would not be complete without a prestigious reference to AI ethics. AI ethics will continue to come to mind in 2021, and we’ve seen everyone from FTC commissioners to industry practitioners work on AI ethics in their own way. Also, keep in mind the ongoing boom in AI applications in healthcare, where ethics should be a top priority, with or without AI.

Knowledge graph, graph database, graph AI

We have long been an enthusiastic supporter of graphs of all shapes and sizes (knowledge graph, graph database, graph analysis, data science, AI). Therefore, it is a complicated feeling to report from before. On the one hand, we didn’t see much innovation, except perhaps in one area (graph neural network). DeepMind’s Neural Algorithmic Reasoning also leverages GNN.

On the other hand, this is not necessarily a bad thing for two reasons. First, there is a major uptake of mainstream technology. By 2025, Gartner predicts that 80% of data and analytics innovations will use graphing technology, up from 10% in 2021 to facilitate rapid decision making. Reporting use cases such as BMW, IKEA, Siemens Energy, Wells Fargo, UBSis is no longer news, it’s good. Yes, there are challenges related to creating and maintaining the knowledge graph, but these challenges are almost always well understood.

As already mentioned, the knowledge graph is effectively a technology 20 years ago, and it seems that the time has come to be in the limelight. The method of creating a knowledge graph is well known, and the challenges there are well known. Some of the most sought-after skills and disciplines in knowledge graph development relate to how to build and maintain knowledge graphs using natural language processing and visual interfaces, extending from single-user to multi-user scenarios. It’s not a coincidence. ..

And in order to connect this conversation to the big picture of the AI ​​to which it belongs, the general challenges seem to be about operationalization and building the right expertise in the team. This is because these skills are in very high demand. Another important touchpoint is the direction of hybrid AI in injecting knowledge into machine learning. Leaders such as Intel’s Gadi Singer, LinkedIn’s Mike Dillinger, and Hybrid Intelligence Center’s Frank van Harmelen point out the importance of knowledge organizations in the form of a knowledge graph for the future of AI.

Knowledge graph, graph database, graph AI are all converged

Between the big picture of AI and the knowledge graph, there is another important touch point: the data mesh and the data fabric. We apologize for the recent confusion between these two and a large amount of data-related terms. Simply put, the data fabric is intended to serve as the technical foundation for the data mesh concept of distributed data management in an organization. This is in fact very much in line with knowledge graph technology, and some vendors in the field have identified it and positioned it accordingly. Even Informatica seems to have noticed.

And what about the foundation for building a knowledge graph, or graph database? The word that seems to characterize the 2021 graph database would be “put it on the market.” This year was a good year for graph databases. The graph database (Neo4j) has become the top 20 DB engine for the first time. Neo4j also announced that the general availability of Aura managed cloud services will deliver the largest $ 325 million Series F funding round in database history, with a valuation of over $ 2 billion.

The graph database space saw a series of funding rounds and upcoming IPOs. Tiger Graph has won $ 105M Series C, Katana Graph $ 28.5M Series A, Memgraph $ 934M Seed Funds and Terminus DB 3.6M. Meanwhile, Agens Graph maker Bitnine has begun working on the first IPO on the market.

Technically, GraphQL is still being adopted as part of a broader ecosystem or as a central component of data architecture. RDF and LPG, which bridge the world of two graph databases from a model perspective, are still underway, but some interesting developments were seen in 2021.

I don’t think the world’s honeymoon using graphs and graph databases will last forever, and at some point disillusionment will inevitably continue after hype. But we are convinced that this technology is the foundation.

