More Thoughts on Artificial Intelligence

I published Thoughts on Artificial Intelligence on 19 July 2023. Since then, developments in AI have continued at a breath-taking speed. In here, I want to share new thoughts on AI that I have, as well as provide updates on some of the initial discussions.

Let’s start with the new thoughts, in no particular order (note that the caution from Thoughts on Artificial Intelligence that my thinking on AI is fragile still apply):

AI could be a long-term tailwind for the development of biotechnology drugs. AlphaFold is an AI-model from Alphabet’s subsidiary, Google Deepmind, that is capable of predicting the structure of nearly every protein discovered by scientists thus far – this amounts to more than 200 million structures. And Alphabet is providing this data for free. Proteins are molecules that direct all cellular function in a living organism, including of course, humans. A protein’s structure matters because it is what allows the protein to perform its job within an organism. In fact, diseases in humans can be caused by mis-structured proteins. Understanding the structure of a protein thus means knowing how it could affect the human body. Biotechnology drugs can be composed of proteins, and they tend to manipulate proteins, or the production of proteins, within the human body. According to an Economist article published in September this year, AlphaFold has been used by over 1.2 million researchers to-date. Elsewhere, researchers from biotechnology giant Amgen noted in a recent paper that with the help of AI, the company has reduced, by 60% compared to five years ago, the time it needs to develop a candidate drug up to the clinical-trial stage. But the researchers also shared that AI could do more to help biotechnology companies make the development process for protein-based drugs faster and cheaper. An issue confronting biotechnology companies today is a lack of sufficient in-house data to build reliable models to predict the effects of protein-based drugs. The researchers proposed methods for biotechnology companies to share data to build more powerful predictive AI models in a way that protects their intellectual properties. As AI technology improves over time, I’m excited to observe the advances in the protein-drug creation process that is likely to occur alongside.
It now looks even more possible to us that generative AI will have a substantial positive impact on the productivity of technology companies. For example, during Oracle’s earnings conference call that was held in September, management shared that the company is using generative AI to produce the code needed to improve all the features in Cerner’s system (Oracle acquired Cerner, a healthcare technology company, in June 2022), instead of its usual way of writing code in the Java programming language. Oracle’s management also said that even if AI code generators make mistakes, “once you fix the mistake, you fix it everywhere.” In another instance, MongoDB announced in late-September this year that it’s introducing generative AI into its MongoDB Relational Migrator service, which helps reduce friction for companies that are migrating from SQL to NoSQL databases. When companies embark on such a migration, software code needs to be written. With generative AI, MongoDB is able to help users automatically generate the necessary code during the migration process.
The use of AI requires massive amounts of data to be transferred within a data centre. There are currently two competing data switching technologies to do so, namely, Ethernet and Infiniband, and they each have their supporters. Arista Networks builds high-speed Ethernet data switches. During the company’s July 2023 earnings conference call, management shared their view that Ethernet is the right long-term technology for data centres where AI models are run. In the other camp, there’s Nvidia, which acquired Mellanox, a company that manufactures Infiniband data switches, in 2020. Nvidia’s leaders commented in the company’s latest earnings conference call (held in late-August this year) that “Infiniband delivers more than double the performance of traditional Ethernet for AI.” It’s also possible that better ways to move data around a data centre for AI workloads could be developed. In Arista Networks’ aforementioned earnings conference call, management also said that “neither technology… were perfectly designed for AI; Infiniband was more focused on HPC [high-performance computing] and Ethernet was more focused on general purpose networking.” We’re watching to see which technology (existing or new) would eventually have the edge here, as the market opportunity for AI-related data switches is likely to be huge. For perspective, Arista Networks estimates the total data centre Ethernet switch market to be over US$30 billion in 2027, up from around US$20 billion in 2022.

Coming to the updates, in Thoughts on Artificial Intelligence, I discussed how AI software, especially generative AI, requires vector databases but that NoSQL databases will remain relevant. During MongoDB’s latest earnings conference call, held in August this year, management shared their view that the ability to perform vector searches (which is what vector databases do) will ultimately be just a feature that’s built into all databases. This is because standalone vector databases are point-products that still need to be used with other types of databases in order for developers to build applications. I am on the same side as MongoDB’s management because of two things they shared during the company’s aforementioned earnings conference call. Firstly, they see developers preferring to work with multi-functional databases compared to bolting on a separate vector solution onto other databases. Secondly, Atlas Vector Search – MongoDB’s vector search feature within its database service – is already being used by customers in production even though it’s currently just a preview-product; to us, this signifies high customer demand for MongoDB’s database services within the AI community.

I also touched upon the phenomenon of emergence in AI in Thoughts on Artificial Intelligence. I am even more confident now that emergence is present in AI systems. Sam Altman, the CEO of OpenAI, the company behind ChatGPT, was recently interviewed by Salesforce co-founder and CEO Marc Benioff. During their conversation, Altman said (emphases are mine):

“I think the current GPT paradigm, we know how to keep improving and we can make some predictions about – we can predict with confidence it’s gonna get more capable. But exactly how is a little bit hard. Like when, you know, why a new capability emerges at this scale and not that one. We don’t yet understand that as scientifically as we do about saying it’s gonna perform like this on this benchmark.”

In other words, even OpenAI cannot predict what new capabilities would spring forth from the AI models it has developed as their number of parameters and the amount of data they are trained on increases. The unpredictable formation of sophisticated outcomes is an important feature of emergence. It is also why I continue to approach the future of AI with incredible excitement as well as some fear. As AI models train on an ever increasing corpus of data, they are highly likely to develop new abilities. But it’s unknown if these abilities will be a boon or bane for society. We’ll see!

Disclaimer: The Good Investors is the personal investing blog of two simple guys who are passionate about educating Singaporeans about stock market investing. By using this Site, you specifically agree that none of the information provided constitutes financial, investment, or other professional advice. It is only intended to provide education. Speak with a professional before making important decisions about your money, your professional life, or even your personal life. I currently have a vested interest in Alphabet and MongoDB. Holdings are subject to change at any time.