admin, Author at The Good Investors

What We’re Reading (Week Ending 21 April 2024)

The best articles we’ve read in recent times on a wide range of topics, including investing, business, and the world in general.

We’ve constantly been sharing a list of our recent reads in our weekly emails for The Good Investors.

Do subscribe for our weekly updates through the orange box in the blog (it’s on the side if you’re using a computer, and all the way at the bottom if you’re using mobile) – it’s free!

But since our readership-audience for The Good Investors is wider than our subscriber base, we think sharing the reading list regularly on the blog itself can benefit even more people. The articles we share touch on a wide range of topics, including investing, business, and the world in general.

Here are the articles for the week ending 21 April 2024:

1. The Anguish of Central Banking – Arthur F. Burns

Why, in particular, have central bankers, whose main business one might suppose is to fight inflation, been so ineffective in dealing with this worldwide problem?

To me, as a former central banker, the last of these questions is especially intriguing. One of the time-honored functions of a central bank is to protect the integrity of its nation’s currency, both domestically and internationally. In monetary policy central bankers have a potent means for fostering stability of the general price level. By training, if not also by temperament, they are inclined to lay great stress on price stability, and their abhorrence of inflation is continually reinforced by contacts with one another and with like-minded members of the private financial community. And yet, despite their antipathy to inflation and the powerful weapons they could wield against it, central bankers have failed so utterly in this mission in recent years. In this paradox lies the anguish of central banking…

…Analyses of the inflation that the United States has experienced over the past fifteen years frequently proceed in three stages. First are considered the factors that launched inflation in the mid-1960s, particularly the governmental fine tuning inspired by the New Economics and the loose financing of the war in Vietnam. Next are considered the factors that led to subsequent strengthening of inflationary forces, including further policy errors, the devaluations of the dollar in 1971 and 1973, the worldwide economic boom of 1972-73, the crop failures and resulting surge in world food prices in 1973-74, the extraordinary increases in oil prices that became effective in 1974, and the sharp deceleration of productivity growth from the late 1960s onward. Finally, attention is turned to the process whereby protracted experience with inflation has led to widespread expectations that it will continue in the future, so that inflation has acquired a momentum of its own.

I have no quarrel with analyses of this type. They are distinctly helpful in explaining the American inflation and, with changes here and there, that in other nations also. At the same time, I believe that such analyses overlook a more fundamental factor: the persistent inflationary bias that has emerged from the philosophic and political currents that have been transforming economic life in the United States and elsewhere since the 1930s. The essence of the unique inflation of our times and the reason central bankers have been ineffective in dealing with it can be understood only in terms of those currents of thought and the political environment they have created…

…the period from World War II to the mid-1960s was marked not only by a dampening of the business cycle but also by persistent increases in the prosperity of American families…

…This experience of economic progress strengthened the public’s expectations of progress. What had once been a quiet personal feeling that the long future would be better than the past, particularly for one’s children, was transformed during the postwar years into an articulate and widespread expectation of steady improvement in living standards—indeed, into a feeling of entitlement to annual increases in real income.

But the rapid rise in national affluence did not create a mood of contentment. On the contrary, the 1960s were years of social turmoil in the United States, as they were in other industrial democracies…

…In the innocence of the day, many Americans came to believe that all of the new or newly discovered ills of society should be addressed promptly by the federal government. And in the innocence of the day, the administration in office attempted to respond to the growing demands for social and economic reform while waging war in Vietnam on a rising scale. Under the rubric of the New Economics, a more activist policy was adopted for the purpose of increasing the rate of economic growth and reducing the level of unemployment…

…The interplay of governmental action and private demands had an internal dynamic that led to their concurrent escalation. When the government undertook in the mid-1960s to address such “unfinished tasks” as reducing frictional unemployment, eliminating poverty, widening the benefits of prosperity, and improving the quality of life, it awakened new ranges of expectation and demand. Once it was established that the key function of government was to solve problems and relieve hardships—not only for society at large but also for troubled industries, regions, occupations, or social groups—a great and growing body of problems and hardships became candidates for governmental solution…

…Many results of this interaction of government and citizen activism proved wholesome. Their cumulative effect, however, was to impart a strong inflationary bias to the American economy. The proliferation of government programs led to progressively higher tax burdens on both individuals and corporations. Even so, the willingness of government to levy taxes fell distinctly short of its propensity to spend. Since 1950, the federal budget has been in balance in only five years. Since 1970, a deficit has occurred in every year. Not only that, but the deficits have been mounting in size. Budget deficits have thus become a chronic condition of federal finance; they have been incurred when business conditions were poor and also when business was booming. But when the government runs a budget deficit, it pumps more money into the pocketbooks of people than it withdraws from their pocketbooks; the demand for goods and services therefore tends to increase all around. That is the way the inflation that has been raging since the mid-1960s first got started and later kept being nourished.

The pursuit of costly social reforms often went hand in hand with the pursuit of full employment. In fact, much of the expanding range of government spending was prompted by the commitment to full employment. Inflation came to be widely viewed as a temporary phenomenon—or, provided it remained mild, as an acceptable condition. “Maximum” or “full” employment, after all, had become the nation’s major economic goal— not stability of the price level. That inflation ultimately brings on recession and otherwise nullifies many of the benefits sought through social legislation was largely ignored…

…And so I finally come to the role of central bankers in the inflationary process. The worldwide philosophic and political trends on which I have been dwelling inevitably affected their attitudes and actions. In most countries, the central bank is an instrumentality of the executive branch of government—carrying out monetary policy according to the wishes of the head of government or the ministry of finance. Some industrial democracies, to be sure, have substantially independent central banks, and that is certainly the case in the United States. Viewed in the abstract, the Federal Reserve System had the power to abort the inflation at its incipient stage fifteen years ago or at any later point, and it has the power to end it today. At any time within that period, it could have restricted the money supply and created sufficient strains in financial and industrial markets to terminate inflation with little delay. It did not do so because the Federal Reserve was itself caught up in the philosophic and political currents that were transforming American life and culture…

…Facing these political realities, the Federal Reserve was still willing to step hard on the monetary brake at times—as in 1966, 1969, and 1974—but its restrictive stance was not maintained long enough to end inflation. By and large, monetary policy came to be governed by the principle of undernourishing the inflationary process while still accommodating a good part of the pressures in the marketplace. The central banks of other industrial countries, functioning as they did in a basically similar political environment, appear to have behaved in much the same fashion.

In describing as I just have the anguish of central banking in a modern democracy, I do not mean to suggest that central bankers are free from responsibility for the inflation that is our common inheritance. After all, every central bank has some room for discretion, and the range is considerable in the more independent central banks. As the Federal Reserve, for example, kept testing and probing the limits of its freedom to undernourish the inflation, it repeatedly evoked violent criticism from both the Executive Branch and the Congress and therefore had to devote much of its energy to warding off legislation that could destroy any hope of ending inflation. This testing process necessarily involved political judgments, and the Federal Reserve may at times have overestimated the risks attaching to additional monetary restraint…

…Monetary theory is a controversial area. It does not provide central bankers with decision rules that are at once firm and dependable. To be sure, every central banker has learned from the world’s experience that an expanding economy requires expanding supplies of money and credit, that excessive creation of money will over the longer run cause or validate inflation, and that declining interest rates will tend to stimulate economic expansion while rising interest rates will tend to restrict it; but this knowledge stops short of mathematical precision…

…It is clear, therefore, that central bankers can make errors—or encounter surprises—at practically every stage of the process of making monetary policy. In some respects, their capacity to err has become larger in our age of inflation. They are accustomed, as are students of finance generally, to think of high and rising market interest rates as a restraining force on economic expansion. That rule of experience, however, tends to break down once expectations of inflation become widespread in a country. At such a time, lenders expect to be paid back in cheaper currency, and they are therefore apt to demand higher interest rates. Since borrowers have similar expectations, they are willing to comply. An “inflation premium” thus gets built into nominal interest rates. In principle, no matter how high the nominal interest rate may be, as long as it stays below or only slightly above the inflation rate, it very likely will have perverse effects on the economy; that is, it will run up costs of doing business but do little or nothing to restrain overall spending. In practice, since inflationary expectations, and therefore the real interest rates implied by any given nominal rate, vary among individuals, central bankers cannot be sure of the magnitude of the inflation premium that is built into nominal rates. In many countries, however, these rates have at times in recent years been so clearly below the ongoing inflation rate that one can hardly escape the impression that, however high or outrageous the nominal rates may appear to observers accustomed to judging them by a historical yardstick, they have utterly failed to accomplish the restraint that central bankers sought to achieve. In other words, inflation has often taken the sting out of interest rates— especially, as in the United States, where interest payments can be deducted for income tax purposes…

…There is a profound difference between the effects of mistaken judgments by a central bank in our age of inflation and the effects of such judgments a generation or two ago. In earlier times, when a central bank permitted excessive creation of money and credit in times of prosperity, the price level would indeed tend to rise. But the resulting inflation was confined to the expansion phase of the business cycle; it did not persist or gather force beyond that phase. Therefore, people generally took it for granted that the advance of prices would be followed by a decline once a business recession got under way. That is no longer the case.

Nowadays, businessmen, farmers, bankers, trade union leaders, factory workers, and housewives generally proceed on the expectation that inflation will continue in the future, whether economic activity is booming or receding. Once such a psychology has become dominant in a country, the influence of a central bank error that intensified inflation may stretch out over years, even after a business recession has set in. For in our modern environment, any rise in the general price level tends to develop a momentum of its own. It stimulates higher wage demands, which are accommodated by employers who feel they can recover the additional costs through higher prices; it results in labor agreements in key industries that call for substantial wage increases in later years without regard to the state of business then; and through the use of indexing formulas, it leads to automatic increases in other wages as well as in social security payments, various other pensions, and welfare benefits, in rents on many properties, and in the prices of many commodities acquired under long-term contracts…

…If my analysis of central banking in the modern environment is anywhere near the mark, two conclusions immediately follow. First, central banks have indeed been participants in the inflationary process in which the industrial countries have been enmeshed, but their role has been subsidiary. Second, while the making of monetary policy requires continuing scrutiny and can stand considerable improvement, we would look in vain to technical reforms as a way of eliminating the inflationary bias of industrial countries. What is unique about our inflation is its stubborn persistence, not the behavior of central bankers. This persistence reflects the fundamental forces on which I dwelt earlier in this address—namely, the philosophic and political currents of thought that have impinged on economic life since the Great Depression and particularly since the mid-1960s…

…The precise therapy that can serve a nation best is not easy to identify, and what will work well in one country may work poorly in another. In the case of the American inflation, which has become a major threat to the well-being of much of the world as well as of the American people, it would seem wise to me at this juncture of history for the government to adopt a basic program consisting of four parts. The first of these would be a legislative revision of the federal budgetary process that would make it more difficult to run budget deficits and that would serve as the initial step toward a constitutional amendment directed to the same end. The second part would be a commitment to a comprehensive plan for dismantling regulations that have been impeding the competitive process and for modifying others that have been running up costs and prices unnecessarily. The third part would be a binding endorsement of restrictive monetary policies until the rate of inflation has become substantially lower. And the fourth part would consist of legislation scheduling reductions of business taxes in each of the next five years—the reduction to be quite small in the first two years but to become substantial in later years. This sort of tax legislation would release powerful forces to improve the nation’s productivity and thereby exert downward pressure on prices; and it would also help in the more immediate future to ease the difficult adjustments forced on many businesses and their employees by the adoption of the first three parts of the suggested program.

2. Two Things I’m Not Worried About – Ben Carlson

Here are two things a lot of other people are worried about but not me:

Stock market concentration. Here’s a chart from Goldman Sachs that shows by one measure, the U.S. stock market is as concentrated as it has ever been:

To which my reply is: So what?

Yes, the top 10 stocks make up more than one-third of the S&P 500. All this tells me is that the biggest and best companies are doing really well. Is that a bad thing?

Stock markets around the globe are far more concentrated than the U.S. stock market. Emerging markets rose to their highest level since June 2022 yesterday. Out of an index that covers 20+ countries, a single stock (Taiwan Semiconductor) accounted for 70% of the move.

Stock market returns over the long run have always been dominated but a small minority of the biggest, best-performing companies…

… Bloomberg is out with a new report that sounds the alarm on U.S. government debt levels:

With uncertainty about so many of the variables, Bloomberg Economics has run a million simulations to assess the fragility of the debt outlook. In 88% of the simulations, the results show the debt-to-GDP ratio is on an unsustainable path — defined as an increase over the next decade.

In the end, it may take a crisis — perhaps a disorderly rout in the Treasuries market triggered by sovereign US credit-rating downgrades, or a panic over the depletion of the Medicare or Social Security trust funds — to force action. That’s playing with fire.

I’ll believe it when I see it.

People have been sounding the alarm on government debt in this country for decades. There has been no panic. No financial crisis. No debt default…

… Interest expense relative to the size of the economy has shot higher in recent years from the combination of more debt and higher rates:

But we’re still well below the highs from the 1980s and 1990s. And when you look at the absolute numbers here, going from 1.5% of GDP to 3% of GDP isn’t exactly the end of the world…

…Debt-to-GDP is now as high as it was in World War II:

That seems scary until you realize in Japan, debt-to-GDP is closer to 300%. I’m not saying we should test our limits but there is no pre-set line in the sand on these things.

3. The inequity method of accounting – Sujeet Indap

The fundamental bargain of M&A seems pretty simple. At the closing of a deal, the buyer pays the seller, and gets a business in return.

It hasn’t been so straightforward for the family who agreed in 2022 to sell its California supermarket Save Mart to the private equity firm Kingswood Capital Management, which valued the grocery chain at $245mn.

Three months after the papers were signed, Kingswood demanded that Save Mart’s prior owners, the Piccinini family, fork back over $109mn after already surrendering the company. In effect, Kingswood wanted to receive a net $77mn payment to take over Save Mart.

And thanks to some ballsy lawyering and nebulous bookkeeping, it seems the PE firm might actually succeed, its gambit upheld by a controversial arbitration ruling in September 2023…

…When Kingswood signed the deal for Save Mart, it was really acquiring two separate businesses. One was the Save Mart grocery chain, comprised of 200 stores and more than $4bn in annual revenue. Save Mart separately held a majority stake in Superstore Industries (SSI), a successful food wholesaler/distributor that had two other owners…

…The two sides agreed that Save Mart’s equity stake in SSI, the joint venture, would be valued at $90mn, a significant step up from the ~$22.5mn value that Save Mart had assigned the investment on its books.

The increase reflected SSI’s valuable land portfolio, according to one person familiar with the transaction. And it enables Kingswood to lower SSI’s tax basis should it ever want to sell SSI, according to a person involved in the transaction.

Those seem reasonable enough. Still, the accounting of SSI’s value is what laid the foundation for this dispute.

For context, a company’s investments can be recorded on its balance sheet in three ways: cost method, equity method, and full consolidation.

Save Mart selected the equity method for its SSI stake.

To explain a bit further: Let’s imagine a company with $100 of asset value and $60 in liabilities, which leaves it with an equity value of $40. Say this company has a 50-per-cent owner, meaning it owns $20 in equity. The owner’s balance sheet would list that $20 as a single line item, called “equity in unconsolidated affiliates”. That account would grow with the subsidiary’s proportional net income, and decrease with any net losses or dividends.

Save Mart’s stake in SSI was listed as a single line on its balance sheet — worth $22.5mn…

…In March 2022, Kingswood and Save Mart closed their deal with the PE firm sending payments based on the family’s proposed accounting. That then set off a final round of post-closing negotiations, where Kingswood got 90 days to argue with the Piccinini’s maths…

…But Kingswood dropped in one massive adjustment with the boilerplate.

It added back $109mn of gross SSI debt, and asserted that the figure counted as official “Indebtedness”. And it argued it should be paid back for all that additional debt.

The PE firm pointed to the language in the deal contract, and said the definition of “Indebtedness” included any Save Mart “group” debt…

…Arbitrator Joseph Slights III, a lawyer in private practice who was formerly a Delaware Vice Chancellor, did not ultimately buy any of what the Piccinnis were selling.

He wrote in the arbitration decision: “Delaware law is more contractarian than most, and Delaware courts will enforce the letter of the parties’ contract without regard for whether they have struck a good deal or bad deal . . . the result is not absurd or commercially unreasonable.”…

…The Piccinnis, understandably, believe writing a cheque for $109mn is indeed “absurd” and “commercially unreasonable”. They have accused Kingswood of “bad faith” and “gamesmanship” in their court papers.

They will now appeal to the Delaware Supreme Court, pointing to a 2017 decision that said in a post-closing adjustment dispute, the legal system should aim to uphold the broader spirit of the contract instead of narrow contract definitions…

…Kingswood had believed, all along prior to signing and closing, that the gross SSI debt belonged on Save Mart’s main balance sheet. But they decided to keep quiet about that until after the deal closed.

One implication is that they were happy to close on the Piccininis’ terms, and winning on the SSI debt issue would be a bonus, given that there was no guarantee of winning the arbitration.

The firm’s equity check on the $240mn transaction was just $60mn (see the sources and uses table above). If Kingswood is eventually paid the $109mn, it will receive nearly two times their equity contribution by weaponising accounting and legal technicalities.

4. Don’t Be Afraid – Michael Batnick

All-time highs are interesting in the emotions they elicit. Some people might be euphoric as their accounts reach dollar amounts never seen before. Others might fear this is as good as it’s going to get and worry about a trap-door scenario.

Your emotional state might also depend on your asset allocation. If you’re sitting on a large cash pile, it’s understandable that you might be hesitant to go “all in” at a record price. It might not “feel” right.

The good news is the data doesn’t support those feelings. On average since 1970, the S&P 500 has done better 1, 3, and 5 years after making an all-time high than picking a random day.

5. An Interview with Google Cloud CEO Thomas Kurian About Google’s Enterprise AI Strategy – Ben Thompson and Thomas Kurian

You did mention that, “People are moving out of proof-of-concept into actually doing products”. Is that actually happening? What are the actual use cases that companies are actually rolling out broadly as opposed to doing experiments on what might be possible?

TK: Broad-brush, Ben, we can break it into four major categories. One category is streamlining internal processes within the organization, streamlining internal processes. In finance, you want to automate accounts receivable, collections, and cashflow prediction. In human resources, you want to automate your human help desk as well as improve the efficiency with which you can do benefits matching, for example. In procurement and supply chain, you want for example, look at all my suppliers, their contracts with me and tell me which ones have indemnification and warranty protection, so I can drive more volume to those that give me indemnification and warranties and less to those that don’t, for example. These are all practical cases we have customers live in deployment with.

Second is transforming the customer experience. Transforming the customer experiences, how you market, how you merchandise, how you do commerce, how you do sales and service. An example is what Mercedes-Benz CEO Ola Källenius talked about how they’re building a completely new experience for the way that they market and sell and service their vehicles.

Third is that some people are integrating it into their products, and when I say re-imagining their products, re-imagining their core products using AI. We had two examples of companies who are in the devices space. One is Samsung and the other one is Oppo, and they’re re-imagining the actual device itself using AI with all the multimodality that we provide.

There are quite a few companies now re-thinking that if a model can change the way that I see it, that I can process multimodal information. For example, in media we have people saying, “If your model can read as much information as it can, can it take a long movie and shrink it into highlights? Can I take a sports recording of the NCAA basketball final and say, ‘find me all the highlights by this particular player’?” and not have to have a human being sit there and splice the video, but have it do it and I can create the highlights reel really quickly. So there are lots of people re-imagining the product offerings that they have.

And finally, there are some people saying, “With the cost efficiency of this, I can change how I enter a brand new market because, for example, I can do personalized offers in a market where I may not have a physical presence, but I can do much higher conversion rate for customers with online marketing and advertising because now I can do highly tailored campaigns because the cost of creating the content is much lower.” So broad-brush, streamline the core processes and back office, transform the customer experience and it doesn’t mean call centers or chatbots, it can be actually transferring the product itself, transforming the nature of the product you build and enter new markets.

Is it fair to say then when you talk about, “Moving from proof-of-concept to actual production”, or maybe that’s not the words you used, but people are saying, “Okay, we’re going to build this” because this stuff’s not showing up yet, in the real world. Is it the case that, “We see that this could be valuable, now we’re in”, and that’s why you’re emphasizing the platform choice now because they’ve committed to AI broadly, and now it’s like, “Where are we going to build it”?

TK: We have people experimenting, but we also have people actually live deployment and directing traffic. Orange, the telecom company, was talking about how many customers they’re handling online, Discover Financial was talking about how their agents are actually using AI search and AI tools to discover information from policy and procedure documents live. So there are people actually literally running true traffic through these systems and actually using them to handle real customer workload.

Are you seeing the case in a lot of in customers, or maybe you’re hearing from potential customers, that AI is rolling out, if that’s the right word, in an employee arbitrage situation? Where there’s individual employees that are taking on themselves to use these tools and they are personally benefiting from the increased productivity — maybe they’re doing less work or maybe they’re getting more done — and the companies want to capture that more systematically. Is that a theme that you’re seeing?

TK: We’re seeing three flavors. Flavor one is a company has, we’re going to try eight or nine, what they call customer journeys or use cases, we’re going to pick the three that we see as the maximum return, meaning value and value does not mean cost savings always. It could be, for example, we have one who is handling 1 million calls a day through our customer service system. Now a million calls a day, if you think about it, Ben, an average person can do about 250 calls a day, that’s a certain volume in an eight-hour day. If you handled a million, that is a lot of people, so the reality is that several of them were not being answered and people never called because the wait time was so long. So in that case, it was not about cost savings, it’s the fact that they’re getting able to reach many more customers than they could do before. So that’s one. One part is people saying, “I have a bunch of scenarios, I’m going to pick the three”, and in many cases, they’re actually augmenting something they’re doing or doing something they couldn’t do before, that’s scenario one.

Scenario two was I have, for example, there’s a large insurance company that’s working with us. Today, when they do claims and risk calculation, it takes a long time to handle the claims and the risk, particularly the risk calculation, because there’s thousands of pages of documents, there’s a lot of spreadsheets going back and forth. They put it into Gemini and it was able to run the calculations much, much more quickly. So second is I’m picking a very high value use case for my organization, which is the core function, and I’m going to implement it because I can get a real competitive advantage. In their case, it’s the fact that they can both get more accurate scoring on the risk and they can also do a much more accurate job, faster job in responding.

And the third scenario is what you said. “Hey, we’ve got a bunch of people, we’re going to give it to a certain number of developers”. For example, our coding tool, “They are going to test it, they say it helps me generate much better unit tests, it helps me write better quality code”. Wayfair’s CTO was talking about what their experience is, and then they say, “Let’s go broadly”, so all three patterns are being seen…

…Do you see AI, though, in all this talk about, “You need to choose a platform? Sure, our platform’s going to be open, you can use it anywhere” — but do you see this as a wedge to be like, “Okay, this is a reboot broadly for the industry as far as cloud goes, and sure, your data may be in AWS, or in Azure, or whatever it might be, but if you have a platform going forward, you should start with us”? Then maybe we’ll look up in ten, fifteen years, and all the center of gravity shifted to wherever the platforms are?

TK: For sure. I mean, it’s a change in the way that people make purchase decisions, right? Ten years ago, you were worried about commodity computing, and you were like, “Who’s going to give me the lowest cost for compute, and the lowest cost for storage, and the lowest cost for networking?”. Now the basis of competition has changed and we have a very strong position, given our capability both at the top, meaning offering a platform, offering models, et cetera, and building products that have long integrated models.

Just as an example, Ben, integrating a model into a product is not as easy as people think; Gmail has been doing that since 2015. On any daily basis, there are over 500 million operations a day that we run and to do it well, when a partner talked about the fact that 75% of people who generate an image for slides actually end up presenting it, it’s because we have paid a lot of attention over the years on how to integrate it.

So we play at the top of the stack, and we have the infrastructure and scale to do it really well from a cost, performance, and global scale that changes the nature of the competition. So we definitely see this, as you said, as a reset moment for how customers thinking of choosing their cloud decision.

If you’re talking about a lot of choices about models, and customers were over-indexed on choosing the correct model, that implies that models are maybe a commodity, and that we’ve seen with GPT-4 prices are down something like 90% since release. Is that a trend you anticipate continuing, and is it something that you want to push and actually accelerate?

TK: Models — whether they’re a commodity or not, time will tell, these are very early innings. All we’re pointing out is every month, there’s a new model from a new player, and the existing models get better on many different dimensions. It’s like trying to pick a phone based on a camera, and the camera’s changing every two weeks, right? Is that the basis on which you want to make your selection?

Well, but if you make that basis, then you might be locked into the operating system.

TK: That’s right, and so that’s why we say you should choose an open platform, and you should be able to use a collection of different models, because it’s changing, and don’t lock into a particular operating system at a time when the applications on top of it are changing, to use your analogy.

Why is your platform open as compared to others? Microsoft has announced you can use other models, not just OpenAI models. Amazon is sort of, to the extent you can ascertain a strategy, it’s like, “Look, we’re not committing to anything, you could do whatever you want.” Why do you feel comfortable saying, “No, we’re the open one,” and they’re not?

TK: Well, first of all, the completeness of our platform; Vertex has a lot more services than you can get with the other platforms. Secondly, in order to improve a platform, you have to have your own model, because there’s a bunch of things you do when you engineer services with that model.

I’ll give you a really basic example. You use a model, you decide to ground the answers. Grounding improves quality, but can also introduce latency. How do you make sure that when you’re grounding, you’re not serially post-processing a model’s answer to add latency? Unless you have your own model, you wouldn’t even get to that. So because we have our own model, we’re able to engineer these things, but we make them available as services with other models, so you can use enterprise grounding as a very specific example. There are lots of customers using it with Mistral and with Llama and with Anthropic.

Second thing, we are not just offering models, but we’re actually helping the third party go to customers with us. I met a lot of customers today jointly with [CEO] Dario [Amodei] from Anthropic, and it’s a commitment to make sure we’re not just giving you our infrastructure, we’re not just training, integrating a model into Vertex, we’re not just making it a first-class model, but we’re actually bringing it to clients together.

I think that’s what we mean by open. One of the other players has no models of their own, so naturally they’re offering a bunch of models, and the other player has outsourced their model development to a third party…

…How important is that million context window in the story you are telling? My perception is, there’s a lot of stuff you could do if you build a lot of infrastructure around it, whether it be RAG or other implementations, but it feels like with Gemini 1.5 there are jack-of-all-trades possibilities that seem to open up to a much greater extent, and there’s a bit where, you had that compliance bit, the statements of work and they had to compare it to the 100-page compliance document. I got some comments like, “Maybe companies shouldn’t have 100-page compliance notebooks or whatever it might be”, but the reality is, that’s the case, the world has that. My perception of the keynote is, that was the killer feature, that seemed to undergird everything. Was that the correct perception?

TK: Yeah, there are two reasons. Just to be perfectly clear, Ben, the long context window allows you to do three things that are important. First of all, when you look at high definition video, for example, and other modalities, and just imagine you’re dumping a high definition video in and you want to create out of the NCAA final, which just happened, the highlight reel but you don’t want to specify every attribute about what you want spliced into the highlight reel. The model has to digest it and because it has to process it, it’s a fairly dense representation of the video because there are objects, there are people moving, there are actions, like I’m throwing a pass. They could be, I have my name on the back of my t-shirt, there could be a score like, “When did they change from 24 to 26 points? Did they score three pointers?”, so there are many, many, many dimensions. So reasoning becomes a lot better when you can take a lot more context, that’s one, and it’s particularly true of modality.

The second is, today people don’t use models to maintain state or memory, meaning they ask it a question, the next time they think, “Hey, it may not remember”, so when you’re able to maintain a longer context, you can maintain more state, and therefore you can do richer and richer things rather than just talk back-and-forth with a very simplistic interface. You see what I mean?

The third thing is, there are certainly complex scenarios, it’s the unfortunate reality, there’s lots of policies and procedure books that are even longer than what we showed, and so there are scenarios like that that we have to be able to deal with. But in the longer term, the real breakthrough is the following. Context length, if you can decouple the capabilities of the model and the latency to serve a model from the context length, then you can fundamentally change how quickly you can scale a model.

Is this ultimately, from your perspective, a question of infrastructure, and that just leans into Google’s biggest advantage?

TK: It’s a question of global infrastructure, but also optimizations at every layer in the infrastructure, which we can co-engineer with DeepMind…

…Sundar Pichai mentioned in his video greeting, he emphasized the number of AI startups, and particularly AI unicorns using Google Cloud. To go back to the reboot idea, do you view the AI Era as a restart in terms of capturing the next generation of companies? I mean, obviously, AWS had a huge advantage here as far as general cloud computing, the entire mobile app ecosystem was by and large built on AWS. In the enterprise era, you have to deal with what’s there, what they’ve already dealt with, you have to have the integrations. Do you see yourself as having this as a big focus, “We’re going to own this era of startups”?

TK: Yes. And by the way, every one of those startups is being pursued by the other two, and the fact that 90% of the unicorns and 60% of all AI-funded startups, up in each case by ten points in eight months, and they are the most discerning ones. I mean, just to be frank, the unicorns, for them, it is the really biggest cost of goods sold in their P&L.

So what’s the driver there?

TK: The efficiency of our infrastructure.

Disclaimer: The Good Investors is the personal investing blog of two simple guys who are passionate about educating Singaporeans about stock market investing. By using this Site, you specifically agree that none of the information provided constitutes financial, investment, or other professional advice. It is only intended to provide education. Speak with a professional before making important decisions about your money, your professional life, or even your personal life. We currently have a vested interest in Alphabet (parent of Google), Amazon (parent of AWS), Microsoft, and TSMC. Holdings are subject to change at any time.

What We’re Reading (Week Ending 14 April 2024)

The best articles we’ve read in recent times on a wide range of topics, including investing, business, and the world in general.

We’ve constantly been sharing a list of our recent reads in our weekly emails for The Good Investors.

Do subscribe for our weekly updates through the orange box in the blog (it’s on the side if you’re using a computer, and all the way at the bottom if you’re using mobile) – it’s free!

Here are the articles for the week ending 14 April 2024:

1. Perplexity is ready to take on Google – Alex Heath and Aravind Srinivas

What’s it like on the frontlines of the AI talent war right now?

I made mistakes in chasing the wrong people. Recently there was a really senior backend engineer who ended up joining X.AI. He was talking to us, too.

I was talking to Patrick Collison for advice on this, and he said, “Why are you even in this race? Why are you trying to compete with these people? Go after people who want to actually build the stuff that you’re building and don’t chase AI clout.”

There are a lot of good engineers who are applying to us and Anthropic and OpenAI and X.AI and Character.ai. These are the top five choices of AI startups. And people normally just go to the highest bidder. Whoever has the highest valuation will be able to win this race all the time because, on paper, you’re always going to be able to offer the same amount of shares but the dollar value is going to be much higher…

...Have you taken any kind of lesson away from the Gemini diversity scandal? I saw you recently integrated photo generation into Perplexity.

Factfulness and accuracy is what we care about. Google has many other cultural things that they care about, and that’s why they made their products that way. They should only prioritize one aspect, which is giving an accurate answer. They don’t do that for whatever reasons. They have all these other people in the room trying to make decisions.

If I learned one thing, it’s that it’s better to be neutral. Don’t try to have any values you inject into the product. If your product is an answer engine, where people can ask questions and get answers, it better respond in a scholarly way. There’s always a nerd in your classroom who’s just always right, but you don’t hate them for having a certain political value, because they are just going to give you facts. That’s what we want to be. And Google’s trying to be something different. That’s why they got into trouble.

What are you hearing generally about the state of Google from people there right now?

The researchers are still pretty excited about what they’re doing. But the product team messes up their releases. The Gemini product team was fine-tuning all these models to put in the product. There’s a lot of bureaucracy, basically.

I know Sergey Brin being there is making things faster and easier for them. You might have seen the video that was circulating of him being at some hackathon. He brushed it [the Gemini diversity scandal] off as just some kind of a small bug, right?

It’s not a small bug. It’s actually poor execution. The image generation thing is actually very easy to catch in testing. They should have caught it in testing. When you consider Google as the place for de facto information and correctness, when they make mistakes it changes the way you perceive the company…

…How much of your tech is in-house versus fine-tuning all these models that you work with? What’s your tech secret sauce?

In the beginning, we were just daisy-chaining GPT-3.5 and Bing. Now, we post-train all these open-source models ourselves. We also still use OpenAI’s model.

We are never going to do the full pre-training ourselves. It’s actually a fool’s errand at this point because it takes so much money to even get one good model by pre-training yourself. There are only four or five companies that are capable of doing that today. And when somebody puts out these open-source models, there’s no reason for you to go and recreate the whole thing.

There is a new term that has emerged in this field called post-training. It’s actually like fine-tuning but done at a much larger scale. We are able to do that and serve our models ourselves in the product. Our models are slightly better than GPT-3.5 Turbo but nowhere near GPT-4. Other than Anthropic and Gemini, nobody has actually gotten to that level yet.

How are you doing to solve AI hallucination in your product? Can you?

The reason why we even have sources at the top of the answer is because we want to make sure that users have the power to go verify the answer. We precisely tell you which link to go to versus showing ten blue links and you not being sure which to read.

The other way is constantly improving the authority of which sources we use to cite the answer and then getting rid of the bad ones. When you don’t have sufficient information, it’s better to say you don’t know rather than saying something you made up.

2. Book Summary: Our Investing Strategy, who does the market smile upon – Made In Japan

He goes by the name of Tatsuro Kiyohara, who was the CIO of Tower Investment Management, which ran the flagship K-1 fund that compounded 20% annually during his 25-year run (that’s 9300%). Compare this to the TOPIX which did an annualized return of roughly 3%.

But its not just the numbers that he posted that were inspiring, the journey to get there was a tumultuous one that would be almost impossible for us to replicate. He is built differently. Who else is willing to pour in almost their entire net worth when the fund is down -72% in an attempt to save his fund not just for his sake, but for the clients that decided to stick with him amid all the redemptions?..

…During his stint in New York, his clients included the familiar hedge funds we’d all heard of. One of which was Julian Robertson’s Tiger Management. Meeting Tiger was perhaps the first instance he decided that he wanted to do something with hedge funds, he would spend his days talking stock at their office. Tiger appreciated him too – one time he realized that Tiger was short a stock, Kawasaki Steel, and also realized Nomura was attempting to ‘promote’ the stock (what you call ‘pump’ these days), he almost had to fight with Tiger to convince exiting and that the stock was going up regardless of fundamentals which they finally obliged. This stock 3xed not long after. Not a surprise that he was invited to Tiger’s annual party to be awarded best salesman of the year…

…The game is to figure out when you’re in the minority. This doesn’t always mean that because everyone is bullish, that you being bullish isn’t a variant perception. If for example the market expects a company to grow by 10% a year for the next 5 years, and you believe it will be more like 30%, you are still in the minority.

It is more difficult to thus have a good investment idea in large caps because its harder to figure out what’s priced in.

From an opportunity cost standpoint for an individual investor, the return on time is so much better researching cheap micro & small-caps with the potential to grow. If you had an hour with the CEO of a large company or a small one the likelihood you will get more valuable insights (your alpha) is much higher in the latter. In general, you also won’t lose much if you’re wrong here, these companies aren’t priced for growth anyway. For him personally, he tries to avoid investing in stocks that have already gone up as much as possible…

…One of his first investments with the fund that falls into this archetype was Nitori, which is a household name today but when he first invested nobody touched it. It was a Hokkaido-based company, and the furniture market in Japan was shrinking. What he realized though was that the market was very fragmented, and he saw Nitori as the one to take market share over time with an exceptional founder. Which proved to be correct. His investment in NItori 10xed all the while the market halved. The lesson here was that even with a shrinking market if you find the right company, you can generate strong returns. No doubt there are some diamonds in the rough…

…In the end, 90% of investing in Small/Micro-caps is about Management
Heres what he looks for:

Operators with a growth mindset
Talented employees that are aligned with the CEO’s Vision and Mission
Doesn’t get crushed by competitors
A widening core competence (read = Moat) as it grows
Not in an industry where you’re pulling forward demand (There is a finite pile of demand that once absorbed will be gone, he points out the Japanese M&A industry as one)
The management is impeccable with their word that is, they do what they say
Companies that have positive feedback loops…

…With one company, every time he requested a meeting with Investor Relations, the CEO showed up every time without fail which he found strange. Eventually, though this business got caught in an accounting scandal and went under. Maybe if the CEO shows up too readily you need to be careful. Another business had zero interest in doing IR, showing up in their factory uniform, and wasn’t too friendly. One day however they show up in suits! This business also didn’t too well…

…There are mavericks among them. Founder of Zensho Holdings (the operator of Beef bowl chain Sukiya). This was an unpopular stock that IPOd. The first thing he saw when visiting the founder’s office was a bench press. He was ‘ripped like Popeye’. At the meeting, all he talked about was how superior a food a ‘beef bowl’ (Gyuudon) was. If Japanese people had eaten enough beef bowls and benchpressed enough Japan wouldn’t have lost the war (LOL).

One time one of Kiyohara-sans employees told this founder he’s also been going to the gym, he immediately challenged said employee and tested him with a ‘lariat’ a type of wrestling tackle. The key is to find a CEO who knows his wrestling moves.

The IR was also interesting. In its mid-term plan, they even included the P/E multiple the company should be trading at in 5 years…

…As he once reflected on his portfolio, he realized that the business and its management was like looking at himself in the mirror. If assessing management correctly is the key to investing, also understand that there is a self-selection bias. If the Founder and CEO is that much more brilliant than you – you won’t even realize how brilliant he is. That said you’d never invest in a ‘dumb’ CEO, so ultimately you end up selecting people ‘on your level’. This it appears, to be the reality for investing in microcaps…

…He was adamant about that whether large or small – he had no interest in buying an expensive business. If the P/E was too high, that was a pass.

Over the years he tried building various growth models and realized this had almost no benefit to making money so just stopped. Was a waste of time.

He screens for such companies by finding a high net cash ratio which was just net cash over market cap. (So basically net-nets).

He also liked to invert the problem, by looking at the current P/E of the stock you can figure out the kind of earnings growth it implied. No rocket science here – he tries to figure out the Terminal Multiple with a Perpetual Growth model. For example, if the risk-free rate is 2% and the P/E is 10x that implies a terminal growth of -8.2% all else equal if earnings growth was -3.1% instead then the P/E should be 20x. (Yes this is all negative growth)

The 40-year average of the risk-free rate is about 1.7% in Japan so that sounds fair to him. Also, he says, forget about the concept of equity risk premium – this is just a banker’s term to underwrite uncertainty. If you’re uncertain just model that into your earnings projections…

…He doesn’t look at P/B where Investors may be calculating the liquidation price which can be inaccurate.

The point he thinks many miss – if a company is loss-making would the hypothetical buyer buy the assets at face value? And say that the business is a decent profitable business – no one’s going to be looking at the P/B they’ll be fixated on the P/E…

…Risks of small/micro caps:

Illiquidity discount
Many businesses are small suppliers to a much larger company
It operates in an industry with low entry barriers
Limited talent within the organization
Succession issues and nepotism – the son of the owner can be a dumb ass
Because no one notices them – the likelihood of a fraud/scandal is higher
When the owner retires, this person may pay out a massive retirement bonus
Because it’s an owner-operator and harder to take over, no one keeps them in check and may screw up
When there is an accounting fraud, the damage will be large
They don’t have the resources to expand overseas
They have an incentive to keep their valuation low to minimize the inheritance tax.

3. Three reasons why oil prices are remarkably stable – The Economist

Shouldn’t oil prices be surging? War has returned to the Middle East. Tankers in the Red Sea—through which around 12% of seaborne crude is normally shipped—are under attack by Houthi militants. And opec, a cartel of oil exporters, is restricting production. Antony Blinken, America’s secretary of state, has invoked the spectre of 1973, when the Yom Kippur war led to an Arab oil embargo that quadrupled prices in just three months. But oil markets have remained calm, trading mostly in the range of $75 and $85 per barrel for much of last year…

…Oil production is now less concentrated in the Middle East than it has been for much of the past 50 years. The region has gone from drilling 37% of the world’s oil in 1974 to 29% today. Production is also less concentrated among members of OPEC… That is partly because of the shale boom of the 2010s, which turned America into a net energy exporter for the first time since at least 1949…

…Another reason for calm is opec members’ ample spare production capacity (ie, the amount of oil that can be produced from idle facilities at short notice)…

…America’s Energy Information Administration (eia) estimates that opec’s core members have around 4.5m barrels per day of spare capacity—greater than the total daily production of Iraq…

…The world still has a big appetite for oil: according to the eia demand hit a record in 2023 and will be higher still in 2024, thanks in part to growth in India. But that is unlikely to push prices much higher. Global growth is not at the levels seen in the early 2000s. China, long the world’s biggest importer of oil, is experiencing anaemic economic growth. Structural changes to its economy also make it less thirsty for the stuff: next year, for example, half of all new cars sold in the country are expected to be electric.

4. How We’ll Reach a 1 Trillion Transistor GPU – Mark Liu and H.S. Philip Wong

All those marvelous AI applications have been due to three factors: innovations in efficient machine-learning algorithms, the availability of massive amounts of data on which to train neural networks, and progress in energy-efficient computing through the advancement of semiconductor technology. This last contribution to the generative AI revolution has received less than its fair share of credit, despite its ubiquity.

Over the last three decades, the major milestones in AI were all enabled by the leading-edge semiconductor technology of the time and would have been impossible without it. Deep Blue was implemented with a mix of 0.6- and 0.35-micrometer-node chip-manufacturing technology. The deep neural network that won the ImageNet competition, kicking off the current era of machine learning, was implemented with 40-nanometer technology. AlphaGo conquered the game of Go using 28-nm technology, and the initial version of ChatGPT was trained on computers built with 5-nm technology. The most recent incarnation of ChatGPT is powered by servers using even more advanced 4-nm technology. Each layer of the computer systems involved, from software and algorithms down to the architecture, circuit design, and device technology, acts as a multiplier for the performance of AI. But it’s fair to say that the foundational transistor-device technology is what has enabled the advancement of the layers above.

If the AI revolution is to continue at its current pace, it’s going to need even more from the semiconductor industry. Within a decade, it will need a 1-trillion-transistor GPU—that is, a GPU with 10 times as many devices as is typical today…

…Since the invention of the integrated circuit, semiconductor technology has been about scaling down in feature size so that we can cram more transistors into a thumbnail-size chip. Today, integration has risen one level higher; we are going beyond 2D scaling into 3D system integration. We are now putting together many chips into a tightly integrated, massively interconnected system. This is a paradigm shift in semiconductor-technology integration.

In the era of AI, the capability of a system is directly proportional to the number of transistors integrated into that system. One of the main limitations is that lithographic chipmaking tools have been designed to make ICs of no more than about 800 square millimeters, what’s called the reticle limit. But we can now extend the size of the integrated system beyond lithography’s reticle limit. By attaching several chips onto a larger interposer—a piece of silicon into which interconnects are built—we can integrate a system that contains a much larger number of devices than what is possible on a single chip…

…HBMs are an example of the other key semiconductor technology that is increasingly important for AI: the ability to integrate systems by stacking chips atop one another, what we at TSMC call system-on-integrated-chips (SoIC). An HBM consists of a stack of vertically interconnected chips of DRAM atop a control logic IC. It uses vertical interconnects called through-silicon-vias (TSVs) to get signals through each chip and solder bumps to form the connections between the memory chips. Today, high-performance GPUs use HBM extensively…

…With a high-performance computing system composed of a large number of dies running large AI models, high-speed wired communication may quickly limit the computation speed. Today, optical interconnects are already being used to connect server racks in data centers. We will soon need optical interfaces based on silicon photonics that are packaged together with GPUs and CPUs. This will allow the scaling up of energy- and area-efficient bandwidths for direct, optical GPU-to-GPU communication, such that hundreds of servers can behave as a single giant GPU with a unified memory. Because of the demand from AI applications, silicon photonics will become one of the semiconductor industry’s most important enabling technologies…

…We can see the trend already in server GPUs if we look at the steady improvement in a metric called energy-efficient performance. EEP is a combined measure of the energy efficiency and speed of a system. Over the past 15 years, the semiconductor industry has increased energy-efficient performance about threefold every two years. We believe this trend will continue at historical rates. It will be driven by innovations from many sources, including new materials, device and integration technology, extreme ultraviolet (EUV) lithography, circuit design, system architecture design, and the co-optimization of all these technology elements, among other things.

Largely thanks to advances in semiconductor technology, a measure called energy-efficient performance is on track to triple every two years (EEP units are 1/femtojoule-picoseconds).

In particular, the EEP increase will be enabled by the advanced packaging technologies we’ve been discussing here. Additionally, concepts such as system-technology co-optimization (STCO), where the different functional parts of a GPU are separated onto their own chiplets and built using the best performing and most economical technologies for each, will become increasingly critical.

5. The illusion of moral decline – Adam Mastroianni

In psychology, anything worth studying is probably caused by multiple things. There may be lots of reasons why people think morality is declining when it really isn’t.

Maybe people say that morality is declining because they think it makes them look good. But in Part I, we found that people are willing to say that some things have gotten better (less racism, for instance). And people still make the same claims when we pay them for accuracy.
Maybe because people are nice to you when you’re a kid, and then they’re less nice to you when you’re an adult, you end up thinking that people got less nice over time. But people say that morality has declined since they turned 20, and that it’s declined in the past four years, and all that is true for old people, too.
Maybe everybody has just heard stories about how great the past is—like, they watch Leave It to Beaver and they go “wow, people used to be so nice back then.” But again, people think morality has declined even in the recent past. Also, who watches Leave It to Beaver?
We know from recent research that people denigrate the youth of today because they have positively biased memories of their own younger selves. That could explain why people blame moral decline on interpersonal replacement, but it doesn’t explain why people also blame it on personal change.

Any of these could be part of the illusion of moral decline. But they are, at best, incomplete.

We offer an additional explanation in the paper, which is that two well-known psychological phenomena can combine to produce an illusion of moral decline. One is biased exposure: people pay disproportionate attention to negative information, and media companies make money by giving it to us. The other is biased memory: the negativity of negative information fades faster than the positivity of positive information. (This is called the Fading Affect Bias; for more, see Underrated ideas in psychology).

Biased exposure means that things always look outrageous: murder and arson and fraud, oh my! Biased memory means the outrages of yesterday don’t seem so outrageous today. When things always look bad today but brighter yesterday, congratulations pal, you got yourself an illusion of moral decline.

We call this mechanism BEAM (Biased Exposure and Memory), and it fits with some of our more surprising results. BEAM predicts that both older and younger people should perceive moral decline, and they do. It predicts that people should perceive more decline over longer intervals, and they do. Both biased attention and biased memory have been observed cross-culturally, so it also makes sense that you would find the perception of moral decline all over the world.

But the real benefit of BEAM is that it can predict cases where people would perceive less decline, no decline, or even improvement. If you reverse biased exposure—that is, if people mainly hear about good things that other people are doing—you might get an illusion of moral improvement. We figured this could happen in people’s personal worlds: most people probably like most of the people they interact with on a daily basis, so they may mistakenly think those people have actually become kinder over time.

They do. In another study, we asked people to answer those same questions about interpersonal replacement and personal change that we asked in a previous study, first about people in general, and then about people that they interact with on a daily basis. When we asked participants about people in general, they said (a) people overall are less moral than they were in 2005, (b) the same people are less moral today than in 2005 (personal change) and (c) young people today are less moral than older people were in 2005 (interpersonal replacement). Just as they did before, participants told us that morality declined overall, and that both personal change and interpersonal replacement were to blame.

But we saw something new when we asked participants about people they know personally. First, they said individuals they’ve known for the past 15 years are more moral today. They said the young folks they know today aren’t as moral as the old folks they knew 15 years ago, but this difference was smaller than it was for people in general. So when you ask people about a group where they probably don’t have biased exposure—or at least not biased negative exposure—they report less moral decline, or even moral improvement.

The second thing that BEAM predicts is that if you turn off biased memory, the illusion of moral decline might go away. We figured this could happen if you asked people about times before they were born—you can’t have memories if you weren’t alive. We reran one of our previous studies, simply asking participants to rate people in general today, the year in which they turned 20, the year in which they were born, 20 years before that, and 40 years before that.

People said, basically, “moral decline began when I arrived on Earth”:

Neither of these studies mean that BEAM is definitely the culprit behind the illusion of moral decline, nor that it’s the only culprit. But BEAM can explain some weird phenomena that other accounts can’t, and it can predict some data that other accounts wouldn’t, so it seems worth keeping around for now.

Disclaimer: The Good Investors is the personal investing blog of two simple guys who are passionate about educating Singaporeans about stock market investing. By using this Site, you specifically agree that none of the information provided constitutes financial, investment, or other professional advice. It is only intended to provide education. Speak with a professional before making important decisions about your money, your professional life, or even your personal life. We currently have a vested interest in Alphabet (parent of Google) and TSMC. Holdings are subject to change at any time.

What We’re Reading (Week Ending 07 April 2024)

The best articles we’ve read in recent times on a wide range of topics, including investing, business, and the world in general.

We’ve constantly been sharing a list of our recent reads in our weekly emails for The Good Investors.

Do subscribe for our weekly updates through the orange box in the blog (it’s on the side if you’re using a computer, and all the way at the bottom if you’re using mobile) – it’s free!

Here are the articles for the week ending 07 April 2024:

1. China’s capitalist experiment – Michael Fritzell

I just read a great new book by analyst Anne Stevenson-Yang. It’s called Wild Ride and is available for pre-order on Amazon.

The book tells the story of China’s economic miracle from the late 1970s until today － how Deng Xiaoping’s reforms unleashed a wave of entrepreneurship and led to China’s economy becoming one of the largest in the world.

However, it also discusses some of the system’s fragilities and how the country now seems to be turning inwards again…

…China under Mao Zedong was a closed-off, repressive society. Meat was a once-in-a-week luxury. Cooking was done outside. And personal freedoms were more or less non-existent…

…After Mao died in 1976, a power struggle ensued. Ultimately, Mao’s former ally, Deng Xiaoping, emerged victorious from this struggle. One of his first tasks was to open up the economy to the outside world. For this, he would need hard currency.

Practical considerations took priority in those early years. When Deng Xiaoping travelled to the United States in 1979, he ordered an inventory of all hard currency in China’s banks. He came up with only US$38,000 － hardly enough to pay for his delegation.

This was a low point for the Chinese economy. Deng recognized that China needed exports. Japan, Korea, and Taiwan became wealthy by promoting the export of manufactured goods. So Deng adopted a twin strategy of promoting exports in special economic zones while shielding ordinary Chinese from foreign cultural influences…

..Deng’s special economic zones were newly incorporated entities acting as quasi-governments. What made them different was that their managers were rewarded by meeting targets focused on the scale of capital investment and gross tax revenues…

…Initially, foreign influence was kept at bay. Foreign nationals were required to live in special compounds, use separate medical facilities, and even use special currencies. Romantic relationships between foreigners and Chinese were forbidden as well.

The special economic zones in the Southern parts of the Guangdong province, such as Shenzhen, were particularly successful. One of the reasons was that they were near port facilities. But perhaps even more importantly, they had access to financial powerhouse Hong Kong, with its banks and talented entrepreneurs. While, of course, having access to hundreds of millions of workers from inland provinces.

In fact, Shenzhen became a model for the China that was about to develop. It was the first city to abolish the food coupon system, thus allowing residents to buy food with their own money. And residents were soon allowed to lease their own land…

…Another important part of Deng’s reforms was allowing farmers to grow whatever they pleased after meeting some quota. They could then sell any surplus in newly established markets. This unleashed immense rural income growth of 12% per year throughout the 1980s.

A similar system was later introduced to state-owned enterprises as well. They were now allowed to retain profits, either for reinvestment or pay them out as bonuses to employees. Managers suddenly realized they had incentives to increase revenues and profits, and some became wealthy…

…But beneath the surface, discontent was growing. Students were devouring books brought in from overseas. They were clamoring not only for economic gains but also for political reforms. By 1987, Beijing students regularly held marches from the university districts to Tian’anmen Square to protect against political restrictions…

…The crackdown on the student demonstrations in Beijing in June 1989 led to a significant political shift. For two years after the massacre, the country closed off, and dissidents were hunted down and jailed. Anyone who participated in the protests was either disappeared, jailed, demoted or unable to attend university or get a good job.

After the student protests, the Communist Party shifted its strategy to maintaining control. It upped its propaganda efforts, conveying that if the party were to collapse, China would end up in total anarchy.

In the aftermath of Tian’anmen, a communication system was established that improved the party’s control over the provinces. Tax collection and audits were tightened, and a criminal detection and surveillance system was developed…

…One of Deng’s buzzwords during this era was “to get rich is glorious” (致富光荣). You no longer had to be ashamed of pursuing wealth; it was promoted from the top down.

The Communist Party bet that as long as people felt their livelihoods improved, they would not rock the boat. The restive students who protested at Tian’anmen Square would now focus on economic opportunity rather than spiritual dissatisfaction.

After his come-back in the early 1990s, Deng picked out young talent Zhu Rongji to push for further reforms. In a long list of achievements, Zhu Rongji managed to:

Cut the government bureaucracy in half
Privatize housing
Sell off 2/3 of the companies in the state sector
Unify the dual currencies used prior to 1994
Introduce a nationwide tax system
Take control of the appointment of all provincial-level governors…

…After the reforms of the 1990s, China’s economic growth really took off. Exporters in China’s coastal regions benefitted from the country’s admission into the WTO, and Chinese returnees started businesses left and right…

…It was also during the 2000s that the property boom really kicked into high gear. In the late 1990s, Zhu Rongji instituted reforms that allowed state-owned enterprises to sell worker housing back to tenants for a pittance. As prices rose throughout the 2000s, tenants now held significant household equity, which they could then leverage to buy new, even fancier, commodity housing.

A change in the tax structure also incentivized local governments to promote construction. In the mid-1990s, the central government established its own offices to collect taxes directly. In other words, local governments had less ability to raise taxes themselves, instead relying on remittances from the central government. Local governments thus became cash-poor.

To fund their spending programs, they instead set up local government financing vehicles (LGFVs), which used land as collateral for borrowing. And since they were government entities, they were seen as quasi-sovereign borrowers enjoying full access to loans from state banks. Over time, the number of LGFVs grew to over 10,000. They operate urban infrastructure, subway systems, water and gas utilities, etc. Some of them are profitable, but many of them are not…

…The privatization of China’s housing market, which provided collateral for new loans, created one of the biggest credit booms the world has ever seen. Later on, in just five years, more credit was created than the entire value of the US banking system…

…After the Great Financial Crisis of 2008, the Communist Party leadership unleashed a CNY 4 trillion stimulus program that brought forward demand for infrastructure and spending targets.

At this point, it was already becoming clear that the capital stock for infrastructure was starting to exceed those of most other developing or even developed economies. By 2012, China had 8x the length of highways per unit of GDP as that of Japan. At the time, more than 70% of China’s airports were failing to cover their own costs, even though such costs tend to be modest…

…Meanwhile, with the state pushing for big stimulus packages, the government increasingly directed economic resources. Concepts such as “advance of the state, retreat of the private sector” (国进民退) became more common, reflecting a shift in the economy away from private sector entrepreneurship…

…And indeed, with the emergence of Xi Jinping, the state has started to reassert control. State companies are now receiving most of the loans from China’s banks. State media is now talking of “national rejuvenation”, trying to unite the country around nationalist sentiment and acceptance of a “moderately prosperous lifestyle” (小康社会). This is a clear break from the era of Deng Xiaoping’s reforms when getting rich was perhaps the greatest virtue in life…

…Further, she believes that a Russia-Iran-China bloc is currently being formed and that China’s financial system could serve as a bedrock for trade within the bloc:

“If, however, China were someday to shrink its network of trading partners to other dictatorships like Russia and North Korea, its dedicated financial system could become the principal one used for trade among those nations.”

In other words, Anne believes that China is withdrawing from its informal pact with Western nations about open trade, with the experiment in Western-style capitalism that commenced in 1979 over. The Chinese economy is now morphing into a different system, one where the state reigns supreme and will become an influential partner in a new trading bloc formed by China’s current geopolitical allies.

2. 20 Lessons From 20 Years of Managing Money – Ben Carlson

1. Experiences shape your perception of risk. Your ability and need to take risk should be based on your stage in life, time horizon, financial circumstances and goals.

But your desire to take risk often trumps all that, depending on your life experiences. If you worked at Enron or Lehman Brothers or AIG or invested with Madoff, your appetite for risk will be forever altered.

And that’s OK as long as you plan accordingly.

2. Intelligence doesn’t guarantee investment success. Warren Buffett once wrote, “Investing is not a game where the guy with the 160 IQ beats the guy with the 130 IQ. Once you have ordinary intelligence, what you need is the temperament to control the urges that get other people into trouble in investing.”

I’ve met so many highly educated individuals who are terrible investors. They can’t control their emotions because their academic pedigree makes them overconfident in their abilities.

Emotional intelligence is the true sign of investment smarts.

3. No one lives life in the long-term. Long-term returns are the only ones that matter but you have to survive a series of short-terms to get there.

The good strategy you can stick with in those short-terms is preferable to the perfect strategy you can’t stick with…

…9. The biggest risks are always the same…yet different. The next risk is rarely the same as the last risk because every market environment is different.

On the other hand, the biggest mistakes investors make are often the same — timing the market, recency bias, being fearful when others are fearful and greedy when others are greedy and investing in the latest fads.

It’s always a different market but human nature is the constant…

…16. Experience is not the same as expertise. Just because you’ve been doing something for a long time doesn’t mean you’re an expert.

I know plenty of experienced investors who are constantly fighting the last war to their own detriment.

How many people who “called” the 2008 crash completely missed the ensuing bull market? All of them?

How many investment legends turn into permabears the older they get becasue they fail to recognize how markets have changed over time?

Loads of investment professionals who have been in the business for many years make the same mistakes over and over again…

…18. There is a big difference between rich and wealthy. Lots of rich people are miserable. These people are not wealthy, regardless of how much money they have.

There are plenty of people who wouldn’t be considered rich based on the size of their net worth who are wealthy beyond imagination because of their family, friends and general contentment with what they have.

19. Optimism should be your default. It saddens me to see an increasing number of cynical and pessimistic people every year.

I understand the world can be an unforgiving place and things will never be perfect but investing is a game where the optimists win.

3. 8 Google Employees Invented Modern AI. Here’s the Inside Story – Steven Levy

EIGHT NAMES ARE listed as authors on “Attention Is All You Need,” a scientific paper written in the spring of 2017. They were all Google researchers, though by then one had left the company…

…Recurrent neural networks struggled to parse longer chunks of text. Take a passage like Joe is a baseball player, and after a good breakfast he went to the park and got two hits. To make sense of “two hits,” a language model has to remember the part about baseball. In human terms, it has to be paying attention. The accepted fix was something called “long short-term memory” (LSTM), an innovation that allowed language models to process bigger and more complex sequences of text. But the computer still handled those sequences strictly sequentially—word by tedious word—and missed out on context clues that might appear later in a passage. “The methods we were applying were basically Band-Aids,” Uszkoreit says. “We could not get the right stuff to really work at scale.”

Around 2014, he began to concoct a different approach that he referred to as self-attention. This kind of network can translate a word by referencing any other part of a passage. Those other parts can clarify a word’s intent and help the system produce a good translation. “It actually considers everything and gives you an efficient way of looking at many inputs at the same time and then taking something out in a pretty selective way,” he says. Though AI scientists are careful not to confuse the metaphor of neural networks with the way the biological brain actually works, Uszkoreit does seem to believe that self-attention is somewhat similar to the way humans process language.

Uszkoreit thought a self-attention model could potentially be faster and more effective than recurrent neural nets. The way it handles information was also perfectly suited to the powerful parallel processing chips that were being produced en masse to support the machine learning boom. Instead of using a linear approach (look at every word in sequence), it takes a more parallel one (look at a bunch of them together). If done properly, Uszkoreit suspected, you could use self-attention exclusively to get better results…

…The transformer crew set about building a self-attention model to translate text from one language to another. They measured its performance using a benchmark called BLEU, which compares a machine’s output to the work of a human translator. From the start, their new model did well. “We had gone from no proof of concept to having something that was at least on par with the best alternative approaches to LSTMs by that time,” Uszkoreit says. But compared to long short-term memory, “it wasn’t better.”

They had reached a plateau—until one day in 2017, when Noam Shazeer heard about their project, by accident. Shazeer was a veteran Googler—he’d joined the company in 2000—and an in-house legend, starting with his work on the company’s early ad system. Shazeer had been working on deep learning for five years and recently had become interested in large language models. But these models were nowhere close to producing the fluid conversations that he believed were possible.

As Shazeer recalls it, he was walking down a corridor in Building 1965 and passing Kaiser’s workspace. He found himself listening to a spirited conversation. “I remember Ashish was talking about the idea of using self-attention, and Niki was very excited about it. I’m like, wow, that sounds like a great idea. This looks like a fun, smart group of people doing something promising.” Shazeer found the existing recurrent neural networks “irritating” and thought: “Let’s go replace them!”

Shazeer’s joining the group was critical. “These theoretical or intuitive mechanisms, like self-attention, always require very careful implementation, often by a small number of experienced ‘magicians,’ to even show any signs of life,” says Uszkoreit. Shazeer began to work his sorcery right away. He decided to write his own version of the transformer team’s code. “I took the basic idea and made the thing up myself,” he says. Occasionally he asked Kaiser questions, but mostly, he says, he “just acted on it for a while and came back and said, ‘Look, it works.’” Using what team members would later describe with words like “magic” and “alchemy” and “bells and whistles,” he had taken the system to a new level.

“That kicked off a sprint,” says Gomez. They were motivated, and they also wanted to hit an upcoming deadline—May 19, the filing date for papers to be presented at the biggest AI event of the year, the Neural Information Processing Systems conference in December. As what passes for winter in Silicon Valley shifted to spring, the pace of the experiments picked up. They tested two models of transformers: one that was produced with 12 hours of training and a more powerful version called Big that was trained over three and a half days. They set them to work on English-to-German translation.

The basic model outperformed all competitors—and Big earned a BLEU score that decisively shattered previous records while also being more computationally efficient. “We had done it in less time than anyone out there,” Parmar says. “And that was only the beginning, because the number kept improving.”…

…TRANSFORMERS DID NOT instantly take over the world, or even Google. Kaiser recalls that around the time of the paper’s publication, Shazeer proposed to Google executives that the company abandon the entire search index and train a huge network with transformers—basically to transform how Google organizes information. At that point, even Kaiser considered the idea ridiculous. Now the conventional wisdom is that it’s a matter of time.

A startup called OpenAI was much faster to pounce. Soon after the paper was published, OpenAI’s chief researcher, Ilya Sutskever—who had known the transformer team during his time at Google—suggested that one of its scientists, Alec Radford, work on the idea. The results were the first GPT products. As OpenAI CEO Sam Altman told me last year, “When the transformer paper came out, I don’t think anyone at Google realized what it meant.”

The picture internally is more complicated. “It was pretty evident to us that transformers could do really magical things,” says Uszkoreit. “Now, you may ask the question, why wasn’t there ChatGPT by Google back in 2018? Realistically, we could have had GPT-3 or even 3.5 probably in 2019, maybe 2020. The big question isn’t, did they see it? The question is, why didn’t we do anything with the fact that we had seen it? The answer is tricky.”

Many tech critics point to Google’s transition from an innovation-centered playground to a bottom-line-focused bureaucracy. As Gomez told the Financial Times, “They weren’t modernizing. They weren’t adopting this tech.” But that would have taken a lot of daring for a giant company whose technology led the industry and reaped huge profits for decades. Google did begin to integrate transformers into products in 2018, starting with its translation tool. Also that year, it introduced a new transformer-based language model called BERT, which it started to apply to search the year after.

But these under-the-hood changes seem timid compared to OpenAI’s quantum leap and Microsoft’s bold integration of transformer-based systems into its product line. When I asked CEO Sundar Pichai last year why his company wasn’t first to launch a large language model like ChatGPT, he argued that in this case Google found it advantageous to let others lead. “It’s not fully clear to me that it might have worked out as well. The fact is, we can do more after people had seen how it works,” he said…

…Does Google miss these escapees? Of course, in addition to others who have migrated from the company to new AI startups. (Pichai reminded me, when I asked him about the transformer departures, that industry darling OpenAI also has seen defections: “The AI area is very, very dynamic,” he said.) But Google can boast that it created an environment that supported the pursuit of unconventional ideas. “In a lot of ways Google has been way ahead—they invested in the right minds and created the environment where we could explore and push the envelope,” Parmar says. “It’s not crazy that it took time to adopt it. Google had so much more at stake.”

Without that environment: no transformer. Not only were the authors all Google employees, they also worked out of the same offices. Hallway encounters and overheard lunch conversations led to big moments. The group is also culturally diverse. Six of the eight authors were born outside the United States; the other two are children of two green-card-carrying Germans who were temporarily in California and a first-generation American whose family had fled persecution, respectively.

4. In Depth: Local Governments Struggle to Tackle Mountain of Hidden Debt – Cheng Siwei, Wang Juanjuan, Zhang Yuzhe, Ding Feng and Zhang Yukun

The central government has been trying to address the problem of LGFV debt for years, mainly through piecemeal measures that had limited success. But in July, the Politburo vowed to formulate and implement a comprehensive strategy to resolve local government hidden debts.

These off-the-books liabilities, which include LGFV bonds with implicit official backing, have accumulated over the years to around 30 trillion to 70 trillion yuan according to some estimates, and become a threat to the country’s fiscal and financial stability and sustainability.

One of the main instruments being used to repay hidden debt in this round of debt resolution is special refinancing bonds — on-balance-sheet local government bonds whose proceeds are used to repay outstanding hidden debt. Issuance has stepped up significantly since early October after the Ministry of Finance launched a special refinancing bond swap program.

From October to December, almost all provincial-level regions on the Chinese mainland issued these special refinancing bonds, raising nearly 1.4 trillion yuan to repay hidden borrowings, according to calculations by analysts at Tianfeng Securities Co. Ltd. The regions include heavily indebted Guizhou province, which topped the list with issuance of 226.4 billion yuan.

Many regions have announced plans to issue more such bonds in February and March, with planned issuances totaling more than 100 billion yuan, the Tianfeng analysts wrote in a January report.

The campaign to resolve hidden debt has tightened rules for new debt issuance and cut some localities off from their previous financing channels, depriving them of resources to pay interest on hidden debt. The proceeds of special refinancing bonds cannot be used to make interest payments.

“The core issue now is that we can’t make our interest payments,” a source who works for an economic development zone in West China told Caixin, noting that without new financing, the fiscal revenue of the region can only sustain government agencies’ day-to-day operations and preferential policies for attracting businesses. He said his local government has stopped making all other payments, including those to project developers, to ensure it can meet interest payments on outstanding LGFV debt…

…The renewed push to bring hidden debt onto the books and restructure or swap LGFV debt, however, has reinforced the belief that the central government won’t allow LGFVs to default on their bonds, reviving investor sentiment. That’s led to a surge in demand for LGFV bonds over the past few months, even as the central government has repeatedly highlighted the need to stem any renewed buildup in hidden debt…

…Although LGFV bonds are back in hot demand, tightened oversight has made it more difficult for some vehicles, especially those with heavy debt burdens, to continue issuing new debt. This has curbed growth in hidden debt to some extent, but it has added to default risks of some LGFV bonds as there is less money available to make the interest repayments.

The central government ordered provincial officials to compile a list of LGFVs owned by local authorities in their jurisdictions…

…Obtaining new bank loans has become much harder for LGFVs on the list, as banks heed the central government’s instruction to prevent new LGFV debt.

Regarding existing LGFV debt, the State Council in September issued guidance that banks, among the most important creditors of LGFVs, should become involved in debt resolution in 12 provincial-level regions with high government leverage, which include Liaoning, Heilongjiang, and Jilin, the three rustbelt provinces in Northeast China. The guidance set out that banks should focus on restructuring or swapping existing loans, high-interest non-standard debt, and other types of borrowing.

5. Conviction and Quality – Josh Tarasoff

Conviction is no doubt the foundation of long-term business ownership. How is it formed? What is it like to have it? Why does it falter? In my experience there are two distinct kinds of conviction. Explicit conviction, as I call it, comes from having figured something out. It entails a useful prediction, like “our ETA is 5pm” or “majoring in economics will lead to better career prospects than majoring in philosophy.” There is an underlying logic to it, which can be explained and used to persuade. Implicit conviction, on the other hand, is exemplified by the trust one might have in a family member, a dear friend, a close colleague, to do the right thing, to get the job done, to come through. It is felt as opposed to believed. This kind of conviction doesn’t make predictions so much as align with what is good. It doesn’t theorize about goodness but rather knows it when it sees it…

…In the context of investing, one might develop the thesis that a particular company can capture X% market share, generate Y dollars in annual revenue, achieve Z% operating margins, and therefore has an intrinsic value within a certain range. One might have high confidence because of the presence of competitive advantages and management with a very good track record. One would have a range of expected returns from owning the shares over time. All of this would fall into the explicit category.

Sooner or later, the investment would encounter a confounding surprise. Perhaps execution turns choppy, a new competitive vector emerges out of nowhere, an exogenous crisis turns the world upside down, etc. Old projections are now in doubt, previous plans and strategies are being reworked, everything is less fun. These things are actually happening all the time— something explicit conviction has a way of tuning out! Only genuine and well-placed implicit conviction, a qualitative knowing that the company will do what it needs to and ought to do, is equipped to ably traverse this kind of terrain. Unlike analysis-based explicit conviction, implicit conviction comes from something deeper than the cause and effect we perceive in the unfolding of events—it is both analytical and, crucially, intuitive (about which more later)…

…While in everyday life implicit conviction arises naturally, in the context of investing I can’t help but feel it is somewhat alien. In part, this is because few companies are truly deserving. Even so, I suspect that implicit conviction is proffered by investors even less than it ought to be. It isn’t difficult to see why the investment industry is inhospitable to implicit conviction, and why its partner rules the roost. Implicit conviction forms of its own accord and cannot be planned. It defies quantification, eliciting the charge of being too “fuzzy” to matter. Nor can it be fully captured in words. Implicit conviction is impossible to transmit from analyst to portfolio manager or from portfolio manager to client, which is highly inconvenient for the business of managing money. It is primarily personal. It is quiet. By contrast, the appeal of the explicit is clear. Explicit conviction furnishes the comfort of knowability and modeled outcomes. It projects the legitimacy of diligence and precision. It is thought to be reliably manufactured via “repeatable process.” It is clever and self assured….

…Nonetheless, because literal communication necessitates choosing a word, I will use “Quality” (capitalized to distinguish it from the ordinary sense of the term) to indicate the deeper-something on which implicit conviction is based. Using “Quality” in this way is consistent with my prior writing and pays homage to the work of Robert Pirsig, which was a formative influence.

Analysis plays an important but limited role in detecting Quality. For example, the following is a selection of (neither necessary nor sufficient) indicators that I have found to be suggestive of Quality in companies:

“Wow” customer experiences
Mission to solve an important problem
Domain mastery (the best at what they do)
First-principles-based thinking and invention
Unlimited ambition combined with no-nonsense realism
Overcapitalized balance sheet
Founder mentality (life’s work).

While carefully looking for indicators like these is helpful, I think it would be a misstep to attempt to systematize the search, constructing a Grand Unified Theory of Quality and attendant comprehensive processes for finding and evaluating it. Quality emerges from the complexity of the system in action; it is in the how rather than the what. Thus, when Quality is broken down into parts and analyzed, its essence is lost. This explains why analysis alone has trouble discerning the authentic from the artificial. Moreover, Quality frozen in a theory or process cannot be recognized in sufficiently new contexts, such as in a company that is novel to one’s experience or in the same company as it evolves (they always do!).

So where does that leave us? With intuition. Well-honed intuition does what analysis cannot by perceiving Quality directly, as opposed to through an intellectual process. What I suspect is happening in the direct perception of Quality is subconscious pattern recognition, based upon a dynamic, holistic experience of the thing in question. Of course, the ability to intuitively recognize patterns in a specific domain must be earned through experience and feedback; indeed, I have found that the value of my own intuition has grown (starting at zero) over many years. Interestingly, I also find that experiencing Quality in any one domain (e.g., music or meditation, to use examples that are dear to me) can be helpful for recognizing it in other domains (including business) because Quality’s nature is universal, even as its manifestations are necessarily particular.

Disclaimer: The Good Investors is the personal investing blog of two simple guys who are passionate about educating Singaporeans about stock market investing. By using this Site, you specifically agree that none of the information provided constitutes financial, investment, or other professional advice. It is only intended to provide education. Speak with a professional before making important decisions about your money, your professional life, or even your personal life. We currently have a vested interest in Alphabet (parent of Google) and Microsoft. Holdings are subject to change at any time.

What We’re Reading (Week Ending 31 March 2024)

The best articles we’ve read in recent times on a wide range of topics, including investing, business, and the world in general.

We’ve constantly been sharing a list of our recent reads in our weekly emails for The Good Investors.

Do subscribe for our weekly updates through the orange box in the blog (it’s on the side if you’re using a computer, and all the way at the bottom if you’re using mobile) – it’s free!

Here are the articles for the week ending 31 March 2024:

1. Gold-Medalist Coders Build an AI That Can Do Their Job for Them – Ashlee Vance

Take the case of Cognition AI Inc.

You almost certainly have not heard of this startup, in part because it’s been trying to keep itself secret and in part because it didn’t even officially exist as a corporation until two months ago. And yet this very, very young company, whose 10-person staff has been splitting time between Airbnbs in Silicon Valley and home offices in New York, has raised $21 million from Peter Thiel’s venture capital firm Founders Fund and other brand-name investors, including former Twitter executive Elad Gil. They’re betting on Cognition AI’s team and its main invention, which is called Devin.

Devin is a software development assistant in the vein of Copilot, which was built by GitHub, Microsoft and OpenAI, but, like, a next-level software development assistant. Instead of just offering coding suggestions and autocompleting some tasks, Devin can take on and finish an entire software project on its own. To put it to work, you give it a job—“Create a website that maps all the Italian restaurants in Sydney,” say—and the software performs a search to find the restaurants, gets their addresses and contact information, then builds and publishes a site displaying the information. As it works, Devin shows all the tasks it’s performing and finds and fixes bugs on its own as it tests the code being written.

The founders of Cognition AI are Scott Wu, its chief executive officer; Steven Hao, the chief technology officer; and Walden Yan, the chief product officer…

…Wu, 27, is the brother of Neal Wu, who also works at Cognition AI. These two men are world-renowned for their coding prowess: The Wu brothers have been competing in, and often winning, international coding competitions since they were teenagers…

…Sport-coding—yes, it’s a real thing—requires people to solve puzzles and program with speed and accuracy. Along the way, it trains contestants to approach problems in novel ways. Cognition AI is full of sport-coders. Its staff has won a total of 10 gold medals at the top international competition, and Scott Wu says this background gives his startup an edge in the AI wars…

…One of the big claims Cognition AI is making with Devin is that the company has hit on a breakthrough in a computer’s ability to reason. Reasoning in AI-speak means that a system can go beyond predicting the next word in a sentence or the next snippet in a line of code, toward something more akin to thinking and rationalizing its way around problems. The argument in AI Land is that reasoning is the next big thing that will advance the industry, and lots of startups are making various boasts about their ability to do this type of work.

Devin does appear to be well ahead of the other coding assistants in many respects. You can give it jobs to do with natural language commands, and it will set off and accomplish them. As Devin works, it tells you about its plan and then displays the commands and code it’s using. If something doesn’t look quite right, you can give the AI a prompt to go fix the issue, and Devin will incorporate the feedback midstream. Most current AI systems have trouble staying coherent and on task during these types of long jobs, but Devin keeps going through hundreds and even thousands of tasks without going off track.

In my tests with the software, Devin could build a website from scratch in 5 to 10 minutes, and it managed to re-create a web-based version of Pong in about the same amount of time. I had to prompt it a couple of times to improve the physics of the ball movement in the game and to make some cosmetic changes on its websites, all of which Devin accomplished just fine and with a polite attitude…

…Exactly how Cognition AI made this breakthrough, and in so short a time, is something of a mystery, at least to outsiders. Wu declines to say much about the technology’s underpinnings other than that his team found unique ways to combine large language models (LLMs) such as OpenAI’s GPT-4 with reinforcement learning techniques. “It’s obviously something that people in this space have thought about for a long time,” he says. “It’s very dependent on the models and the approach and getting things to align just right.”

2. Geopolitics in the C-Suite – Jami Miscik, Peter Orszag, and Theodore Bunzel

But even though national security and foreign policy occasionally intruded on corporate America during that time, until very recently, few executives concerned themselves with geopolitics. In the post–Cold War world, with globalization on the march, the idea that national interests might be at odds with open markets and expanding trade came to seem alien to American executives.

But the changes that have roiled the geopolitical landscape in recent years have left an impression in C-suites around the United States. In a recent poll of 500 institutional investors, geopolitics ranked as the top risk to the global economy and markets in 2024…

…As governments lean on economic restrictions and industrial policies to achieve geopolitical ends, corporations have increasingly become both the objects and instruments of foreign policy…

…The centrality of economic competition to today’s foreign policy problems represents a qualitative break from the past. During the Cold War, for example, the United States and the Soviet Union hardly interacted economically: trade between them peaked at a paltry $4.5 billion in 1979; in recent years, the United States and China have generally traded that much every week or two, adjusting for inflation. In the post–Cold War era, U.S. foreign policy was focused on opening markets and reducing international economic barriers rather than erecting them. Era-defining crises such as the 9/11 attacks did little to change the relationship between U.S. policymakers and American corporations; if anything, the “war on terror” further solidified the idea that foreign policy was primarily concerned with security and military issues, not economics.

But in the background, global economic integration was transforming the playing field. In 1980, trade accounted for just 37 percent of global GDP. Today, that figure is 74 percent, and economies have become intertwined to a degree never seen in the twentieth century. Globalization is not new, of course; it has been a centuries-long process. What is new, however, is the emergence of great-power rivalry in a highly interconnected world. Military power still matters, but economic and technological competition have become the main battlefield of global politics. Under the so-called Washington consensus that dominated policymaking for decades, the question of where a semiconductor manufacturer would build its next factory or whether German auto companies would decide to throttle their investments in China would have seemed relatively unimportant to policymakers. Now, such questions are at the center of almost every major foreign policy debate.

Greater economic integration has also created a complex web of links between geopolitical rivals that policymakers now seek to leverage for strategic ends. This is especially true when it comes to financial and technological networks, where Washington holds a privileged position…

…But as great-power tensions have increased, so has the number of sectors caught in the fray of what Farrell and Newman call “weaponized interdependence.” Consider, for example, the way that G-7 countries have taken advantage of Russian dependence on shipping insurers based in the West, an industry that most foreign policymakers had probably never thought about before Russia’s 2022 invasion of Ukraine. To try to cap the price of Russian oil exports, the G-7 prevented these companies from insuring Russian crude oil cargoes unless they had been sold at a maximum of $60 per barrel.

Western powers are not the only ones playing this game. In 2010, after a Chinese fishing trawler and Japanese Coast Guard patrol boats collided in disputed waters, setting off a diplomatic row between Beijing and Tokyo, China banned exports to Japan of the rare-earth minerals that are critical components of batteries and electronics, thus raising costs and creating shortages for Japanese manufacturers of everything from hybrid cars to wind turbines…

…More recently, a number of American consulting firms have been caught in the middle of the complex U.S.-Saudi relationship, with Congress demanding details about their contracts with Saudi Arabia that Riyadh has forbidden them to provide.

All these dynamics are being turbocharged by an intensifying competition between the United States and China, the two countries with the largest and most globally intertwined economies. Both aim to dominate the twenty-first-century economy, which means gaining the upper hand in computing technologies, biotechnology, and clean energy. And the foreign policies of both countries are now driven by a shared desire to shape their economies in ways that reduce their vulnerability and increase their leverage. China calls this “self-reliance.” Washington calls it “de-risking.” For the United States, what it looks like in practice is expanded export controls on advanced semiconductors and manufacturing equipment, enhanced government screening of investments by U.S. companies in foreign markets, and major subsidies for industries such as electric vehicles and microchips, primarily through the Inflation Reduction Act and the CHIPS Act. In this brave new world, the secretary of commerce is as important to foreign policy as the secretaries of state and defense.

Washington is hardly alone in taking such steps. State-sponsored drives for greater self-reliance have taken hold in nearly every major economy, particularly after the supply-chain disruptions of the COVID-19 pandemic. The number of countries introducing or expanding investment screening, for example, jumped from three between 1995 and 2005 to 54 between 2020 and 2022. Meanwhile, a wave of industrial policies has increased trade barriers in an attempt to induce companies to reshore their supply chains. At the same time, the understanding of what matters to national security has also expanded, as countries seek to advance or protect everything from software and microchips to pharmaceuticals and foodstuffs.

Many of the complications of this new era are rooted in the difference between the way the public and private sectors view time horizons. Policymakers set bright lines with immediate operational implications—for example, suddenly forbidding companies from exporting or importing certain goods from certain countries. But companies need to make long-term investment decisions. Should a company set up another plant in China if there is market demand and doing so is currently allowed by law? Should a pharmaceutical company set up advanced R & D centers in mainland China or purchase a Chinese biotech firm, given the long-run trajectory of relations between Beijing and the West? Should a consumer electronics firm purchase Chinese-made chips if they are the most cost-efficient option? Answering these questions requires executives to forecast the outcomes of highly volatile political debates and policymaking choices over which they have little control. And yet whatever decisions they make have a significant effect on whether, for example, the United States can effectively “de-risk” its economic relationship with China.

The example of semiconductors is instructive. Washington is seeking to reshore semiconductor manufacturing, but the success of its flagship industrial policy, the CHIPS Act, depends only in part on how the Commerce Department distributes the legislation’s $39 billion in subsidies over the next five years. A much more important factor is whether the Taiwanese chip manufacturer TSMC will risk setting up facilities in the United States despite high costs and a relative scarcity of human capital, and whether Apple decides to buy slightly more expensive chips made by U.S. fabricators instead of less expensive ones produced in Asia. And the CHIPS Act is only one input in those decisions.

3. Get Smart: Chasing Nvidia? Don’t succumb to FOMO – Chin Hui Leong

Are you feeling left out because you missed Nvidia’s (NASDAQ: NVDA) massive stock rise?

Well, we have good news and bad news.

Let’s start with the bad news: that tightening in your chest you are feeling right now is the fear of missing out — or better known by its initials “FOMO”.

And ladies and gentlemen, FOMO is real.

It’s that sneaky emotion which spurs you to buy a stock based on a feeling rather than proper research…

…But hang on, amid the hype — there’s good news too.

If you recognise that you are feeling FOMO, then congratulations — you have just taken the first step in recognising what you have to deal with: your runaway emotions.

The next step is to keep your emotions in check.

On the other side of FOMO, is its cousin FOJI — or the fear of joining in.

Like FOMO, FOJI is also a strong emotion.

That’s especially true for some investors who are bearing scars from 2022 when US stocks took a beating amid a punishing bear market.

These scars can emit another fear — FOJI — which is paralysing for investors.

The fear of looking stupid if you buy today only to watch the stock fall the very next day…

…Whether it is FOMO or FOJI, you won’t invest well if feelings dictate your actions.

Recognising the presence of both emotions is key…

…Beyond FOMO and FOJI, there’s JOMO or the joy of missing out.

Don’t feel down if you decide to give Nvidia a pass.

As Peter Lynch once said — you can’t kiss all the frogs to find out which will turn into a prince.

Unconvinced?

In Lynch book’s “One Up on Wall Street”, he wrote down the names of 65 stocks which returned at least 10 times their original price (he calls them 10-baggers).

Except that the fund that he ran owned NONE of them.

Before you start rolling your eyes, consider this point: Peter Lynch achieved a stunning 29% per year in annualized returns over 13 years, outpacing the benchmark S&P 500 index (INDEXSP: .INX) by more than two times…

…By sharing the list of missed winners, Lynch had a salient point to make: you do not need to be in every 10-bagger to deliver enviable returns.

4. How the richest woman in the world—mocked as a ‘miser’ in the press—helped bail out New York City during the panic of 1907 – Will Daniel

Hetty Green is remembered as the “world’s greatest miser” and the “Witch of Wall Street,” but these days, Green would likely be seen as an eccentric investing icon. After all, while she became famous for her frugal nature and gruff exterior, Green pioneered value investing strategies that have made billionaires out of many of today’s leading investors. And when the chips were down, when people really needed help, the whaling heiress turned independent investor, business tycoon, and world’s wealthiest woman often used her fortune to save the day…

…Over a three-week period after the panic began on Oct. 22, 1907, the New York Stock Exchange plummeted nearly 50% from its 1906 peak. And a year later, in 1908, Gross National Product (GNP), a measure akin to today’s Gross Domestic Product (GDP), cratered 12%. The problems for the banking system were so severe during the knickerbocker crisis that they spurred the establishment of the Federal Reserve System…

…As the situation deteriorated, John Pierpont Morgan, the American financier who founded what is now JPMorgan Chase, was eventually forced to call together a group of Wall Street’s best and brightest at the Morgan Library to help decide how to prop up the ailing economy and stock market. Hetty Green was the only woman who was invited to attend that meeting during the height of the panic…

…“I saw this situation coming,” she said, noting that there were undeniable signs of stress. “Some of the solidest men of the Street came to me and wanted to unload all sorts of things, from palatial residences to automobiles.”

Green said that she then gave The New York Central Railroad company a “big loan” after they came knocking, and that made her “sit up and do some thinking.” She decided to begin gathering as much cash as possible, understanding that a panic could be on the way…

…Green described how men came to New York from all over the country to ask for loans during the panic of 1907. But despite being labeled a “miser” throughout her life, she didn’t take advantage of the situation.

“Those to whom I loaned money got it at 6%. I might just as easily have secured 40%,” she explained…

…Usury, or charging excessive interest for a loan, was against Green’s moral code, which was born of her Quaker roots…

…Green would go on to lend the government of New York City $1.1 million at the peak of the 1907 panic, which is equivalent to roughly $33 million in today’s dollars…

…“On more than one occasion, when New York was running low on money, she would lend money to the city,” explained Charles Slack, the author of Green’s biography, Hetty: The Genius and Madness of America’s First Female Tycoon. “And she always did so at reasonable rates. She didn’t gouge or hold the city over a barrel.”

5. Transcript for Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI | Lex Fridman Podcast #416 – Lex Fridman and Yann Lecun

Lex Fridman (00:50:40) I would love to sort of linger on your skepticism around auto regressive LLMs. So one way I would like to test that skepticism is everything you say makes a lot of sense, but if I apply everything you said today and in general to I don’t know, 10 years ago, maybe a little bit less, no, let’s say three years ago, I wouldn’t be able to predict the success of LLMs. So does it make sense to you that autoregressive LLMs are able to be so damn good?

Yann LeCun (00:51:20) Yes.

Lex Fridman (00:51:21) Can you explain your intuition? Because if I were to take your wisdom and intuition at face value, I would say there’s no way autoregressive LLMs, one token at a time, would be able to do the kind of things they’re doing.

Yann LeCun (00:51:36) No, there’s one thing that autoregressive LLMs or that LLMs in general, not just the autoregressive one, but including the bird style bidirectional ones, are exploiting and its self supervised running, and I’ve been a very, very strong advocate of self supervised running for many years. So those things are a incredibly impressive demonstration that self supervised running actually works. The idea that started, it didn’t start with BERT, but it was really kind of good demonstration with this.

(00:52:09) So the idea that you take a piece of text, you corrupt it, and then you train some gigantic neural net to reconstruct the parts that are missing. That has produced an enormous amount of benefits. It allowed us to create systems that understand language, systems that can translate hundreds of languages in any direction, systems that are multilingual, so it’s a single system that can be trained to understand hundreds of languages and translate in any direction, and produce summaries and then answer questions and produce text.

(00:52:51) And then there’s a special case of it, which is the auto regressive trick where you constrain the system to not elaborate a representation of the text from looking at the entire text, but only predicting a word from the words that are come before. And you do this by constraining the architecture of the network, and that’s what you can build an auto aggressive LLM from.

(00:53:15) So there was a surprise many years ago with what’s called decoder only LLM. So since systems of this type that are just trying to produce words from the previous one and the fact that when you scale them up, they tend to really understand more about language. When you train them on lots of data, you make them really big. That was a surprise and that surprise occurred quite a while back, with work from Google, Meta, OpenAI, et cetera, going back to the GPT kind of work, general pre-trained transformers.

Lex Fridman (00:53:56) You mean like GPT2? There’s a certain place where you start to realize scaling might actually keep giving us an emergent benefit.

Yann LeCun (00:54:06) Yeah, I mean there were work from various places, but if you want to place it in the GPT timeline, that would be around GPT2, yeah.

Lex Fridman (00:54:19) Well, because you said it so charismatic and you said so many words, but self supervised learning, yes. But again, the same intuition you’re applying to saying that auto aggressive LLMs cannot have a deep understanding of the world. If we just apply that, same intuition, does it make sense to you that they’re able to form enough of a representation in the world to be damn convincing, essentially passing the original touring test with flying colors?

Yann LeCun (00:54:50) Well, we’re fooled by their fluency, right? We just assume that if a system is fluent in manipulating language, then it has all the characteristics of human intelligence, but that impression is false. We’re really fooled by it.

Lex Fridman (00:55:06) What do you think Alan Turing would say, without understanding anything, just hanging out with it?

Yann LeCun (00:55:11) Alan Turing would decide that a Turing test is a really bad test, okay? This is what the AI community has decided many years ago that the Turing test was a really bad test of intelligence.

Lex Fridman (00:55:22) What would Hans Marvek say about the larger language models?

Yann LeCun (00:55:26) Hans Marvek would say that Marvek Paradox still applies. Okay, we can pass-

Lex Fridman (00:55:32) You don’t think he would be really impressed?

Yann LeCun (00:55:34) No, of course everybody would be impressed. But it’s not a question of being impressed or not, it’s the question of knowing what the limit of those systems can do. Again, they are impressive. They can do a lot of useful things. There’s a whole industry that is being built around them. They’re going to make progress, but there is a lot of things they cannot do, and we have to realize what they cannot do and then figure out how we get there. And I’m seeing this from basically 10 years of research on the idea of self supervised running, actually that’s going back more than 10 years, but the idea of self supervised running. So basically capturing the internal structure of a piece of a set of inputs without training the system for any particular task, to learning representations.

(00:56:26) The conference I co-founded 14 years ago is called International Conference on Learning Representations. That’s the entire issue that deep learning is dealing with, and it’s been my obsession for almost 40 years now. So learning representation is really the thing. For the longest time, we could only do this with supervised learning, and then we started working on what we used to call unsupervised learning and revived the idea of unsupervised running in the early 2000s with your [inaudible 00:56:58] and Jeff Hinton. Then discovered that supervised running actually works pretty well if you can collect enough data. And so the whole idea of unsupervised, self supervised running kind of took a backseat for a bit, and then I tried to revive it in a big way starting in 2014, basically when we started FAIR and really pushing for finding new methods to do self supervised running both for text and for images and for video and audio.

(00:57:29) And some of that work has been incredibly successful. I mean, the reason why we have multilingual translation system, things to do, content moderation on Meta, for example, on Facebook, that are multilingual, that understand whether a piece of text is hate speech not or something, is due to that progress using self supervised running for NLP, combining this with transformer architectures and blah, blah, blah.

(00:57:53) But that’s the big success of self supervised running. We had similar success in speech recognition, a system called WAVE2VEC, which is also a joint embedding architecture, by the way, trained with contrastive running. And that system also can produce speech recognition systems that are multilingual with mostly unlabeled data and only need a few minutes of labeled data to actually do speech recognition, that’s amazing. We have systems now based on those combination of ideas that can do real time translation of hundreds of languages into each other, speech to speech.

Lex Fridman (00:58:28) Speech to speech, even including, which is fascinating, languages that don’t have written forms.

Yann LeCun (00:58:34) That’s right.

Lex Fridman (00:58:34) Just spoken only.

Yann LeCun (00:58:35) That’s right. We don’t go through text, it goes directly from speech to speech using an internal representation of speech units that are discrete, but it’s called Textless NLP. We used to call it this way. But yeah, so I mean incredible success there. And then for 10 years, we tried to apply this idea to learning representations of images by training a system to predict videos, learning intuitive physics by training a system to predict what’s going to happen in the video.

(00:59:02) And tried and tried and failed and failed, with generative models, with models that predict pixels. We could not get them to learn good representations of images. We could not get them to learn good representations of videos. And we tried many times, we published lots of papers on it, where they kind of sort of work, but not really great. They started working, we abandoned this idea of predicting every pixel and basically just doing the joint embedding and predicting and representation space, that works. So there’s ample evidence that we’re not going to be able to learn good representations of the real world using generative model. So I’m telling people, everybody’s talking about generative AI. If you’re really interested in human level AI, abandon the idea of generative AI…

…Yann LeCun (01:35:29) I actually made that comment on just about every social network I can, and I’ve made that point multiple times in various forums. Here’s my point of view on this, people can complain that AI systems are biased and they generally are biased by the distribution of the training data that they’ve been trained on that reflects biases in society, and that is potentially offensive to some people or potentially not. And some techniques to de-bias then become offensive to some people because of historical incorrectness and things like that.

(01:36:23) And so you can ask two questions, the first question is, is it possible to produce an AI system that is not biased? And the answer is, absolutely not. And it’s not because of technological challenges, although they are technological challenges to that, it’s because bias is in the eye of the beholder. Different people may have different ideas about what constitutes bias for a lot of things, there are facts that are indisputable, but there are a lot of opinions or things that can be expressed in different ways. And so you cannot have an unbiased system, that’s just an impossibility.

(01:37:08) And so what’s the answer to this? And the answer is the same answer that we found in liberal democracy about the press, the press needs to be free and diverse. We have free speech for a good reason, is because we don’t want all of our information to come from a unique source because that’s opposite to the whole idea of democracy and progressive ideas and even science. In science, people have to argue for different opinions and science makes progress when people disagree and they come up with an answer and consensus forms, and it’s true in all democracies around the world.

(01:37:58) There is a future which is already happening where every single one of our interaction with the digital world will be mediated by AI systems, AI assistance. We’re going to have smart glasses, you can already buy them from Meta, the Ray-Ban Meta where you can talk to them and they are connected with an LLM and you can get answers on any question you have. Or you can be looking at a monument and there is a camera in the glasses you can ask it like, what can you tell me about this building or this monument? You can be looking at a menu in a foreign language, and I think we will translate it for you, or we can do real time translation if we speak different languages. So a lot of our interactions with the digital world are going to be mediated by those systems in the near future.

(01:38:53) Increasingly, the search engines that we’re going to use are not going to be search engines, they’re going to be dialogue systems that we just ask a question and it will answer and then point you to perhaps appropriate reference for it. But here is the thing, we cannot afford those systems to come from a handful of companies on the west coast of the US because those systems will constitute the repository of all human knowledge, and we cannot have that be controlled by a small number of people. It has to be diverse for the same reason the press has to be diverse, so how do we get a diverse set of AI assistance? It’s very expensive and difficult to train a base model, a base LLM at the moment, in the future it might be something different, but at the moment, that’s an LLM. So only a few companies can do this properly.

(01:39:50) And if some of those top systems are open source, anybody can use them, anybody can fine tune them. If we put in place some systems that allows any group of people, whether they are individual citizens, groups of citizens, government organizations, NGOs, companies, whatever, to take those open source AI systems and fine tune them for their own purpose on their own data, then we’re going to have a very large diversity of different AI systems that are specialized for all of those things.

(01:40:35) I tell you, I talked to the French government quite a bit, and the French government will not accept that the digital diet of all their citizens be controlled by three companies on the west coast of the US. That’s just not acceptable, it’s a danger to democracy regardless of how well-intentioned those companies are, and it’s also a danger to local culture, to values, to language. I was talking with the founder of Infosys in India, he’s funding a project to fine tune Llama 2, the open source model produced by Meta, so that Llama 2 two speaks all 22 official languages in India, it is very important for people in India. I was talking to a former colleague of mine, Moustapha Cisse, who used to be a scientist at Fair and then moved back to Africa, created a research lab for Google in Africa and now has a new startup Co-Kera.

(01:41:37) And what he’s trying to do, is basically have LLM that speak the local languages in Senegal so that people can have access to medical information because they don’t have access to doctors, it’s a very small number of doctors per capita in Senegal. You can’t have any of this unless you have open source platforms, so with open source platforms, you can have AI systems that are not only diverse in terms of political opinions or things of that-

Yann LeCun (01:42:00) … AI systems that are not only diverse in terms of political opinions or things of that type, but in terms of language, culture, value systems, political opinions, technical abilities in various domains, and you can have an industry, an ecosystem of companies that fine tune those open source systems for vertical applications in industry. I don’t know, a publisher has thousands of books and they want to build a system that allows a customer to just ask a question about the content of any of their books, you need to train on their proprietary data. You have a company, we have one within Meta, it’s called Metamate, and it’s basically an LLM that can answer any question about internal stuff about the company, very useful.

(01:42:53) A lot of companies want this. A lot of companies want this not just for their employees, but also for their customers, to take care of their customers. So the only way you’re going to have an AI industry, the only way you’re going to have AI systems that are not uniquely biased is if you have open source platforms on top of which any group can build specialized systems. So the direction of inevitable direction of history is that the vast majority of AI systems will be built on top of open source platforms…

…Lex Fridman (02:04:21) You often say that a GI is not coming soon, meaning not this year, not the next few years, potentially farther away. What’s your basic intuition behind that?

Yann LeCun (02:04:35) So first of all, it’s not going to be an event. The idea somehow, which is popularized by science fiction and Hollywood, that somehow somebody is going to discover the secret to AGI or human-level AI or AMI, whatever you want to call it, and then turn on a machine and then we have AGI, that’s just not going to happen. It’s not going to be an event. It’s going to be gradual progress. Are we going to have systems that can learn from video how the world works and learn good representations? Yeah. Before we get them to the scale and performance that we observe in humans it’s going to take quite a while. It’s not going to happen in one day. Are we going to get systems that can have large amount of associated memory so they can remember stuff? Yeah, but same, it’s not going to happen tomorrow. There is some basic techniques that need to be developed. We have a lot of them, but to get this to work together with a full system is another story.

(02:05:37) Are we going to have systems that can reason and plan perhaps along the lines of objective-driven AI architectures that I described before? Yeah, but before we get this to work properly, it’s going to take a while. Before we get all those things to work together, and then on top of this, have systems that can learn hierarchical planning, hierarchical representations, systems that can be configured for a lot of different situation at hand, the way the human brain can, all of this is going to take at least a decade and probably much more because there are a lot of problems that we’re not seeing right now that we have not encountered, so we don’t know if there is an easy solution within this framework. So it’s not just around the corner. I’ve been hearing people for the last 12, 15 years claiming that AGI is just around the corner and being systematically wrong. I knew they were wrong when they were saying it. I called their bullshit…

…Lex Fridman (02:08:48) So you push back against what are called AI doomers a lot. Can you explain their perspective and why you think they’re wrong?

Yann LeCun (02:08:59) Okay, so AI doomers imagine all kinds of catastrophe scenarios of how AI could escape or control and basically kill us all, and that relies on a whole bunch of assumptions that are mostly false. So the first assumption is that the emergence of super intelligence is going to be an event, that at some point we’re going to figure out the secret and we’ll turn on a machine that is super intelligent, and because we’d never done it before, it’s going to take over the world and kill us all. That is false. It’s not going to be an event. We’re going to have systems that are as smart as a cat, have all the characteristics of human-level intelligence, but their level of intelligence would be like a cat or a parrot maybe or something. Then we’re going to work our way up to make those things more intelligent. As we make them more intelligent, we’re also going to put some guardrails in them and learn how to put some guardrails so they behave properly.

(02:10:03) It’s not going to be one effort, that it’s going to be lots of different people doing this, and some of them are going to succeed at making intelligent systems that are controllable and safe and have the right guardrails. If some other goes rogue, then we can use the good ones to go against the rogue ones. So it’s going to be my smart AI police against your rogue AI. So it’s not going to be like we’re going to be exposed to a single rogue AI that’s going to kill us all. That’s just not happening. Now, there is another fallacy, which is the fact that because the system is intelligent, it necessarily wants to take over. There is several arguments that make people scared of this, which I think are completely false as well.

(02:10:48) So one of them is in nature, it seems to be that the more intelligent species otherwise end up dominating the other and even distinguishing the others sometimes by design, sometimes just by mistake. So there is thinking by which you say, “Well, if AI systems are more intelligent than us, surely they’re going to eliminate us, if not by design, simply because they don’t care about us,” and that’s just preposterous for a number of reasons. First reason is they’re not going to be a species. They’re not going to be a species that competes with us. They’re not going to have the desire to dominate because the desire to dominate is something that has to be hardwired into an intelligent system. It is hardwired in humans. It is hardwired in baboons, in chimpanzees, in wolves, not in orangutans. The species in which this desire to dominate or submit or attain status in other ways is specific to social species. Non-social species like orangutans don’t have it, and they are as smart as we are, almost, right?

Lex Fridman (02:12:09) To you, there’s not significant incentive for humans to encode that into the AI systems, and to the degree they do, there’ll be other AIs that punish them for it, I’ll compete them over it.

Yann LeCun (02:12:23) Well, there’s all kinds of incentive to make AI systems submissive to humans.

Lex Fridman (02:12:26) Right.

Yann LeCun (02:12:27) Right? This is the way we’re going to build them. So then people say, “Oh, but look at LLMs. LLMs are not controllable,” and they’re right. LLMs are not controllable. But objectively-driven AI, so systems that derive their answers by optimization of an objective means they have to optimize this objective, and that objective can include guardrails. One guardrail is, obey humans. Another guardrail is, don’t obey humans if it’s hurting other humans within limits.

Lex Fridman (02:12:57) Right. I’ve heard that before somewhere, I don’t remember

Yann LeCun (02:12:59) Yes, maybe in a book.

Lex Fridman (02:13:01) Yeah, but speaking of that book, could there be unintended consequences also from all of this?

Yann LeCun (02:13:09) No, of course. So this is not a simple problem. Designing those guardrails so that the system behaves properly is not going to be a simple issue for which there is a silver bullet for which you have a mathematical proof that the system can be safe. It’s going to be a very progressive, iterative design system where we put those guardrails in such a way that the system behave properly. Sometimes they’re going to do something that was unexpected because the guardrail wasn’t right and we’re dd correct them so that they do it right. The idea somehow that we can’t get it slightly wrong because if we get it slightly wrong, we’ll die is ridiculous. We are just going to go progressively. It is just going to be, the analogy I’ve used many times is turbojet design. How did we figure out how to make turbojet so unbelievably reliable?

(02:14:07) Those are incredibly complex pieces of hardware that run at really high temperatures for 20 hours at a time sometimes, and we can fly halfway around the world on a two-engine jetliner at near the speed of sound. Like how incredible is this? It’s just unbelievable. Did we do this because we invented a general principle of how to make turbojets safe? No, it took decades to fine tune the design of those systems so that they were safe. Is there a separate group within General Electric or Snecma or whatever that is specialized in turbojet safety? No. The design is all about safety, because a better turbojet is also a safer turbojet, so a more reliable one. It’s the same for AI. Do you need specific provisions to make AI safe? No, you need to make better AI systems, and they will be safe because they are designed to be more useful and more controllable…

…Lex Fridman (02:28:45) Well, it’ll be at the very least, absurdly comedic. Okay. So since we talked about the physical reality, I’d love to ask your vision of the future with robots in this physical reality. So many of the kinds of intelligence that you’ve been speaking about would empower robots to be more effective collaborators with us humans. So since Tesla’s Optimus team has been showing us some progress on humanoid robots, I think it really reinvigorated the whole industry that I think Boston Dynamics has been leading for a very, very long time. So now there’s all kinds of companies Figure AI, obviously Boston Dynamics.

Yann LeCun (02:29:30) Unitree.

Lex Fridman (02:29:30) Unitree, but there’s a lot of them.

Yann LeCun (02:29:33) There’s a few of them.

Lex Fridman (02:29:33) It’s great. It’s great. I love it. So do you think there’ll be millions of humanoid robots walking around soon?

Yann LeCun (02:29:44) Not soon, but it’s going to happen. The next decade I think is going to be really interesting in robots, the emergence of the robotics industry has been in the waiting for 10, 20 years without really emerging other than for pre-program behavior and stuff like that. And the main issue is, again, the Moravec paradox, how do we get those systems to understand how the world works and plan actions? And so we can do it for really specialized tasks. And the way Boston Dynamics goes about it is basically with a lot of handcrafted dynamical models and careful planning in advance, which is very classical robotics with a lot of innovation, a little bit of perception, but it’s still not, they can’t build a domestic robot.

(02:30:41) We’re still some distance away from completely autonomous level five driving, and we’re certainly very far away from having level five autonomous driving by a system that can train itself by driving 20 hours like any 17-year-old. So until we have, again, world models, systems that can train themselves to understand how the world works, we’re not going to have significant progress in robotics. So a lot of the people working on robotic hardware at the moment are betting or banking on the fact that AI is going to make sufficient progress towards that…

…Yann LeCun (02:38:29) I love that question. We can make humanity smarter with AI. AI basically will amplify human intelligence. It’s as if every one of us will have a staff of smart AI assistants. They might be smarter than us. They’ll do our bidding, perhaps execute a task in ways that are much better than we could do ourselves, because they’d be smarter than us. And so it’s like everyone would be the boss of a staff of super smart virtual people. So we shouldn’t feel threatened by this any more than we should feel threatened by being the manager of a group of people, some of whom are more intelligent than us. I certainly have a lot of experience with this, of having people working with me who are smarter than me.

(02:39:35) That’s actually a wonderful thing. So having machines that are smarter than us, that assist us in all of our tasks, our daily lives, whether it’s professional or personal, I think would be an absolutely wonderful thing. Because intelligence is the commodity that is most in demand. That’s really what I mean. All the mistakes that humanity makes is because of lack of intelligence really, or lack of knowledge, which is related. So making people smarter, we just can only be better. For the same reason that public education is a good thing and books are a good thing, and the internet is also a good thing, intrinsically and even social networks are a good thing if you run them properly.

(02:40:21) It’s difficult, but you can. Because it helps the communication of information and knowledge and the transmission of knowledge. So AI is going to make humanity smarter. And the analogy I’ve been using is the fact that perhaps an equivalent event in the history of humanity to what might be provided by generalization of AI assistant is the invention of the printing press. It made everybody smarter, the fact that people could have access to books. Books were a lot cheaper than they were before, and so a lot more people had an incentive to learn to read, which wasn’t the case before.

Disclaimer: The Good Investors is the personal investing blog of two simple guys who are passionate about educating Singaporeans about stock market investing. By using this Site, you specifically agree that none of the information provided constitutes financial, investment, or other professional advice. It is only intended to provide education. Speak with a professional before making important decisions about your money, your professional life, or even your personal life. We currently have a vested interest in Apple, Alphabet (parent of Google), Microsoft, Meta Platforms, and Tesla. Holdings are subject to change at any time.

What We’re Reading (Week Ending 24 March 2024)

The best articles we’ve read in recent times on a wide range of topics, including investing, business, and the world in general.

We’ve constantly been sharing a list of our recent reads in our weekly emails for The Good Investors.

Do subscribe for our weekly updates through the orange box in the blog (it’s on the side if you’re using a computer, and all the way at the bottom if you’re using mobile) – it’s free!

Here are the articles for the week ending 24 March 2024:

1. The future of ‘communist capitalism’ in China – Martin Wolf

What is the economic future of China? This question raises many specific issues, notably China’s persistent macroeconomic imbalances, the threat of population decline and worsening relations with important parts of the outside world, above all, an increasingly hostile US. But underneath all of these lies a deeper one: is “communist capitalism”, that seemingly self-contradicting invention of Deng Xiaoping, inexorably fading away under Xi Jinping? Will China’s regime ossify and, in the end, collapse, as the Soviet Union did?…

…Much light on this issue is shed by China’s World View, a recently published book by David Daokui Li, a distinguished Harvard-trained professor of economics, who teaches at Tsinghua University. People interested in China, be they hawks or doves, should read Li’s valuable book carefully.

Perhaps its most startling observation is that “from 980 until 1840, the beginning of China’s modern history”, income per head declined. Ancient China was in a Malthusian trap. This picture is even worse than the one shown in the work of the late Angus Maddison. Even after 1840, this grim reality did not get much brighter. Only after Deng Xiaoping’s “reform and opening up” did it change.

By freeing the private economy, relying on market forces and opening up to the world economy, Deng created the conditions for an extraordinary transformation. Yet, by repressing demands for democracy in Tiananmen Square in 1989, he also reinforced communist party control. He invented a new political economy: today’s China is the result.

Is it also sustainable? Li’s book answers a clear “yes” to this question. In essence, he argues that China’s political system should be viewed not as Soviet, but as a modernised form of the traditional Chinese imperial state. This state is paternal. It is responsible for the people, but not accountable to them, except in one fundamental way: if it loses mass support, it will be overthrown. Its job is to provide stability and prosperity. But, in doing so, it does not try to run everything from the centre. That would be crazy in so vast a country: it decentralises to local levels. The communist party should, he argues, be seen fundamentally as the national party of China.

From this perspective, the Xi regime does not represent an abandonment of the goals of the Deng era, but rather an attempt to remedy some of the problems created by its reliance on “go-go” capitalism, namely, pervasive corruption, soaring inequality and environmental damage…

…When considering the prospects for China, one should not focus mainly on the list of obvious problems — falling property prices, excessive debt, excess savings, an ageing population and western hostility. All these can be dealt with by a country with China’s human resources and growth potential, even if with difficulty.

The bigger issue is whether, in the centralising, cautious and conservative era of Xi, Deng’s move from stagnation to explosive growth is doomed to reverse back into stagnation. If people come to believe that the dynamism of the recent past has been lost for good, then there is a risk of a downward spiral of disappointed hopes. But the force of 1.4bn people wanting a better life is extremely powerful. Will anything be allowed to halt it? The answer, I suspect, is still “no”.

2. Is diversification a blessing or curse? – Chin Hui Leong

DIVERSIFICATION is good or bad for you, depending on whom you ask. Warren Buffett, the legendary investor and businessman, once said that if you know what you’re doing, it makes little sense to diversify.

But Peter Lynch, a star mutual fund manager of the 1980s, had a different approach. He believed that the more stocks you own, the better your chances of finding a winner. Lynch was famous for holding up to 1,400 stocks in his portfolio.

Here’s the surprise: They both achieved remarkable success, despite their opposing positions. What does this mean for you as an investor? Should you diversify, or not?…

…As we delve deeper into diversification, we should not lose sight of its goal to reduce risk. This is where buying businesses from unrelated industries or geographies can go wrong. In fact, investors who diversify into areas where they lack expertise are taking more risk, not less. It makes little sense to do so, says Lynch. How well you know your stocks matters more than how many sectors or regions you spread your money across.

I agree with Lynch. Diversify only if you want to boost your chances of finding more winning stocks in your portfolio.

Here is a point you shouldn’t miss: you should always be looking to learn more about new businesses and industries. As you become more knowledgeable, you can grow your portfolio with more stocks you know well, but without exceeding your limits.

Remaining humble is key. Knowing the limits of your knowledge in any new area is how you keep yourself in check. As author Carl Richards once said, risk is what’s left when you think you’ve thought of everything…

…Here’s a simple rule of thumb to help you. If you’ve been following a new company for a year, invest no more than 1 per cent of your portfolio into the stock. If it’s five years, then up to 5 per cent. You can adjust the percentage to fit your risk appetite.

The point of this strategy is to have a reference point where you can match your risk level with your knowledge level…

…Finally, investing over time helps to spread your risk over years. Don’t worry about starting small in a stock. A winning stock is only known in hindsight. Here’s the point most people miss: if a stock is destined to be a winner, the stock price rise will happen over years, if not decades…

…Here’s the final conundrum: the mark of a successful portfolio is a concentrated portfolio. How can that be? Let’s say you invested $1,000 each into 10 stocks. Each stock will make up a tenth of this $10,000 portfolio.

After five years, the first one skyrockets, increasing by 10 times and is worth $10,000, while the last one goes to zero. The other eight stocks stay the same at $1,000. Do the math and you’ll end up with $18,000 in total. The big difference is, the winning stock will comprise more than 55 per cent of the five-year old portfolio…

…As you diversify to find more winners, the best of them will naturally rise to the top – thereby concentrating your portfolio in the right set of winning stocks. That’s more than any investor can wish for.

3. China has little choice but stimulus – Ethan Wu

The near universal reaction in the west to China’s refreshed 5 per cent gross domestic product growth target: good luck with that…

…The old growth drivers — property, infrastructure and manufacturing — all face major constraints. Property’s structural decline is well known; home prices and sales keep falling. Meanwhile, infrastructure is running into the limit of high debt levels. Chinese officials were dispatched last year to prod local governments to delever. It began with easy cost cuts: withholding wages from civil servants, delaying payments to vendors, slashing city services. But more recently, the deleveraging drive has been hitting infrastructure projects already under way, as Reuters reported in January:

Increasing its efforts to manage $13 trillion in municipal debt, the State Council in recent weeks issued a directive to local governments and state banks to delay or halt construction on projects with less than half the planned investment completed in 12 regions across the country, the sources said…

…Lastly, manufacturing. Since about 2020, the credit that once flowed to the property sector has been redirected to manufacturing, especially in politically favoured sectors such as solar and electric vehicles. The year-over-year growth rate of loans to Chinese industry has risen steadily, though the level is now declining..

…This pivot back to manufacturing is “radical”, says Adam Wolfe of Absolute Strategy Research, and it has generated important victories for China. Most notably, BYD is now the world’s biggest EV maker, and China the biggest auto exporter. But it has also created an enormous oversupply of manufactured goods, which, when combined with limp demand at home, is crushing industrial margins and fuelling deflation…

…China’s manufacturing trade surplus is already huge, perhaps 2 per cent of world GDP. As Gavekal’s Yanmei Xie wrote in the FT last month, western countries sensibly fear China dumping cheap goods into export markets. A cheap renminbi heightens the threat; trade retaliation is widely anticipated. If that is right, export-led growth probably can’t be China’s escape valve.

This glum picture suggests that China may soon be forced into stimulus. Assuming the GDP target is at least somewhat binding, no sector of the Chinese economy stands ready to get growth to 5 per cent. A pick-up in consumption could do it, but we’ve heard no convincing story for why anxious consumers would suddenly become gripped by animal spirits…

…The unclear stimulus outlook has left the bulk of investors nervous, but equity outflows have at least stopped. The stock market has rallied 14 per cent since early February, but only because of ample support from the state. Value trade or value trap?

What keeps us sceptical is the fact that Chinese stocks are not loads cheaper than global stocks. After the rally, the CSI 300 trades at 13x forward earnings, versus 14x for the MSCI all-country world ex-US index. To us the risks in China stocks are much clearer than the reward.

4. Exxon Barges in on Hess Deal – Matt Levine

I, on the other hand, used to be a convertible bond investment banker, so I have somewhat more than the usual familiarity with them. I could tell you, for instance, that it is common in the US for a convertible to be done as a Rule 144A offering, meaning that the bonds are sold to large “qualified institutional buyers” (QIBs) in a private placement and then can’t be resold to retail investors. Doing a 144A deal is generally faster and cheaper than doing a public deal that is registered with the US Securities and Exchange Commission, and retail investors don’t really buy convertibles anyway.

But eventually the institutional buyers of a 144A deal will want to be able to convert their bonds into regular, publicly traded stock, so there needs to be some mechanism for turning “144A” convertibles into “registered” ones. I am old enough that, when I started as a converts banker, the way to do this was to file a registration statement with the SEC, but the modern approach is pretty much that you wait six months or a year and the convertible becomes freely tradeable as a legal matter.

As a practical matter, though, the way this works is that the bonds, when they are originally issued, have a “restrictive legend” on them saying that they can be sold only to institutional buyers, and after a year the company sends a notice to its transfer agent saying “you can take that legend off the bonds now.” And when the bonds have the legend, they can’t be freely traded; once the legend is off, they can be. Here I am pretending, as one does, that “the bonds” are pieces of paper with a legend stamped on them, but of course they are actually entries in an electronic database; what really happens is that the original bonds have a “restricted CUSIP” (the identification number that every security has), telling transfer agents and depositaries and brokers and everyone else that they can only be sold to QIBs, and then after a year the company gets them a new “unrestricted CUSIP” and they trade freely. This is not hard — it’s a phone call or an email, maybe a legal opinion — but the company has to do it…

…So for instance here is the indenture for Avid Bioservices Inc.’s 1.25% exchangeable senior notes due 2026, a convertible bond it issued in 2021.4 Section 4.06(e) of the indenture, the 94-page contract governing the bonds, says:

If, and for so long as, the restrictive legend on the Notes specified in ‎‎Section 2.05(c) has not been removed, the Notes are assigned a restricted CUSIP or the Notes are not otherwise freely tradable … as of the 370th day after the last date of original issuance of the Notes, the Company shall pay Additional Interest on the Notes at a rate equal to 0.50% per annum of the principal amount of Notes outstanding until the restrictive legend on the Notes has been removed. …

…Avid forgot, for two years, to take the restrictive legend off of its convertible. This was very understandable: Its obligation to remove the restricted legend was boring and technical and buried in Section 4.06(e) of a bond indenture that surely nobody read. It could only remove the legend a year after it issued the bonds, after everyone had stopped paying attention. And, as Avid points out, it “did not receive any notices and was not otherwise made aware” of this provision in, sure, a contract that it signed, but a very long and boring contract. (And, to be fair, the holders forgot too!) And because it completely forgot about its obligation to remove the legend, Avid also forgot to pay the 0.5% penalty interest rate for two years. And because it forgot to pay the extra interest, it created a non-curable default on the bonds: The holders can demand all of their money back, with interest, immediately, with no chance for Avid to fix the problem by removing the legend and paying the overdue interest…

…This is a bad oopsie by Avid, which probably should have put a reminder in its calendar to unrestrict the CUSIP. But it’s a clever trade by whoever this holder was: The old bonds are far out-of-the-money (that is, they’re not going to convert into stock), and Bloomberg tells me that they were trading in the high 70s as recently as a month ago (the high 80s more recently). If you had noticed Avid’s extremely technical oopsie, you could have bought the bonds at, say, 80 cents on the dollar, sent them a letter saying “we gotcha hahahaha,” and made a quick 20 points, plus interest. The holder owns “at least 25%” of the bonds (the amount required to accelerate), and there are $143.75 million of bonds outstanding; 20 points on 25% of $143.75 million is $7.2 million. Plus interest.

5. Sora, Groq, and Virtual Reality – Ben Thompson

Groq was founded in 2016 by Jonathan Ross, who created Google’s first Tensor Processing Unit; Ross’s thesis was that chips should take their cue from software-defined networking: instead of specialized hardware for routing data, a software-defined network uses commodity hardware with a software layer to handle the complexity of routing. Indeed, Groq’s paper explaining their technology is entitled “A Software-defined Tensor Streaming Multiprocessor for Large-scale Machine Learning.”

To that end Groq started with the compiler, the software that translates code into machine language that can be understood by chips; the goal was to be able to reduce machine-learning algorithms into a format that could be executed on dramatically simpler processors that could operate at very high speed, without expensive memory calls and prediction misses that make modern processors relatively slow.

The end result is that Groq’s chips are purely deterministic: instead of the high-bandwidth memory (HBM) used for modern GPUs or Dynamic Random Access Memory (DRAM) used in computers, both of which need to be refreshed regularly to function (which introduces latency and uncertainty about the location of data at a specific moment in time), Groq uses SRAM — Static Random Access Memory. SRAM stores data in what is called a bistable latching circuitry; this, unlike the transistor/capacitor architecture undergirding DRAM (and by extension, HBM), stores data in a stable state, which means that Groq always knows exactly where every piece of data is at any particular moment in time. This allows the Groq compiler to, in an ideal situation, pre-define every memory call, enabling extremely rapid computation with a relatively simple architecture.

It turns out that running inference on transformer-based models is an extremely ideal situation, because the computing itself is extremely deterministic. An LLM like GPT-4 processes text through a series of layers which have a predetermined set of operations, which is perfectly suited to Groq’s compiler. Meanwhile, token-based generation is a purely serial operation: every single token generated depends on knowing the previous token; there is zero parallelism for any one specific answer, which means the speed of token calculation is at an absolute premium…

…One of the arguments I have made as to why OpenAI CEO Sam Altman may be exploring hardware is that the closer an AI comes to being human, the more grating and ultimately gating are the little inconveniences that get in the way of actually interacting with said AI. It is one thing to have to walk to your desk to use a PC, or even reach into your pocket for a smartphone: you are, at all times, clearly interacting with a device. Having to open an app or wait for text in the context of a human-like AI is far more painful: it breaks the illusion in a much more profound, and ultimately disappointing, way. Groq suggests a path to keeping the illusion intact.

It is striking that Groq is a deterministic system running deterministic software that, in the end, produces probabilistic output. I explained deterministic versus probabilistic computing in ChatGPT Gets a Computer:

Computers are deterministic: if circuit X is open, then the proposition represented by X is true; 1 plus 1 is always 2; clicking “back” on your browser will exit this page. There are, of course, a huge number of abstractions and massive amounts of logic between an individual transistor and any action we might take with a computer — and an effectively infinite number of places for bugs — but the appropriate mental model for a computer is that they do exactly what they are told (indeed, a bug is not the computer making a mistake, but rather a manifestation of the programmer telling the computer to do the wrong thing).

I’ve already mentioned Bing Chat and ChatGPT; on March 14 Anthropic released another AI assistant named Claude: while the announcement doesn’t say so explicitly, I assume the name is in honor of the aforementioned Claude Shannon. This is certainly a noble sentiment — Shannon’s contributions to information theory broadly extend far beyond what Dixon laid out above — but it also feels misplaced: while technically speaking everything an AI assistant is doing is ultimately composed of 1s and 0s, the manner in which they operate is emergent from their training, not proscribed, which leads to the experience feeling fundamentally different from logical computers — something nearly human — which takes us back to hallucinations; Sydney was interesting, but what about homework?

The idea behind ChatGPT Gets a Computer is that large language models seem to operate somewhat similarly to the human brain, which is incredible and also imprecise, and just as we need a computer to do exact computations, so does ChatGPT. A regular computer, though, is actually the opposite of Groq: you get deterministic answers from hardware that is, thanks to the design of modern processors and memory, more probabilistic than you might think, running software that assumes the processor will handle endless memory calls and branch prediction.

In the end, though, we are back where we started: a computer would know where the bow and stern are on a ship, while a transformer-based model like Sora made a bad guess. The former calculates reality; the latter a virtual reality.

Imagine, though, Sora running on Groq (which is absolutely doable): could we have generated videos in real-time? Even if we could not, we are certainly much closer than you might have expected. And where, you might ask, would we consume those videos? How about on a head-mounted display like the Apple Vision Pro or Meta Quest? Virtual reality (my new definition) for virtual reality (the old definition).

Disclaimer: The Good Investors is the personal investing blog of two simple guys who are passionate about educating Singaporeans about stock market investing. By using this Site, you specifically agree that none of the information provided constitutes financial, investment, or other professional advice. It is only intended to provide education. Speak with a professional before making important decisions about your money, your professional life, or even your personal life. We currently have a vested interest in Apple and Meta Platforms. Holdings are subject to change at any time.

What We’re Reading (Week Ending 17 March 2024)

The best articles we’ve read in recent times on a wide range of topics, including investing, business, and the world in general.

We’ve constantly been sharing a list of our recent reads in our weekly emails for The Good Investors.

Do subscribe for our weekly updates through the orange box in the blog (it’s on the side if you’re using a computer, and all the way at the bottom if you’re using mobile) – it’s free!

Here are the articles for the week ending 17 March 2024:

1. The Ultra-Pure, Super-Secret Sand That Makes Your Phone Possible – Vince Beiser

Spruce Pine is not a wealthy place. Its downtown consists of a somnambulant train station across the street from a couple of blocks of two‑story brick buildings, including a long‑closed movie theater and several empty storefronts.

The wooded mountains surrounding it, though, are rich in all kinds of desirable rocks, some valued for their industrial uses, some for their pure prettiness. But it’s the mineral in Glover’s bag—snowy white grains, soft as powdered sugar—that is by far the most important these days. It’s quartz, but not just any quartz. Spruce Pine, it turns out, is the source of the purest natural quartz—a species of pristine sand—ever found on Earth. This ultra‑elite deposit of silicon dioxide particles plays a key role in manufacturing the silicon used to make computer chips. In fact, there’s an excellent chance the chip that makes your laptop or cell phone work was made using sand from this obscure Appalachian backwater. “It’s a billion‑dollar industry here,” Glover says with a hooting laugh. “Can’t tell by driving through here. You’d never know it.”

In the 21st century, sand has become more important than ever, and in more ways than ever. This is the digital age, in which the jobs we work at, the entertainment we divert ourselves with, and the ways we communicate with one another are increasingly defined by the internet and the computers, tablets, and cell phones that connect us to it. None of this would be possible were it not for sand.

Most of the world’s sand grains are composed of quartz, which is a form of silicon dioxide, also known as silica. High‑purity silicon dioxide particles are the essential raw materials from which we make computer chips, fiber‑optic cables, and other high‑tech hardware—the physical components on which the virtual world runs. The quantity of quartz used for these products is minuscule compared to the mountains of it used for concrete or land reclamation. But its impact is immeasurable…

…In the mid‑1950s, thousands of miles from North Carolina, a group of engineers in California began working on an invention that would become the foundation of the computer industry. William Shockley, a pathbreaking engineer at Bell Labs who had helped invent the transistor, had left to set up his own company in Mountain View, California, a sleepy town about an hour south of San Francisco, near where he had grown up. Stanford University was nearby, and General Electric and IBM had facilities in the area, as well as a new company called Hewlett‑Packard. But the area known at the time as the Santa Clara Valley was still mostly filled with apricot, pear, and plum orchards. It would soon become much better known by a new nickname: Silicon Valley.

At the time, the transistor market was heating up fast. Texas Instruments, Motorola, and other companies were all competing to come up with smaller, more efficient transistors to use in, among other products, computers. The first American computer, dubbed ENIAC, was developed by the army during World War II; it was 100 feet long and 10 feet high, and it ran on 18,000 vacuum tubes.

Transistors, which are tiny electronic switches that control the flow of electricity, offered a way to replace those tubes and make these new machines even more powerful while shrinking their tumid footprint. Semiconductors—a small class of elements, including germanium and silicon, which conduct electricity at certain temperatures while blocking it at others—looked like promising materials for making those transistors.

At Shockley’s startup, a flock of young PhDs began each morning by firing up kilns to thousands of degrees and melting down germanium and silicon. Tom Wolfe once described the scene in Esquire magazine: “They wore white lab coats, goggles, and work gloves. When they opened the kiln doors weird streaks of orange and white light went across their faces . . . they lowered a small mechanical column into the goo so that crystals formed on the bottom of the column, and they pulled the crystal out and tried to get a grip on it with tweezers, and put it under microscopes and cut it with diamond cutters, among other things, into minute slices, wafers, chips; there were no names in electronics for these tiny forms.”

Shockley became convinced that silicon was the more promising material and shifted his focus accordingly. “Since he already had the first and most famous semiconductor research and manufacturing company, everyone who had been working with germanium stopped and switched to silicon,” writes Joel Shurkin in his biography of Shockley, Broken Genius. “Indeed, without his decision, we would speak of Germanium Valley.”

Shockley was a genius, but by all accounts he was also a lousy boss. Within a couple of years, several of his most talented engineers had jumped ship to start their own company, which they dubbed Fairchild Semiconductor. One of them was Robert Noyce, a laid‑back but brilliant engineer, only in his mid‑20s but already famous for his expertise with transistors.

The breakthrough came in 1959, when Noyce and his colleagues figured out a way to cram several transistors onto a single fingernail‑sized sliver of high‑purity silicon. At almost the same time, Texas Instruments developed a similar gadget made from germanium. Noyce’s, though, was more efficient, and it soon dominated the market. NASA selected Fairchild’s microchip for use in the space program, and sales soon shot from almost nothing to $130 million a year. In 1968, Noyce left to found his own company. He called it Intel, and it soon dominated the nascent industry of programmable computer chips.

Intel’s first commercial chip, released in 1971, contained 2,250 transistors. Today’s computer chips are often packed with transistors numbering in the billions. Those tiny electronic squares and rectangles are the brains that run our computers, the Internet, and the entire digital world. Google, Amazon, Apple, Microsoft, the computer systems that underpin the work of everything from the Pentagon to your local bank—all of this and much more is based on sand, remade as silicon chips.

Making those chips is a fiendishly complicated process. They require essentially pure silicon. The slightest impurity can throw their tiny systems out of whack.

Finding silicon is easy. It’s one of the most abundant elements on Earth. It shows up practically everywhere bound together with oxygen to form SiO2, aka quartz. The problem is that it never occurs naturally in pure, elemental form. Separating out the silicon takes considerable doing.

Step one is to take high‑purity silica sand, the kind used for glass. (Lump quartz is also sometimes used.) That quartz is then blasted in a powerful electric furnace, creating a chemical reaction that separates out much of the oxygen. That leaves you with what is called silicon metal, which is about 99 percent pure silicon. But that’s not nearly good enough for high‑tech uses. Silicon for solar panels has to be 99.999999 percent pure—six 9s after the decimal. Computer chips are even more demanding. Their silicon needs to be 99.99999999999 percent pure—eleven 9s. “We are talking of one lonely atom of something that is not silicon among billions of silicon companions,” writes geologist Michael Welland in Sand: The Never-Ending Story.

Getting there requires treating the silicon metal with a series of complex chemical processes. The first round of these converts the silicon metal into two compounds. One is silicon tetrachloride, which is the primary ingredient used to make the glass cores of optical fibers. The other is trichlorosilane, which is treated further to become polysilicon, an extremely pure form of silicon that will go on to become the key ingredient in solar cells and computer chips.

Each of these steps might be carried out by more than one company, and the price of the material rises sharply at each step. That first‑step, 99 percent pure silicon metal goes for about $1 a pound; polysilicon can cost 10 times as much.

The next step is to melt down the polysilicon. But you can’t just throw this exquisitely refined material in a cook pot. If the molten silicon comes into contact with even the tiniest amount of the wrong substance, it causes a ruinous chemical reaction. You need crucibles made from the one substance that has both the strength to withstand the heat required to melt polysilicon, and a molecular composition that won’t infect it. That substance is pure quartz.

THIS IS WHERE Spruce Pine quartz comes in. It’s the world’s primary source of the raw material needed to make the fused‑quartz crucibles in which computer‑chip‑grade polysilicon is melted. A fire in 2008 at one of the main quartz facilities in Spruce Pine for a time all but shut off the supply of high‑purity quartz to the world market, sending shivers through the industry.

Today one company dominates production of Spruce Pine quartz. Unimin, an outfit founded in 1970, has gradually bought up Spruce Pine area mines and bought out competitors, until today the company’s North Carolina quartz operations supply most of the world’s high‑ and ultra‑high‑purity quartz. (Unimin itself is now a division of a Belgian mining conglomerate, Sibelco.)

In recent years, another company, the imaginatively titled Quartz Corp, has managed to grab a small share of the Spruce Pine market. There are a very few other places around the world producing high‑purity quartz, and many other places where companies are looking hard for more. But Unimin controls the bulk of the trade.

The quartz for the crucibles, like the silicon they will produce, needs to be almost absolutely pure, purged as thoroughly as possible of other elements. Spruce Pine quartz is highly pure to begin with, and purer still after being put through several rounds of froth flotation. But some of the grains may still have what Glover calls interstitial crystalline contamination—molecules of other minerals attached to the quartz molecules.

That’s frustratingly common. “I’ve evaluated thousands of quartz samples from all over the world,” says John Schlanz, chief minerals processing engineer at the Minerals Research Laboratory in Asheville, about an hour from Spruce Pine. “Near all of them have contaminate locked in the quartz grains that you can’t get out.”

Some Spruce Pine quartz is flawed in this way. Those grains are used for high‑end beach sand and golf course bunkers—most famously the salt‑white traps of Augusta National Golf Club, site of the iconic Masters Tournament. A golf course in the oil‑drunk United Arab Emirates imported 4,000 tons of this sand in 2008 to make sure its sand traps were world‑class, too.

The very best Spruce Pine quartz, however, has an open crystalline structure, which means that hydrofluoric acid can be injected right into the crystal molecules to dissolve any lingering traces of feldspar or iron, taking the purity up another notch. Technicians take it one step further by reacting the quartz with chlorine or hydrochloric acid at high temperatures, then putting it through one or two more trade‑secret steps of physical and chemical processing.

The result is what Unimin markets as Iota quartz, the industry standard of purity. The basic Iota quartz is 99.998 percent pure SiO2. It is used to make things like halogen lamps and photovoltaic cells, but it’s not good enough to make those crucibles in which polysilicon is melted. For that you need Iota 6, or the tip‑top of the line, Iota 8, which clocks in at 99.9992 percent purity—meaning for every one billion molecules of SiO , there are only 80 molecules of impurities. Iota 8 sells for up to $10,000 a ton. Regular construction sand, at the other end of the sand scale, can be had for a few dollars per ton…

…Unimin sells this ultra‑high‑purity quartz sand to companies like General Electric, which melts it, spins it, and fuses it into what looks like a salad bowl made of milky glass: the crucible. “It’s safe to say the vast majority of those crucibles are made from Spruce Pine quartz,” Schlanz says.

The polysilicon is placed in those quartz crucibles, melted down, and set spinning. Then a silicon seed crystal about the size of a pencil is lowered into it, spinning in the opposite direction. The seed crystal is slowly withdrawn, pulling behind it what is now a single giant silicon crystal. These dark, shiny crystals, weighing about 220 pounds, are called ingots.

The ingots are sliced into thin wafers. Some are sold to solar cell manufacturers. Ingots of the highest purity are polished to mirror smoothness and sold to a chipmaker like Intel. It’s a thriving multi-billion dollar industry in 2012.

The chipmaker imprints patterns of transistors on the wafer using a process called photolithography. Copper is implanted to link those billions of transistors to form integrated circuits. Even a minute particle of dust can ruin the chip’s intricate circuitry, so all of this happens in what’s called a clean room, where purifiers keep the air thousands of times cleaner than a hospital operating room. Technicians dress in an all‑covering white uniform affectionately known as a bunny suit. To ensure the wafers don’t get contaminated during manufacture, many of the tools used to move and manipulate them are, like the crucibles, made from high‑purity quartz.

The wafers are then cut into tiny, unbelievably thin quadrangular chips—computer chips, the brains inside your mobile phone or laptop. The whole process requires hundreds of precise, carefully controlled steps. The chip that results is easily one of the most complicated man‑made objects on Earth, yet made with the most common stuff on Earth: humble sand.

The total amount of high‑purity quartz produced worldwide each year is estimated at 30,000 tons—less than the amount of construction sand produced in the United States every hour. (And even construction sand is in high demand; there’s a thriving black market in the stuff.) Only Unimin knows exactly how much Spruce Pine quartz is produced, because it doesn’t publish any production figures. It is an organization famously big on secrecy. “Spruce Pine used to be mom‑and‑ pop operations,” Schlanz says. “When I first worked up there, you could just walk into any of the operations. You could just go across the street and borrow a piece of equipment.”

NOWADAYS UNIMIN WON’T even allow staff of the Minerals Research Laboratory inside the mines or processing facilities. Contractors brought in to do repair work have to sign confidentiality agreements. Whenever possible, vice‑president Richard Zielke recently declared in court papers, the company splits up the work among different contractors so that no individual can learn too much.

Unimin buys equipment and parts from multiple vendors for the same reason. Glover has heard of contractors being blindfolded inside the processing plants until they arrive at the specific area where their jobs are and of an employee who was fired on the spot for bringing someone in without authorization. He says the company doesn’t even allow its employees to socialize with those of their competitors.

It was hard to check out Glover’s stories, because Unimin wouldn’t talk to me. Unlike most big corporations, its website lists no contact for a press spokesperson or public relations representative. Several emails to their general inquiries address went unanswered. When I called the company’s headquarters in Connecticut, the woman who answered the phone seemed mystified by the concept of a journalist wanting to ask questions.

She put me on hold for a few minutes, then came back to tell me the company has no PR department, but that if I faxed (faxed!) her my questions, someone might get back to me. Eventually I got in touch with a Unimin executive who asked me to send her my questions by email. I did so. The response: “Unfortunately, we are not in a position to provide answers at this point in time.”

2. It was never about LLM performance – Justin

The LLM community is obsessed with benchmarking model performance. Mistral released their new “flagship” model this week, and immediately focused the discussion on how it performs on “commonly used benchmarks” relative to other models:

The entire blog post (I’d recommend reading it) is just a read through of how this model performs relative to other models on benchmarks, from math and coding to multilingual capabilities…

…This tendency to fixate on benchmarks is understandable – right now, it’s basically the only semi-objective way to measure how these models stack up against each other. It’s something vendors in other spaces, like data streaming, do too. But it is dangerous because it misses the point of where this whole AI thing is going, and is a textbook product marketing anti-pattern.

In a trend that we’ve seen hundreds of times in developer tooling, the underlying LLM is not going to matter within a few years. Large Language Model performance is already highly commoditized, and will continue to head in that direction. All that will matter is the experience that you build on top of these models, and what that enables for your customers.

Let’s take a look at the ChatGPT interface. Here’s a common prompt I’ve been using for testing, asking the model to summarize the contents of an external link into a tweet thread. Unrelated aside, the responses to this prompt are virtually identical across every major LLM.

Which parts of this interface are the underlying model – GPT-4 in this case – and which are an experience built by OpenAI on top of the underlying model?

The text response, minus any formatting, is what the model generated. But the:

Ability of the model to access and scrape content from a web page
Context of the prompt, including setting the system as a helpful assistant
Formatting the response, like changing the numbers to gray UI for typing the prompt
Filepicker for attaching media to the prompt
Prompt history
Model switcher / picker (this one is meta)
Ability to persist and share the model responses

…and more not show here

are all not GPT-4, they’re features built by OpenAI on top of GPT-4 to create an experience that is helpful and worth paying for. Some of these are harder to build than others – OpenAI’s secret sauce obviously isn’t the little arrow that scrolls down to the bottom of the response. ChatGPT would be nothing without GPT-4 – but the reverse may also be true!

The retort to this line of reasoning is that these chat interfaces are primarily for non-technical users, while the real money for these model providers comes from developer use cases, building LLMs into user-facing applications. I’ve worked closely with one of the major model compute providers, so this is not foreign to me. But experience matters to developers too!

OpenAI has dedicated significant resources to building a seamless developer experience beyond “docs for the model.” Here’s their playground for prompting GPT models – you can adjust parameters like temperature and penalties, plus change the system prompt to be any other style…

…For a closed source model provider like OpenAI, the difference between what is model and what is experience is academic – you’re paying for both. They are one thing. But where this really matters is in open source. Does the convergence of open source performance to closed source performance really matter if the experience of using that open source is bad?…

…The open source discussion has been too anchored on reaching performance parity with OpenAI models. This is a small piece of the puzzle. For developers looking to build applications with these open source models, and especially the pro-sumer chat use case, users need to consider the holistic experience that model providers offer. Integrating LLMs into your app is almost never going to be the “drop in” experience you see on marketing sites – and my concern is that the “open source is approaching parity with OpenAI!” narrative is not actually true in a meaningful way.

Folks working in AI can look to previous examples of this phenomenon in developer tools for guidance: A couple of years ago, I wrote about how underlying performance of production relational databases is becoming commoditized, and vendors are focusing much more on developer experience. It’s going to happen here too, the question is just when.

3. Aravind Srinivas – Building An Answer Engine – Patrick O’Shaughnessy and Aravind Srinivas

Patrick: [00:07:28] It’s really cool to think about the sequencing to get there. We’ve had search engines. Like you said, it’s a hack to get the answers. You’re building what I think of today as an answer engine. I type something in, it’s just giving the answer directly with great citation and all this other stuff we’ll talk about. And the vision you’re articulating is this question engine can anticipate the things that I want to learn about and give them to me beforehand.

And I’d love to build up towards that. So maybe starting with the answer engine, explain to us how it works. Maybe you could do this via the time line of how you’ve built the product or something. But what are the components? What is happening behind the scenes when I type something into Perplexity either a question or a search query or whatever? Walk us through in some detail the actual goings on behind the scenes in terms of how the product works itself?

Aravind: [00:08:13] Yes. So when you type in a question into Perplexity, the first thing that happens is, it first reformulates the question, it tries to understand the question better, expands the question in terms of adding more suffixes or prefixes to it, to make it more well formatted. It speaks to the question engine part. And then after that, it goes and pulls so many links from the web that are relevant to this reformulated question.

There are so many paragraphs in each of those links. It takes only the relevant paragraphs from each of those links. And then an AI model, we typically call it large language model. It’s basically a model that’s been trained to predict the next word on the Internet and fine-tuned for being good at summarization and chats.

That AI model looks at all these chunks of knowledge the bits of study that surface from important or relevant links and takes only those parts that are relevant to answering your query and gives you a very concise four or five sentence answer, but also with references. Every sentence has a reference to which webpage or which chunk of knowledge it took from which webpage and puts it at the top in terms of sources.

That gets you to a nicely formatted rendered answer, sometimes in markdown bullets, or sometimes just generic paragraphs, sometimes it has images in it. But a great answer with references or citation so that if you want to dig deeper, you can go and visit the link. If you don’t want and just read the answer and ask a follow-up, you can engage in a conversation, both modes of usage are encouraged and allowed. So this is what happens on Perplexity today.

Patrick: [00:09:51] What percent of users end up clicking beneath the summarized answer into a source webpage?

Aravind: [00:10:01] At least 10%.

Patrick: [00:10:02] So 90% of the time, they’re just satisfied with what you give them?

Aravind: [00:10:06] It depends on how you look at it. If you wanted to be 100% of the time, people always click on a link, that’s the traditional Google. And you want to be 100% of the time where people never click on links, that’s ChatGPT. We think the sweet spot is somewhere in the middle. People should click on link sometimes to go do their work there. Let’s say, you’re just booking a ticket, you might actually want to go away Expedia or something.

Let’s say you’re deciding where to go first. You don’t need to go away and read all these SEO blogs and get confused on what you want to do. You first make your decision independently with this research body that’s helping you decide. And once you finished your research and you have decided, then that’s when you actually have to go out and do your actual action of booking your ticket. That way, I believe there is a nice sweet spot of one product providing you both the navigational search experience as well as the answer engine experience together. And that’s what we strive to be doing…

…Patrick: [00:13:54] Can you explain from an insider’s perspective and someone building an application on top of these incredible new technologies, what do you think the future might look like or even what you think the ideal future would be for how many different LLM providers there are, how specialized they get to scale the primary answer, so there’s only going to be a few of them. How do you think about all this and where you think it might go?

Aravind: [00:14:16] It really depends on who you’re building for. If you’re building for consumers, you do want to build a scalable infrastructure because you do want to ask many consumers to use your product. If you’re building for the enterprise, you still want a scalable infrastructure.

Now it really depends, are you building for the people within that company who are using your product. Let’s say, you’re building an internal search engine, you only need to scale to the size of the largest organization, which is like maybe 100,000 people. And not all of them will be using your thing at one moment. You’re decentralizing it, you’re going to keep different servers for different companies and you can elastically decide what’s the level of throughput you need to offer.

But then if you’re solving another enterprise’s problem, where that enterprise is serving consumers and you’re helping them do that, you need to build scalable infrastructure indirectly at least. For example, OpenAI. Their APIs are used by us, other people to serve a lot of consumers. So unless they solve that problem themselves, they’re unable to help other people solve their problem. Same thing with AWS.

So that’s one advantage you have of actually having a first-party product that your infrastructure is helping you serve. And by doing that, by forcing yourself to solve that hard problem, whatever you build can be used by others as well. Amazon build AWS first for Amazon. And because Amazon.com requires very robust infrastructure, that can be used by so many other people and so many other companies emerged by building on top of AWS.

Same thing happened with OpenAI. They needed robust infrastructure to serve the GPT-3 developer API and ChatGPT as a product. But once they got it all right, then they can now support other companies that are building on top of them. So it really depends on what’s your end goal and who you’re trying to serve and what’s the scale of our ambition…

…Patrick: [00:19:02] And when I think about the history of the product, which I was a pretty early user of, the first thing that pops to my mind is that it solves the hallucination problem, which has become less of a problem. But early on, everyone just didn’t know how to trust these things and you solved that. You gave citations, you can click through the underlying webpages, et cetera.

I’d love you to walk through what you view the major time line product milestones have been of Perplexity dating back to its start. The one I just gave could be one example. There was this possibility, but there was a problem and you solved it, at least that was my perception as a user. What have been the major milestones as you think back on the product and how it’s gotten better?

Aravind: [00:19:41] I would say the first major thing we did is really making the product a lot faster. When we first launched, the latency for every query was seven seconds, then we actually had to speed up the demo video to put it on Twitter so that it doesn’t look embarrassing.

And one of our early friendly investors, Daniel Gross who co-invests a lot with Nat Friedman, he was one of our first testers before we even released the product. And he said, you guys should call it a submit button for a query. It’s almost like you’re submitting a job and waiting on the cluster to get back. It’s that slow.

And now we are widely regarded as the fastest chatbot out there. Some people even come and ask me, why are you only as fast as ChatGPT? Why are you not faster? And little did they realize that ChatGPT doesn’t even use the web by default. It only uses it on the browsing mode on Bing.

So for us to be as fast as ChatGPT already tells you that in spite of doing more work to go pull up links from the web, read the chunks, pick the relevant ones and use that to give you the answer with sources and a lot more work on the rendering, despite doing all the additional work, if you’re managing an end-to-end latency as good as ChatGPT that shows we have like even a superior back end to them.

So I’m most proud about the speed at which we can do things today compared to when we launched, the accuracy has been constantly going up, primarily few things. One is we keep expanding our index and like keep improving the quality of the index. From the beginning, we knew all the mistakes that previous Google competitors did, which is obsessed about the size of your index and focus less on the quality.

So we decided from the beginning we would not obsess about the size. Size doesn’t matter and index actually, what matters is the quality of your index. What kind of domains are important for AI chatbots and question-answering and knowledge workers. That is what we care about. So that decision ended up being right.

The other thing that has helped us improve the accuracy was training these models to be focused on hallucinations. When you don’t have enough information in the search snippets, try to just say I don’t know, instead of making up things. LLMs are conditioned to always be helpful, will always try to serve the user’s query despite what it has access to, may not be even sufficient to answer the query. So that part took some reprogramming, rewiring. You’ve got to go and change the ways. You can’t just solve this with prompt engineering. So we have spent a lot of work on that.

The other thing I’m really proud about is getting our own inference infrastructure. So when you have to move outside the OpenAI models to serve your product, everybody thinks, “Oh, you just train a model to be as good as GPT and you’re’ done.” But reality is OpenAI’s mode is not just in the fact that they have trained the best models, but also that they have the most cost-efficient, scalable infrastructure for serving this on a large-scale consumer product like ChatGPT. That is itself a separate layer of mode. You can build that mode, you can build.

And so we are very proud of our inference team, how fast, high throughput, low latency infrastructure we built for serving our own LLMs. We took advantage of the open source revolution, Llama and Mistral and took all these models, trained them to be very good at being great answer bots and served them ourselves on GPU so that we get better margins on our product. So all these three layers, both in terms of speed through actual product back-end orchestration, accuracy of the AI models and serving our own AI models, we’ve done a lot of work on all these things…

…Patrick: [00:28:50] Can you expand on index. You’ve referenced that a few times for those that haven’t built one or haven’t thought about this. Just explain that whole concept and the decisions that you’ve made. You already mentioned quality versus size. But just explain what it means to build an index, why it’s so important, et cetera?

Aravind: [00:29:07] Yes. So what does an index mean, it’s basically a copy of the web. The web has so many links and you want a cache, you want a copy of all those links in the database, so a URL and the contents in that URL. Now the challenge here is new links are being created every day on the web and also existing links keep getting updated on the web as well. New sites keep getting updated. So you’ve got to periodically refresh them. The URL needs to be updated in the cache with a different version of it.

Similarly, you got to keep adding new URLs to your index, which means you’ve got to build a crawler. And then how you store a URL, the contents in that URL also matters. Not every page is native HTML anymore. The web has upgraded a lot, rendering JavaScript a lot, and every domain has custom-based rendered the JavaScript. So you’ve got to build parsers. So you’ve got to build a crawler, indexer, parser and that together makes up for a great index.

Now the next step comes to retrieval, which is now that you have those index, every time you hit a query, which links do you use? And which paragraphs in those links do you use? Now that is the ranking problem. How do you figure out what is relevance and ranking? And once you retrieve those chunks, like the top few chunks relevant to a query that the user is asking, that’s when the AI model comes in. So this is a retrieve part. Now the generic part. That’s why it’s called retrieve and generic.

So once you retrieve the relevant chunks from the huge index that you have, the AI model will come and read those chunks and then give you the answer. Doing this ensures that you don’t have to keep training the AI model to be up to date. What you want the AI model to do is to be intelligent, to be a good reasoning model.

Think about this as when you were a student, I’m sure you would have written an open book exam, open notes exam in school or high school or college. What are those exams test you for? They don’t test you for rote learning. So it doesn’t give an advantage to the person who has the best memory power. It gives advantage to person who has read the concepts, can immediately query the right part of the notes, but the questions required you to think on the fly as well.

That’s what we want to design systems. It’s very different philosophy from OpenAI, where OpenAI wants this one model that’s so intelligent, so smart, you can just ask it anything. It’s going to tell you. We rather want to build a small efficient model that’s smart, capable, can reason on facts that it’s given on the fly. And this ambiguate different individuals with different names or saved as not sufficient information, not get confused about dates.

When you’re asking something about the future, say that was not yet happened. These sort of corner cases handle all of those with good reasoning capabilities yet have access to all of the world’s knowledge at an instant through a great index. And if you can do both of these together end-to-end orchestrated with great latency and user experience, you’re creating something extremely valuable. So that’s what we want to build…

…Patrick: [00:37:26] Do you think that the transformer architecture is here to stay and will remain the dominant tool or architecture for a long time?

Aravind: [00:37:33] This is a question that everybody asks in the last six years or seven years since the first transformer came. Honestly, nothing has changed. The only thing that has changed is the transformer became a mixture of experts model, where there are multiple models and not just a single model. But the core self-attention model architecture has not changed. And people say there are shortcomings, the quadratic attention, complexities there. But any solution to that incurs costs somewhere else too.

Most of the people are not aware that majority of the computation in a large transformer like GPT-3 or 4 is not even spent on the attention layer. It’s actually spent on the matrix multiplies. So if you’re trying to focus more on the quadratic part, you’re incurring costs and the matrix multiples, and that’s actually the bottleneck in the larger scaling.

So honestly, it’s very hard to make an innovation on the transformer that can have a material impact at the level of GPT-4 complex cost of training those models. So I would bet more on innovations, auxiliary layers, like retrievable augmented generation. Why do you want to train a really large model when you don’t have to memorize all the facts from the Internet, when you literally have to just be a good reasoning model?

Nobody is going to value Patrick for knowing all facts. They’re going to value you for being an intelligent person, fluid intelligence. If I give you something very new that nobody else has an experience in, are you well positioned to learn that skill fast and start doing it really well. When you hire a new employee, what do you care about? Do you care about how much they know about something? Or do you care about whether you can give them any task and they would still get up to speed and do it, which employee would you value more?

So that’s the sort of intelligence that we should bake into these models, and that requires you to think more on the data. What are these models training on? Can we make them train on something else and just memorizing all the words on the Internet? Can we make reasoning emerge in these models through a different way? And that might not need innovation on the transformer, that may need innovation more on what data you’re throwing at these models.

Similarly, another layer of innovation that’s waiting to happen is the architecture like sparse versus dense models. Clearly, mixture of experts is working, GPT-4 is a mixture of experts, Mixtral is a mixture of experts, Gemini 1.5 is a mixture of experts. So even there, it’s not one model for coding, one model for reasoning and math, one model for history that depending on your input, it’s getting routed to the right model. It’s not that spares.

Every individual tokened is routed to a different model, but it’s happening every layer. So you’re still spending a lot of compute. How can we create something that’s actually 100 humans in one company? So the company itself has aggregated so much smarter. We’ve not created the equivalent at a model layer, more experimentation on the sparsity and more experimentation on how we can make reasoning emerge in a different way is likely to have a lot more impact than thinking about what is the next transformer.

4. Training great LLMs entirely from ground up in the wilderness as a startup – Yi Tay

People always assume it’s simply a question/debate of accelerator choice (TPUs vs GPUs etc) and all GPU clusters are created equal. For us, this soon proved to be false. As we sampled across different service providers, we find that the variance of hardware quality differs vastly even for the same hardware, i.e., GPUs (H100s). Note that here, hardware refers to overall cluster quality and not necessarily the chips or accelerators per se. Just like a lottery. Basically:

Not all hardware is created equal. The variance of cluster quality across hardware providers is so high that it is literally a lottery pertaining to how much pain one would have to go through to train good models. In short, a hardware lottery in the era of LLMs.

More specifically, we’ve leased a few clusters from several compute providers, each with a range of hundreds to thousands of chips. We’ve seen clusters that range from passable (just annoying problems that are solvable with some minor SWE hours) to totally unusable clusters that fail every few hours due to a myriad of reasons. Specifically, some clusters have nodes that fail every N hour with issues ranging from cabling issues (where N is unreasonably small), GPU hardware errors etc. Even more surprisingly, every cluster across the same provider could also be vastly different in terms of how robust it was…

…Did I mention you’ll also get a different Model Flop Utilisation (MFU) for different clusters!? This was a non negligible amount of compute wasted if one is unlucky enough to find a provider with badly cabled nodes or some other issues. Systems with very sub-optimal file systems would have the MFU of training runs tank the moment a team mate starts transferring large amounts of data across clusters.

Every service provider also had different levels of support. These range from being polite to nonchalant, “chatgpt-style” canned responses to blaming the user for every single thing that goes wrong.

Overall, every single cluster we tried feels like they have their own vibe, struggles and failure modes. It was also almost as though every single cluster needed their own hot-fixes for their own set of issues – some more tolerable than others. That said, we’ve learned that fail safes are important, and finding fast hot fixes for any clusters could be key…

…We’re training our models on GPUs for the most part at Reka. Personally, I’ve used TPUs all my life when it comes to large language model training at Google pre-Reka life. CUDA and nccl were the most alien thing to me ever. (I only learned it’s pronounced “Nickel” from one of my coworkers who used to work at Nvidia lol)

I was completely taken aback by the failure rate of GPUs as opposed to my experiences on TPUs at Google. In fact, I don’t actually recall TPUs failing much even for large runs, though I was not sure if I was protected from knowing this just by the sheer robustness of the outrageously good infra and having a dedicated hardware team. In fact, the UL2 20B model (at Google) was trained by leaving the job running accidentally for a month. It never failed. If this were in GPU land, it would have failed within the first few days for sure.

That said, I think this could be more about the competency of the hardware team that manages your accelerators rather than the underlying chip. The presence of having good hardware support (from your compute provider) is important. And so much hinges on them being actually competent, reinforcing the notion of the “hardware lottery”…

…It is no secret that my favourite codebase of all time is T5X and Mesh Tensorflow (named tensors ftw) but these options quickly became not viable as 1) they don’t get as much support outside Google, 2) they are kind of deprecated and 3) they are not friendly to folks on our team that are not xooglers.

We ended up going for something vanilla, seemingly stable and more popular (i.e., pytorch) that is more accessible to most people on the team (except me lol). In my first few months, I was tripping all over pip, git, docker and all these wild life stuff. Then again, I am not 100% sure about how stable or user friendly it would be to use a google codebase externally (it would have been pretty nasty I guess).

To be very frank, I would have to say the quality of codebases externally significantly lag behind those I’ve been used to at Google. Primarily because codebase within Google tends to be written by ML rockstars themselves (e.g, Noam Shazeer, Barret Zoph, Adam Roberts, Hyung Won Chung et al.) and just feel better (e.g., superior vibes) compared to those I’ve tried externally. In particular, I found myself super annoyed with the code quality when dabbling with stuff built by other companies (some way worse than others 🤗).

5. How The Interstate Highway System Changed American Industry – Lawrence Hamtil

Signed into law in 1956 by then President Dwight Eisenhower, the Federal Highway Act created the Interstate Highway System, which would become the largest and costliest public works project in history. Measuring almost 48,000 miles in total distance, the Interstate Highway System was completed only in 1992, more than three decades after work began, and for a total cost in today’s dollars of more than $500 billion…

…Among the beneficiaries of this huge outlay were the quarry owners and aggregate miners, who provided the gravel and rock on which the interstates were laid, the heavy machinery manufacturers who provided the graders, tractors, and steamrollers that turned those rocks into roads, and the oil and gas producers and refiners who made the gasoline and diesel that fueled the project…

…As families began to set out exploring the country on the new interstate system, restauranteurs such as Ray Kroc and Howard Johnson recognized the need to provide traveling families with predictable, familiar service. The idea of the chain restaurant was born as interstate exit ramps guided hungry motorists to McDonald’s and Howard Johnson’s. Families would also need places to say on longer journeys, so hotels followed restaurants in the chain model as franchises like Holiday Inn became a staple of interstate exits; early ads for the hotel underlined the value of the familiar by stating, “The best surprise is no surprise.”

The logistical flexibility provided by the interstate system also gave rise to a whole new model of retailing: big box stores began to set up in small towns offering rich variety and low prices to consumers previously left unserved by larger retailers. Walmart’s 1975 annual report detailed just such a model…

…Whereas not quite a century before the railroads had aided in the rise of Sears, Roebuck, and Co. as the first retailer with national reach, the interstate in the 1960s and 1970s would provide the backbone of Walmart’s logistical operations, with large distribution centers situated at critical points throughout the interstate network to facilitate inventory replenishment, as Professor Jesse LeCavalier has noted on his blog.

Disclaimer: The Good Investors is the personal investing blog of two simple guys who are passionate about educating Singaporeans about stock market investing. By using this Site, you specifically agree that none of the information provided constitutes financial, investment, or other professional advice. It is only intended to provide education. Speak with a professional before making important decisions about your money, your professional life, or even your personal life. We currently have a vested interest in Alphabet (parent of Google), Amazon, Apple, and Microsoft. Holdings are subject to change at any time.

What We’re Reading (Week Ending 10 March 2024)

The best articles we’ve read in recent times on a wide range of topics, including investing, business, and the world in general.

We’ve constantly been sharing a list of our recent reads in our weekly emails for The Good Investors.

Do subscribe for our weekly updates through the orange box in the blog (it’s on the side if you’re using a computer, and all the way at the bottom if you’re using mobile) – it’s free!

Here are the articles for the week ending 10 March 2024:

1. Flawed Valuations Threaten $1.7 Trillion Private Credit Boom – Silas Brown, Laura Benitez, John Sage, Kat Hidalgo, and Ellen Schneider

The meteoric rise of private credit funds has been powered by a simple pitch to the insurers and pensions who manage people’s money over decades: Invest in our loans and avoid the price gyrations of rival types of corporate finance. The loans will trade so rarely — in many cases, never — that their value will stay steady, letting backers enjoy bountiful and stress-free returns. This irresistible proposal has transformed a Wall Street backwater into a $1.7 trillion market.

Now, though, cracks in that edifice are starting to appear.

Central bankers’ rapid-fire rate hikes over the past two years have strained the finances of corporate borrowers, making it hard for many of them to keep up with interest payments. Suddenly, a prime virtue of private credit — letting these funds decide themselves what their loans are worth rather than exposing them to public markets — is looking like one of its greatest potential flaws.

Data compiled by Bloomberg and fixed-income specialist Solve, as well as conversations with dozens of market participants, highlight how some private-fund managers have barely budged on where they “mark” certain loans even as rivals who own the same debt have slashed its value.

In one loan to Magenta Buyer, the issuing vehicle of a cybersecurity company, the highest mark from a private lender at the end of September was 79 cents, showing how much it would expect to recoup for each dollar lent. The lowest mark was 46 cents, deep in distressed territory. HDT, an aerospace supplier, was valued on the same date between 85 cents and 49 cents…

…“As interest rates have risen, so has the riskiness of borrowers,” Lee Foulger, the Bank of England’s director of financial stability, strategy and risk, said in a recent speech. “Lagged or opaque valuations could increase the chance of an abrupt reassessment of risks or to sharp and correlated falls in value, particularly if further shocks materialize.”…

…Some market participants wonder, however, whether the fog around pricing suits investors just fine. Several fund managers, who requested anonymity when speaking for fear of endangering client relationships, say rather than wanting more disclosure, many backers share the desire to keep marks steady — prompting concerns about a code of silence between lenders and the insurers, sovereign wealth funds and pensions who’ve piled into the asset class.

One executive at a top European insurer says investors could face a nasty reckoning at the end of a loan’s term, when they can’t avoid booking any value shortfall. A fund manager who worked at one of the world’s biggest pension schemes, and who also wanted to remain anonymous, says valuations of private loan investments were tied to his team’s bonuses, and outside evaluators were given inconsistent access to information.

The thinly traded nature of this market may make it nigh-on impossible for most outsiders to get a clear picture of what these assets are worth, but red flags are easier to spot. Take the recent spike in so-called “payment in kind” (or PIK) deals, where a company chooses to defer interest payments to its direct lender and promises to make up for it in its final loan settlement.

This option of kicking the can down the road is often used by lower-rated borrowers and while it doesn’t necessarily signal distress, it does cause anxiety about what it might be obscuring…

…According to Solve, about three-quarters of PIK loans were valued at more than 95 cents on the dollar at the end of September. “This raises questions about how portfolio companies struggling with interest servicing are valued so high,” says Eugene Grinberg, the fintech’s cofounder.

An equally perplexing sign is the number of private funds who own publicly traded loans, and still value them much more highly than where the same loan is quoted in the public market.

In a recent example, Carlyle Group Inc.’s direct-lending arm helped provide a “second lien” junior loan to a US lawn-treatment specialist, TruGreen, marking the debt at 95 cents on the dollar in its filing at the end of September. The debt, which is publicly traded, was priced at about 70 cents by a mutual fund at the time…

…Thrasio is an e-commerce business whose loan valuations have been almost as varied as the panoply of product brands that it sells on Amazon, which runs from insect traps and pillows to cocktail shakers and radio-controlled monster trucks.

As the company has struggled lately, its lenders have been divided on its prospects. Bain Capital and Oaktree Capital Management priced its loans at 65 cents and 79 cents respectively at the close of September. Two BlackRock Inc. funds didn’t even agree: One valuing its loan at 71 cents, the other at 75 cents. Monroe Capital was chief optimist, marking the debt at 84 cents. Goldman Sachs Group Inc.’s asset management arm had it at 59 cents.

The Wall Street bank seems to have made the shrewder call. Thrasio filed for Chapter 11 on Wednesday as part of a debt restructuring deal and one of its public loans is quoted well below 50 cents, according to market participants. Oaktree lowered its mark to 60 cents in December…

…Distressed companies do throw up some especially surprising values. Progrexion, a credit-services provider, filed for bankruptcy in June after losing a long-running lawsuit against the US Consumer Financial Protection Bureau. Its bankruptcy court filing estimated that creditors at the front of the queue would get back 89% of their money. Later that month its New York-based lender Prospect Capital Corp. marked the senior debt at 100 cents…

…For private credit’s many champions, the criticism’s overblown. Fund managers argue that they don’t need to be as brutal on marking down prices because direct loans usually involve only one or a handful of lenders, giving them much more control during tough times. In their eyes, the beauty of this asset class is that they don’t have to jump every time there’s a bump in the road…

…Direct lenders also use far less borrowed money than bank rivals, giving regulators some comfort that any market blowup could be contained. They typically lock in cash they get from investors for much longer periods than banks, and they don’t tap customer deposits to pay for their risky lending. They tend to have better creditor protections, too.

2. An Interview with Nat Friedman and Daniel Gross Reasoning About AI – Ben Thompson, Nat Friedman, and Daniel Gross

The other release, I think around the same day, was Groq released a demo of using their processor online. This is about the processor, it’s not about the model. They’re using Mistral and Llama as the the available models, but the speed is truly remarkable. It strikes me as a big deal, not because what it says about Groq — that’s a different question and I actually I’m curious about your guys points of view on some questions there — but I’ve been on, for a long time, there is a user experience issue when it comes to AI, and a lot of the use cases we’re talking about where, because it is human-like, the vastness of the uncanny valley is very large and basically any friction in that experience matters way more than it matters with a phone. With a phone, when you’re pulling it out of your pocket or you’re sitting out of your device, you’re never not aware that you’re using a phone or that you’re using a computer. It’s never like, “Wow, I thought I was talking to a human, I was actually talking on my phone.” No, that’s never going to happen, and so you actually have way more latitude for user experience friction. However, when it comes to AI, the fact that it can sound like a human, speed matters, it matters hugely, and the reason why I thought that demo was a big deal was again, the business prospects of Groq aside, it was tangible that, yes, this is the right thesis. Speed actually makes an astronomical difference and it felt like validation of a view that I had on that.

DG: Yeah, I think we have pretty fast response times from our minds, I think the brain runs at a pretty high hertz, and depending on the mood that you’re in, there’s alpha, beta, gamma, but at the end of the day we perceive reality very quickly and we hadn’t quite had an experience where something was that instant and that fast and that fluid, but I think that’s only the beginning to be honest, and someone’s going to have to do the hard work of actually taking that concept, be it on Groq’s hardware or somewhere else and turning it into something that’s very polished, refined and a product that can handle interruptions, that sort of thing.

But once someone does that, if I had to guess, if we try to project forward in the next podcast or the one after that, what is the big new thing? It’s just this idea that we’re going to move into a more agentic world of models where what we have now is very Precambrian. You go to chat.openai.com and you put in a bunch of words and some words come out and at the end of the day the model is rhyming more than it’s thinking, and it’s a little slow and I think next era is to have actual agents do tasks for you on the Internet, converse with you at human speed, and I think the economy and market prices don’t factor this in at all.

Well, this is the reason to be optimistic about Groq. If you actually pencil out the cost of their systems, and part of the reasons why it’s so fast is every individual chip has a very small amount of SRAM, which keeps the data in place and is super expensive, but it’s deterministic, they know exactly where the data is, but that means they need big systems to have enough memory. That means they would need a large market to develop. So they’re pushing this cost per token idea, but you have to have just an astronomical amount of tokens moving through the system for that pricing to make sense. My sense though is speed actually matters so much that this is a use case unlocker.

NF: It’s a user interface unlocker too. With slow model outputs, you were forced to have this streaming tokenization, the stream of tokens basically coming at you and now with speed, speed has always been a feature and I think actually in many ways this is just a reminder of a perennial rule of user interface design, which is that speed matters, latency matters. It’s a funny thing because users usually don’t ask for it, but they just sense that they prefer the thing that’s snappy and they choose it over the thing that’s sluggish.

And I think that difference is, like I said, that much bigger for these sorts of models.

NF: But in this case I think it unlocks new types of UI, whereas previously you had to sit there and watch the model just stream tokens at you.

This is where you can actually talk to it and it feels normal. It doesn’t feel weird.

NF: Yeah. Well, it also actually, I think, feels more superhuman in a way, because you can get a whole essay in seconds and you can get a book in minutes and there’s a way in which the superhuman feeling is stronger, but also I think you could have the model, for example, if you’re willing to spend the money, it’s more reasonable to have the model explore several paths and maybe it’s going to try ten things and pick the one that works best because it can do it very quickly…

…Groq is really interesting because they’ve been around for a long time. Jonathan Ross, the founder, invented the TPU at Google and then set out to do it better in a certain respect. I think they almost died and then LLMs come along and suddenly they have this architecture that seems to works well. Again, you have this, under the surface, it’s quite deterministic that maps well to their approach.

You mentioned the scaling bit, Daniel. I think one of the questions that goes with this about chip design in general is at what point does it make sense to specialize even more than the GPU? The GPU is much more specialized than a CPU, but it’s still general purpose, and that comes with real costs when it comes to things like latency and things like that. Do these go hand in hand? If it actually is the case that scale is the answer to almost every problem, does that mean the opportunity for a more specialized architecture has arrived maybe sooner than we expected?

DG: I think so. And we are sitting here, I think, before the era of AI ASICs [Application-specific integrated circuit]. Maybe Groq is a little early to it because it’s been around for a little longer but if I had to guess, this is a big part of the future.

I think one of the main things that’s changed, I remember calling Jonathan the day after Llama came out, and I told him the industry is going to finally standardize around something where you can show people how great you are, because previously his issue was, he was parading around a bunch of these benchmarks and people had a tough time translating that into something that was so economically valuable they’d reconfigured their entire architecture for a specialized chip. It wasn’t just Jonathan, it was that whole era of your 2016, ’17 AI companies. What happened was really Meta created a standard by open sourcing Llama and everyone started thinking in terms of token output per second basically. That became a standard where you can perform by, and much more importantly, you can measure your balance sheet by.

AI companies go through two cycles when they train their models, they’re fairly margin, I think, insensitive, they just want the best GPUs, they don’t want to take any risk. You’re spending $300 million, you just want your model to “tape out” properly and then if you find product market fit, you switch to this inference era. Now in the inference era, you’re ultimately staring at your COGS and you’re staring at your COGS every month and you’re thinking, “Gosh, we’re paying so much per hour, per GPU, whatnot. It makes total sense for us to allocate five engineers and re-architect towards this completely different alien platform.” It’s an ASIC effectively, people would be upset if I call their chips ASICs but you get the idea.

Well, it’s more of that category than a GPU, yes.

DG: It’s a dedicated chip and it makes total sense to do that because you’re just staring at your COGS. It’s sort of like how much would you be willing to architect your infrastructure as a fintech company if you could lower your interchange rate? Well, the answer is usually a lot and the Nvidia margin is a kind of interchange rate for tokens, and you’re very much willing to do the work and the schlep for custom architecture if it works in a way that people just weren’t willing to do in 2017 because very few companies had revenue coming in.

The inference was smaller than the training market.

DG: The only people who had this, by the way, were the advertising companies, Meta and Google, and they had their own chips.

So I think ultimately that’s what happened is you’re now able to monetize these models in a way where you can do the mental math to yourself about why it makes sense to rewrite them for a custom architecture, and if I had to guess, Nvidia’s dominance in training, as far as I can tell, remains strong as ever. Over time, I don’t necessarily know that they’ll lose share, but the pie will grow and the inference pie is going to grow to some of these ASICs and to some extent it already has of course, with the TPU, and Meta has its own internal custom inference chips and that’s going to grow, I think, over time because it just makes economic sense to do so…

…There seems to be a groundswell of robotic foundation models that are coming, where we haven’t yet had this GPT-3 moment of robotics where you have a couple of hands on a desk and it can tie a shoe or it can decorate a cake or put a Lego together and do all those things relatively well or in a way that feels like the beginnings of robotic intelligence, but it seems like that’s coming in the next 12 or 18 months. We will see those demonstrations.

What’s enabling it is this belief in scaling and a few breakthroughs on the model architecture side and what’s holding it back is data. You don’t have the common crawl of robotic data, you can’t scrape the Internet for robotic instruction data and so all the efforts going into collecting those data sets and the early demonstrations are really impressive and they do involve local learned models for things like motion and kinematics and balance and stuff like that in some cases.

Is data going to be a real differentiator in that there’s going to be fights for exclusive data sets, or will it become a commodity where everyone realizes the way you actually differentiate is with the product and it’s actually to everyone’s benefit to have access to the best data sets and there’ll be more collective action?

NF: I think this is a really good question. If it had happened a few years ago, I think it would’ve been much more likely that there’d be common data sets. There are a few open robotic data sets, but they’re pretty small, pretty low quality and now that we’re already in the AI gold rush, it seems likely that the really expensive project of collecting a bunch of data, whether that’s through teleoperations or something else, will happen inside funded companies, either big companies or smaller.

Does this apply to data generally, just because maybe theoretically it’d be best for everyone to adopt a collective approach to have a high-minded where we’re going to actually differentiate, but right now the stakes are so high, everyone’s like, “Nope, my data, I’m not going to share”?

NF: The walls are going up, definitely the shutters are down on data, it used to be easier to scrape websites than it is today. Scraping has gotten harder, generally, you see that across the board. So I think companies, that at one point didn’t view the content of all their UGC as an asset, now suddenly do. They say, “Wait, we’ve got this big data set that can be trained on.”…

…NF: The bet on long context is very important and we think that being able to not just retrieve out of but reason over huge amounts of information, is a super, I mean, it’s partly a human ability. We have episodic memory and we have procedural memory and the ability to retain skills or memories over time and there’s been an open question, “How are models going to do this? How are they going to develop episodic or procedural memory?”, and you can do both in the context.

In the context, you can put episodes in that the model will remember and you can put skills in, as Google actually demonstrated by teaching it new languages inside a single prompt and then asking it to use those skills. So this has been a big missing skill, this may not be the final way it shows up in AI systems, but it’s a new way that we can do this that I think is incredibly meaningful.

You can also do superhuman things as well. Reason over huge code bases, show it hours of security footage and ask it to draw correlations across that. I do think it’s amazing and a real breakthrough, and it’s clear that Google has figured something out here, and they have a bit of a secret and we’ve all been looking for clues and poring over the literature to figure out what it is. But this is a real axis of differentiation.

Well, that’s the big question in my mind, how much of this is model and how much of this is infrastructure? Because there was a presentation they did at their enterprise event last year, and it’s weird, I can’t find this anywhere, I spent hours looking for it last week, I was writing about 1.5. But I very tangibly remember it where they were talking about this sort of sharding capability, where we know about sharding in the context of databases, and the problems that solves and the challenges it presents, but they were talking about sharding in the context of, I think they were talking about it for training. But it seems like they’re doing sharding in the context of inference where they have this ability to distribute the workload, not just across chips, not just across clusters, but at least in theory, across data centers, which introduces huge challenges as far as you’re constrained by the speed of light.

Google’s networking capabilities have always been well known, but I’m not sure it’s been appreciated how that could be brought to bear on these issues. And you talked about, Daniel, how much can you make a sparse model, and to do this, and to do a mixture-of-experts sort of approach, and to spread it out. It’s the exact opposite of Groq. Groq is massively serial, super fast. What if we can spread it out all over the place and because the use case is tolerable of latency, we can just take that all the way to the extreme? And it feels like only Google could do what Gemini 1.5 is right now, and it doesn’t feel like anyone else is even close.

DG: Do you think anyone else is close, Nat?

NF: Well, we know of one company that has this also.

DG: Yeah.

NF: Daniel and I made an investment last week in a company called Magic that has a very good, very efficient, extremely long, longer than Gemini, context that’s working. To be honest with you, we thought there was only one company that had this, now we know there were two…

…The reason why Gemini as it shipped feels so distasteful, is it feels like bad faith, it’s very blatantly on the tin, “We’re not actually doing our best job to give you an answer”. It’s just straightforward, and it feels like an aspect where we would forgive an AI screwing up, we’ve been forgiving OpenAI all along, and they had some early episodes where there was clearly slants put on, and they’ve worked through that. But it felt like in good faith, “We’re doing our best here.” Gemini doesn’t feel like it’s in good faith, and maybe it was an accident that it feels that way, but it crossed a line of perception that just seems very problematic.

How did this happen? How did we get a product like this from a company that is supposedly too scared to ship and they ended up finally shipping and then it’s just a disaster?

NF: Well, I think you’re right. I think one reason they should get a little less leeway than OpenAI did, is that they saw what came before them, and they learned nothing from the precedents. Dall-E 2 had its own sort of crazy woke image creation problem that they had to adjust and tune and they learned from, and that was all forgivable because they were pioneering and ChatGPT has been through this as well and so Google should have seen all that and learned from it and done better.

It’s such a great point. This is a big advantage of going first, is you get more grace.

NF: You do, you get more grace, because no one’s ever solved these problems before. But Google definitely didn’t come first and still made mistakes that feel like 2021 mistakes, 2022 mistakes, and that’s much less forgivable.

How did it happen? I mean, I think culture’s a very big component. You wrote about that, and it’s clear that it was very difficult for anyone at Google to raise their hand and say, “Hey, I don’t think we should ship in this form, we should probably do something about this.”

Then, we’ve heard from people at Google that the models themselves, this is not likely to be something that was a deep problem in the model training, but a decision that was made in the productization by someone who came later. So, there’s probably a set of system prompts or templates or something like that that are imposing a set of rules and guidance to the models that the raw internal models don’t do.

I think this is the challenge. Google’s always had this funny word they use for shipping products, which is what they call externalization, I always thought that was a very culturally-indicative piece of jargon from Google, because it kind of captures in a way, the way Google thinks of itself. They develop breakthrough technologies internally and then they externalize the magic, and it’s not a product-first thinking, it’s not even a customer-first thinking, it’s a technology-first thinking. I think that’s where the mistake is here, in the externalization, in the process of putting it out there.

So in a way that makes it easy to fix, there’s probably a single file that could be edited that would improve things a lot, and in another way, editing that file might mean going through layers of product people and policy people who will potentially have a lot to say about that, and the gulf between the brilliant minds creating the models and the users, there’s someone in the middle and that’s where the challenge lies.

How exactly do you think this is happening, Daniel? Is it that there’s the level from the data, there’s the model, there’s the RLHF [Reinforcement Learning from Human Feedback] process, there’s the prompt, where are things going sideways here?

DG: Well, we were having a good conversation about this earlier. I mean, traditionally there’s, I think, a few things people misunderstand a little bit. Pre-training and fine-tuning a model are not distinct ideas, they’re sort of the same thing. That fine-tuning is just more the pre-training at the end. As you train models, this is something I think we believe, but we now see backed by a lot of science, the ordering of the information is extremely important. Because look, the ordering for figuring out basic things like how to properly punctuate a sentence, whatever, you could figure that out either way. But for higher sensitivity things, the aesthetic of the model, the political preferences of the model, the areas that are not totally binary, it turns out that the ordering of how you show the information matters a lot.

In my head, I always imagine it like you’re trying to draw a sheet, a very tight bed sheet over a bed, and that’s your embedding space, and you pull the bed sheet in the upper right-hand corner and the bottom left hand corner pops off, and you do that and then the top right hand corner pops off, that’s sort of what you’re doing. You’re trying to align this high dimensional space to a particular set of mathematical values, and then at some point you’re never going to have a perfect answer or a loss of zero. So, the ordering matters, and fine-tuning is traditionally more pre-training do at the end.

I think that’s originally the liberal leanings of the OpenAI ChatGPT model, came out of that. I think it was a relatively innocuous byproduct of those final data points that you show the model to, it becomes very sensitive to and those data points, it’s very easy to accidentally bias that. For example, if you have just a few words in the internal software you have where you’re giving the human graders prompts in terms of what tokens they should be writing into the model, those words can bias them and if the graders can see the results of other graders, you have these reflexive processes. It’s like a resonant frequency and very quickly it compounds. Errors compound over time. I actually think you could end up without really thinking through it with a model that’s slightly left-leaning, a lot of the online text is slightly left-leaning…

…I think the piece of information that’s most interesting is the fact that Google lacked a very basic process. This is your point, where maybe people thought or maybe people didn’t even think before they launched it and I’m thinking a lot of that famous Steve Jobs interview where he says, “The problem with Microsoft is they just have no taste.” I think the unexpected thing about AI, we’ve talked about it in this podcast, but I don’t think it’s been generally expected, is fine-tuning a model is just as aesthetic an art as making a beautiful landing page for your website.

So in hindsight, it shouldn’t be that surprising that the Borg that built the interfaces of GCP also produced very robotic models, like that’s the same thing and it also should not be surprising to us that Mistral, which a French company with French cultures and now French products, was able to produce a model that to their credit, I mean, it’s not the smartest, but it’s by far the most obedient and has by far the most neutral political tone, at least in my anecdotal testing.

Well, actually, I want to get to Mistral in a moment, but Nat, what does Google do now?

DG: Other than call you?

NF: (laughing) Yeah, I mean I think this is a leadership challenge. There’s a missing editor here and there’s a missing product editor and a missing person with good taste and judgment who gives a damn and has the authority to overrule anyone in the company and make sure the right thing goes out the door. I do think leadership changes have to happen, culture is the hardest type of change to make in a company. You could do strategy change, you could do product change, you could do operational change. Culture change is the one that’s just super difficult and it can only happen with leadership. We either need to see dramatically different behavior from Google leadership or we need to see dramatically different leaders.

3. TIP611: The Bear Case For China w/ Kyle Bass – Clay Finck and Kyle Bass

[00:06:59] Clay Finck: One of the things that sort of struck me in preparing for this conversation is that much of the information that various institutions have used to gather on what’s happening in China has actually been cut off by the CCP and it’s no longer available.

[00:07:14] Clay Finck: So why have such moves? been made by the CCP. We know they like to control data and information flow. And how are you able to get accurate information on what’s happening in China and really make sense of it?

[00:07:28] Kyle Bass: No one has accurate data on China except the Chinese Communist Party. They do and used to, they began to adhere to Western standards and they put together data aggregators that collected both micro macro level data.

[00:07:40] Kyle Bass: And so they had a Bloomberg of China called wind and there were four or five others. And they were actually pretty good, but if you dug into the data, if you looked at the Chinese Customs Bureau for import and export, and you looked at the customs data that was in the wind database 1 year until they recently cut it off, it was off by 200 billion dollars.

[00:08:02] Kyle Bass: Not 2 billion dollars, 200 billion dollars. Then you think about trade with the US is what? 650 billion. So to be off by 200 billion, that just means someone’s really cooking the books. We all knew that Chinese data had low fidelity, and now there just isn’t Chinese data anymore.

[00:08:22] Kyle Bass: As of March of 2023, they severed all of those links to U.S. research universities, to the Fed, to Wall Street writ large, and that data is only allowed out of the mainland. To mainland data, call it readers, and they’re not allowed to share it unless the party approves it. So do you think you’re getting the truth? Probably not. And, they were reporting youth unemployment until they actually reported that it was over 20%.

[00:08:47] Kyle Bass: And then they say, we’re not going to report that anymore. If you read some Chinese scholars while that was going on, 1 of the top scholars at 1 of the top universities in China said. It looks like it’s 46 percent and then they silenced him…

…[00:12:13] Kyle Bass: They’d rather pretend. Those things aren’t bad. And I’ll take you to an October 2023 Reuters release where the People’s Bank of China, which is the regulator or the call it the Chinese Fed that regulates their banking system issued an edict in October 23 and it said, The local government financing bonds that exist in the marketplace in China, it’s a 13 trillion dollar equivalent market, a monster market in China.

[00:12:39] Kyle Bass: It’s all about how the local governments fund themselves by selling real estate. They sell real estate to pay their debts. They issue debt and to gather even more funding. And that 13 trillion dollar market is in default. 80 percent of those bonds are not paying. Those local governments can’t pay because there’s no real estate bid because every public developer in China is in default.

[00:13:00] Kyle Bass: When you think about what the PBOC said in October of 23, they said to the banks, if you own the debt or you own those bonds, you can just say they’re current and it won’t affect your ratings in our annual reviews of the banks. We’re just going to pretend that the market’s paying. Just think about that for a second.

[00:13:17] Kyle Bass: Clay, a 13 trillion market. is in a complete state of default, and we’re just not going to talk about it…

…[00:14:44] Kyle Bass: We really haven’t sanctioned anything or anyone when you really look at this. I know we’re going to try to get serious, but going back to what they’re doing in their legal system, in January of 2020, China updated its foreign investment law, giving Beijing the power and the ability to nationalize foreign assets or investments.

[00:15:03] Kyle Bass: Under special circumstances, which include war, that’s their words, not mine that began in January of 2020. That’s super interesting because that’s when a covid emanated from the city of Wuhan. So that’s when they began their legal movements in the system. In June of 2021, they issued a new counter foreign sanctions law.

[00:15:24] Kyle Bass: Foreign sovereigns that were sanctioning anyone in China, they were saying if Chinese. Corporate interests or international corporate interests that have business in China are adhering to foreign sanctions that are punitive on China. That China can just nationalize their interests, imprison the expats that live there, and basically turn their companies off.

[00:15:49] Kyle Bass: Basically they were countering foreign sanctions by saying we’ll just shut off all of your business here in China and we’ll take everything that you’ve got. That happened on June 21. In April of 23, Chinese lawmakers passed a new update to their anti espionage legislation. If you remember, that’s when they were raiding U.S. due diligence firms.

[00:16:06] Kyle Bass: They raided 3 or 4 firms, they arrested everyone, they took all of the computers, and due diligence firms were just doing due diligence, business due diligence. On potential acquisitions management teams, they’re everything that companies like Bain or McKenzie or these others do when they get hired to do due diligence, that became illegal and that had a chilling effect…

…[00:19:55] Clay Finck: In light of those laws that you mentioned that were passed around COVID and ever since COVID, I actually ran across this chart that showed data from the administration of foreign exchange. It showed that China’s inbound foreign direct investment has just essentially collapsed.

[00:20:10] Clay Finck: It was, this data shows it was north of 300 billion just prior to COVID. And then in 2023 it is around 33 billion. Does that data sound accurate to you?

[00:20:19] Kyle Bass: That’s right. And there’s a caveat to that data where they don’t asterisk and don’t tell you this, but it’s actually wildly negative. And let me explain to you how.

[00:20:27] Kyle Bass: If you are a corporate interest in the U. S. and, or a multinational and you have business in China Tesla’s got business in China, there are plenty of multinationals that have business there. Chevron has business there. The profits they make in China get put in a Chinese bank and China never lets them out.

[00:20:45] Kyle Bass: So I know many multinational companies that have hired friends of mine to try to get their money out. And China just, pardon the pun, gives them a bunch of red tape and won’t allow the money out. Every dollar that’s made by a multinational in China, if it stays in the bank through the end of the year, it’s counted as foreign direct investment into China.

[00:21:06] Kyle Bass: When you look at the FDI numbers, they’ll always be until they nationalize everything, right? Multinational profits in China are automatically FDI. And I think that’s also a lens that we need to be thinking about looking at things through. What is a complete collapse of FDI, by the way, Clay…

…[00:29:20] Clay Finck: So in addition to what’s happening here, in relation to Taiwan, China definitely seems to be going through a financial crisis of their own, which you’ve touched on plenty here. And a lot of data has pointed towards an economic contraction, but they actually reported GDP growth of 5.3 percent in 2023.

[00:29:38] Clay Finck: And real estate is definitely a big part of China’s economy. So What are you seeing in their real estate market and how this plays into the bigger picture?

[00:29:50] Kyle Bass: The data that’s actually being released, again, whether there’s proper fidelity in the data, nobody knows. Clearly it’s suspect, but Hong Kong’s real estate is down over 25%.

[00:30:01] Kyle Bass: Again, since China took over, that’s the largest decline ever. And that’s just a harbinger of more to come. And by the way, that’s probably that’s the reported number. We know the real numbers are much worse and we have a couple of anecdotes from people that we know that have traded in that market and been forced to trade in the real estate market there.

[00:30:22] Kyle Bass: And it’s much worse than people think it is. But when you think about the Chinese, you mentioned that Chinese real estate is vital to their GDP. It’s somewhere between 33 percent and 40 percent of their GDP. It’s 70 percent of their net worth. And it is, it was the primary driver of the Chinese miracle of their GDP growth.

[00:30:41] Kyle Bass: And imagine if you allowed reckless speculation in your real estate markets. Your GDP grows, all the ancillary services grow. Everyone technically gets wealthier and wealthier. The banks lend into it. The bank, their banking system is three and a half times the size of its GDP. The U. S. going into a financial crisis was one time our GDP.

[00:31:02] Kyle Bass: And you know how bad we screwed this up back in 2008. And if you include non banks like Fannie and Freddie and other financials, we’re about 1. 7 times. They’re three and a half times levered to their GDP.

4. Off the Run: Piero Sraffa and Imperial Japanese Government Bonds – Irwin Union

For the better part of 70 years, rumours have followed the Italian economist Piero Sraffa. Long the subject of speculation, it has been asserted that in the dying days of the Second World War, Sraffa heavily bought defaulted Imperial Japanese Government bonds. These, following the Treaty of San Francisco, being eventually honoured in full.

Though several authors have offered differing accounts of what Sraffa was purported to have done, till now, no person has been able to offer a satisfying and granular account of events…

Two credible accounts of Sraffa’s investments survive…

…The second comes from the historian Norman Stone:

The economist Piero Sraffa, editor of the correspondence of David Ricardo and re-floater of Marx’s sunken theory of surplus value, took two economic decisions in his life. He bought Japanese bonds in 1945, and he swapped them in 1960 for gold, dying a very rich man.

…Luckily, recent events, including the opening of Sraffa’s archive at Trinity College, afford new insight in to what Sraffa did, when he did it, and, indeed, how he did it…

…Following her entry into the Second World War, Japan began to default on most of her external obligations in, as best as can be figured, mid 1941.

At the outbreak of the war, a number of Imperial Japanese Government bonds were listed on the London Stock Exchange. These securities were issued in the United Kingdom, denominated in British Pounds and were obligations that Japan had entered into under British law.

Japan could refuse to acknowledge them, but could not inflate them away, nor strike them out by fiat. And so they remained outstanding, with an ongoing market made, all through the war and into the peace that followed; shielded from the worst problems of the immediate post war Japanese economy by dint of their denomination in sterling and their legal domicile.

Following her 1941 default, the bonds, already on the ropes prior to the war, collapsed completely…

…Among the items in Sraffa’s archive at Trinity College are two remarkable sets of papers.

The first is a series of trading receipts issued by the London branch of the Swiss Bank Corporation. These receipts run from 1946 to 1951, and cover Sraffa’s trading of Imperial Japanese Government Bonds, as well as some miscellaneous securities (City of Wilno, Poland at 3.25 of par and Estonian bonds at 6 of par, as well as some common stock.)

The second is a series of letters received by Sraffa from an unnamed Swiss organisation who custodied gold bullion for him.

It’s reasonable to conjecture that this was also the Swiss Bank Corporation, though it’s impossible to know as the letters are so discrete as to carry no letterhead or distinguishing detail of any kind. These letters give us an inventory of Sraffa’s bullion holdings in Switzerland as of 1975, and broadly corroborate Stone’s assertion that Sraffa swapped out of bonds into gold bullion.

From the set of trading receipts, we can, with only a few minor adjustments, build a chronology of Sraffa’s trading, and, thus, a simulated portfolio of his holdings. This portfolio can then be priced using collected price data.

As of 1960, we can substitute the simulated portfolio of bonds for gold and then continue to price the portfolio all the way through to 1983.

Of course, there are wrinkles, discussed vide infra, and so it should be understood that the best that can be done is speculation about Sraffa’s actual record.

Nonetheless, we can get somewhere close to reality, and enough detail is provided for the reader to make her own back of the envelope adjustments and calculations as desired.

I first collected monthly price data for the period from 1946 to 1951 (the period in which Sraffa was actively trading) and six monthly data from 1929 to 1960.

With this data in hand, we can begin to unravel the question of how and what Sraffa accomplished.

Sraffa’s receipts show that between 1946 and 1951, he traded quite frequently, realising capital gains and recycling his proceeds into other issues. However, in late 1951 Sraffa halted his trading altogether.

From here, for the purposes of simulating his record, we assume that the portfolio remained static until 1960.

Sraffa’s final trades consolidate his holdings into the 1899 bond. This issue bore one of the earliest maturity dates…

…On the 9th of March, 1946, as Sraffa was likely contemplating his first purchases, the Financial Times ran a front page story titled Japan Bonds’ Bleak Outlook: Chancellor Reaffirms Gloomy View. The article reported on comments made by the Chancellor of the Exchequer in the House of Commons the previous day, wherein he had stated that:

[…] in the case of British bondholders at large, and in general, I will do my utmost to see that they get fair play. There is nothing new in that, but why humbug Japanese bondholders into believing that they have anything but the very dimmest and remotest chance of recovering anything of their investments?.

Following the Chancellor’s remarks, the bonds sold off by approximately 20%…

…Reading the financial papers of the time, one finds a veritable feast of views on the Japanese loans expressed in articles, opinion pieces and letters to the editor. Indeed, the letters to the editor in particular functioned as a sort of clearinghouse for opinion and query. It’s not a stretch to compare these exchanges to those that happen on message boards and social media today.

Though the full record is too voluminous to feature in full, it is also so information dense that it forms a vital part of any study of the securities.

We learn some extraordinary facts from these articles and letters. For instance, as early as late 1946 thru January 1947, it was being stated that interest on the defaulted bonds had been paid into sinking funds during the war.

One stock which tended to be overlooked when the market was active was the Tokyo Five Percent, 1912. Like Japanese Government Stocks, the interest has been set aside for bondholders in Tokyo throughout the period of the war and after, and Japanese nationals have been paid.

Any question of transfer to British bondholders awaits the signing of the Peace Treaty and the unfreezing of the yen-sterling exchange; the latter process can hardly be a quick one.

Japanese Bonds Speculation – Lex – Financial Times – 27/1/47

We also learn that the amount needed to make British bondholders whole was relatively de minimis. This is because Japanese citizens, for reasons not apparent, owned most of the sterling issues. Japanese citizens were compulsorily converted into Yen denominated bonds in 1943, presumably due to strains on Japan’s foreign exchange balances, leaving only the rump holdings of foreign owners intact.

A correspondent has lately received a cable from the Far East which has bearing on my note of yesterday on Japanese bonds. The cable reads as follows:

“Japanese Sterling Bonds interest paid all Japanese holders in Japan at former rates of exchange until March, 1942. Foreign nationals in Japan paid interest into special custody account. After March, 1943, Japanese owned compulsorily converted into yen bonds. No payments made of interest against unconverted bonds, but still being made on converted.”

That puts the position in a nutshell. Whatever the peace treaty may have to say on the matter, it is a fact, as is pointed out by my correspondent, that the default in interest due to British and Allied holders of Japanese sterling bonds not resident in Japan would not need a large sum to wipe out, as the Japanese always held the larger part of the sterling bonds. Lex

Japanese Post Script – Lex – Financial Times – 28/1/47

We also learn of Japan’s wish to join the United Nations and apply for membership of the IMF.

[…] 6) The goodwill of the Japanese since the end of hostilities, and the expressed desire of the Japanese Government to join the United Nations as soon as permissible after the signing of the Peace Treaty. An intention to apply for membership to the International Monetary Fund once the Peace Treaty has been signed has also been indicated.

Letters to the Editor – Financial Times – 19/4/47

In the following letter, the author, a former resident of Japan, argues that the settlement of the debt would allow Japan to reestablish herself with foreign lenders at negligible cost.

Having spent several years in the service of the Japanese Government and having always kept in close touch with financial circles in that country, I have no hesitation in endorsing the view expressed by one of your readers a few weeks ago, namely, that the bonds in question are the best three-year lock-up on the market to-day, or as “Lex” remarked in your issue dated 2nd January: “If I were asked to name a good speculative long-shot for 1947, I think Japanese bonds would be as strong a starter as any.”

[…] Finally, the amount of Japan’s foreign indebtedness is infinitesimal, and the Government is fully alive to the fact that by meeting its commitments it is reestablishing its financial credits abroad at a very small cost.

Japan Bonds and Reparations

Letter to the Editor – Financial Times – 21/5/47

And then, on the 23rd of December, 1947, there is what can only be described as an extraordinary letter from William Teeling, a member of the House of Commons. This letter is worth inclusion in full.

Sir, -There has been much comment in your paper and elsewhere recently on the widening interest in all Japanese loans. Yesterday (Friday) afternoon I told a number of business men in the City interested in Japan what I know about these loans, and I feel that it is only fair that everyone should know, since contact with Japan and the Japanese is so difficult.

I have just returned as a member of a Parliamentary delegation which spent six weeks in the Far East, and while in Tokyo I made it my business to inquire about these loans which interest so many people here.

The Finance Minister in the present Japanese Coalition Cabinet told me that all interest accrued on the Japanese bonds would definitely be paid when peace with America has been signed. He could not say yet at what rate, but it would definitely not be at the rate when war broke out. He added that even during the war bondholders in Switzerland for certain loans were paid and he assured me that money has all the time been set aside in Tokyo for this purpose.

This was confirmed to me at a later meeting with heads of Japanese business firms and banks at which meeting the Foreign Secretary, Mr. Ashida, was also present. Mr. Ashida explained to me that new loans from America were essential and therefore Japan must keep up her reputation for meeting her debts and would pay off her earlier loans.

Reparations officials confirmed that the sums outstanding are small and could be repaid. The American officials concerned told me that a rate for the repayment of all debts will shortly be fixed and will definitely take into account the present depreciation of the yen.

But when will peace be signed? I only know that America was waiting for the recent Four Power Conference to break down before going ahead on a separate peace with Japan, and Great Britain will reluctantly support her as it is the only solution, but it will mean the strengthening of Japan and that means more loans.

William Teeling. House of Commons, S.W.1.

Letters to the Editor – Financial Times – 23/12/47…

…On the 23rd of August, 1949, we learn that Japan’s total external debt was then $323mm USD with approximately $80mm USD of unpaid interest thereon. We also learn that British claims totalled approximately £62mm GBP.

Kaneschichi Masuda, Japanese chief Cabinet Secretary, said here today that he was unable to reveal any practical plans whereby Japans foreign bond commitments could be met.

[…]

He said that $323m. worth of bonds were held by foreigners, on which $80m. in interest had accumulated. British subsribers held about £62m. of this amount.

Japan and Bond Repayment – Financial Times – 23/8/49

However, it was not so cut and dried. By 1951, the mood had soured, and the question of reparations, long simmering, had become acute. In April, Teeling again wrote to the Times, this time expressing concern about the lack of progress and the possible outcomes for British bondholders.

At question was whether reparations would rank ahead of foreign bondholders, and whether reparations might exhaust Japan’s capacity to make foreign bondholders whole, irrespective of her desire to do so.

Then, on the 13th of August, news of formal recognition by the Japanese Government of her prewar debts was published in the Financial Times.

Japan will not be restricted milatarily, politically or economically under the draft peace treaty published yesterday by Britain and the United States.

Japan affirms its liability for the pre-war external debt of the Japanese State, and for debts of corporate bodies subsequently declared to be liabilities of the Japanese State, and expresses its intention to enter on negotiations at an early date with its creditors with respect to the resumption of payments on those debts.

It will facilitate negotiations in respect to private pre-war claims and obligations; and will facilitate the transfer of sums accordingly.

Japanese bonds were active on the London Stock Exchange yesterday. Prices rose sharply at the opening and were up to £5 higher at one time. Following publication of the terms of the draft treaty there was, however, considerable profit taking. As a result, closing prices well after hours were £4 below the best.

Japan Recognises Debt Liability; Prepared for Talks on Payments – Financial Times – 13/8/51

The formal end of hostilities between Japan and the Allied powers came in September, 1951, with the signing of the Treaty of San Francisco. With the treaty formalised, Japan was now able to turn to the issue of settling her defaulted foreign obligations.

In March, 1952, the Financial Times reported that the Japanese Government was placing £20mm GBP on deposit in London as a goodwill gesture.

The Treasury announces that the Japanese Foreign Exchange Control Board is arranging to deposit with the Bank of England £20m. as a token of good will towards the holders of Japanese sterling bonds.

The initiative for this move was taken by the Japanese Foreign Minister. When neccessary formalities have been completed, the sum will be deposited and will remain with the Bank of England for two years.

During that period, it will be available for any payments by Japan to her creditors in connection with a settlement of her sterling bond indebtedness.

Japan to Deposit £20m. in London – Financial Times – 29/3/52

The front page of the 29 September issue of the Financial Times read Japan to Pay Full Interest Arrears, and detailed the terms agreed upon in New York.

After negotiations lasting nearly two and a half months, agreement has been reached in New York on the treatment of Japan’s bonded debt to Britain and the United States. It is a settlement that goes a very long way to meeting British Claims. The service on outstanding issues is to be resumed forthwith. Interest arrears that have piled up since the Pearl Harbour affair brought Japan into the war are to be met in full, though at a time lag of ten years from the due dates. There is a similiar arrangment for the treatment of repayment obligations. Moreover, the currency clauses included in a number of the debts under discussion at the conference are to be substantially honoured. The Japanese have, in short, comitted themselves to do what they said they would do before the conference began.

Contractual Terms – Financial Times – 29/9/52

On the 24th of November, the Times published the full terms of the settlement.

Briefly, the terms provided for the extensions of maturities by ten and fifteen years, a catch up payment generally equal to a single coupon, and the amortisation of accumulated defaulted coupons by the payment of one current and one defaulted coupon for each payment period until all defaulted coupons had been settled. This, in effect, doubling the coupon of each bond for a discrete period…

…With firm details of the restructuring of the loans, we can now model the post 1951 evolution of Sraffa’s portfolio through to 1960. I assume that Sraffa allowed his coupons to accumulate in cash, rather than reinvesting them.

With this account curve in hand, we can now model his swap to gold bullion in 1960.

At the end of 1960, Sraffa’s simulated account had a value of £52,676.

At year end 1960, a kg of gold bullion cost £404.46. Thus, assuming no frictions, we find that Sraffa swapped his bonds and cash for ~ 130 kg of gold bullion.

With this, we now have a complete simulated account curve for the entire period.

According to these calculations, Sraffa compounded his initial simulated outlay of £8000 cash into £1,105,839, a multiple of 138 times, or 13.97% per annum over approximately 38 years.

5. Thoughts on Ben Graham’s “Unpopular Large Caps”: A Still-Effective Strategy – John Huber

In the spirit of Graham’s categories, I recently gave a presentation to Saber investors during our latest client Zoom call with an overview of my own three main categories of our own investments: 1) Core operating businesses that we hope can compound value for a decade+, 2) Time Arbitrage (Similar to Ben Graham’s Unpopular Large Caps) and 3) Bargains.

This “Category 2” provides a frequent enough flow of ideas thanks to a very simple fact: stocks fluctuate much more than true business values do…

…I’ve written about the concept of “investment edge” on numerous occasions (see: What is Your Edge?), and how in today’s world, information has become easier to get and thus more of a commodity. But this information access, along with other technologies, has caused our attention spans to become shorter and shorter, which I think has diminished our patience and our time horizons. We want results now. This has created a “time arbitrage” opportunity, and I expect this will only gain strength as time horizons and patience levels continue to shorten.

Past examples of Category 2 ideas would include Apple in 2016 when pessimism surrounding the next iPhone cycle and worries about Apple’s competition caused the stock to fall below 10 P/E, Verisign when worries about government intervention into its pricing practices caused the stock to fall to multiyear valuation lows, or large banks like BAC and JPM in 2015-2016 when the market was expecting and fearing a difficult economy (and larger loan losses). More recent examples of mispriced large caps might include large cap tech stocks in 2022: AMZN fell 50% in 2022 and rose 80% in 2023, and that was mild compared to what happened at numerous other mega cap stocks. The valuation levels fluctuate far more than business values.

To be clear, there always is a legitimate negative fundamental case to be made when stocks get mispriced, but I think the majority of the time these concerns tend to be focused on the short term. Amazon over invested in warehouse capacity because it overestimated the growth in online retail sales, but was this going to negative impact Amazon’s long-term moat? (I would argue that in one sense it actually further entrenched their moat, making it very difficult for other retailers with lesser capacity to offer the same experience of low cost and speed of delivery: another large online marketplace with ambitions to enter the logistics space ended up throwing in the towel during this period). Sometimes, these short-term difficulties end up being long-term beneficial for the “unpopular large caps”, and the great thing about this category of investment is you get to acquire a stake in these better-positioned large companies when their stocks are depressed.

JPM is recent example of a Category 2 idea as well: the stock traded down under 8 P/E in the summer of 2022 when recession fears were prevalent (similar to what happened in 2016 to bank stocks).

I think Jamie Dimon had some great advice on the right mindset last year when he said (paraphrasing): “in 20 years, the world’s stock market capitalization will be much higher, the assets in the banking system will be higher, corporate earning power will be higher, the dollar volume of merger transactions will be higher, global payment volume will be higher.” The implication is JPM has a durable moat and thus is positioned to take a cut of all of that business. Earnings might decline in the near term, but what matters to business values is the long-term free cash flows that it earns over time.

Disclaimer: The Good Investors is the personal investing blog of two simple guys who are passionate about educating Singaporeans about stock market investing. By using this Site, you specifically agree that none of the information provided constitutes financial, investment, or other professional advice. It is only intended to provide education. Speak with a professional before making important decisions about your money, your professional life, or even your personal life. We currently have a vested interest in Alphabet (parent of Google), Amazon, Apple, Meta Platforms, Microsoft, and Tesla. Holdings are subject to change at any time.

What We’re Reading (Week Ending 03 March 2024)

The best articles we’ve read in recent times on a wide range of topics, including investing, business, and the world in general.

We’ve constantly been sharing a list of our recent reads in our weekly emails for The Good Investors.

Do subscribe for our weekly updates through the orange box in the blog (it’s on the side if you’re using a computer, and all the way at the bottom if you’re using mobile) – it’s free!

Here are the articles for the week ending 03 March 2024:

1. The Future of Ecommerce That Wasn’t: An In-depth Look into What Went Wrong with Wish – Speedwell Research

A good explanation is a story that is hard to vary. If we did a postmortem of WebVan (grocery delivery) or Pets.com (online specialty store for Pets), what would we say went wrong? If we did the postmortem in 2006, most likely we would have said it was a silly and unrealistic idea. But if we were to do a postmortem now, with the existence of Instacart (grocery delivery) and Chewy (online specialty store for pets), how would our understanding change?

This is not a trivial exercise. It is far too easy to be dismissive about a failing business and think it was the entrepreneur’s ill-thought-out idea or just incompetence, but this does not hold scrutiny.

Look at Apple. For how many years was Steve Jobs and his insistence on not licensing the Mac operating system seen as the impetus for their failure? And the same thing happened again when the iPhone was released: analysts thought their unwillingness to license the phone’s iOS would ultimately lead to their demise. Now though, Apple’s success is attributed to their close integration and their proprietary software is a key selling point, which wouldn’t be possible if they licensed it.

If you took over Lego in 2004 when it was nearing bankruptcy, what would you diagnose as the problem? Would you have thought that with digital entertainment kids just don’t want to play with toy blocks anymore? Or would you have thought the focus on “noncore” activities like theme parks, clothing, and video game development were the issues? Perhaps the product was good but was simply too expensive? You know that today there are vastly more digital entertainment options than there were in 2004, they still have theme parks and video games, and their products are still expensive, so what was it?

If you were appointed CEO of Crocs in 2008 when their stock dropped 98% and was on the verge of entering bankruptcy, tell us that you wouldn’t be tempted to lay the blame on the aesthetics of the shoes. It is the most ridiculed shoe design with “ugly” virtually synonymous with Crocs and yet they now sell over 150 million of them a year. Again, what some people would identify as the problem of the business turned out to be a virtue…

…So, if we are saying Wish was unsuccessful because of their focus on cheap items with slow shipping, we shouldn’t be able to point to another company that did something similar and was successful…

…We will do one final analysis to estimate churn before concluding. However, we want to note that this analysis is unnecessary to make the point we are about to. If you simply saw that they lost users despite spending >80% of revenues on marketing, or almost ~$1.5bn, is there any explanation you would accept that could convince you the business was healthy? Imagine your Chief Marketing Officer just told you they spent $1.5bn to lose 2mn buyers and grow revenues 2%. How would anyone possibly see that as a good thing? And yet, with a little bump in numbers from Covid in 2020, it was overlooked by investors in favor of the hope of buying the next Amazon at IPO.

In the S-1 they disclose the following LTV chart. They calculate “LTV” as cumulative gross profits over a period of time attributable to new buyers acquired in a given cohort, divided by total new buyers, which is most certainly not what an LTV really is.

For example, let’s say a cohort generated $15 of gross profit in year one and then another $10 of gross profit in year two. They would add those two numbers up and say the “LTV” of the customer in year 2 is $25. Therefore, if you wanted to calculate how much each cohort generated in gross profits in a given year, you just have to take the difference between each year. In this example, this cohort generated $10 in gross profits in the second year versus $15 in the first year, suggesting ample churn. What you would want to see is each year’s incremental figure stay steady or increase.

The chart above shows cumulative gross profit by cohort. If it was a perfectly linear line, then that would mean in each period the cohort bought the same amount of goods as the previous period.

We will focus just on the 2017 cohort for simplicity. We annotated it to show how we estimated incremental gross profits. The average buyer from the 2017 cohort earns $15 in gross profits in year 1, which drops to $10 in year 2, and then to $6 in year 3. We can already see that by year 3, each cohort is generating about 1/3rd what it did in year 1, which suggests heavy churn. Remember that Wish’s payback period is about 2 years, which means it isn’t until year 3 they make that small incremental gross profit. And remember, this is just to pay back the initial marketing investment, not other S&M they spent on promotions to reengage that buyer.

Here, we can see that that the difference between the gross profit for total buyers divided by the gross profit per average active buyer gives us a churn estimate. At the end of year 1, 100% of buyers are active (by definition) and by year three that drops to 19%. That comes out to about 44% annual churn over two years. It is also noteworthy that the churn is much worse in the first year. A full 67% of buyers do not return after buying once.

Now, remember that their average payback period is under 2 years. That is rather problematic in the context of almost no one being left after 2 years! They have a thin amount of remaining users that not only need to cover all of the reengagement marketing, but also all of their G&A and R&D cost. And that’s before they can even make a profit!

This is a fundamentally broken business. Users do not stay long enough, they have to pay to get users to return, and users are not profitable…

…Earlier we said that an explanation is a story that cannot easily vary. Well, we have trouble figuring out exactly what the story is that cannot vary. There is nothing in principle wrong with an ecommerce offering catered to the consumers in the low-end of household earnings. Some would note that the low average order value would make it hard to make enough contribution profit per order, but that is essentially what Pinduoduo did in China, what Shopee is doing in Southeast Asia and Brazil, and what Temu is doing in the US. While we don’t know exactly if all of those initiatives will end up being profitable, it is hard to claim it is the idea itself that is rotten.

Clearly, Wish had a problem with both their high cost to acquire users and their ability to retain them. We know that Pinduoduo had a better customer acquisition engine piggy backing off of Tencent’s Weixin with preferred placement and the Community Group Buy model was a novel way to spur consumer sharing, free of charge. Shein had TikTok and went viral early on with “Shein hauls”, where influencers would post everything they purchased. They would later lean into influencer marketing on TikTok to much success. Amazon has Amazon Prime which helps retain users, and their optimal customer service helps keep customers satisfied at potential churn events. Wish was lacking something in the customer acquisition and retention area, but exactly what isn’t obvious.

Perhaps it was a mix of everything that individually created customer churn events from slow shipping to “unreliable shipping”, fraud, fake listings, sub-par customer service, inadequate item selection, poor item quality, inaccurate recommendations, or perhaps even internal issues. But again, other companies have survived similar or worse issues. And the longer the list, the more it speaks to our lack of confidence in any one variable. As an investor from the outside, it isn’t apparent what the key problem was, at least not to us.

What is crystal clear though is that there were issues since at least 2019, and some red flags prior. An investor only needed the company’s IPO prospectus to see these problems brewing, and could have avoided even worrying about any potential “narrative fallacy” by just focusing on the financials.

2. Bill Ackman: Investing, Financial Battles, Harvard, DEI, X & Free Speech | Lex Fridman Podcast #413 (partial transcript here) – Lex Fridman and Bill Ackman

Bill Ackman (57:12): So this was at the time of the Financial Crisis, circa November 2008. Real estate’s always been a kind of sector that I’ve been interested in. I began my career in the real estate business working for my dad, actually arranging mortgages for real estate developers. So I have kind of deep deep ties and interest in the business. General Growth was the second largest shopping mall company in the country – Simon Properties many people have heard of – General Growth was number two. They own some of the best malls in the country…

…General Growth the company, the CFO in particular, was very aggressive in the way that he borrowed money. He borrowed money from a kind of Wall Street – not long-term mortgages – but generally relatively short-term mortgages. He was pretty aggressive. As the value went up, he would borrow more and more against the assets and that helped the short-term results of the business. The problem was during the Financial Crisis, the market for what’s called CMBS – commercial mortgage backed securities – basically shut. And the company, because its debt was relatively short-term, had a lot of big maturities coming up that they had no ability to refinance. The market said, “oh my God, the lenders are going to foreclose and the shareholders are going to get wiped. The company’s going to go bankrupt, they’re going to get wiped out.” The stock went from $63 a share to 34 cents. There was a family, the Bucksbaum family owned I think about 25% of the company and they had a $5 billion stock that was worth $25 million or something by the time we bought a stake in the business.

What interested me was, I thought the assets were worth substantially more than the liabilities. The company had $27 billion of debt and had $100 million value of the equity, down from like $20 billion. And sort of an interesting place to start with a stock down 99%. But the fundamental drivers – the mall business – are occupancy, how occupied are the malls? Occupancy was up year on-year between ‘07 and ‘08. Interestingly, net operating income, which is kind of a measure of cash flow from the malls – that was up year-on-year. So the underlying fundamentals were doing fine. The only problem they had is they had billions of dollars of debt that they had to repay – they couldn’t repay

If you examine the bankruptcy code, it’s precisely designed for a situation like this where it’s this resting place you can go to restructure your business. Now the problem was that every other company that had gone bankrupt, the shareholders got wiped out. And so the market, seeing every previous example the shareholders get wiped out, the assumption is the stock is going to go to zero. That’s not what the bankruptcy code says. What the bankruptcy code says is that the value gets apportioned based on value, and if you could prove to a judge that the assets’ worth more than the liabilities, then the shareholders actually get to keep their investment in the company. And that was the bet we made.

So we stepped into the market. We bought 25% of the company in the open market. We had to pay up. It started at 34 cents – I think there were 300 million shares – so it was at a $100 million value. By the time we were done, we paid an average of – we paid $60 million for 25% of the business, so about $240 million for the equity of the company. And then we had to get on the board to convince the directors the right thing to do. The board was in complete panic, didn’t know what to do, spending a ton of money on advisers…

…And the key moment, if you’re looking for fun moments, is there’s a woman named Maddie Bucksbaum who was from the Bucksbaum family. Her cousin John was chairman of the board, CEO of the company. And I said – as she calls me after we disclose our stake in the company, she’s like “Billy Ackman, I’m really glad to see you here.” I met her – I don’t think it was a date – but I kind of met her in a social context when I was 25 or something. And she said, “I’m really glad to see you here and is there anything I can do to help you, call me.” I said, “Sure.” We kept trying to get on the board of the company, they wouldn’t invite us on. Couldn’t really run a proxy contest, not with a company going bankrupt, and their advisers actually were Goldman Sachs and they’re like, “You don’t want the fox in the hen house” and they were listening to their advisors. So I called Maddie up and I said, “Maddie, I need to get on the board of the company to help.” And she says, “I will call my cousin and I’ll get it done.” She calls back a few hours later, “You’ll be going on to the board.” I don’t know what she said, but she was convincing.

Next thing you know, I’m invited to the board of the company and the board is talking about the old equity of General Growth. Old equity is what you talk about when the shareholders are getting wiped out. I said, “No, no, no. This board represents the current equity of the company. I’m a major shareholder, John’s a major shareholder, there’s plenty of asset value here. This company should be able to be restructured for the benefit of shareholders.” And we led a restructuring for the benefit of shareholders and it took let’s say eight months and the company emerged from Chapter 11. We made an incremental investment into the company and the shareholders kept the vast majority of their investment. All the creditors got their face amount of their investment – par plus accrued interest. And it was a great outcome. All the employees kept their jobs, the malls stayed open, there was no liquidation. The bankruptcy system worked the way it should. I was in court all the time and the first meeting with the judge, the judge is like “Look, this would never have happened were it not for a financial crisis.” And once the judge said that, I knew we were going to be fine because the company had really not done anything fundamentally wrong – maybe a little too aggressive in how they borrowed money.

Stock went from 34 cents to $31 a share…

…Lex Fridman (1:05:44): How hard is it to learn some of the legal aspects of this? You mentioned bankruptcy code – I imagine it’s very sort of dense language and dense ideas and the loopholes and all that kind of stuff. If you’re just stepping in and you’ve never done distressed investing, how hard is it to figure out?

Bill Ackman (1:06:05): It’s not that hard. I literally read a book on distressed investing. Ben Branch or something, on distressed investing.

Lex Fridman (1:06:12): So you were able to pick up the intuition from that, just all the basic skills involved, the basic facts to know, all that kind of stuff.

Bill Ackman (1:06:20): Most of the world’s knowledge has already been written somewhere. You just got to read the right books.

3. Why is Google killing cookies? – Eric Benjamin Seufert

What is Google’s underlying motivation in deprecating third-party cookies in Chrome? Suspicion is warranted. Google’s mission statement for its Privacy Sandbox initiative is to “Protect [user] privacy online,” across its Chrome browser and its Android operating system (Google intends to deprecate its GAID Android identifier at some point). Cookies, unquestionably, present severe data leakage risks to consumers: they allow anonymous services to observe the web activities of users with little preventative recourse. But as I point out in this piece, “privacy” is an abstract social concept, and firms – but especially multi-trillion dollar market leaders – don’t make dramatic, sweeping policy changes absent commercial benefit. Believing that a company would utterly reform the mechanics of digital advertising solely in service of increased user privacy is as absurd as believing that two firms would engage in a merger as an expression of friendship. To not assume a commercial motive in cookie deprecation is naive.

Apple’s App Tracking Transparency (ATT) privacy policy is an apt example of this. Apple launched an international PR campaign championing the privacy safeguards of the iPhone following its introduction of ATT in April 2021. Yet as I point out in this piece, Apple collects and utilizes consumer data in the ways that ATT was ostensibly designed to prevent. Apple positions its use of install and purchase data collected via consumer engagement in apps that it doesn’t own as “ads personalization” and not “tracking.” Apple claims first-party privileges over this consumer data because Apple exerts (and is stridently maintaining a firm grip on) control over iOS payments, giving it exclusive, proprietary access to that data. And in a court filing from December 2023, Apple had the following to say about the logical contortions of its privacy policies (all emphasis from the document):

The Allow Apps to Request to Track setting governs whether apps can ask to track users across apps or websites owned by other companies, as Apple’s descriptions of the setting consistently make clear … Plaintiffs also include a screen shot of the Tracking disclosure, which explains that Apple “requires app developers to ask for permission before they track your activity across Apps or websites they don’t own.” … Given Apple’s extensive privacy disclosures, no reasonable user would expect that their actions in Apple’s apps would be private from Apple.

This isn’t to say that Google and Apple don’t employ well-meaning, intelligent, and highly effective people whose efforts are centered on promoting their conceptions of digital privacy. But digital privacy initiatives from publicly traded, multi-trillion-dollar corporations must be viewed in a broader commercial context…

…So given that Google must have a commercial motivation in deprecating cookies, what is it? The most obvious is simply margin expansion: Google’s network business, which serves ads on third-party websites and apps, will almost certainly suffer if the Privacy Sandbox is less effective for targeting and measurement than cookies (and early indicators suggest it is). If the economics of buying third-party open web inventory through Google’s tools degrades, some of that demand may simply be routed to Google’s owned-and-operated channels. And these channels feature much higher margin for Google than its Network business: Bernstein estimated in December 2022 that Google’s margin on Network revenue is 10%, while it’s 15% for YouTube and 55% for Search. As I argue in this piece, because advertising budgets are deployed against absolute performance, Google will likely lose some degree of top-line revenue if its Network business unit declines. But Google doesn’t need to shift all of the revenue from Network to these channels to maintain its current bottom line given the margin differentials: $1BN in Network revenue produces the same margin as $181MM in Search revenue.

4. Twitter thread on life and investing lessons from climbing Mt Kinabalu – Eugene Ng

1 | Hiking is a marathon, not a sprint. It is about finishing, whether or not you finish first or last it doesn’t matter, as long as you finish. There is no gold medals for the fastest, and only rescue for those who don’t. It only matters as long as you finish, and that you remain safe when you finish. Safety first, go slow.

It is the same with investing. Never be permanently wiped out, avoid all unlimited downside trades, and then you can focus on making asymmetric bets with unlimited upside…

…3 | I was in awe out the scale of the human labour require for the entire operations. We saw numerous porters carrying up 20-40+ kg of fresh water, food, furniture, equipment for the lodge where we stayed at. There were also a number of porters who carried up luggage for some climbers as well. Without them, the support, the ecosystem, none of these would have been possible for us to experience the climb.

It is ever easier than ever to get data, but we cannot be lazy. We need to learn to appreciate the ecosystem and what we have now with the internet, versus 50 years old with libraries and faxes. Use them to your advantage. Easily available does not mean everyone will actually read them. Do not confuse it.

4 | We had a fantastic mountain guide who was one of the oldest and fittest at 52 years and he has been doing it over 32 years old since he was 20 years old. He still does this three times a week and will retire next year. He was leading us in our walk through easier path with such a controlled and comfortable pace, like he is meditating. Without him, it would have been so much more difficult. Having the right leader to help guide you really matters.

Having the right mentors, the right people around you matters. They can have the right expertise, experience to share that can help you in your journey to become better and to avoid the pitfalls…

…6 | When ascending up and descending back down, it is not an individual effort but a collective team effort. The company matters. Without the right company to support you mentally on every step of the way makes so much of a difference, everyone has a role to play.

Like in investing, there are going to be ups and downs. The right people/investors to stand by you matters, and not run away at the first sight of trouble. Choosing carefully the right team to the best of your ability matters.

7 | Sometimes a member of your team is not going to be feeling very well or can be injured, it is being prepared and bringing along extra supplies, medical or food, and continuously supporting them with what you have physically and mentally. Remember that if they can’t finish, you can’t finish.

Know that the businesses that we invest in are not going to do well all the time, it is not going to be a straight line. There are going to be ups and downs, and they will zig and zag from time to time. We need to have the patience to stand by them through difficult times, and the good times, and not sell them.

8 | Run your own race at your own pace, sometimes you will overtake and sometimes others will overtake. Don’t be stressed by someone behind you trying to push you go faster. You set your own pace. If they want to overtake, just stand to the side and let them overtake, if not just chill. Separately, if you want to overtake someone slower, then just overtake on the side.

Find your own investment strategy that suits you best, that energises you. The real race is against yourself, not against others. There will always be someone who will do better than you in any given year, so chill. It is not about being the top 10% in a year, but the top 10% after 10 or 20 years and more…

…11 | Always remember to never get complacent and choose speed, or get distracted. Do a misstep over a loose rock, and you may just end up spraining your ankle (like me), and end up not finishing the climb. But thankfully, it was serious and painful, but it was still okay enough for me to complete the last 8km. It was insanely painfully with every descend as my right ankle landed on every step.

Never think highly of yourself. Stay humble, have humility. The moment you lose that, you stop listening, you stop absorbing, you stop learning, and then with a mis-step you might just result in eventual failure. Never do that.

12 | At the end, despite how much preparation, it is really willpower at the end that gets everyone through to the end. It can be so powerful, the human mind and the will power. Despite how tough it is, we were just highly focused on taking step at a time mindfully and carefully, that’s all that mattered. I sprained my right ankle horribly with 8km left, it was really painful, but I kept persisting, and my teammates were patient with me and walked slower. “Stay hard” by David Goggins was our slogan to keep us going.

Investing too is a slog, managers get paid to endure all the emotional and psychological elements with all the ups and downs. It is knowing when to keep pursuing and staying the course even especially when the going gets tough…

…14 | Memories over medals. We did not finish first, but we finished in the end, and that’s what matters. To Team Endurance!

If you beat the index after 10 or 20 years, you will be in the top quartile. You want to keep staying and playing the game, and keeping doing okay and eventually you will do very well.

5. Things I Don’t Know About AI – Elad Gil

In most markets, the more time passes the clearer things become. In generative AI (“AI”), it has been the opposite. The more time passes, the less I think I actually understand.

For each level of the AI stack, I have open questions…

…There are in some sense two types of LLMs – frontier models – at the cutting edge of performance (think GPT-4 vs other models until recently), and everything else. In 2021 I wrote that I thought the frontier models market would collapse over time into an oligopoly market due to the scale of capital needed. In parallel, non-frontier models would more commodity / pricing driven and have a stronger opensource presence (note this was pre-Llama and pre-Mistral launches).

Things seem to be evolving towards the above:

Frontier LLMs are likely to be an oligopoly market. Current contenders include closed source models like OpenAI, Google, Anthropic, and perhaps Grok/X.ai, and Llama (Meta) and Mistral on the open source side. This list may of course change in the coming year or two. Frontier models keep getting more and more expensive to train, while commodity models drop in price each year as performance goes up (for example, it is probably ~5X cheaper to train GPT-3.5 equivalent now than 2 years ago)

As model scale has gotten larger, funding increasingly has been primarily coming from the cloud providers / big tech. For example, Microsoft invested $10B+ in OpenAI, while Anthropic raised $7B between Amazon and Google. NVIDIA is also a big investor in foundation model companies of many types. The venture funding for these companies in contrast is a tiny drop in the ocean in comparison. As frontier model training booms in cost, the emerging funders are largely concentrated amongst big tech companies (typically with strong incentives to fund the area for their own revenue – ie cloud providers or NVIDIA), or nation states wanting to back local champions (see eg UAE and Falcon). This is impacting the market and driving selection of potential winners early.

It is important to note that the scale of investments being made by these cloud providers is dwarfed by actual cloud revenue. For example, Azure from Microsoft generates $25B in revenue a quarter. The ~$10B OpenAI investment by Microsoft is roughly 6 weeks of Azure revenue. AI is having a big impact on Azure revenue revently. Indeed Azure grew 6 percentage points in Q2 2024 from AI – which would put it at an annualized increase of $5-6B (or 50% of its investment in OpenAI! Per year!). Obviously revenue is not net income but this is striking nonetheless, and suggests the big clouds have an economic reason to fund more large scale models over time.

In parallel, Meta has done outstanding work with Llama models and recently announced $20B compute budget, in part to fund massive model training. I posited 18 months ago that an open source sponsor for AI models should emerge, but assumed it would be Amazon or NVIDIA with a lower chance of it being Meta. (Zuckerberg & Yann Lecunn have been visionary here)…

...Are cloud providers king-making a handful of players at the frontier and locking in the oligopoly market via the sheer scale of compute/capital they provide? When do cloud providers stop funding new LLM foundation companies versus continuing to fund existing? Cloud providers are easily the biggest funders of foundation models, not venture capitalists. Given they are constrained in M&A due to FTC actions, and the revenue that comes from cloud usage, it is rational for them to do so. This may lead / has led to some distortion of market dynamics. How does this impact the long term economics and market structure for LLMs? Does this mean we will see the end of new frontier LLM companies soon due to a lack of enough capital and talent for new entrants? Or do they keep funding large models hoping some will convert on their clouds to revenue?…

…What happens in China? One could anticipate Chinese LLMs to be backed by Tencent, Alibaba, Xiaomi, ByteDance and others investing in big ways into local LLMs companies. China’s government has long used regulatory and literal firewalls to prevent competition from non-Chinese companies and to build local, government supported and censored champions. One interesting thing to note is the trend of Chinese OSS models. Qwen from Alibaba for example has moved higher on the broader LMSYS leaderboards…

…How much of AI cloud adoption is due to constrained GPU / GPU arb? In the absence of GPU on the main cloud providers companies are scrambling to find sufficient GPU for their needs, accelerating adoption of new startups with their own GPU clouds. One potential strategy NVIDIA could be doing is preferentially allocating GPU to these new providers to decrease bargaining power of hyperscalers and to fragment the market, as well as to accelerate the industry via startups. When does the GPU bottleneck end and how does that impact new AI cloud providers? It seems like an end to GPU shortages on the main clouds would be negative for companies whose only business is GPU cloud, while those with more tools and services should have an easier transition if this were to happen…

…ChatGPT launched ~15 months ago. If it takes 9-12 months to decide to quit your job, a few months to do it, and a few months to brainstorm an initial idea with a cofounder, we should start to see a wave of app builders showing up now / shortly.

Disclaimer: The Good Investors is the personal investing blog of two simple guys who are passionate about educating Singaporeans about stock market investing. By using this Site, you specifically agree that none of the information provided constitutes financial, investment, or other professional advice. It is only intended to provide education. Speak with a professional before making important decisions about your money, your professional life, or even your personal life. We currently have a vested interest in Alphabet (parent of Google), Amazon, Apple, Meta Platforms, Microsoft, and Tencent. Holdings are subject to change at any time.

What We’re Reading (Week Ending 25 February 2024)

The best articles we’ve read in recent times on a wide range of topics, including investing, business, and the world in general.

We’ve constantly been sharing a list of our recent reads in our weekly emails for The Good Investors.

Do subscribe for our weekly updates through the orange box in the blog (it’s on the side if you’re using a computer, and all the way at the bottom if you’re using mobile) – it’s free!

Here are the articles for the week ending 25 February 2024:

1. Wang Chuanfu: A Name Everyone in the West Should Know – Kevin Xu

Wang Chuanfu (王传福), the founder of BYD which just beat Tesla in global electric vehicles sales, is virtually unknown in the west. Even in China, he is only well-known in the business circle and has a low profile otherwise compared to the more flashy tech entrepreneur, Jack Ma, or the more cosmopolitan investor-turned-founder, Kaifu Lee.

Whether you think China’s mass production of EVs and other renewable energy products is a net-positive for dealing with climate change, or an evil “onslaught” on the west, BYD’s global impact is hard to ignore and cannot be wished away. Its batteries have been powering millions of cell phones long before it started making cars. Its EVs can be now seen on the streets of every Chinese city, and quite a few European and Latin American cities. Its battery-powered buses are transporting commuters in Hyderabad, Bogotá, and the Los Angeles International Airport. It is also making electric SkyRails (subway in the air) that may soon appear in São Paulo’s skyline. Oh, and it supplies batteries to Tesla too.

Wang Chuanfu, the pudgy-faced chemist-turned-entrepreneur, is the main, if not the sole, reason why BYD, which meant literally nothing when the company was incorporated in 1995, became BYD, which now means “Build Your Dreams.” The late Charlie Munger called him a “genius”. Yet, there is no comprehensive biography (that I’m aware of) about the man. (Musk, on the other hand, has at least three about him.)…

…It is difficult to describe just how poor Wang’s upbringing was and how much the cards were stacked against him to amount to anything. In fact, his plan was to get into a vocational high school, not university, because it was easier in the early 1980s in China to get a job with vocational training. But the year he applied was the same year that his mother passed away, so he was affected by the loss and didn’t get in. Instead, he ended up in a normal high school that inadvertently paved the path for him to eventually attend a university. Even though he could have dropped out, his older brother insisted on supporting him financially, so he could focus on studying, get into a university, and bring the whole family out of poverty.

As the story goes, because Wang had no guidance or tutelage from his parents or anyone else, he read a lot of books on his own and developed some early muscle as an independent thinker. He had no choice. He ended up going to Central South University of Technology in the neighboring province of Henan as a chemistry major. In his own telling, Wang did not even remember applying to this school. His first choice was the Hefei University of Technology in his home province to study wireless technology, because he liked playing with radios as a kid, but he didn’t get in…

…With a 250,000 RMB loan from a cousin who worked in finance, Wang incorporated BYD in Shenzhen in 1995, where nothing was built and anything was possible. Registering a similar company in Beijing would have been a huge hassle, but in Shenzhen as a pilot SEZ, it sometimes took as little time as one day to form a company. Thus, there were a ton of companies being incorporated. In a rush, Wang chose B(比) Y(亚) D(迪) – three random Chinese characters that meant nothing together – because it was a name that wasn’t used yet. He optimized the first character’s pinyin for being earlier in the English alphabet, so the name could be seen earlier at a trade show or conference. (Jack Ma picked Alibaba for the same reason.)

Back then, the global leaders in battery manufacturing were Japanese giants – Sanyo, Panasonic, Phillips. Sanyo, in particular, was the company Wang aspired to and wanted to beat. But BYD was poor and could not afford any of the advanced equipment or assembly lines that Japanese manufacturers were using. So Wang reverse-engineered the manufacturing process, broke it down into small pieces, then hired very cheap human labor – the only advantage China had at the time – to work on each of those pieces to build cheaper batteries by hand. It was the most literal implementation of “human as a cog in a machine.”

Wang also flexed his chemistry training and caught up quickly in terms of battery technologies, from nickel-cadmium, to nickel-metal hydride, to lithium. BYD quickly caught up on all three types of batteries, while producing them at a fraction of the cost compared to its Japanese competitors. Its early investment in lithium-based batteries, along with Wang’s penchant to reverse engineer, would feature more prominently later in our story when BYD decided to make EVs…

…The company went public in 2002. That same year, Li Lu, the Tiananmen-protest-leader-turned-value-investor bought a stake with the money that Charlie Munger entrusted him to start Himalaya Capital…

…To Munger, investing in BYD in 2002 was akin to writing a VC check into an early stage startup – high probability of going to zero but with infinite upside.

Munger nonetheless admired Wang Chuanfu the person – someone he considered a “genius” with great engineering aptitude who works 70 hours a week. He would also soon learn of Wang’s independence and stubbornness, a trait that made his and Li Lu’s wager look like a terrible idea for a time, but set it on the path to become one of the best performing investments ever.

In January 2003, BYD bought a local carmaker called Qichuan Motors. Qichuan was so bad that the only worthwhile asset from that acquisition was the license it held, which BYD could now use to make its own cars.

Wang has had his eyes on the massive Chinese car market, and this was his way to move into it. His investors, however, were not pleased with this expansion. Li Lu, Munger, and just about everyone inside and outside the company opposed it. BYD’s stock price tanked by one-third during this acquisition.

But Wang didn’t care. For one reason or another, he acquired an immense confidence in his ability to reverse-engineer, vertically-integrate, then mass-produce just about anything. To learn how to make cars, he bought 50 or so second-hand cars from all the best foreign brands, took them apart, and learned how to make cars – a tale he has been fond of sharing in interviews since…

…Tesla was incorporated in July 2003, a few months after BYD bought Qichuan Motors. And Elon Musk would not come into the picture until February 2004, when he made an investment into Tesla’s Series A round using his PayPal-to-eBay acquisition winnings.

Technically, Wang was into making cars before Musk was.

Warren Buffett’s investment in BYD is a well-told story. Buying 225 million shares for $230 million dollars in 2008, when BYD was trading at barely more than $1, it is one of the best examples of Buffett’s “buy and hold” strategy working its magic. Buffett did not begin selling until 2023 – 15 years after his initial purchase. He is still holding more than half of his original stake, at the time of this writing.

However, there are two details to the Buffett-BYD love story that are less well-known and provide interesting colors to Wang Chuanfu’s personality.

First, Wang rejected Buffett’s initial overture to buy BYD, because the Oracle of Omaha wanted to buy a bigger stake than Wang was willing to give up. Despite the obvious benefits of capital infusion and stamp of approval from the greatest investor of all time, Wang stubbornly treated BYD like his baby, his kingdom, and his calling that couldn’t be so easily sold to the highest or most famous bidder. In the end, Buffett was only able to acquire about 10% of BYD…

…From 2009 to 2010, buoyed by Buffett’s investment and branding, Wang set BYD on an aggressive expansion path to make and sell as many cars as possible in China. Although Wang was an engineering and mass production savante, force-feeding BYD cars, which were not of the best quality nor had any brand premium at the time, turned out to be a terrible move. BYD had no problem pumping out tens of thousands of cheap cars. But Wang’s sales target – doubling year over year – forced its sales teams to in turn force dealerships across the country to take on more BYD inventories and higher sales targets of their own.

But not enough consumers wanted BYD cars. Demand overall was also weakening at the time when every country was, in one way or another, dealing with the aftermath of the Global Financial Crisis. Thus, major dealerships started rejecting BYD cars and severing relationships with the company in droves, from Sichuan, to Hunan, to Shandong, and beyond.

By mid-2010, “Dealership Exodus Gate” was in full swing, BYD slashed its sales guidance, implemented mass layoffs, and Wang was humbled. He realized that treating dealers like minions, while making cars with no brand value was not going to work, even with Buffett’s blessing. Unlike batteries, which few consumers know of or care about the brand or manufacturer, cars are prized possessions that convey social status and prestige.

BYD had to become a brand, not just an efficient producer of cheap, affordable cars…

…Tesla first started selling EVs in China in 2014. It commanded brand premium, conveyed social status, and produced high-performing EVs with solid range – three things BYD did not have. Tesla’s were coveted by many, but affordable to only a few, due in large part to China’s high tariffs on foreign-made cars. This barrier gave BYD and other domestic EV makers some room to survive by continuously catering to low-end, cost-conscious consumers.

All that changed in 2019, when Tesla opened its Shanghai factory. Musk’s creations could now both be made and sold in China. This meant Tesla cars could avoid the tariffs and lower prices to compete with the likes of BYD. That year, BYD sold 20% less vehicles than the previous year. Its earnings fell by almost half. Wang Chuanfu was in survival mode again…

…To fix BYD EVs’ lack of range and improve safety concerns, Wang came up with a new design concept that became the Blade Battery – a new form factor that could pack more power density and release heat faster than the standard battery pack modules. BYD’s adaptive and vertically-integrated manufacturing line quickly churned out prototypes of Lithium Iron Phosphate (LFP) Blade Battery…

…By packing more LFP-composed power into Wang’s blade-shaped design, which allowed for more density and a larger surface area for cooling, the LFP Blade Battery achieved a nice middle ground that enabled longer range than conventional LFP block batteries, a bit less range than NMC batteries, with way less heat during an accident…

…By March 2020, Blade Battery started making its way into BYD EVs. From 2020 to 2022, BYD’s sales quadrupled. The same Blade Battery is now in Tesla’s Model Y…

…What Wang will face next in order to take BYD to the next level is a geopolitical problem that has been decades in the making. It will require more words, more finesse, and less inventive chemistry composition and hardcore engineering. It is probably not the kind of wheeling-and-dealing he is naturally good at. Then again, for a peasant kid orphaned as a teenager, he is not supposed to be naturally good at anything.

Whether he succeeds or not, Wang Chuanfu, is a name that everyone in the west should know. It’s long overdue.

2. The road to investing wisdom begins with ‘I don’t know’ – Chin Hui Leong

When it comes to buying stocks, investor and mutual fund manager Peter Lynch has a simple mantra: Invest in what you know. But what does it mean to know something? How do you gauge your knowledge and skills?

Businessman and investor Warren Buffett has a useful concept for this conundrum: your circle of competence. In layman’s terms, it refers to the range of topics and fields that you can understand well.

For instance, if you are a teacher, you will have a better understanding of the education system than most people. Likewise, if you are a restaurant owner, you will know the ins and outs of the food and beverage industry.

Here is what investors miss: Knowing what you are good at is just the beginning. The real challenge is to know your limits. You need to be honest about your weaknesses and avoid investing in areas you do not understand, Buffett says. In other words, you need to know what you do not know…

…It is better to admit early that you are out of your depth than to suffer months and years later from holding the wrong stocks. Even a winning stock will be useless if you lack the conviction to hold it…

…Ben Graham, the father of value investing, used a story to explain how the stock market works: he called it Mr Market. A friendly guy, Mr. Market always tells you the price of your shares every day. But there is a catch: He is also very emotional. He often gets too excited or too depressed, and gives you prices that are too high or too low.

The trick is to know when Mr Market is wrong. That is how you beat him at his own game. Then again, while Mr Market has mood swings, he is not dumb. Even Graham admits that Mr Market can get it right sometimes, giving you a fair price for your stock based on how the underlying business is doing and its prospects.

The trick, then, is to realise that while Mr Market is not stupid, he is impatient. In the short term, he will change the price of your stocks to reflect the prevailing business news.

Over the long term, however, it is the business’ earnings growth that will determine the direction of the stock price…

…Here’s what I have noticed: Most investors do not like to admit that they need to diversify to lower their risk. They prefer to follow Buffett’s advice and put all their eggs in one basket. They would hold no more than five stocks at a time, sometimes even less.

Sadly, these same investors are just trading one flaw for another – ignorance for arrogance. Holding a few stock positions implies you have the rare ability to pick winners with atypical accuracy. Buffett, with his decades of experience, can make that claim. How about you?…

…Investor, hedge fund manager and author Seth Klarman said it best – that when you buy a stock, it is an arrogant act. You are saying you know more than the person selling the stock to you. That is arrogance.

There is no thin line between arrogance and confidence. They are both sides of the same coin. But here is the good news. You do not have to be stuck on one setting. You can be confident when you buy stocks. And then be humble after you buy the stock. You can commit to learning about the business over years, and earn your right to be confident.

3. What a Viral Post on Giraffes Says About China’s Fed-Up Investors – Li Yuan

It’s a perilous time for investors in China. Their main vehicle, so-called A shares of Chinese companies, fell more than 11 percent in 2023 and have continued their losses this year. Many investors have instead flocked to the exchange-traded funds that track foreign markets and that have been performing much better.

Putting money in stocks is inherently risky. But Chinese investors are experiencing something especially alarming: financial losses in the markets, declining home values and a government that doesn’t want any public discussion of what’s happening.

With their frustrations piling up, Chinese investors recently found a way to vent that wouldn’t be quickly censored. They started leaving comments on an innocuous post about giraffe conservation on the official Weibo social media account of the U.S. Embassy in China. They lamented the poor performance of their portfolios and revealed their broader despair, anger and frustration. The giraffe post has been liked nearly one million times since Feb. 2, much more than what the embassy’s Weibo posts usually get. Many of the comments also offered admiration for the United States, as well as unhappiness about their own country.

“The different stock markets’ performances reflect the distances between America and China in terms of national power, technology, humanity and sense of well-being,” a commenter wrote.

The comments demonstrate a growing loss of confidence by the Chinese public in the stock market, the country’s economic prospects and the Chinese Communist Party’s ability to govern…

…Another investor I spoke to, Leo, a portfolio manager at an asset management company in Beijing, has been investing in China’s stock markets for nearly a decade. In November, he started closing out his positions. Now, like Jacky, he is placing his bets on overseas markets.

Leo said he used to hope that China’s internet giants Alibaba and Tencent would become $1 trillion companies like Amazon, and that investors like him would benefit from their growth. “That dream was shattered” after the government cracked down on tech in 2020, he said. “I can only look to the overseas markets now.”

The American Embassy’s Weibo comments section once served as an online punching bag for nationalistic Chinese who blamed the United States for their country’s problems. Now it’s called the Western Wall of China’s A shares investors.

“Under the protection of the U.S. government,” wrote one commenter, “the giraffes are 10,000 times happier than the Chinese stock investors.”…

…A recent survey by the Canton Public Opinion Research Center offered a bleak picture from the southern city of Guangzhou, a metropolis of nearly 19 million people and a hub of technology, manufacturing and trade. In a 2023 survey of 1,000 residents, the center found that the city’s “economy and the society were confronted with unprecedented challenges and pressure.”

The research center’s report said residents’ assessment of the economy, because of unemployment and falling incomes, was as low as it was in 2015, when China’s markets tanked. Satisfaction with the growth of the private sector dropped below 30 percent, the lowest level since the question was first asked in 2008. Most residents said they didn’t expect their incomes to improve in 2024, and more than 20 percent said they believed they were “likely” to lose their jobs.

News coverage about the survey was censored, and the report can’t be found on the center’s website…

…Leo, who was born in Beijing in the mid-1980s, said he had grown up as a nationalistic “little pink.” The first crack in his confidence, he said, was in 2021 when the government went after internet companies. The second crack appeared when the government abruptly ended its “zero-Covid” policy in December 2022 without preparing the population with effective vaccines or medications. Then in late July, the markets and the private sector failed to respond to government measures to stimulate the economy.

Leo’s change is remarkable. He said local Beijing residents like him and the people with whom he had gone to high school were among the stoutest supporters of the Communist Party’s rule because they benefited from the city’s expansion and the country’s growth.

When a group of Leo’s classmates met up in June, he said, they couldn’t believe that two of them, a couple, were migrating to Canada…

…He said the big problems that had made him flee remained unsolved: the imploding real estate sector, enormous local government debts and a fast-aging population.

He said that he wanted the government to loosen its grip on private enterprise and disband Communist Party branches that had proliferated inside companies, and that he wanted the private sector to start to invest again. Until then, he will keep his money out of China’s markets.

And what investing advice would he give to his families and friends? “Run as fast as you can,” he said, “even at a loss.”

4. Rohit Krishnan — Demystifying AI – Jim O’Shaughnessy and Rohit Krishnan

Jim O’Shaughnessy: This large language model says, and he’s speaking to you or it is speaking to you. “In your description of AI as a fuzzy processor, you acknowledge a level of unpredictability in AI behavior. How would you balance the need for predictable AI systems with the inherent uncertainty of their fuzzy outputs in critical applications?” …

…Rohit Krishnan: So with an LLM, the fact that it’s a fuzzy processor means that you can now use it in a lot of different places where you could not have used an AI or any kind of software before, because it can effectively be a replacement for parts of different jobs that people might actually be doing. However, the problem is that, if you or I as fuzzy processors are used in those places, we can be tested. We can be evaluated. If I’m hiring someone for a job, I know that they’re not perfectly predictable. However, I can talk to them and get a sense of how unpredictable they are, and how they would actually deal with different situations, and monitor those in different ways, and ask for previous employers or references, or interview them, and create basically this cone of uncertainty, if you will. I can bound it, so I know that they’re not completely crazy. I know that they will do some things, but it’ll be within the bound of error.

Rohit Krishnan: So with LLMs and fuzzy processors, we are at the early stages for that. The inherent fuzziness is problematic only because you cannot depend on when and how it is actually likely to be fuzzy, that it might end up going in any kind of random direction. So for us, to be able to use it in any actual real-life situation, especially in critical applications, we would need to have a whole lot more confidence in how precisely it works. We would need to not it’s in the internals, in the specific nodes and weights and stuff like that. We already know it, but it’s slightly unhelpful. It’s like doing, I don’t know, neuroscience to predict behaviorism. I don’t think it is hugely valuable in and of itself. However, we do need to bound its behavior in some sense so that we know it cannot go completely off the rails when you’re trying to use it.

Rohit Krishnan: Even with that, I mean, we are speaking, what, after the latest Boeing disaster, not that long after? So when you talk about complex systems where large number of parts actually interact with each other, the possibility of something going wrong always exists. So the way we solve it in real life is by having stringent QA and multiple checks and huge levels of evaluations and large amounts of redundancies. And the exact same principle applies for fuzzy processes as well, where the only way to make a fuzzy processor function in a critical system is by having large number of evaluations so that it can bound it, creating enough structure around it so that even if it does something weird or crazy, you can actually cut off those particular probability branches of the tree, and you can direct it towards something, and having large amount of redundancies so that you can actually ensure that the output that is coming from it is effectively usable, so that even if it does something crazy or stupid, the errors are not continuously compounded over a period of time.

Rohit Krishnan: It’s like that… I don’t know whether this is apocryphal, but I remember hearing this story about Elon where they were trying to send computers up along with the Starlink satellites. And obviously, radiationshielded computers are very heavy and highly expensive. And radiation shielding is important because bit flips are more common when there is higher levels of radiation that actually hits once they’re above the atmosphere. And I think his solution, or the solution is one of his engineers in that particular apocryphal story, was to send three, and they would just vote, because chances of all three getting hit simultaneously are much lower. That’s a way to use redundancy to solve for unpredictability. And I feel like a similar kind of thesis has to exist with respect to LLMs as well…

… Rohit Krishnan: I think I’ve written about a couple of these things before, which is that, at a sufficient degree of complexity, highly deterministic systems can also show highly indeterministic outcomes. I am by no means the first person. It’s a common trope in pretty much anything to do with chaos theory or even things like sand piles and grains at a point of avalanche and cascade. And there’s a bunch of these questions which are, in my opinion, more feasible to see happen than to predict how it will happen because prediction requires you to effectively run the experiment, so to speak, and I’m fascinated by that.

Rohit Krishnan: So I think, in some sense, we in normal conversations quite often complicate indeterministic with random, or unpredictable with random, and they’re two different kind of processes. I mean, there is the common argument that people make against, things like free will, is like, everything is a physical phenomena. Physical phenomena, given a sufficiently powerful computer, might actually be able to get simulated, and therefore, you might be able to predict it. And it’s one of those things when, logically, it might hold true if and only if the computer that is predicting it did not need to actually run the simulation in order to predict it. And if it did, then from the perspective of the people being simulated, us in this instance, the outcome will still end up looking indeterministic, unpredictable, even though, theoretically, everything was as preordained.

Rohit Krishnan: I know this has vexed and driven more people mad than me, but I think there is a core kernel of truth here that just because you can’t create beautiful analog equations to predict the behavior of a particular piece of software, physical phenomena, whatever, does not mean that that is random. It just means that at a certain degree of complexity, there are way more permutations and combinations of how things can go wrong than there is feasible for us to, I don’t know, conceivably identify. And as we said in the previous, the only way to solve it is by having sufficient amount of QA and redundancy and bound the system so that you can actually be relatively sure that it does what you want it to do.

Rohit Krishnan: I mean, stock markets are a perfect example of this. I mean, the flash crash is my favorite example of this. It’s not an intended behavior of the system, but it is one chaotic outcome that could have happened. And how do you stop it? You don’t stop it by stopping each individual trader analyzing each one. You stop it at the macro level saying, “If it falls a little this much, we cut it off,” which is a macro behavior that then controls the micro behavior of each individual algo, which takes that into account. And even if it does hit, we mean that the worst-case scenario is bounded.

Jim O’Shaughnessy: And you also covered that in your book because you posit that we could have a so-called flash crash of AI. And why don’t you tell our listeners a little bit about your solution for that?…

…Rohit Krishnan: The only way to guard against it is at the macro level. You can’t go solution by solution and say, “Unless we can perfectly predict the outcome of this particular system, we will let it go off and do what it wants to do,” because if you could perfectly predict the outcome of the system, we didn’t really need the system in the first place. It’s arguing against the premise of the question in the first place.

Rohit Krishnan: The only way to guard against it is at the macro level. You can’t go solution by solution and say, “Unless we can perfectly predict the outcome of this particular system, we will let it go off and do what it wants to do,” because if you could perfectly predict the outcome of the system, we didn’t really need the system in the first place. It’s arguing against the premise of the question in the first place.

Rohit Krishnan: We will have to do something similar on the AI front as well, where if you don’t want it to do certain outcomes in a particular system, we have to go from outcome first rather than sort of algo first. You’re not going to prevent that by, I don’t know, bounding the number of flops, because even with the lower number of flops, we can find enough ways for it to screw us up, assuming there’s enough number of them that actually interact with each other. But the only way to stop that is step up a layer of aggregation, actually stop it from creating the chaos that we don’t actually want it to do…

…Rohit Krishnan: Oh, I’ll tell you one of the funny things that I’ve been working on. I created a bit of an evaluation suite for a bunch of LLMs for various reasons. And I ran it against a bunch of the Chinese LLMs because I could. I mean, there’s no reason to. So then the interesting things that come out from that is that they’re really good, first of all, I should say that. However, they’re also clearly slanted in what they’re actually allowed to say.

Rohit Krishnan: If you ask it any questions about things around geopolitics, it’s like hackles get raised a little bit, and it says specific things. If you ask it questions about economics, its hackles get raised. If you ask about politics, of course, sometimes it just refuses to answer. Don’t even mention Tiananmen Square. It is fascinating to see that it has created an actually useful tool, which is it does coding really well. And you ask it to create ASCII art of a dinosaur, it does pretty well. You ask it to name, I don’t know, planets in reverse order with different whatever, different languages for each, it does the things that you would want it to do. But it also means you cannot put it into production anywhere you need any of that judgment.

Rohit Krishnan: So you cannot use it in a financial services institution because, guess what, if you’re making an investment decision, you cannot be influenced by things that were hard-coded into you. So similarly, the only way you’re going to be convinced about which ones you are most happy using are by ease of use and latency. It has to be easy to use in front of you, fast, et cetera, et cetera. But also, you can trust the advice coming from it. If I’m thinking about investing in something, I’m not going to call my friend up from Beijing to ask their opinion on a public line. Because there’s a set of information that comes back which is clearly biased. I would ask somebody that I trust and that is the benefit here…

…Rohit Krishnan: I don’t think you are wrong. I think the only caveat or perhaps addition that I would make is centaur models work best in areas which are not directly entirely competitive with the same things that the AIs do. Unless you find joy in doing it, because then it’s a self-fulfilling kind of prophecy.

Rohit Krishnan: To me, currently, and at least for the immediate future, AI is best used in areas where you can either automate part of your own job and yourself and also use it together with you in order to make your ultimate goal better. It’s just like any tech. We are all centaurs already. We live most of our lives on digital technology connected with other human beings. We are part of some weird form of a hive mind, and we are all cyborgs. This is a fact.

Rohit Krishnan: Then the question is, how much more integration would you like in different facets so that you can actually perform some of these things better? And the answer is all of them. Now, there might be some things where, guess what? If you like drawing for fun, you’re probably still going to drawing for fun, despite the fact that if you do want to make a profession out of it, there are some things that the AI will be able to do much better.

Rohit Krishnan: And you as somebody who actually understands it and can use it better and knows the intricacies of drawing will be able to direct it and make use of it in ways that me, as somebody who doesn’t, can’t. Your knowledge and education in doing that particular thing translates to how much better you can actually do something. It’s like giving yourself a boost. Everyone gets a boost kind of question.

5. Big Risks: Catastrophic Risk in Investing and Business – Aswath Damodaran

There are a multitude of factors that can give rise to catastrophic risk, and it is worth highlighting them, and examining the variations that you will observe across different catastrophic risk. Put simply, a volcanic eruption, a global pandemic, a hack of a company’s database and the death of a key CEO are all catastrophic events, but they differ on three dimensions:

Source: I started this post with a mention of a volcano eruption in Iceland put an Icelandic business at risk, and natural disasters can still be a major factor determining the success or failure of businesses. It is true that there are insurance products available to protect against some of these risks, at least in some parts of the world, and that may allow companies in Florida (California) to live through the risks from hurricanes (earthquakes), albeit at a cost. Human beings add to nature’s catastrophes with wars and terrorism wreaking havoc not just on human lives, but also on businesses that are in their crosshairs. As I noted in my post on country risk, it is difficult, and sometimes impossible, to build and preserve a business, when you operate in a part of the world where violence surrounds you. In some cases, a change in regulatory or tax law can put the business model for a company or many company at risk. I confess that the line between whether nature or man is to blame for some catastrophes is a gray one and to illustrate, consider the COVID crisis in 2020. Even if you believe you know the origins of COVID (a lab leak or a natural zoonotic spillover), it is undeniable that the choices made by governments and people exacerbated its consequences.
Locus of Damage: Some catastrophes created limited damage, perhaps isolated to a single business, but others can create damage that extends across a sector geographies or the entire economy. The reason that the volcano eruptions in Iceland are not creating market tremors is because the damage is likely to be isolated to the businesses, like Blue Lagoon, in the path of the lava, and more generally to Iceland, an astonishingly beautiful country, but one with a small economic footprint. An earthquake in California will affect a far bigger swath of companies, partly because the state is home to the fifth largest economy in the world, and the pandemic in 2020 caused an economic shutdown that had consequences across all business, and was catastrophic for the hospitality and travel businesses.
Likelihood: There is a third dimension on which catastrophic risks can vary, and that is in terms of likelihood of occurrence. Most catastrophic risks are low-probability events, but those low probabilities can become high likelihood events, with the passage of time. Going back to the stories that I started this post with, Iceland has always had volcanos, as have other parts of the world, and until recently, the likelihood that those volcanos would become active was low. In a similar vein, pandemics have always been with us, with a history of wreaking havoc, but in the last few decades, with the advance of medical science, we assumed that they would stay contained. In both cases, the probabilities shifted dramatically, and with it, the expected consequences.

Business owners can try to insulate themselves from catastrophic risk, but as we will see in the next sections those protections may not exist, and even if they do, they may not be complete. In fact, as the probabilities of catastrophic risk increase, it will become more and more difficult to protect yourself against the risk…

…When looking at how the market prices in the expectation of a catstrophe occurring and its consequences, both these human emotions play out, as the overpricing of businesses that face catastrophic risk, when it is low probability and distant, and the underpricing of these same businesses when catastrophic risk looms large.

To see this process at work, consider again how the market initially reacted to the COVID crisis in terms of repricing companies that were at the heart of the crisis. Between February 14, 2020 and March 23, 2020, when fear peaked, the sectors most exposed to the pandemic (hospitality, airlines) saw a decimation in their market prices, during that period.

With catastrophic risk that are company-specific, you see the same phenomenon play out. The market capitalization of many young pharmaceutical company have been wiped out by the failure of blockbuster drug, in trials. PG&E, the utility company that provides power to large portions of California saw its stock price halved after wildfires swept through California, and investors worried about the culpability of the company in starting them.

The most fascinating twist on how markets deal with risks that are existential is their pricing of fossil fuel companies over the last two decades, as concerns about climate change have taken center stage, with fossil fuels becoming the arch villain. The expectation that many impact investors had, at least early in this game, was that relentless pressure from regulators and backlash from consumers and investors would reduce the demand for oil, reducing the profitability and expected lives of fossil fuel companies.

While fossil fuel pricing multiples have gone up and down, I have computed the average on both in the 2000-2010 period and again in the 2011-2023 period. If the latter period is the one of enlightenment, at least on climate change, with warnings of climate change accompanied by trillions of dollars invested in combating it, it is striking how little impact it has had on how markets, and investors in the aggregate, view fossil fuel companies. In fact, there is evidence that the business pressure on fossil fuel companies has become less over time, with fossil fuel stocks rebounding in the last three years, and fossil fuel companies increasing investments and acquisitions in the fossil fuel space.

Impact investors would point to this as evidence of the market being in denial, and they may be right, but market participants may point back at impact investing, and argue that the markets may be reflecting an unpleasant reality which is that despite all of the talk of climate change being an existential problem, we are just as dependent on fossil fuels today, as we were a decade or two decades ago:

Don’t get me wrong! It is possible, perhaps even likely, that investors are not pricing in climate change not just in fossil fuel stocks, and that there is pain awaiting them down the road. It is also possible that at least in this case, that the market’s assessment that doomsday is not imminent and that humanity will survive climate change, as it has other existential crises in the past.

Disclaimer: The Good Investors is the personal investing blog of two simple guys who are passionate about educating Singaporeans about stock market investing. By using this Site, you specifically agree that none of the information provided constitutes financial, investment, or other professional advice. It is only intended to provide education. Speak with a professional before making important decisions about your money, your professional life, or even your personal life. We currently have a vested interest in Amazon, Tencent, and Tesla. Holdings are subject to change at any time.

What We’re Reading (Week Ending 18 February 2024)

The best articles we’ve read in recent times on a wide range of topics, including investing, business, and the world in general.

We’ve constantly been sharing a list of our recent reads in our weekly emails for The Good Investors.

Do subscribe for our weekly updates through the orange box in the blog (it’s on the side if you’re using a computer, and all the way at the bottom if you’re using mobile) – it’s free!

Here are the articles for the week ending 18 February 2024:

1. Where Will Virtual Reality Take Us? – Jaron Lanier

In the intervening decades, V.R. has thrived at two extremes in the quest for “killer apps.” It has long been an established industrial technology: if you’ve flown, ridden, or sailed in a factory-built vehicle in the last thirty years, virtual reality may have played a central role. It’s been used to design surgical procedures and train surgeons ever since our first simulated gallbladder, at Stanford Med, some three decades ago; Boeing, Ford, and many other companies started using VR for design in the early days as well. And then there are the visionary, mystical, and philosophical applications. V.R. can be a way of exploring the nature of consciousness, relationships, bodies, and, perception. In other words, it can be art. V.R. is most fun when approached that way.

In between the two extremes lies a mystery: What role might V.R. play in everyday life? The question has lingered for generations, and is still open. Gaming seems likely—but, for most gamers, not so much. There are many reasons why V.R. and gaming don’t quite work, and I suspect that one is that gamers like to be bigger than the game, not engulfed by it. You want to feel big, not small, when you play. (“Star Wars” might have got this right with holographic chess.) Apple’s initial round of Vision Pro apps, like those from its competitors, aren’t entirely compelling, either, and can even have a lonely, dystopian flavor. (Watching a simulated big-screen movie, by yourself?) But my belief is that the quotidian killer apps will come. Maybe you’ll use V.R. to learn quickly about the Airbnb at which you’ve just arrived. Maybe V.R. will help you assemble ikea furniture. Maybe!

Virtual-reality headsets come in various forms. A major divide has to do with how they acknowledge the real world. Some headsets obscure the surrounding environment completely; this is typical in gaming headsets. But there is another option, which I used to call “mixed” reality, and which came to be known as “augmented” reality in the nineteen-nineties. Some mixed or augmented headsets, such as the Microsoft HoloLens or the system created by Magic Leap, allow you to see the real world through the headset glass so that it can be combined with virtual content using challenging optical techniques. Others, like Apple’s Vision Pro and the recent offerings from Meta, capture the real world with cameras, then render it as part of the virtual environment so that it can be combined with fabulated content.

Camera-based mixed reality is vastly easier to accomplish than the optical version, but it is concerning. Early research by a Stanford-led team has found evidence of cognitive challenges. Your hands are never quite in the right relationship with your eyes, for instance. Given what is going on with deepfakes out on the 2-D Internet, we also need to start worrying about deception and abuse, because reality can be so easily altered as it’s virtualized…

… For most of the technology’s history, however, virtual experiences have been hard to build and maintain. This has been one of V.R.’s biggest problems. I saw the first V.R. teaching demonstration of general relativity at least as early as 1992, and have seen dozens more since then; they’re often wonderful, and help users grasp the concept in new ways. But they only run for a year or so because there are too many variables in a V.R. system for creators to keep experiences available. Graphics chips change, and with them the layers of mediating software. That’s true for other programs, too, but with V.R., when the properties of a headset (like field of view) or an input device shift, the whole experience and interaction method must often be rejiggered. It’s too much ongoing effort, so it usually doesn’t happen; developers move on to other projects. The exceptions have been locked-down V.R. experiences that assume a minimal level of interaction, which limits the magic…

…Apple is marketing the Vision Pro as a device you might wear for everyday purposes—to write e-mails or code, to make video calls, to watch football games. But I’ve always thought that V.R. sessions make the most sense either when they accomplish something specific and practical that doesn’t take very long, or when they are as weird as possible.

The practical side of V.R. is a scattering of wonderful niches: in addition to surgical simulation and vehicle design, the technology is used by oil companies to simulate geological structures, by drug companies to envision molecules, and by planners working on city centers. The new frontier, which might apply more to everyday life, is the spontaneous creation of practical apps that you might not even bother to save. My research group, for instance, has presented a prototype system—the “mixed-reality copilot”—that allowed us to recreate, with a single voice request, a program that allows you to use your hands to paint and sculpt with virtual stuff. A decade ago, it took months to make that kind of program. Hopefully, in the near future, one will be able to ask for a V.R. relativity simulation tailored for a student who has color blindness and A.D.H.D., and it will simply appear. More prosaically, you might walk through a facility in augmented reality, asking an A.I. for instant advice about potential safety hazards and fixes. These ideas might even work already: one of the curious features of this accelerated period of A.I. development is that there aren’t enough minutes in the day to try everything.

On the weird edge, it turns out you can change your body plan in V.R. You can become different animals. You can map your body to that of a lobster or an octopus, and experience, to a significant extent, the control of that other body. The brain has had to adapt to many body plans over the course of its evolution, and it’s pre-adapted to work with more. When you change your body, you can also play with the flow of time. By shifting the rhythm of the natural sway of your limbs, and also how the objects around you move and change in response, you alter the reference points that your brain uses to mark the flow of time. You can speed it up or slow it down. In V.R., you can change the rules of the world. You can exist in strange geometries that are too hard to describe in words. You can become an archipelago of parts instead of a continuous animal. You can blend and share bodies with others, to a surprising degree…

…There are fresh, urgent reasons to reaffirm the value of experience. It is impossible to judge technology without a sense of its purpose—and its only plausible purpose is to benefit people, or perhaps animals, or the over-all ecosystem of the planet. In any case, if we pursue technologies that make it hard to delineate the beneficiaries—for instance, by blending brains into robotics not to cure a disease but just because it seems cool—then we make the very idea of technology absurd. The central question of the technological future is how to identify the people who are supposed to benefit from technology, especially if they seem to have melted into it. If people aren’t special, how can we act in a way that benefits people? We can’t. The principles of ethics, design, and even technology itself become nonsense. What can that specialness be? It must be something that is not technologically accessible, since technology expands unpredictably. It’s a little mystical. The definition of people must be one of apartness. We must now put people on pedestals, or they will drown.

When I put on a V.R. headset, I still notice that I am floating there, that I exist independently of the information I experience. But then there’s the moment I take off the headset, which is the very best. In the nineteen-eighties, we used to try to sneak flowers or pretty crystals in front of people before they would take off their headsets; it was a great joy to see their expressions as they experienced awe. In a sense, this was like the awe someone might experience when appreciating a flower while on a psychedelic drug. But it was actually the opposite of that. They were perceiving the authentic ecstasy of the ordinary, anew.

This is the kind of experience you can have only if you use V.R. fleetingly, not constantly. Here we come to one of the greatest differences between what I love about virtual reality and how it is often promoted today. Venture capitalists and company-runners talk about how people will spend most of their time in V.R., the same way they spend lots of time on their phones. The motivation for imagining this future is clear; who wouldn’t want to own the next iPhone-like platform? If people live their lives with headsets on, then whoever runs the V.R. platforms will control a gigantic, hyper-profitable empire.

But I don’t think customers want that future. People can sense the looming absurdity of it, and see how it will lead them to lose their groundedness and meaning…

…But the truth is that living in V.R. makes no sense. Life within a construction is life without a frontier. It is closed, calculated, and pointless. Reality, real reality, the mysterious physical stuff, is open, unknown, and beyond us; we must not lose it.

Just because owning a major tech platform is desirable, that doesn’t suggest there is no other way to succeed in the technology business. There are water companies and soda companies, and then there is fine wine. All are viable businesses. The metaphor isn’t perfect, but I suspect that V.R. entrepreneurs will find their sweet spot by emulating Napa Valley…

…A.I. is often portrayed as a godlike, transcendent project that will take over the fabric of our physical reality, leading to a singularity, meaning nothing that matters now is likely to matter after. But singularities, like the ones we hypothesize in black holes, are the very definition of ignorance. There is no learning that bridges the before and after of a singularity. It is the absolute rejection of intelligence. Virtual reality is sometimes stirred into this mix. But our best understanding of how reality works is entirely bound to finitude. Physics is all about conservation principles. There are no infinities, only S curves. There is no free lunch. Technical culture often longs for freedom from finitude. A profound truth, however, is that the greatest mysteries are found in conserved systems, which can become rich and complex, not in infinite ones, which stretch out like blank white sheets to the edge of the cosmos.

And so another urgent question is whether people can enjoy the storied reality of finitude after coming down from the high of fake infinity. Can being merely human suffice? Can the everyday miracle of the real world be appreciated enough? Or will the future of culture only be viral? Will all markets become Ponzi-like fantasies? Will people reject physics forever, the moment we have technology that’s good enough to allow us to pretend it’s gone?

2. Pods, Passive Flows, and Punters – Drew Dickson

You’ve surely noticed what has happen to Nvidia lately. We used to just call these winners FANGs, and then FAANGs and then FAMANGs, but Nvidia has insisted on joining the league table. It now has a $1.7 trillion market cap. And in the last five years, the stock is up about 1,700%. Guess what else is up about 1,700%?

Nvidia’s earnings estimates.

How about Facebook, aka Meta, which goes through periods of hatred and love with equal vigor? Well, over the past seven years it has bounced around a lot but still has generated nearly 260% returns. And forward earnings projections? They’re up 280%.

We can stretch things further back, and look at Google over the past 14 years (earnings up 885%, stock up 980%); or Amazon during the same period (earnings up nearly 2,500%, stock up about 2,800%).

Or we can go waaay back and analyze Microsoft over the past 22 years. Forward earnings projections have increased from $0.93 in February of 2002 to $11.57 today. That’s nearly 1,150%. The stock is up just over 1,200%.

And finally, from one of my favorite former-CEOs Reed Hastings, we have good old Netflix. About 18 years ago, analysts were forecasting that Netflix would generate 11 cents of earnings in the coming 2006 year. Here in 2024, they are forecasting a whopping $17 of earnings in the coming year. That is a whopping EPS increase of 14,889%.

And how about the stock? We’ll it is up a whopping 14,882%.

Fundamentals matter, sports fans. Fundamentals matter.

Admittedly, some of these examples above are very long-term, but even when we self-select with some of the biggest, most exciting, long-term winners out there, and ignore the losers (of which there are many), it is still clearly apparent that it is the fundamentals that matter most.

So basically, it probably isn’t terrible advice to ignore the rest of it. Ignore the noise. Ignore the talking heads on CNBC. Ignore prognostications of meme-stock sith lords. Ignore the volatility. Embrace it, actually. And just focus on the fundamentals. Get those right, and you will likely win.

3. “The Practice Of Value Investing”, by Li Lu – Graham Rhodes

If you invest in a company in a sustainably growing economy, your company’s profits and your investment return will also grow sustainably. If you speculate on other people’s short-term trading behaviour, there can only be one result in the end: gains and losses must equal because this is a zero-sum game. If you add up the gains and losses of all speculators in the market, they will sum to zero. This is the biggest difference between investing and speculating. I’m not denying that there are some speculators whose chances of winning are higher and who can go on winning for longer; equally there are some who will always be the sucker at the table and never strike it rich. If you give it enough time though, when you add the winners and losers together, the net result will be zero. The reason is that speculating on short-term behaviour in the market adds nothing to the economy nor to corporate earnings growth. Some people say they use a mixed model of “80% investment, 20% speculation”. If they do 70-80% of their work correctly, then such participants’ returns will reflect the compound growth of the modern economy. However, the remaining portion will be caught up with all the other speculators and their result will be the same – zero.

Now that you know this result, will you choose to be an investor or a speculator? This is a personal choice and there is no right or wrong answer. The only difference is the impact you will have on society. Investors will help all parts of society enter modernity’s virtuous cycle – the stage in which it enjoys continuous compound growth. If you are interested and would like to learn more about this, you can refer to my monograph, “Discussions on Modernisation”.

Relatively speaking, the speculative part of the market verges on being a casino. From a social welfare point of view, we do not want this casino to be too big. However, without it, the market would not exist. We should therefore see speculation as a necessary evil – and a part of human nature – which cannot be removed. We cannot deny the parts of human nature which love to gamble and speculate but we cannot let them overwhelm us. Otherwise, society will sooner or later face the consequences. The wounds of the 2008-2009 Global Financial Crisis from which we have just emerged are still fresh in our memories. And once you understand the principle of a zero-sum game, you will begin to see these speculators as Mr. Market…

…There was another company at the time which taught me something revealing. This company owned a lot of gas stations, and so I became interested in gas stations. There were two gas stations near where I lived, one on each side of the same intersection. However, I realised that one gas station had many more customers, and that cars would come to it regardless of which direction they were heading. Both gas stations had the same price and their gas was the same as it was made to the same standard. I felt this was very strange and since it was my company’s gas station anyway, I went to have a look. The gas station which attracted all the customers was run by a family of Indian immigrants, who all lived there too. As soon as a customer arrived, they would come out to offer him a glass of water. Whether you wanted it or not, they would always offer it to you first and then strike up a conversation. If the kids were home from school, they would come out and help you tidy up your car. The other gas station was run by a typical American. He wasn’t a bad guy but the gas station didn’t belong to him. He was just an employee hired by the real owner, so he wouldn’t come out from the store and nor would he pay much attention to what was happening outside. Thanks to this one difference, I calculated that in a given period, one gas station attracted almost four times as much traffic as the other.

From then on, I realised it was important to know whether a company’s manager had an owner’s mindset. Through this, I began to gradually understand how a company could earn money and why it could earn more than others. The example of the two gas stations is a perfect illustration because they sold the same product and were otherwise identical. However, one’s service was slightly superior to the other’s and so it received four times as much traffic. What motivated that Indian fellow? He was an immigrant, like me. He needed money and if he couldn’t bring in business, he would have financial difficulties. The other manager could be indifferent because he could just take his salary while pretending to do his job. This was the difference. I therefore began to take great interest in how a company is run, its competitive advantages, and the sustainability of these competitive advantages…

…The next attribute is relatively special. You must be both extremely patient and extremely decisive, even though they are in contradiction. When there are no opportunities, you might go for years without taking any action. But as soon as an opportunity arrives, you must be able to become extremely decisive and act without hesitation. I have been Charlie Munger’s investment partner for sixteen or seventeen years now. We meet for dinner at least once a week and I’ve developed a deep understanding of him. Let me tell you a story about his investments. Charlie subscribes to Barron’s, a weekly magazine about the stock market published by the Wall Street Journal. He’s read this magazine for approaching 40-50 years for the purpose of finding investment ideas. And how many has he found in this time? One! There has only been one and he only found it after reading the magazine for more than thirty years. And he hasn’t found another in the ten years since. This hasn’t stopped him from continuing to read the magazine every week though. He is extremely patient and can go for a long time without doing anything at all. But when he finds an opportunity, he will go all in. And this particular investment made him a lot of money. So this is what’s required of an exceptional investor: he must have extreme patience and stay focused even when there are no opportunities. When an opportunity does come, he must then have the ability to move swiftly and decisively…

…When I was young, I always wondered about the meaning of life. Later, I gradually came to realise that the meaning of life is the pursuit of true knowledge. True knowledge can change your life and your fate; it can even change the world. Moreover, mankind is completely different from what else we can observe in the material world. The world we can see is one in which entropy increases. Energy flows from high places to low places; big things devour small things. If a large celestial body hits a smaller one, it will crush it. The entire planet and our universe are to a certain extent heading towards annihilation.

But the world of man is not the same. Mankind can turn the world into one in which entropy decreases. We can reverse entropy’s course. Through study, man can go from ignorance to erudition; through self-cultivation, man can become a virtuous person who contributes to society. Man can create things which were previously unimaginable. Since man’s arrival, the earth has changed. Today, we can even leave this planet for the stars; it is entirely possible that we go on to change the universe. As I mentioned earlier, the first investment I made was related to the wireless telephone. At the time, I hadn’t really figured out what that was. Twenty-six years later, who can bear to part with their mobile phone? Mobile phones, the internet and all these things were game changers born of knowledge. The internet is based on TCP/IP which is a protocol. At their heart, computers are permutations and combinations of 0s and 1s combined with a diode which uses silicon and electricity to tell those 0s and 1s apart. This is how knowledge can create changes which turn our world upside down.

4. Hong Kong’s death has been exaggerated – Michael Fritzell

The National Security Law in June 2020 was indeed a watershed moment for Hong Kong’s judiciary. Now that individuals seen to be endangering national security can be extradited to mainland China, there’s a fear that they will no longer receive fair trials.

But let’s look at the positive side of things. In reality, the National Security Law has really just had two major effects. One is emigration, and the other is stopping public demonstrations.

Since 2020, roughly 400,000 people have left Hong Kong, according to this data from the Hong Kong Immigration Department. But, if you calculate the cumulative number, net migration has actually started to decrease:

In other words, people are now moving back to Hong Kong. These could be individuals who avoided Hong Kong during COVID-19 and are now willing to return. They could also be people who changed their minds about living overseas, knowing that Hong Kong is a great place to make money. In the early 1990s emigration wave, many of those who left for Vancouver or elsewhere ultimately came back to Hong Kong.

While it’s certainly negative that hundreds of thousands of people have left Hong Kong, it’s not implausible that mainland Chinese immigration could make up for the shortfall. In fact, Hong Kong’s residential rents rose 8.1% in 2023 due to immigration from the mainland.

For now, the Hong Kong legal system remains reliable. The conviction rate for Magistrate’s courts in Hong Kong was 54% last year, far higher than mainland China’s 99.95%. This seems to suggest that Hong Kong judges are still independent. Hong Kong still ranks #23 in WJP’s Rule of Law Index, ahead of the United States.

Between Hong Kong and Singapore, the former remains a far larger financial hub. The aggregate market cap of Hong Kong-listed companies is 10x that of Singapore. Its assets under management are US$2.2 trillion – far higher than Singapore’s US$1.5 trillion. There are 2,000 licensed asset managers in Hong Kong vs just 1,200 in Singapore.

A key competitive advantage for Hong Kong is that its currency is freely convertible and pegged to the US Dollar. This enables the Chinese government and its companies to raise overseas capital while maintaining capital controls within mainland China.

It’s also the case that Hong Kong’s taxes are uniquely low:

The highest marginal income tax is 17%.
There is no capital gains tax.
There is no withholding tax on dividends or interest income.
There is no GST.
There is no estate duty.
There is no wealth tax.
There is a 15% tax rate on rental income but with a standard deduction of 20%.
Most import duties to Hong Kong are zero, making imported goods cheap.
The stamp duty for purchasing residential property is 15% for foreigners and 7.5% for locals, but this stamp duty could soon be removed.

For these reasons, the PwC and the World Bank recently ranked Hong Kong as the region with the most friendly tax system in the world.

The Hong Kong government remains committed to its low-tax policy. Hong Kong has agreed to implement a minimum corporate tax rate of 15% from 2025, but so has many other major economies. The budget deficit is projected to continue at over HK$100 billion in FY2025, but 3% of GDP remains modest.

While I don’t want to minimize the political shift that has taken place, for Hong Kong companies, it will be mostly business as usual. Hong Kong will continue to attract the ultra-wealthy through its low taxes, and it will continue to be used to raise capital for companies in China and beyond.

After Hong Kong’s zero-COVID policy was lifted at the end of 2022, the economy has actually been on a solid footing. Hong Kong’s retail sales grew +16% year-on-year in 2023, though remaining almost 20% below the peak in early 2019:

A major component in Hong Kong retail sales comes from tourism to Hong Kong, which is now back to around 70% of the pre-COVID level:

But don’t expect a full recovery in tourism spending. Before 2019, a large portion of Hong Kong retail sales to tourists comprised goods smuggled into mainland China. In 2021, China’s border controls tightened up significantly, and most of such business now occurs through legitimate channels. I wrote about such smuggling here.

One business that is booming is Hong Kong life insurance products sold to mainland Chinese visitors. Related premiums already exceed the pre-COVID-19 level, suggesting strong demand for USD-linked policies.

Hong Kong’s real GDP grew +4.3% in the fourth quarter of 2023. Hong Kong’s export growth has now turned positive at +11% year-on-year. The unemployment rate remains just 2.9%, suggesting that jobs are plentiful…

…What’s weighing on the Hong Kong economy is the interest rate environment. Since the Hong Kong currency is pegged to the US Dollar through a currency board arrangement, it effectively imports its monetary policy and interest rates from the United States…

…Now that HIBOR has reached over 4% borrow rates for households and companies remain above the nominal income growth in the economy. In my view, that means that monetary policy remains restrictive…

…Another longer-term worry is geopolitics. If a war were to break out in Taiwan or elsewhere, US sanctions could be imposed on Hong Kong. It could lose its special trade status. Import tariffs would be imposed, and it would be subject to the same export controls as China. If the Hong Kong Dollar were to be de-pegged to another currency. But as long as the currency remains freely convertible, Hong Kong will continue to retain its competitive advantage as a hub for raising overseas capital.

5. A beginner’s guide to accounting fraud (and how to get away with it): Part VI – Leo Perry

On 9th September 2018 serial entrepreneur Luke Johnson shared his experience and wisdom in an article in The Times newspaper titled ‘A business beginner’s guide to tried and tested swindles’. Five days later HMRC petitioned the High Court to wind up his business, the cafe chain Patisserie Valerie, for an unpaid tax bill. He didn’t notice. Unfortunately I didn’t either.

On 10th October Pat Val halted trading in its shares and suspended its CFO. It noted “significant, potentially fraudulent, accounting irregularities” that had materially impacted the cash position. I was familiar enough with the brand. I worked in an office a few doors down from one. It never seemed busy but there was nothing in the accounts that gave me good reason to think about shorting the company. But if I’d been able to now I would have, even at half the price it was halted at. The reason I was so confident it was screwed was precisely because I hadn’t spotted anything wrong in its numbers before (neither, apparently, had anyone else as there were no publicly disclosed shorts on the FCA list).

Pat Val’s published accounts were as straightforward as you’d expect from a simple business like a cafe. Sales taken in cash, not much held as stock and a few prepaid expenses. The only line items of any size on the balance sheet were the capitalised cost of fitting out stores, and money sitting in a bank. Not a lot to tweak if you needed to meet numbers. That’s why the company saying that its cash position was significantly misstated, while it was short on detail, had to mean that (probably) sales and (almost certainly) profit were faked. Working backwards there couldn’t really be any other story.

Things unravelled fast. The next statement from the company, later the same day, disclosed the winding up petition from a month earlier. The following day Pat Val said it couldn’t continue trading without a capital injection, which really amounted to saying the £30m of “cash” on its balance sheet wasn’t in the bank at all. And the day after that its CFO Chris Marsh was arrested. One trick (I should say allegedly, I guess) is depositing fat cheques just before year end – to show a big credit at the point in time when you know the auditor is going to look – only for them to bounce a few days later. Another is borrowing money – again giving a big credit to cash – and just not mentioning the debt part in the accounts. Most of the time that would still show up as higher interest payments (see e.g. Globo), but when rates are close to zero you can get away with a lot more.

Disclaimer: The Good Investors is the personal investing blog of two simple guys who are passionate about educating Singaporeans about stock market investing. By using this Site, you specifically agree that none of the information provided constitutes financial, investment, or other professional advice. It is only intended to provide education. Speak with a professional before making important decisions about your money, your professional life, or even your personal life. We currently have a vested interest in Alphabet (parent of Google), Amazon, Apple, Meta Platforms, Microsoft, and Netflix. Holdings are subject to change at any time.