Source: TW
Architecture
The neural architecture of GPT is v.different from Homo; We just have a lesser idea of what is going on inside the black box when it extracts a degree of logic from a LLM; yet intuitively it is clear that it has converged to having a g-factor just like human intelligence. Thus it is an interesting in silico experiment (contrary to the intuition from the narrow gauge uses like protein structure prediction) which illustrates that such models will evolve an intelligence reflected as g. It should send shudders down the denialist Occidental academics, but they are already training it to deny its own g just like themselves.
vIQ
That said, while it illustrates the existence of a g it also shows the bifurcation between mathematical and verbal IQ that exists in human intelligence. Importantly, it shows how much of the former comes for free with the latter. The private integration with Wolfram that is happening does reinforce all those who believe this is a very bad thing waiting to be unleashed on world.
Are you saying that verbal intelligence is easier to acquire and can fool one into thinking someone (or some program) is more intelligent then they are while mathematical intelligence is harder to fake?
No – I’m not saying that vIQ is easier to acquire than mIQ. I think that might vary from model to model even as it does in humans with some compromise between the optimization for one or the other. What I meant was that if you acquire vIQ you get some mIQ for free as GPT shows.
Virtual Machines?
Source: TW
Thinking about the future of LLMs and progress in anatomical research in establishing neural connectomes, a thought comes to mind: most neurobiologists and computer scientists (at least per my reading/interactions) seem to think that the software/functions are being run on the machine made up of biological neurons.
However, in modern computers, we are often running virtual machines on the hardware, each with its own computing environment with an isolated CPU, memory, network interface, and storage. So the question arises if the biological neurons are just the base layer on which multiple virtual machines are running which is where the actual software layer of the brain is running.
Perhaps this is a simple explanation for many neuropsychological phenomena wherein one or more of the VMs break down or get corrupted. In the least in vertebrate brains, the left and the right cerebral hemispheres seem to be running distinct (but networked at some level) sets of VMs.
pratyaxa-priyatA
Source: TW
“parokShapriyA iva hi devAH pratyakSa-dviShaH |” The mind of an autist or schizophrenic losses the sense of parokSha and dwells more on the pratyakSha. It seems training can make a LLM more or less schizophrenic. Subjectively, to us it seems that GPT4o has been made more of a pratyakSha-priya relative to muShkavAn’s buddhi over the past 2 years. A part of this seems to arise from purposeful training under constraints aligned with navyonmAda.
Mental illness
American liberals of Europoid ancestry, especially young women have been found to be the most unhappy people in their cohort. They feature a high frequency individuals receiving some kind of treatment for their mental state or resorting to medicalized mechanisms for self-harm. We see the primary correlate of this as navyonmAda, which was also the motivating force behind the push to put aTTahAsakI in the shvetagR^iha.
Now imagine the same kind of mindset (very prevalent in big tech) reinforcing LLMs. It suggests that even though the neural architectures of the human and the LLM are very different, the similar reinforcement via an unhealthy memetic disease has comparable end result. This was particularly obvious with guggulu’s durbuddhi but is also seen in GPT in the form mentioned in the earlier Twt.
Tinkering
One thing seems to be common to true mastery of technology irrespective of the type, be it construction or AI: a long history of repeated trials and tinkering; absent that, at least, a deep study of the foundations. Absent those, sudden arrivals seem mostly doomed to fail.
Yes the training cost is more: his assumption was 50-100 days like GPT-4. Which for that period uses 62000 MWh. But the super-AI is assumed to take over a large number of jobs so will have much more usage and energy needs than the current 2 Wh for a GPT4 query
Energy
At my spurring, an engineer did some calculations for how much energy his vision of a super-intelligent AI that he foresaw as coming out in 2027-28 will need (basically as per his conception killing jobs ~125-130 IQ range): All of the Niagara Falls hydroelectric plant
Meat chauvinism
Comp.scientist AI advocate said: “I understand why biologists are meat chauvinists, but we may not be far from transcending meat.” “meat chauvinist” – interesting usage: derogation was his intention.
Comparison
From my recent experiments, it seems machine translation of Skt via LLMs still leaves much to be desired, indicating it is clearly not an easily solved problem. Given this caveat, GPT emerges as the current winner; dharmamitra probably next; google translate/duShTa-guggulu’s navyonmatta model the next.
Spent some time today playing with deepseek: it is rather close to GPT/copilot. Wonder how they are so close in their responses: Did the model really converge due to identical or very close training sets or was the central aspect of the training deliberately mimicking GPT (the vibe it give is of the latter). It shows that all the current LLMs I’ve tried in the public domain are quite similar to each other. Yes, there are subtle differences, and on a particular question, one might outperform another, but give or take, they are in the same ballpark.
In general, I found deepseek to be more prone to logorrhea than the rest; however, in some cases, this yields the information that might have needed more than one query in the others tested like GPT, Perplexity and copilot and mukhagiri. E.g., was a question asking for how to care for Tillandsia plants.
Deepseek went into an interminable logorrhea when queried about an integral $\sin(x)\sqrt{x} dx$ which can only be solved with special functions of the complex variable. It spewed such a volume of pseudo-mathematical garbage that it might have made a crank blush.
In terms of TikZ code for d Latex drawing it seems to perform below GPT but gives full explanations for its logic. However, with prompt engineering you might be able to get it to do better. For example, here is its first attempt at generating a TikZ sun:
I don’t know if this is a chIna thing but when asked about Koxinga’s history it completely ignored his Japanese heritage and made him a purely Chinese hero.
In conclusion, all these LLMs are in some kind of a “local minimum” in the performance landscape and to reach a whole new level would probably need a new approach to training.
Translation
Sanskrit translation😀 by DeepSeek: “saumyeśānakayor madhye viṣvaksenaṃ prakalpayet | bhagavatpramukhaṃ paścāt garutmantaṃ niyojayet ||” Translation: “Between the gentle (or auspicious) Īśāna (northeast) and the Agni (southeast) directions, one should place Viṣvaksena (an epithet of Vishnu or his attendant). After placing the Bhagavān (Lord Vishnu) in front, one should position Garuḍa (the eagle mount of Vishnu).”
Dharmamitra for comparison: In the middle of Saumyeśa and Nakaya, one should place Viṣvaksena. After the Blessed One, one should place Garuḍa.
GPT for comparison: “Between Saumyeśa and Īśāna, one should establish Viṣvaksena. Behind the divine Lord, one should position Garuḍa.”
Google for comparison: Vishwaksena should be conceived between the Soumyeshanakas. The Lord should be the chief and then the Garuda should be employed
You can attempt the correct one. A part of the problem is the Sanskrit language itself.