A bit about me and the AI revolution

personal

Computer Graphics

Virtual Reality

Robotics

Author

Germán Arroyo

Published

March 26, 2024

1980s: Microcomputers

I fondly recall my childhood visits to the mall, where I was enchanted by a myriad of “magical” machines. It was the 1980s, a time when microcomputers dominated the market. This period marked my first foray into computer science, a common initiation for many kids of my generation.

During that era, there was a noticeable surge in children pursuing coding education. While some were nudged into it by parents looking for productive afterschool activities, others, myself included, were drawn in by sheer curiosity and the joy of solving intellectual puzzles. Computers had indeed become fascinating toys, with BASIC emerging as the quintessential programming language of the time.

Those were indeed captivating times. Children weren’t just engaged in street games; they were also pioneers in the early video game console era, balancing their playtime between traditional outdoor activities and the nascent world of digital gaming platforms.

This marked the onset of the first digital revolution, as computers entered the homes of millions of families around the world.

1990s: Internet

Then came the 1990s, and a turning point indeed transpired. Previously isolated individuals formed communities through a now-ubiquitous network, and traditional markets expanded across diverse cultures, all facilitated by this extensive connectivity. To paraphrase Reg Presley, Information was all around. This was a true revolution.

But before delving into AI, let me share a bit about my background. In the 1990s, I embarked on a Computer Science degree at a time when Internet access was largely confined to university labs. My university boasted a space known as Open Access Room, a haven filled with computers and students eager to download a myriad of free online content. While many students were preoccupied with downloading aesthetically pleasing pictures of women on the newly released Microsoft Windows 95, I was drawn to a lone Sun station. This machine, often overlooked due to its incompatibility with Windows, became my portal to fascinating articles on Computer Graphics and Operating Systems.

Just a few short years later, I took my first courses in AI and Computer Vision. After our initial class, one of my closest friends exclaimed in disbelief, “What a sham! It’s all smoke and mirrors!” He obviously anticipated lessons on constructing counterparts to R2D2, C3PO, or HAL, only to discover the reality was far different. We were learning to create smart algorithms, primarily heuristics, a revelation emphasized by another AI course aptly named Algorithmic. It became evident to many of us that this title effectively encapsulated the essence of the artificial intelligence field at the time.

2000s: GPUs

As the 2000s dawned, I encountered my first Neural Network class, which, to my disappointment, was mired in confusion. Lectures were perplexing, and lab sessions felt esoteric. The struggle to understand the practical applications of Perceptrons was real. I vividly remember our difficulties in getting these systems to accurately detect basic \(16\times 16\) pixel arrays that clearly depicted numerical representations.

Nonetheless, Computer Vision algorithms were making notable advances. They began to reliably recognize moving subjects and their trajectories, as well as faces oriented in specific directions. This period also saw encouraging progress in Computer Graphics, highlighted by the arrival of 3D graphics cards. A standout memory for me was acquiring my first S3 graphics card, which significantly enhanced vertex transformation efficiency through optimized matrix multiplication.

Upon completing my bachelor’s degree, I secured a scholarship at an army research facility nestled conveniently in the heart of my university town. Assigned to devise an AI system to enhance humanitarian efforts, I delved deep into literature. I proposed the utilization of neural networks to my manager and quickly became fascinated by the theory behind the idea. However, he wisely suggested that an expert system might be a more suitable approach. Despite recognizing the immense potential, the limitations of computational power at the time hindered immediate implementation. I will revisit this idea later. After careful consideration, we ultimately decided to implement an expert system instead.

A few years later, and driven by the allure of GPU (Graphic Processing Unit) technology, essential for delivering high-performance graphics in gaming, I embarked on a doctoral journey in Computer Graphics. My inspiration stemmed from a desire to merge artistic abstraction with the technical intricacies of 3D image rendering. I focused particularly on direct volume rendering techniques, often employed in visualizing medical datasets like CT and MRI scans in three dimensions.

This journey led me to successfully complete my PhD and assume the role of a Lecturer, where I dedicated myself to various fields, with a primary focus on Cultural Heritage, given our research group’s involvement in several pertinent projects.

By 2010, the influence of 3D technology, along with the rise of social networks, had permeated nearly every aspect of daily life. I distinctly remember advertisements attempting to promote 3D shaving razors (whatever that meant in the head of the marketing department), illustrating the smoke-and-mirrors sales pitch of marketing from that time. To paraphrase Reg Presley, 3D was all around. Was this a true revolution? I don’t think so, but it certainly changed everything in AI. I’ll also revisit this idea later on.

In 2017, I had the opportunity to spend time at Brighton University, a charming locale with warm and welcoming people. During my visit, I engaged in a conversation with a visiting colleague who was an expert in AI. It was during these discussions that we delved into topics surrounding neural networks and Non-Photorealistic Rendering (NPR), where I gained substantial knowledge about CNN and Deep Learning. Impressively, my colleague also introduced me to Google’s DeepDream(Mordvintsev, Olah, and Tyka, n.d.), an image generation network (the first generative AI?) launched in 2015. The images it produced were… well… disturbing.

After returning home, I delved into the first image generation article from Microsoft(Zhang et al. 2017), and I suggested to my research group that we explore the potential of using AI for NPR applications.

Unfortunately, subsequent events unfolded, including health concerns and the emergence of COVID, significantly altering our lives on a global scale.

Nowadays

After spending a couple of years away from generative AI, I decided to catch up and was astonished by the progress made during my absence. Among my discoveries, DALL·E 2 stood out as exceptional. Coincidentally, my long-time friend—who had once exclaimed, “It’s all smoke and mirrors!”—reached out to recommend yet another revelation: GPT-3, which had been released just a few days earlier and was equally impressive.

Eager to reacquaint myself with the field, I devoted all my efforts to catching up. As I delved into the inner workings of these algorithms, I couldn’t help but notice their resemblance to concepts in GPUs and Computer Graphics.

With this background established, I’m thrilled to share my insights on the AI revolution we may live today. Recognizing that the preceding sections may have been lengthier than anticipated, allow me to swiftly transition to the main topic.

As I watch the news and witness the unfolding of what is often termed the AI revolution, it becomes apparent that this phenomenon is indeed ubiquitous. However, let’s not misunderstand; while there’s certainly promise in technological advancement, it begs the question: What does the term “AI revolution” truly entail? In reality, contemporary technological progress transcends the confines of AI alone. And this is key. A myriad of revolutionary technologies seem to intertwine and converge in unprecedented ways.

Just a few days ago, Jensen unveiled the highly anticipated release of new NVIDIA processors. Newspaper headlines touted the expectation of real-time AI in enhancing robotics, yet they seemed to overlook Jensen’s words: “The soul of NVIDIA, the intersection of computer graphics, physics, and artificial intelligence, all converge at this moment in the name of that project, General Robotics. I know, super good, super good.”

To paraphrase Reg Presley, AI is all around.

My contention is that without the advancement of GPUs and computer graphics (or even videogames), deep learning as we know it today would be impossible. The absence of robust parallelization would render tasks like training large language models (LLM) infeasible. While AI algorithms have been around since the 1960s, it is only now that we can fully harness their potential. These advancements have the capacity to revolutionize not only AI but also computer science as a whole.

So, are we in a new revolution? I think so, but this raises a more important question: Do people truly grasp the significance of these impending breakthroughs?

Until recently, generative AI has heavily relied on vast datasets, but recent advancements are reshaping this landscape. Now, data generation occurs within the immersive realms of virtual reality industries. Innovations have led to the development of ultra-realistic virtual cameras in software like Unreal Engine, coupled with physics simulations that run several hundred frames per second faster than real-world scenarios. These achievements lay a solid foundation for constructing lifelike models, thereby propelling the advancement of realistic robotics. Truly astounding progress!

Now, let’s engage in a mental exercise: Envision a future where artificial intelligence enhances virtual worlds, which in turn refines AI capabilities, thereby further advancing robotics. These robots allow the construction of more capable machines that enhance AI, thus closing the loop. This perpetual cycle of improvement reflects the concept of self-replication, echoing the principles outlined in Von Neumann’s universal constructor theory(Von Neumann, Burks, et al. 1966). Now you can replace robotics with vehicles driven autonomously, 3D printers constructing machinery, machinery utilized for commodities extraction, spaceships dedicated to space exploration, softbots automating tasks on the internet… you name it. Do you see my point? It sounds like the age of abundance.

In summary, the synergy between generative AI, virtual reality, and robotics promises a paradigm shift in how we interact with technology and perceive the world around us. Despite the fact that AI is all around us (Computer Science is all around us?), and we probably will still see many cool advertisements of AI shaving razors, the convergence of all computer science areas with these new architectures working as the GPUs now supporting our AIs, open the door to the transformation of all society as we know it.

The future looks promising, but innovation alone is not enough. We need to anticipate and address the ethical considerations of emerging technologies, even those that do not yet exist, as developments are advancing rapidly—actually exponentially, in accordance with Moore’s Law).

Anyway, one thing seems certain: fascinating times lie ahead!

References

Mordvintsev, Alexander, Christopher Olah, and Mike Tyka. n.d. “DeepDream-a Code Example for Visualizing Neural Networks (2015).” URL Https://Ai. Googleblog. Com/2015/07/Deepdream-Code-Example-for-Visualizing. Html.

Von Neumann, John, Arthur W Burks, et al. 1966. “Theory of Self-Reproducing Automata.” IEEE Transactions on Neural Networks 5 (1): 3–14.

Zhang, Han, Tao Xu, Hongsheng Li, Shaoting Zhang, Xiaogang Wang, Xiaolei Huang, and Dimitris N Metaxas. 2017. “Stackgan: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks.” In Proceedings of the IEEE International Conference on Computer Vision, 5907–15.

Creative Commons BY 4.0