AIs on the Prize: Unleashing the Power of AI’s Sight

Breakthroughs in computing power have bestowed upon computers the remarkable ability to “see.” A transformative development that not only encompasses the capacity to recognize and interpret images but also to generate entirely new visual content. Previously, computers could only index images based on textual descriptions provided by humans. For instance, an image might be labeled as containing a cat or bright red truck, allowing it to appear in search results for those terms. However, computers lacked the comprehension of the images – they couldn’t truly “see.” This limitation is exemplified by Captcha challenges, which task users with identifying images containing specific objects like street signs to make sure you’re not a robot, as traditional computers were incapable of such nuanced tasks.

Enter generative AI. With this groundbreaking technology, computers now possess the ability to genuinely perceive visual content, both photos and videos. It is unlocking vast new capabilities such as autonomous driving and robotic surgeries, but in the future scientists are hopeful that this may bring sight to the blind. The proverbial “worth a thousand words” attributed to a photo takes on a new meaning in the context of data, where images and videos contain millions of pieces of information. The exponential increase in complexity inherent in visual data illustrates the immense computational challenges that have recently been overcome to enable this nascent capability. Computers are no longer merely crunching numbers, or text, but are learning and reasoning based on visual inputs – like we rely on our eyesight.

Computer vision is reshaping healthcare, and it is taking robotic-assisted surgery to the next level. Surgeons already benefit from robotic systems equipped with advanced cameras and sensors that enable them to perform procedures with enhanced dexterity and less tissue damage. However, researchers at John Hopkins University have taken a significant leap forward by completely automating a procedure. The experiment was performed on a pig, and with no human intervention, the robot performed “one of the most intricate and delicate tasks in surgery: the reconnection of two ends of an intestine.”1 Surprisingly, the results were deemed significantly better than a human performing the surgery.

It is proving instrumental in the early detection of diseases, particularly cancer. While a human pathologist or radiologist can process only a few images per minute, a computer can analyze thousands or even millions.2 This not only boosts the volume of images that can be examined daily but also enhances detection by pinpointing microscopic anomalies that can be missed by the human eye.

Medical technology concept with 3d rendering robot with tablet display x-ray brain

Until recently, computers lacked the ability to comprehend and compare diverse datasets, like CT scans from millions of patients’ lungs. With generative AI, a computer model named SYBIL was trained with massive datasets of patient scans and taught to predict how likely a person is to get lung cancer in the near future. It was able to predict cancer before the tumor appeared with 94% accuracy.3 Technology like this could dramatically alter the future of lung cancer, the number one cause of cancer death in the world.

The fusion of AI’s rapid computational capabilities with video processing is revolutionizing threat detection and security measures. Whether it’s the US Air Force employing the technology to safeguard airspace or law enforcement agencies utilizing facial recognition, AI can now analyze video feeds in real-time with remarkable accuracy. For example, an AI-monitored security camera can identify suspicious activities and alert nearby officers for investigation. Moreover, AI’s ability to interpret behavior patterns and gestures enables it to predict potential threats, such as identifying signs of agitation or hostility, enhancing proactive security measures.

Stay on Top of Market Trends

The Carson Investment Research newsletter offers up-to-date market news, analysis and insights. Subscribe today!

The real-time data processing capabilities of AI are now aiding the visually impaired in “seeing” their surroundings. Microsoft, a key investor in OpenAI, is pioneering this effort with its SeeingAI Service, powered by GPT-4. By describing objects and environments to users, this innovative tool provides invaluable clarity about the world around them. While it doesn’t grant the gift of sight, it represents a significant leap forward in accessibility and inclusivity. The ability to interpret signs, menus, and even facial expressions is a transformative breakthrough for the blind, enhancing their independence and enriching their daily lives in profound ways.

In fact, computers aren’t just learning to see. They are learning to listen, understand, detect emotions, and converse in ways that are similar to humans. It is interesting to watch computers become more human-like and, in many ways, more useful to humanity. Computers have never been able to see the world through human eyes, and likely never will. But they can now “see” – in that the artificial intelligence systems are capable of interpreting and understanding visual information. This newfound sight is enabling AI to provide valuable insights and assist in solving complex problems that require visual understanding. This is an incredible leap forward for technology, of a magnitude we’re only beginning to grasp.







For more content by Jake Bleicher, Portfolio Manager click here.



Get in Touch

In 15 minutes we can get to know your situation, then connect you with an advisor committed to helping you pursue true wealth.

Contact Us
Questions to Consider During a Market Downturn

Questions to Consider During a Market Downturn

Prepare yourself, your clients and your team for the inevitable. Download Your Copy