Guest post by Chris Kern
What may be the next major development in the use of artificial intelligence to improve photography is currently the subject of a well-funded research program by major players in the camera and software industries, according to a friend of mine who serves as a consultant to the project and who agreed to let me post a summary of the research as long as I don’t use his name or those of the participating companies. He and his colleagues call it SWIM, which is an acronym for 'Shoot What I Meant' (i.e., not necessarily what I actually saw in the viewfinder).
The essence of this technique is to train a neural network to optimize the images made by a camera in real-time based on advanced machine-learning techniques. This involves feeding millions of pictures to the neural network until it figures out how to produce good ones. Prototype cameras that employ this technology have already been made available for testing to selected photographers under strict nondisclosure requirements. The software is computationally intensive and requires a lot of specialized supporting hardware. While the eventual goal is to make these techniques available in a cellphone camera, currently a much larger enclosure is required: something the size of a current full-frame DSLR.
Accordingly, the researchers arranged with a manufacturer of high-end cameras to build a few prototypes, which look exactly like the company’s top-of-the-line 'professional' reflex model but actually contain a mirrorless sensor, leaving room inside the body for the auxiliary data processing components and high-speed satellite uplink that are required to perform the image transformations. As a security measure, the prototype emits the sounds of a mirror-slap and a mechanical shutter, and a haptics module even imparts a little shake while the camera is simulating the mechanical actions of a DSLR, although of course the software completely neutralizes any blur caused by camera motion—and even subject motion, for that matter.
A module for sports photography eliminates the need for the 'spray and pray' technique commonly used when shooting fast-moving events and, consequently, reduces the time spent by the photographer or photo editor to cull a large number of only-slightly-different frames. The AI software, which needed a different training set of images for each sport during the machine-learning phase, analyzes the action and anticipates the precise moment a play will reach a critical point, then triggers the shutter once to capture it. According to my friend, a panel of seven experienced photo editors judged the system to be accurate 92.1 percent of the time for basketball, 89.5 percent of the time for international football, 84.0 percent of the time for American football, and 81.7 percent of the time for baseball. (A cricket module was abandoned because the researchers concluded that the machine-learning effort would never terminate.) In addition to anticipating the optimal moment to snap the shutter, the AI system uses selective focus to blur any extraneous players and automatically removes other distracting elements, in a manner which I gather is analogous to Photoshop’s content-aware fill.
The module for photojournalism is a derivative of the one for sports photography, but has been trained to anticipate the precise moment when a political figure or celebrity will strike the most awkward pose or display the most bizarre expression. Hit rates, according to my friend, were even higher than for the sports photography examples, and would have approached 100 percent except for a consensus among the experienced photo editors that some of the images were 'too disgusting for publication.' (My friend declined to describe these outlier images, saying that was sensitive proprietary information.) Conversely, the portrait module waits for the subject’s most flattering expression before triggering the shutter. The portrait module also optionally employs an anatomical-improvement feature to discreetly modify the subject’s features according to a menu-selectable attractiveness parameter.
A landscape module is still under development, according to my friend. In addition to making generic improvements to light, color, shadows, and scenic elements, it can be trained on images by famous photographers in order to emulate their style exactly. For example, my friend used the prototype Ansel Adams module to shoot 'Moonrise, New York, N.Y.' He made the photo at high noon on a sunny, cloudless day from a vantage point in Secaucus, New Jersey, and the software automatically 1.) changed the perspective of the Manhattan skyline to match that of Adams’ famous photo, 2.) introduced a rising moon and cloud bank in the appropriate locations and proportions, and 3.) adjusted the lighting with a tone curve that is indistinguishable from that of gelatin silver prints of his Moonrise picture that were made by Adams himself. Further machine-learning along these lines has temporarily been suspended, however, pending a determination by the project’s legal consultants whether training a neural network to perfectly recreate the style of a dead photographer constitutes identify theft.
Now I suspect some of you reading this post think the introduction of this new AI technology will remove all the challenge and therefore the satisfaction from the process of making great images. I confess that was my initial reaction, too. But upon further reflection, I realized that possibility must be balanced against the potential advantages of these cameras for the experienced photographer. No need to get up before dawn to catch the perfect sunrise. Or to stand around for many hours, hoping for the emergence of a dramatic storm that never materializes. No more worrying about that irritating tourist who always seems to wander between your camera and the subject just before the decisive moment to capture a perfect street photograph: yes, the camera will automatically recreate the scene as though the interloper had not occluded your view. And needless to say, your loved ones will be delighted with the way they look in your family snapshots after you crank up the attractiveness setting. Artificial intelligence holds out the promise of finally eliminating the frustration many of us feel when a day of shooting doesn’t turn out the way we had hoped.
In any event, it’s coming whether we like it or not. You can’t stop progress.
Chris
Here's Chris's Flickr page. No, really. —Ed.
Original contents copyright 2023 by Michael C. Johnston and/or the bylined author. All Rights Reserved. Links in this post may be to our affiliates; sales through affiliate links may benefit this site. As an Amazon Associate I earn from qualifying purchases. (To see all the comments, click on the "Comments" link below or on the title of this post.)
Featured Comments from:
mark dannenhauer: "SYNC: The Truly New Breakthrough in Image Making
"Michael, Mikey, Mike, somehow I thought I knew you. Regular readings of TOP had me convinced that you were a hip guy. But today I read your posting of one Chris Kern’s post about SWIM. With all due respect, I can’t believe that anyone would advocate for such an antiquated platform, let alone that you of all people would publish it.
"SWIM at first glance seems cutting edge, a clever blending of old and new technologies. I could see it dominating the marketplace, except.... SYNC is newer, neater, just all around superior, especially for the price. Let me tell you why. (Dislcaimer: I have no financial interest in this breakthrough technology. I have joined SYNC solely on my own dime.)
"Some people assume SYNC stands for Shoot Your New Camera, as if your new 'camera' was super-hipster-system developed for the masses shooting fashion, sports, money, arts, landscapes, Austin Texas street scenes, etc. In fact, SYNC is an international open source effort for anyone and everyone everywhere to Shoot Your Now Consciousness.
"What’s revelatory, game-changing, about SYNC is that it eliminates hardware and software altogether, mostly. Let’s say you have an image in your consciousness. It could be a real life scene, or one by another photographer, or one by an artist, poser, or advertiser. Using SYNC’S patented processes, the image in your consciousness is automatically transferred to the consciousness of anyone with a SYNC subscription, complete with all original characteristics and desired modifications intact. Hardware, software, computer platforms…all things of the past, why not post about harness making, Leicas, or left-handed cameras for crying out loud.
"If you must have old world, and I do mean old world, old guard, passe, finito, what-century-are-you-living-in image transfer, there is an optional, hardware-based add on. (I did say, mostly without hardware.) SYNC sends your image to SYNC ADD—an add-on device where you can specify either a why-would-you-want-it paper print of your image OR a 3-D printer spit-out of a three-dimensional model of your image to your choice of scales. Animated 4-D versions of the device are in development.
"Introductory subscriptions to SYNC are available now via GoFunMe, today only. Subscriptions are of course valid only for the duration of your consciousness. Sign up now for the best, longest-lived deal!
"SYNC = 0 cameras/0 lenses/a lifetime of years. Your choice is clear. After all, it’s either SYNC or SWIM."
Mike replies: Why am I suddenly hoping for April 2 to get here? :-D
Stephen S.: "This may be an April Fools' Day joke in 2023, but given that Samsung phones already seem to use AI to generate better images of the moon when they detect you're trying to photograph it, there may come a day when much of this article is real."
JimF: "April Fools??? Probably not. Sounds possible to me. Funny what we classify as progress these days. Next step, Neuralink rose-colored glasses. Why see the world as it really is at all?"