Sony Interactive Entertainment Patent Dives Into Providing a “DLSS”-Like Solution for Better Picture Quality

sony dlss

It would appear that Sony gaming division may be looking into some image and performance improvement solutions along the lines of Nvidia’s DLSS and AMD Fidelity FXSR, as their latest patent explores methods to improve image quality, and filling in missing data via a combination of machine learning and computer-based implementation.

Filed this past April, with it going public at the end of last month, the latest in patent from Sony Interactive Entertainment details a potential solution to offering users better image quality.

Digital images can contain regions of missing or corrupted image data. Missing or corrupted regions are referred to in the art as “holes”. Holes are normally undesirable, and methods of inferring what information is missing or corrupted are employed to fill the holes. Filling holes in images is also referred to as image completion or inpainting.

A variety of processes exist for filling holes in images. Machine learning inference techniques, which rely on trained processes, can fill holes in images with high-quality results. However, machine learning techniques are performance intensive, requiring powerful computer hardware and a large amount of time.

Holes in images arise in image-based rendering systems. For example, where there are two or more images representing perspectives of the same environment, there may be no image data corresponding to an intermediate perspective that a user would like to see. Alternatively, there may be some image data missing from one of the perspectives. Machine learning processes may be used to infer the intermediate perspective and to infer the missing image data. Executing machine learning processes to obtain missing data is computationally costly and time consuming.

An example of an image-based rendering system is a virtual reality device displaying a virtual reality environment. A user wearing a virtual reality headset is presented, by two monitors in the headset, with a representation of a three-dimensional scene. As the user moves their head, a new scene is generated and displayed according to the new position and orientation of the headset. In this way, a user can look around an object in the scene. Areas of the initial scene which become visible in the new scene due to the movement are described as being previously “occluded”. The displayed scenes may be generated by computer hardware in a personal computer or console connected to the headset, or by a cloud-based rendering service remote from the headset. A rate at which image data is supplied to the headset is limited by bandwidth of the connection between the headset and the computer, console, or the cloud-based rendering system. Consequently, sometimes, not all the data required at a given time to entirely construct and display a scene is available due to bandwidth limitations or interruptions. Holes in the image data making up the scene are an undesired result and have a significant negative impact on the immersion experienced by the user.

It goes on to state an advantage of this would reduce the load on a processor of said hardware.

This advantageously enables the hole to be filled quickly and efficiently, while increasing the likelihood of achieving a high-quality result. Nearby pixels having different material identifiers to the hole pixel are more likely to look different to the missing pixel data, than those with matching material identifiers. Therefore, using pixels with the same material identifiers advantageously reduces the computational burden on a processor, while more closely achieving an appropriately filled pixel. The determination of the average may include weighting values of the surrounding pixels to be averaged.

This enables some surrounding pixels to contribute more to the average than others, thereby advantageously increasing the versatility of the filling process according to the image being processed. The second filling process may include a machine learning inference process.

Machine learning inference processes provide high quality image filling results. By providing a machine learning inference process as the second filling process, advantageously an improved balance between speed and quality of image processing is achieved.

Some illustration are provided to outline the flow of the process along with a showcase of some before and after images.

This goes without saying with this being a patent and all, it doesn’t necessarily mean we will ever see this happen. However, the company is no stranger to AI based technology as their latest in the Bravia line of TV, Bravia XR supports what is called “cognitive intelligence”.  The explain:

The way we perceive the world is based on information coming from our eyes and ears to our brain at the same time. Conventional AI can only detect and analyze elements like color, contrast and detail individually. Cognitive Processor XR can cross-analyze every element at once, just as our brains do.

To create this closer-to-reality feeling, Cognitive Processor XR divides the screen into hundreds of zones and recognizes individual objects in these zones better than ever before. What’s more, they can cross-analyze around a few hundred thousand different elements that make up a picture in a second, the same way that our brains work.

What cognitive intelligence looks like:

Of course this is pertaining to their television. However, with the growing support of DLSS and AMD Fidelity FXSR, we don’t think it’s all far fetched that Sony would be developing their own solutions. Guess we will just have to wait and see what the future may hold!
Previous Post

Next Post

Top Games and Upcoming Releases

Back 4 BloodCall of Duty: VanguardBattlefield 2042Halo InfiniteDying Light 2OpenCritic