Sorry. This page is not yet translated.

Camera: the next smartphone interface

Harshjit Sethi

At Sequoia::Hack, Sequoia India’s annual hackathon on September 10-11 in Bangalore, a few hundred of India’s smartest developers gather to spend 24 hours imagining and coding the future. Participants work on a wide-range of ideas from building a telemedicine VR app to re-implementing Unix system calls. To facilitate collaboration and competition, participants elect to compete in tracks which are broad topics either encapsulating a new technology trend or use case.

One of our hackathon’s recurring tracks has been the future of mobility – how does one leverage the smartphone to create new experiences? Nine years since the dawn of the iPhone, some may argue that the best days of mobile innovation are behind us.

However, we believe that in India, the opportunity lies in the emerging scale of the platform together with meaningful advances in reducing the price of technology. As more Indians get a smart device that has the computational power to do more, many large scale problems can be solved from civic governance to B2B commerce.

Dreaming about the future, one key aspect of the future of mobility will be the interface of human interaction: what medium will users use to provide input to their device and what new use cases can that enable?

The smartphone has four primary interfaces: text (keyboard), touch, voice and visual(camera) with different apps using different interfaces depending on the use case.

An apps choice of interfaces in many ways is influenced heavily by what existed before the smartphone. The computer keyboard, the point-and-shoot camera and the mouse have all shaped our thinking of what an app’s interface should be: a chat app will use a keyboard as a hangover from our instant messaging days while a music app uses touch for controls to emulate a mouse click. However, such anchoring in the familiar might often constrain creativity and prevent smartphone-specific innovation.

Of the smartphone interfaces, I believe the camera has been the most underrated input medium since its use cases have most heavily mimicked the same applications that existed for the Kodak camera from decades ago.

However, when teams expand the definition to think of a smartphone camera as visual input that can processed by a computer connected to a network 24x7, some truly transformational smartphone experiences can be created.

Three of the most popular smartphone applications from the past few years have been created by having the camera replace another more conventional means of input:

  • Snapchat [keyboard -> camera]: Snapchat’s unabated popularity in its initial years was led by teenagers while most adults didn’t understand how to use the app to chat. A migration from PCs meant many people thought of chat on the phone also as text. But as a generation that has grown up with smartphones has shown us, communication is richer, personal and more fun when it uses photos
  • FaceTime [voice -> camera]: One of the iPhone’s most popular features is FaceTime. As Apple so successfully demonstrated in its marketing campaigns, the ability to feel closer to your loved ones, to always be there – to see and hear – was something that was enabled by adding a visual dimension to auditory telephonic communication and was unique to a smartphone
  • Pokemon Go [touch -> camera]: Pokemon Go is the fastest app to get to 100 million downloads in the world, and has enabled a paradigm shift in gaming. Most mobile games before Pokemon largely relied on touch or motion as a means of input, and were played in a 2D world on a users 4-inch screen. However, by using camera as a primary means input, the game has made the entire physical world a gaming arena - and with it shown the power of artificial reality (AR) in creating magical user experiences

The camera will continue to get more important in the evolution of the smartphone, enabled by two specific technology trends: AR and deep learning.

With AR, a new virtual dimension can be added to our physical world. Imagine going sightseeing and have an architectural expert point to all the facets of the Taj Mahal or being able to see exactly how a new haircut will look on you while sitting in a hair salon. Similarly, progress in deep learning and image processing can enable new use cases with no new infrastructure – imagine a doctor taking a photo of a patient’s CT scan and a computer on the back end scanning hundreds of thousands of similar scans to aid the doctor’s decision making.

As the camera use cases move away from simple photography to providing visual input and context, we might be looking at a new wave of smartphone innovation. Come show us what ideas you have for innovations using the camera at Sequoia::Hack (https://app.sequoiahack.com) on September 10-11!