+1 to Ken Lonyai, Sterling Hawkins, and Ryan Mathews. There are multiple issues here -- the efficacy and location of kiosks in general, and the value of voice technology in retail. Is voice hands-free and thus germ-less? Yes. Will it recognize every voice that enters a store? Yes and no -- it depends, of course, on your training set and your customers. Is it proper for kiosks? It's a place-purpose-customer question.
More important for retail in general: is voice a tool that -- in customer service, in operations, and from the consumer's kitchen -- can drive quantifiable value? Hell yes, a truth being demonstrated right now in different ways by the world's leading retailers as measured by revenue. That's why a number of retailers are now supporting The Open Voice Network, a new non-profit, which is dedicated to developing global technical standards, ethical use guidelines, and quantifiable use cases for voice. If you'd like information on what we're doing, and why leading retailers are involved, call me at +1 503 449 4628, or at email@example.com.
Let's start by properly framing the issue. AI-enabled voice assistance today and tomorrow goes well beyond smart speakers; the smart phone (with a screen, folks) will be the primary vehicle for voice assistance. You'll use voice assistance in your car, you'll use it to enter data into enterprise applications (see Salesforce's good work.) We cannot and must not equate voice assistance with a piece of hardware.
Second, let's recognize that voice is in its earliest of days, days similar to the Netscape-IE browser war days. Earliest of days in development, in capabilities, in usage. It wasn't so long ago that 95% of America wasn't about to shop on the internet for many of the same reasons mentioned here -- fear of privacy violations, fear of data theft. Perhaps others also remember the dismissive comments applied to internet-based retailing -- "hell, it's no more than a store's worth of revenue," or "it's only for the nerds -- women will never use it." (Yes, I actually once conducted a study for Intel as to the potential of US women using the internet to shop. Fortunately, I concluded "yes.")
Third, as Ms. Petrock reported a few weeks earlier -- not noted above -- voice assistance has crossed the chasm in both availability and adoption. In the States, it's reached early majority-level use. Simple use, for sure -- but regular, active use nonetheless.
Fourth, we must try -- please, try -- to understand what voice can and can't do. The studies show that we can speak 3X faster than we can type. And that we can read 2X faster than we can listen. Which suggests a microphone, speaker, and screen. The voice developer community -- people who are making money by making money for others -- has already moved to the necessity of voice-visual (multi-modal) communication for commerce. The best ones roll their eyes at these types of discussions.
Fifth, let me echo the words of Paula and others. There is a significant issue of consumer trust. Much as there was with the internet. Much as there is whenever new consumer technologies emerge. It's trust in privacy, in data usage (voice is a biometric and a diagnostic), in how to use it.
Sixth, let's read the eMarketer report -- and not just the headline. The headline suggests that the annual increase in smart speaker purchasers is disappointing ... because it didn't reach the level forecast by eMarketer. Hmmm. Actually, the number of purchasers by smart speaker increased 18%. Yes, eighteen percent. So -- which is more disappointing: the forecast or the reality?
Voice needs time. And voice needs standards, guidelines, the kind of governance that made the internet the world's greatest value creator. (There are none at present.)
This is a transformative technology. Coming at us slowly and inexorably, as the tide. Given that, we have the opportunity to make voice worthy of enterprise and consumer trust.
Contact me if you'd like to help.
The short answer on voice is yes. The good answer on voice is, of course, much more nuanced. The key factor will be maturity of voice functionality, and its ability -- as Paula points out, above -- to deliver shopper ease.
In the near term -- with current command-response functionality -- the leading driver of voice-based search for any retail (including grocery) will be visual-voice pairing. Voice-only search is now far too klugy. Current command-response functionality is also most appropriate (in grocery) for the replenishment of the core weekly basket ... with (most likely) an encroaching impact upon branded products. However, we're in early days of voice functionality, and current command-response UX will be a humorous anecdote in a few years. Take the Google Duplex demo, multiply by 2X or 3X over the next Y months, and now envision the potential impact of voice to grocery e-com from search to post-sale service.
That being said: e-com grocery, as Lee points out, will make or break on operational consistency, and the retailer's ability to profitably manage the last mile, be it BOPUS or BOSFS. (See Capgemini's latest paper.) Voice -- with a maturing functionality -- will accelerate a base business that other factors will build.