Does Google Now Really Need All Your Data?

On June 17th, 2015, I discussed Google Now, Apple’s Proactive and their respective approaches to obtaining your data. My main thesis was that as far as I could tell, the predictive assistant functions that most people come up with seem to be perfectly possible even with Apple’s approach. In conclusion, I said the following;

Although I certainly need to dig into this in a bit more detail, I am skeptical that invading your privacy is essential for providing a better personal assistant service. I would welcome any examples where the personal assistant must absolutely send all knowledge of everything about you to servers in the cloud to be analysed.

Phil Schiller made some comments in an interview with John Gruber which indicate that Apple thinks the same;

If ever there’s a modern definition of a Faustian bargain, this is it, right? Which is, that if you want to get the features, give us all this information about your life that you’d really rather not.

And we’ve believed for a very long time that that doesn’t have to be the case. And so we’ve built systems and processes all around the idea that, in order to help users, you can do things that are surprising and delightful and magical—but we don’t know your data.

So now the fight is on. On one side, we have Google which suggests that they need all your data to provide you with wonderful predictive assistant services (actually I haven’t seen anybody from Google actually say that, but it seems to be what the pundits are collectively thinking). On the other, we have Apple which believes that they don’t need all this data. Essentially, Apple is saying that there is an upper limit to what they need to know, and that limit is actually very low. It would be interesting to watch how good Apple’s Proactive turns out to be versus Google Now.

Of course, Google and Apple have very different business models and hence the business requirements for their predictive assistants are different. Google’s business is advertising so they need enough information to target you with ads. That might require much more personal information than what is required for just providing a predictive service. For example, if you’re just thinking of going to town, all that a predictive assistant needs to do is to give you the directions, time, lunch suggestions, etc. This can all be done anonymously. However, if Google needs to send you ads on behalf of advertisers, then the more they know about you, the higher they can sell you to advertisers. Advertisers love targeting information and would pay for a detailed profile of whom they are targeting. For example, they would love to know if you are married or have kids. They would like to know what kind of food you eat regularly. They may even like to know if you are fit or overweight. None of this is relevant for the predictive assistant task itself, but it is relevant for the ads given through the assistant. And in Google’s case, these ads are what financially support the service. Essentially, Google Now needs more personal information because they need to finance the service through ads. Even if a predictive assistant didn’t need this data to give users advice, Google would still need to collect data on behalf of the advertisers in order to sustain the service financially.

  • Interesting point and I tend to agree, at least in the near term.
    Longer term, if these services get really smart, theoretically a lot of that data could also be relevant, e.g. if I say “Siri/Google, where should I go for lunch?” the machine learning models certainly can take advantage of data like age, weight, marital status, having kids or not, foods I’ve mentioned in my emails, restaurants where my Location has been detected, etc. And so with all those factors it could come back with your Top 3 Suggestions list and be much more accurate than Yelp’s useless list of 50 restaurants nearby in some dumb order.

    However, while in some areas Google’s AI efforts seem to be providing very direct value -e.g. Google Photos’ impressive object recognition- I’ve never heard someone claim that Google Now is anywhere near that level of helpfulness.

    But even in my scenario, I don’t see why Apple couldn’t use all the same data just with an anonymized identifier or doing some summarization of your activity etc.

    Have you used Google Now? I had a Nexus 5 for a month last year. I think UX-wise Now has some potential benefits over Proactive in that each “information item” is a full-screen Card, so e.g. the Weather card, or the Your Baseball Team Update card, can have much nicer look and info than Proactive where they appear to be more “rows” of info and 5-6 rows show on the screen at once.

    But other than that the scenarios seemed to be fairly basic e.g.
    – bring up your boarding pass or alert you to a gate change. This already worked years ago via Passbook and airline apps. Probably Now overall is a bit simpler e.g. not having to have or log into your airline’s app.
    – tell you you need to leave for a meeting early due to traffic. iOS8 and especially 9 cover this also, and all reports I’ve seen of Now users say this is still a bit hit-or-miss
    – Sports scores, maybe if your team scores it gives you an alert etc. Again this is something the ESPN app et al certainly do on iOS
    – Nearby Restaurants, but not very personalized or accurate AFAIK

    I imagine since last year they’ve released some improvements, and with Android M they’ve opened up Google Now Cards to 3rd parties, but browsing a bunch of cards seems fairly boring, most apps don’t have real-time or intelligent aspects to give you actually useful information most of the time.

  • Kenny

    Does your secretary needs to know your name, your birthday, all your appointment whether you have a wife, or children, to be of a good assistant?

    So basically you just repeat what Apple says, because it reflects your narrative, without any substance nor context?

    besides i enjoy reading you blog!

    • Thanks.

      My skepticism of machine learning actually goes back more than 10 years when I was studying the application of computing to analyse genomics data and the interaction of biomolecules. Although I am certain that the algorithms and raw processing power have greatly improved since then, conceptually the methods still seem to be quite similar.

      Interestingly, the failure of computer algorithms alone in biology forced researchers to take a more manual approach with human curation, and it was this approach that enables genomics research to to be automated. That is to say, computers themselves could not create a solid enough foundation upon which you could base high throughout automated analysis. Only human curation could build a solid enough foundation.

      As such, I strongly think that it is not the collection of data behind your back that is key. Instead, what’s important is for users to trust you with data and voluntarily provide high quality information. This won’t happen without trust.

      • Kenny

        It’s okay to be skeptical of machine learning, but you must also understand that the value of something should not be measured on the basis of what it is now, but on its potential,

        We are not even close to the potential of IA, hence to try to discredit it based on some PR Talk from Apple or a couple example where human can do better is a pointless.

        The goal of AI is to make human much more efficient, not to do better than us, Hence the more it knows about you the more efficient you will be.

        • Of course.

          At the same time, you have to wonder why, for example, Google’s massive machine learning engine still struggles to distinguish humans from gorillas. Current AI uses methods that are significantly different from how humans and other animals learn and think, and it still struggles on tasks that any 5 year old could easily accomplish, with several orders of magnitude less data input. AI has been hyped before and machine learning itself is a very very old concept. I tend to think that unless there is a breakthrough in how AI is approached (instead of the current statistical model), AI will continue to struggle.

          The way I see it, current AI is good for tasks that don’t need to be accurate; tasks that can delegate the final decision to a human.

          AI is good at surfacing a list of possibilities (e.g. a list of search results). It is however not good at narrowing down the list to a single item and answering yes or no (e.g. is it a human or a gorilla?).

          There are some interesting ways how this might be improved, but I doubt that letting machines simply track you is the answer. You have to realise that Google is already sucking up magnitudes more data than even the most intelligent people on earth have access to, but still fails on some very basic tasks. You have to ask why does it fail?

          • Kenny

            Again, you’re making grand statement based on errors instead extrapolating between the capacity, progress and potential of today IA.

            Today AI should not be compared to adult human ability unless you do not understand how complex is the human brain, or training process or simply the number of errors that we do as humans during our own learning process to get to an absolute understanding of something which even then can still be erroneous.

            Even though we are very far away from computer that can actually learn by himself or understand all human context, today AI is already effective at narrowing down a list to a single item or answering a yes or no question, it’s all depends on what kind of question you ask.

          • Even though we are very far away from computer that can actually learn by himself or understand all human context, today AI is already effective at narrowing down a list to a single item or answering a yes or no question, it’s all depends on what kind of question you ask.

            That is exactly my point.

            AI is still very immature and it only gives you good results for specific questions. This is true even if you feed it massive amounts of data and you process it using huge computers. There are still fundamental concepts and techniques that current AI lacks to be able to provide the intelligence that humans would normally take for granted. Of course it is very possible that AI will improve dramatically in the near future. However, if you look up machine learning in Wikipedia, you will learn that it’s a very old concept that did get a significant amount of attention in the 1990s, but didn’t get very far until a recent breakthrough. The science behind machine learning has not improved in a linear fashion, but only sporadically. This is often how other sciences progress too in general. Therefore, it is difficult to predict whether AI will truly improve in the mid-term, and it is just as likely that it will stall again at some point.

            Instead of simply imagining the possibilities of AI, I believe that it is prudent to take a careful look at what AI is good at, and what it is bad at. AI’s strengths and weaknesses are very different from our brains.

            Consider again, how many pictures does a child have to see to be able to distinguish humans and Gorillas with 100% accuracy? I would say, maybe less than ten. Google’s AI engine must have seen hundreds and thousands of Gorilla photos. The amount of data is probably not the key to learning, and similarly, I doubt that Google’s AI will be much better than Apple’s AI simply due to the difference in data volume.

          • Kenny

            You seem confused between, technology concept, system learning with actual product.

            The AI concept may be dated 1960 but an actual functioning system of machine learning are quite new.

            what you fail to understand is that the hardest part of IA is not about answering question but rather to understand the context of different between you and me or anyone else
            in relation to our needs our approach, our feeling at any given time.

            Just as the more you travel and interact with each other, the better you can understand and articulate the contextual difference between them, same is true for IA, the more data you give it to learn from, the better it will be able to understand and articulate the difference between the context of what is an object, human, animal, or it surroundings, etc.

            A child can easily identify a gorilla from a human with 100% accuracy only after learning the difference from us through a transfer of knowledge. today IA cannot do that because it’s in the process of learning how to do it from a difference perspective,

          • Are you sure you know what kind of AI is being researched right now and how it works?

          • Kenny

            I am aware of the concept, its potential and the way it is currently developing.

            Feel free to enlighten me

          • Are you aware of what kind of problems it is being applied to? Wikipedia gives a list;

            https://en.wikipedia.org/wiki/Machine_learning#Applications

            Although there are some interesting applications which require very deep learning and analysis or a certain field, none of them attempt a common sense, multi-faceted approach to understanding human beings. Maybe we will see some in the future, but at least for now, we seem to be training computers to be at best, socially awkward nerds who have immense understanding of a certain expertise, but very little common sense.

            Importantly, look at how these AI engines learn. The way they learn is very different from how humans or even lowly creatures learn. They learn by statistics, not by reason, empathy or logic.

            I would say that current machine learning techniques are extremely crude and primitive compared to humans. They are trying to compensate for this with brute force, but it is not clear that this is at all possible.

          • Kenny

            Are you aware of all the research around the Deep Mind protocol and many others learning system in progress as we speak?

          • Of course. At least if you are referring to http://deepmind.com/publications.html

            Did you read the titles and summaries of their publications? For example “Teaching Machines to Read and Comprehend”? I work in science, and computational analysis of the vast number of human written research articles is a huge issue. I can reiterate that;

            Teaching machines to read natural language documents remains an elusive challenge.

            That’s where AI is today, and that’s where it’s been for decades. There have been significant improvement, but don’t hold your breath just yet.

            As I understand it, the current state of the art for computational analysis of human written documents is not to use machine learning, but to use keywords and sentence structure to gain a crude understanding of what it’s saying. A simple example would be, if the words “eat”, “pig”, “flower” appear in the same sentence, then the computer can gain the knowledge that pigs eat flowers. Although this is a very crude and error prone understanding, if you have a large number of documents which should agree in aggregate, then it will suffice most of the time. Conversely, I have yet to come across any attempt to carefully comprehend the precise meaning of each sentences. The Deep Mind examples do not seem to be any different although they do show improvement over previous methods.