• Prophet@lemmy.world
    link
    fedilink
    English
    arrow-up
    4
    ·
    11 months ago

    Also quite difficult from a vision perspective. Tons of potential object classes, objects with no class (e.g., leftovers, homemade things), potential obfuscation if you are monitoring the refrigerator/cabinets. If the object is in a container, how do you measure the volume remaining of that substance? This is just scratching the surface I imagine. These problems individually are maybe not crazy challenging but they are quite hard all together.

    • kromem@lemmy.world
      link
      fedilink
      English
      arrow-up
      7
      ·
      11 months ago

      You don’t use vision, or if using it you are only supplementing a model that is mostly using purchase histories as the guiding factor.

      • TheGreenGolem@lemmy.dbzer0.com
        link
        fedilink
        English
        arrow-up
        5
        ·
        11 months ago

        But you actually need vision because purchase history is not indicative of my future purchases. Sometimes I buy butter and eat it in a 3 days and buy again. Sometimes I’m not in the mood and have a chunk of butter to sit in my fridge for 3 weeks. It’s honestly totally random for a lot of things. It depends only on my mood at the moment.

        • kromem@lemmy.world
          link
          fedilink
          English
          arrow-up
          3
          arrow-down
          1
          ·
          11 months ago

          You’d be surprised at how many of those things you think are random would actually emerge as a pattern in long enough purchase history data.

          For example, it might be that there’s a seasonality to your being in the mood. Or other things you’d have brought a week before, etc.

          Over a decade ago a model looking only at purchase history for Target was able to tell a teenage girl was pregnant before her family knew just by things like switching from scented candles to unscented.

          There’s more modeled in that data than simply what’s on the receipt.

      • Prophet@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        11 months ago

        I agree, in the context of the tweet, that purchase history is enough to build a working product that roughly meets user requirements (at least in terms of predicting consumed items). This assumes you can find enough purchase history for a given user. Even then, I have doubts about how robust such a strategy is. The sparsity in your dataset for certain items means you will either a.) be forced to remove those items from your prediction service or b.) frustrate your users with heavy prediction bias. Some items also simply won’t work in this system - maybe the user only eats hotdogs in the summer. Maybe they only buy eggs with brownie mix. There will be many dependencies you are required to model to get a system like this working, and I don’t believe there is any single model powerful enough to do this by itself. Directly quantifying the user’s pantry via vision seems easy in comparison.

    • Bread@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      4
      ·
      11 months ago

      There could be an easy party mode button in which it just ignores the usual and picks likely food options for a party.

    • eclectic_electron@sh.itjust.works
      link
      fedilink
      arrow-up
      1
      ·
      11 months ago

      Honestly I would be perfectly happy with the service like this, even if I had to manually input what groceries I need. It’s still an incredibly complex problem though. AI is probably better suited for it than anything else since you can have iterative conversations with latest generation AIs. That is, if I tell it I need cereal, it looks at my purchase history and guesses what type of cereal I want this week, and adds it to my list, I can then tell it no, actually I want shredded mini wheats.

      So it would probably have to be a combination of a very large database and information gathering system with a predictive engine and a large language model as the user interface.