Lifelogging in the age of AI

When I joined General Catalyst in 2006, I didn’t know anything about technology or software. I just knew I was an entrepreneur and wanted to get as close to entrepreneurship as I possibly could coming from a background in banking. The two years I spent at GC were really special. There were maybe 10 investors at the firm, everyone was incredibly curious and also generous educating me as I developed an understanding of this ecosystem. After a few incredible years, I was ready to strike out and start something. I was at GC through the launch of the iphone and the beginning of a very steep rise in smartphone penetration. Much of my thinking was around a thesis of mobile information capture observing that our phones were increasingly writing to the internet on our behalf. I became obsessed with the explicit and implicit digital exhaust we were generating from our phones. Today, of course, that seems like Captain Obvious talking, but at the time it was much less well understood.
When it was time to leave GC and become an entrepreneur, my first idea that I immersed in was the notion that if we could aggregate all of the exhaust we create, it could present a true representation of our lives that would be very valuable. I remember believing that our movement through space (the GPS exhaust) was the lowest level substrate on which you could develop this deep view into somebody, and envisioned layering Spotify data, SMS, email, photos, twitter, etc…on top of that substrate to be able to answer questions like “what was I doing on February 5th last year? How was I feeling? Who was I dating, etc.” I was very into life logging platforms but didn’t want to put all of the effort into capturing my experience actively and this felt like a different and richer path. I remember trying to explain the value of this aggregation to the guy who led our mobile practice at the time…I drew a line on a whiteboard and said “this is your path through physical space, as we layer other forms of time synced data on top of this line we can infer the story of your life. Do you think we could do something with this?” His answer was…“No, this doesn’t make sense.” Even at that time people were talking about semantic search as an opportunity, but of course we were a decade too early to truly ask the questions that I envisioned.
Fast forward to today…we have the capacity to process and understand non-uniformly structured data of all types through a single querying method. I’ve been playing extensively with Google NotebookLM and trying to understand why it is resonating so deeply with people. To me, it comes down to two things: 1) they’ve created a very simple interface for people to upload a wide variety of document and data types on which to run inference. I can drop a PDF, a URL, a Youtube video, whatever into the tool and quickly run inference over the dataset, and 2) while not obviously central to the UX, the audio generation feature is the most magical element of the tool in that it transforms static data into a compelling narrative format. Basically it’s documents in, compelling conversational podcast out.
The first thing I uploaded to Notebook LM was an SVB market data PDF, because that’s the articulated use case, but quickly I went back to my old interests from General Catalyst. What if I could download all of my exhaust from the services I use, and then create a narrative output on top of this disparate data? Maybe I could realize my dream from 15 years ago. I did a little research and it turns out you can download your entire data exhaust from Spotify, Google Maps, Twitter, Gmail, Instagram, Apple Notes, Photos, Youtube Watch History, bank statements, credit card statements, Google chrome history, and so on…the hardest one to access is historical SMS/imessage, for which it appears you need to subpoena your telecom provider, but overall if I was willing to go through some administrative pain, I could get a pretty reasonable amalgam of my activity. Maybe enough to transform static, disparate data into a compelling biography of my life. Something my kids and friends might enjoy reading for example.
It took Spotify a few days to email me json files of my listening history. To upload them to NotebookLM I had to first convert a json file into a PDF because NotebookLM doesn’t accept json, but from there I was able to theoretically query my entire listening history. I tried to ask questions like “was I in a good mood on this date?” or even “who are some artists I was into in this month” but Notebook failed to deliver. I guess not purpose built for my use case…The best output I got was this podcast analyzing my listening history, but it became clear to me that I wasn’t going to be able to use NotebookLM to aggregate and access the value of my exhaust efficiently.
This failure led me to wonder if there wasn’t a simple consumer interface that automated the annoying administrative steps of pulling and uploading a user’s data from every service, whether there was enough value in transforming it into a legible narrative that people would aggregate their exhaust. The premise of “the user should own the data, not the platforms, and apps should request permission to access it” has been a dream for years never realized, but lately I’ve been wondering if the ability to truly process a user’s aggregated, unstructured data semantically doesn’t present the opportunity for value creation so great that both consumers and applications would adopt the dreamt architecture above. Could there be an aggregated data vault that every person owns, on which they and other apps can run inference for different use cases? And does it start with something as simple as “construct the story of my life for me in a medium that I and others enjoy?”
I don’t know…but if anyone is working on an LLM powered biographer/life logging platform, I’d love to jam with you: jordan@pacecapital.com



Leave a comment