• AI Quick Bytes
  • Posts
  • Goldman Sachs’ AI Alarm: Are You Ready for the Future?

Goldman Sachs’ AI Alarm: Are You Ready for the Future?

Explore Notion’s Data Lake, New AI Tools, and Strategies to Transform Your Career

Welcome to this edition of AI Quick Bytes! Get ready to turbocharge your summer with cutting-edge AI insights, exclusive events, and game-changing tools! From our brand new Responsible AI email training to a power-packed webinar for AI leaders, and a rocking Silicon Valley meetup, we’ve got something for everyone. Dive into the latest trends, discover Notion's incredible data lake transformation, and explore slick new tools like the LLM Comparator.

So excited to share with you the in-person and virtual events that will help you Accelerate your AI career this summer!

Hot off the presses:

  • Launch of our FREE Responsible AI email training that we think you will find empowering and love to get your feedback on. Sign up below.

🚀 Learn to Lead with Responsible AI! Join Our Free Exclusive Email Training Now! 🌟

Tap Below To Let Me Know.

Login or Subscribe to participate in polls.

Quick bits

🧩 Strategy: The More I learn the more I realize I don’t know ANYTHING :-)

📊 Trends: AI isn’t all roses

🛠️ Tools: Building and scaling Notions Data Lake

💡 Prompts: A slick LLM Comparator tool

We Are Sponsoring Ourselves Today!:

Welcome To AI Quick Bytes

Let’s Get To It!

🧩 Strategy

I love listening and working with people who have a different approach to thinking than I do. I stumbled on upon this video by Cedric Chin while reading a post by Eugene Ng. I was deep into the internet rabbit hole and found this gem.

Imagine combining academic theories with real-world fun. Picture this: we dive into a theory of expertise, not just for the books, but to mimic and accelerate our own expertise in areas like investing, business, or even personal wisdom. The argument against patterns and expert frameworks and the bridge to fragments was mind blewing for me. I highly recommend watching to rethink how you approach not only business but life.

Crank it up to 1.5X to absorb knowledge even faster.

📊 Trends

It was not a matter of if but when the alarm would be sounded on AI/Gen AI. This trend aptly follows the above video by Cedric Chin. I have seen the ups and downs of AI Summers before and realize that although it will transform our world, it is still very early in its life cycle and complex to successfully cobble together solutions in the enterprise. We will get there, and the companies that adapt and create mature, scalable platforms to create value with AI will win. However, at this point in time, products do not feel industry-ready due to issues like hallucinations, accuracy, and security. Implementation costs are high, and success rates are low. When a new technology is breaking into the corporate market, expect big home runs and strikeouts in the beginning until the stack stabilizes and matures, just like CI/CD. Low ROI is par for the course at this point in time. It feels like everyone is waiting for the killer application built from the ground up on AI. When it does happen, it will seem obvious, like "why did I not think of that!" IMHO, the biggest challenge is business and tech collaborating together to overcome barriers to data and internal expertise.

AI scaling myths - Sayash Kappor and Arvind Narayanan ask can Model growth continue indefinitely? “This gets at one of the core debates about LLM capabilities — are they capable of extrapolation or do they only learn tasks represented in the training data? The evidence is incomplete and there is a wide range of reasonable ways to interpret it.” They lean toward the skeptical view. I am fully confident with a mixture of brute cpu and human ingenuity we will break thru AI Model barriers until we hit the next wall -lather, rinse, repeat.

Cup Half Full

Foundation Capital shares “We are at a unique time in history. Every layer in the AI stack is improving exponentially, with no signs of a slowdown in sight. As a result, many founders feel that they are building on quicksand. On the flip side, this flywheel also presents a generational opportunity. Founders who focus on large and enduring problems have the opportunity to craft solutions so revolutionary that they border on magic.”

Me, I am eternally optimistic that amazing innovations will be taking place in the next 10 years and I am super excited to play a small part in making them happen.

🛠️ Tools

Notion’s in-house data lake is built on Debezium CDC connector, Kafka, Hudi, Spark, and S3

Been really loving working with Notion. Have not been this excited about a software product since Slack. Having fun with Notion, learning all the things it does, and have not even touched the surface yet. Over time, I plan to use Notion as a way to share content and build community—stay tuned! But enough Notion fanboying; back to our regularly scheduled programming.

Notion's data team recently overhauled their data architecture to create a robust and scalable data lake, addressing the challenges of rapid growth and data management. The article details their journey, from recognizing the need for a more efficient system to the implementation of a modern data lake infrastructure. They leveraged tools like AWS S3 for storage, Apache Hudi for data ingestion, and Databricks for analytics, which collectively enhanced their data processing capabilities.

This transformation not only improved data accessibility and reliability but also facilitated more sophisticated data analysis and reporting. By adopting a scalable, flexible architecture, Notion's data team ensured that they can efficiently handle the increasing volume and complexity of data, thereby better supporting the company's decision-making processes and growth objectives.

What I Really Loved - was how the team called out specifically what is in and out of scope. This is low-hanging fruit, but so many teams don’t specifically call out what is not in scope. It helps focus clarity and minimize wasted effort due to misalignment. Article excerpt below:

Building and scaling Notion’s in-house data lake

Here were our objectives for building an in-house data lake:

  • Establish a data repository capable of storing both raw and processed data at scale.

  • Enable fast, scalable, operable, and cost-efficient data ingestion and computation for any workload—especially Notion's update-heavy block data.

  • Unlock AI, Search, and other product use cases that require denormalized data.

However, while our data lake is a big step forward, it's important to clarify what it's not intended to do: «««« Nicely Done!

  • Completely replace Snowflake. We’ll continue to benefit from Snowflake’s operational and ecosystem ease by using it for most other workloads, particularly those that are insert-heavy and don’t require large-scale denormalized tree traversal.

  • Completely replace Fivetran. We’ll continue taking advantage of Fivetran’s effectiveness with non-update heavy tables, small dataset ingestion, and diverse third-party data sources and destinations.

  • Support online use cases that require second-level or stricter latency. The Notion data lake will primarily focus on offline workloads that can tolerate minutes to hours of latency.

💡Prompts

It’s a stretch trying to fit the LLM Comparator tool in the Prompts section, but it does compare prompt output and heck it is just f’in cool. I am all about the tools and this one is pretty slick. This tool simplifies the intricate task of LLM assessment, ensuring more effective and human-aligned AI development. - Minsuk Kahng, Ryan Mullins, Ludovic Peran.

What'd you think of this week's edition?

Tap below to let me know.

Login or Subscribe to participate in polls.

Until next time, take it one byte at a time!

P.S.

If you are enjoying our newsletter then you will love our upcoming Webinar!

Reply

or to participate.