AI Master Group Wht Font Dwn
Background

Latest Videos

30:55

Celia Wanderley: AI Innovator of the Year

Celia is the Chief Innovation Officer at Bits In Glass (BIG), a top Canadian IT consulting firm. Prior to that, she held senior leadership roles at AltaML and at Deloitte Canada. Celia was recognized as the AI Innovator of the Year, by Women in AI.

In this episode, Celia shares insights from trends she’s seen in her recent work involving intelligent automation of business processes at scale. A key topic is the idea of embedding AI agents into multiple steps within business processes, such as contract analysis or banking regulatory change management, in order to achieve business results that would probably not be achievable with a one-shot AI solution. She also describes successful efforts at transforming front-line user experience of field workers, replacing clumsy gadgets with voice-activated workflows. And she discusses critical success factors for taking AI projects into full production, confirming that AI and ML solutions frequently work quite well in a completely different context or industry from the one where they were originally created. She concludes by making a case for a blended approach to resourcing AI projects – one that involves in-house teams, third-party solutions, and external partners.

38:01

Andrei Lopatenko: Scaling AI to Billions

Andrei holds a PhD in Computer Science, and is the Director of Search and the AI Lab at Neuron7. Prior to that, he was the VP Engineering and AI at Zillow. He’s also held key leadership roles at Google, Apple, Walmart and eBay.

In this episode, Andrei shares insights and advice, based on his experience deploying large-scale, high-load NLP and search applications to billions of customers (5 billion queries per day, 10 billion pages per day). Along the way, he describes how a high-quality engineering culture and a high-quality science culture were nurtured during early his days at Google as one of the first 60 PhDs in 2006, and how he has applied what he learned there later in his career. You’ll also hear a discussion about critical success factors for a transition from POC to production for a large-scale projects, such as 100 million or a billion queries per day – including a discussion about evaluation metrics for LLMs. Andrei also emphasizes the importance of continuous learning for leaders of teams that do AI, and he describes a great approach for staying on top of current research. The episode concludes with valuable career advice for data scientists who are in the early stages of their career.

24:13

Dave Stern: Hackproof Your Startup

Dave is a fractional CTO and DevOps engineer with over 25 years of experience in systems and software engineering. He’s the President and Senior Solutions Architect of Stern DevOps Group, which is a consultancy focused on early stage companies. He’s also the author of a new book: Hackproof Your Startup, and that book is a key topic of the show.

In this episode, Dave discusses IT and AI security for early-stage start-ups. The conversation begins with a review of what happened in the famous Codespaces hack. Dave asserts that many companies are still vulnerable to the type of ransomware attack that put Codespaces out of business, and that the risk mitigation solution is fairly straightforward (the elements of which he describes on the show). Other topics include cybersecurity as an asset, infrastructure as code, principle of least privilege, and isolating IT environments. The conversation concludes with a what-if scenario where Dave answers the question: “If someone were to steal my laptop or cell phone. What would I suddenly wish I had done before that happened?”

46:22

Shawn Goodin: Agent-Driven Marketing?

Shawn is the Global VP of Solutions at FirstHive, which is a customer data platform. Prior to that, he held senior leadership roles at Capgemini, Silicon Valley Bank, JPMorgan Chase, Clorox, Northwestern Mutual and SC Johnson. He is also an advisory board member of the Customer Data Platform (CDP) Institute.

In this conversation, Shawn describes various roadblocks to transformation in large organizations – especially AI-based initiatives in marketing. He then shares an agentic vision for a future-state where a marketing operations user might simply say: “I want to grow my credit card business by 20% in the US. What should I do?” and the platform would develop and execute a plan for that.

30:57

Jodi Blomberg: Strategic Bets on AI

Jodi is the VP of Data Science at Cox Automotive, a company that has a diverse portfolio of 17 brands that encompass digital products like Kelley Blue Book and Autotrader, as well as various kinds of physical services – all of which are supported by about 70 in-house data scientists and ML engineers.

In this conversation, Jodi describes her AI initiatives as investments, managed in a way that’s similar to a diversified investment portfolio, where the core projects deliver a baseline of ROI, and are supplemented by strategic bets, plus a very small fraction of high-risk / high-reward projects (“moonshots”) that get a Yes/No decision within 4-6 weeks. The show also includes a discussion about what makes a Gen AI project strategic vs “must have,” as well as insights about the practical and human challenges associated with the kinds of AI-based initiatives that primarily target efficiency gains, rather than top-line growth.

58:14

Ramsu Sundararajan: Segment of One at Scale

Ramsu holds a PhD in Machine Learning, and is the Head of R&D at solus.ai, which powers Segment of One personalization. Roles prior to that included Senior Scientist at GE Global Research, and Principal at Sabre Airline Solutions, where he developed some of the original algorithms.

Ramsu shares highlights of his journey in AI, with particular focus on personalization in marketing, with insights about how to think about that problem conceptually, including what parts are somewhat easy and which are difficult or tedious. There are also key insights about customer journeys and about the cold start problem. Other topics covered include customer genomes, as well as a discussion about navigating between decisions taken at a zoomed-in perspective at the individual customer record level, while also managing to a zoomed-out perspective that’s driven by KPIs, comps and annual targets. The show concludes with a discussion about the 1-2 year product development roadmap for solus.

2:53

Announcing the AI Master Group Podcast

The AI Master Group Podcast launches this week on Friday!

The show is an ideal place to meet people who are creating the future of AI in their work today. It features interviews with authors of recent papers, and with people deploying AI in real-world applications, including people in senior leadership roles. If you want a front row seat to the technology and craft of AI, this show will be perfect.

You can Follow the show on Spotify at this link: https://spoti.fi/3XIf1te (The show will also be available on the other usual channels, such as Apple Podcasts.)

There’s a list of the first 14 guests with release dates per episode in the YouTube Comments.

8:03

Mesh Anything (except a Pink Hippo Ballerina)

The developers at MeshAnything have just released new code that offers an important improvement in how the surface of 3D objects can be encoded. What the new method does is build out the shape by always seeking to find and encode an adjacent face that shares an edge, which requires only about half as many tokens to represent the same information by other methods, resulting in a four-fold reduction in the memory requirement to achieve the same task, which enabled MeshAnything to double the maximum number of faces it can handle on a single object to 1600, as compared to 800 for current methods.

This video starts by comparing the new method with the current one. After that, we generate a 3D object from a text prompt on the Rodin website (a pink hippopotamus ballerina character with white tutu), and we check it on the Sketchfab website. Then we run the code that was provided by MeshAnything on GitHub, and we check the output on Sketchfab, comparing before and after side-by-side. The results confirm the final words of the paper, which state that “the accuracy of MeshAnything V2 is still insufficient for industrial applications. More efforts are needed.” Nonetheless, this new computational approach is elegant, and the video concludes with a prediction that we’ll likely see improvements that build on the foundations laid by MeshAnything V2.

8:36

Can Robots Win at Table Tennis? Take a Look!

Google DeepMind has just achieved a new level of robotic skill – the ability to compete and win at table tennis, a game that requires years of training for people who want compete at an expert level.

This video shows the robot in action against an array of competitors, ranging from beginner level to tournament pro and, in doing so, describes both the hardware and AI aspect, including how it was trained and a summary of the key innovations contributed by this project.

It also gives summary results of the live matches, segmented by experience level of opponents. As a bonus, I looked at the performance data and have shared four insider tips for how to beat this robot at table tennis. The video ends on a light note, describing something called RoboCup, which has the goal of fielding a team of robots that will be ready to take on the World Cup soccer champion team by 2050. You’ll quickly see that we have a very long way to go on that particular goal.

10:36

Shark Alert! YOLO AI-Vision in Action

Last week, several news outlets ran a story about SharkEye, which is an AI-vision shark detection program, developed at the University of California, Santa Barbara, and deployed at California’s Padaro Beach, which is an area where surfers and great white sharks are both frequently found.

After quickly describing the program itself, the video identifies the underlying technology that was used for the vision aspect, confirming from the project’s GitHub page that YOLO v8 by Ultralytics was used. Basically, Ultralytics created an abstraction layer that simplifies the deployment of computer vision models, so that even developers with almost no experience in computer vision can quickly implement sophisticated projects. To illustrate, the video then shows a demo of an object detection and identification task being set up and run on Google Colab. It then concludes with examples of types of projects that can be implemented by Ultralytics YOLO v8.

11:03

AI Can do That?? Silver Medal in Pure Math

AI has just achieved an amazing milestone. A couple of Alpha models by Google DeepMind scored silver-medal-level performance in a globally-recognized competition in advanced mathematics: IMO 2004.

This video starts by setting the context for this latest achievement, going back to significant milestones in 2022 and 2023 that helped set the stage for what just happened, sharing the story along the way of two remarkable mathematicians, and comparing their achievements to those of the Alpha models.

With the stage set in that way, the video then describes key details of the contest, including the scoring system, and how DeepMind scored on each problem, including details of a very difficult geometry problem that is solved in a matter of seconds. Next the video describes details about the training that was done for the AlphaProof and the AlphaGeometry 2 models. Finally, it assesses the implications of this accomplishment, including some of the fields in which this kind of capability might make significant contributions.

8:12

Will Open-Source Llama Beat GPT-4o?

Last week Meta launched its newest family of models, Llama 3.1, including a new benchmark – an open-source foundation model with 405 billion parameters. With this, Zuckerberg predicted that Meta AI will surpass OpenAI’s 200 million monthly active users by the end of this year.

Hubris aside, this video looks at six reasons why we need to pay attention to this announcement, including Zuckerberg’s assertion that open source will eventually win for language models for the same reasons that Linux eventually won out against an array of closed-source Unix models.

It then describes a situation where a company has already been building solutions using an OpenAI model or Anthropic, for example, but then decides to get an informed point of view about the open source option by creating a challenger model as well, using the new Llama options. For that situation the video suggests which model size to use, plus recommendations for best platform options for the pilot, plus four types of projects that would be good candidates for a head-to-head test of this sort. Finally, it concludes with a light-hearted description of the battle ahead.

10:20

Call a Doctor! –Blue Screen Lessons Learned

Companies worldwide grappled on Friday with what Troy Hunt, famously described as “the largest IT outage in history,” caused by a faulty sensor configuration update that got pushed to Microsoft by the cyber-security giant, CrowdStrike, resulting in a $31 billion loss in market capitalization for the company.

Specific information about the bug is not yet publicly available, but this video presents 12 top suspects, including two primary ones. From there, it focuses on lessons learned, with the help of a live interview with fractional CTO and senior solutions architect, Dave Stern, who is the author of the recent best-selling book Hackproof Your Startup.

7:07

Amazing Milestone! Million Experts Model

A top researcher at Google DeepMind just released an important paper, “Mixture of a Million Experts.” As the paper’s title announces, it describes an approach that resulted in the first-known Transformer model with more than a million experts.

For context, the number of experts currently seen in smaller models varies between 4 and 32, and ranges up to 128 for most of the bigger ones.

This video reviews the Mixture-of-Experts method, including why and where it’s used, and the computational challenges associated with doing this. Next, it summarizes the findings of another important paper from earlier this year, where a new scaling law was introduced for Mixture-of-Experts models. That sets us up to review the “Million Experts” paper by Xu He.

The video then describes two key strategies that enabled scale to over a million experts by creating experts that are only a single neuron large. Next, it shares a process map for the new approach, and concludes with ideas about where this might be most relevant, including applications that involve continuous data streams.

10:15

Behind the Curtain of Figma AI

The recent announcement of Figma AI generated both excitement and controversy. This video summarizes the new AI features in under three minutes, for this popular design tool that’s used for creating prototypes of digital experiences.

Next, the video looks at the underlying technology that was used to enable the new AI features, including OpenAI language models and the Amazon Titan diffusion model, drawing conclusions about Figma’s strategy, based on the choices they made – especially the decision to use two different vendors for key parts of Figma AI.

8:49

How a Language Model Aced a Top Leaderboard

This video shares details about a remarkable experiment by researchers in Tokyo, who teamed up with Oxford and Cambridge Universities to study whether large language models might now be able to write code that improves their own performance.

The answer was Yes.

Not only that, the model created a whole new approach that placed it at the top of a leaderboard, using a novel method that had not yet been tried or documented in any academic research paper. How can that happen?

The video describes how the model alternated between different kinds of strategies, just like a data scientist might do, resulting in an innovative new loss function, with several interesting properties. In short, the model was systematically generating hypotheses and testing them. Finally, the video identifies five aspects of the research question that can potentially be generalized, and it names three ways in which the findings might be applied to new problem sets, including to virtual reality. . .

6:29

New Method Runs Big LLMs on Smartphones

There’s a big breakthrough that just came out for handling large language models on smartphones. It’s called PowerInfer-2 and what it does is look at every option for a processing an LLM on a particular smartphone, and picks the fastest way for that particular LLM on that particular device. For example, it uses completely different computation patterns for the early vs. the later phases of the pipeline, and it breaks down the work into small tasks, and organizes those based on which neurons are most likely to activate, which increases efficiency a lot. Then the final step picks which processing units to use, based on which one will do the job faster.

Add it all up, and the performance difference is very impressive: 29x faster.

This video starts with a review of the six strategies that are generally used to prepare large language models for use on a smartphone, with examples of each, and then it presents a side-by-side demo of PowerInfer-2 vs Llama-cpp.

The speed difference is remarkable.

10:02

Nemotron-4 is BIG in More Ways than One

Last week, NVIDIA announced Nemotron-4, which consists of three models: Base, Instruct and Reward. These three models work together within the NeMo framework to enable the creation and fine-tuning of new large language models.

At 340 billion parameters, this new entrant far bigger than any other open source model, but the really big news is that Nemotron-4 comes with a permissive license that allows us to use the model to generate synthetic data at scale, for the purpose of creating new models of our own.

Until now, most big models and APIs had clauses in the user agreements that explicitly forbid using the data they generate for the purpose of creating a new model. This video provides a full summary of the size, performance, technical report, and competitive position of Nemotron-4, and it describes what each of the three models do, including production of synthetic data and the five-dimension framework that’s used for model evaluation.

13:53

Testing Ollama on Hard Questions

Ollama is a popular platform for running language models on your local machine, with access to almost 100 different open source models, including llama-3 from Meta, Phi3 from Microsoft, Aya 23 from Cohere, the Gemma models from DeepMind and Mistral.

This video shows llama-3 being run on a laptop, using Ollama. Three difficult questions are presented in turn to each of GPT-4o, Gemini and llama-3. The results yield good insight into the comparative strengths and weaknesses of these three options.

8:28

Hacking Passwords with ChatGPT?

The latest edition of the Hive Systems password table is now available, and it shows ChatGPT as the fastest option by far, for hacking passwords, which certainly requires some explanation!

This video looks at the assumptions that go into time is takes for a hacker to get a password by brute force. Along the way, we look at hashing algorithms like MD5 and bcrypt, and we look at hardware like NVIDIA RTX 4090 GPUs, and NVIDIA A100s – which is where ChatGPT enters into the story. (It turns out that Hive Systems modeled a theoretical situation that involves using about $300 Million worth of ChatGPT hardware to hack a single 8-digit password!)

The video ends with an announcement about the new AI Master Group podcast which will feature interviews with people who are on the front lines, doing innovative work related to AI. The podcast will launch on July 7.