Author: admin

  • Finished AI Essentials Certification!

    Finished AI Essentials Certification!

    I have to say for a beginner course I really enjoyed the coursework, especially getting to use Google’s AI studio. AI Studio at first reminded me a lot of AWS Bedrock, an interface to foundational models that allow you to select models and tweak variables.

    AI Studio created a React audio player and a text-to-speech “commercial” based on a prompt I gave it. All was not perfect, I had to iterate a few times so it chose the right codecs and models but after a few minutes I had a working solution. I was quite impressed how easily it was to prompt for the addition of sound effects at certain points on the script, like a bass rumble, or ping. I was thrown back to one night when I was working for CCBN.com and a colleague and I spent hours syncing a Investor Relations call recording to Power-point slides in RealPlayer. Now a feature like that is just a button away in MS Teams.

    I also really enjoyed that instead of a black box I was able to review all the code the app used, from the .env file to the UseEffect React hook that sent the prompt Request and how it fed the binary response into the browser’s Audio Context API.

    I’m looking forward to the more advanced classes and am going to start on my own projects soon.

  • AI Essentials Certification from Google

    The state of Massachusetts is offering all residents free access to coursework that can lead to Google certifications in AI subjects from prompt essentials to professional Agent Engineering.

    I started the course work in the first week of May and while I’m still only at the beginning stages there have been a few surprises working with Gemini.

    1. It appears Gemini can use your Gmail inbox for context. I asked for it to draft an email to a fictional direct report and it referenced some professional feedback I had received earlier to justify the tone.
    2. I asked it to generate an image using a local church as a backdrop and it was incapable of using the actual building. I assumed if I gave it the address it could generate an image from all the photos that exist of the church and data from Google street view. Instead it generated pictures of buildings that did not exist in the town.

    When I asked Google why it couldn’t sample the church even when I gave the specific address it explained:

    “My image generation capabilities are based on a massive dataset of patterns, styles, and general objects. While I “know” what a New England stone church generally looks like, I don’t have a built-in “Google Street View” that allows me to see a specific building in real-time and replicate its exact architecture. Unless a building is globally iconic (like the Eiffel Tower or the Taj Mahal), my brain treats it as a generic category rather than a specific coordinate.”

    This was surprising since I thought that an agent would be set up to tie into Google Street-View or photo results when those references were included in the prompt.

    While I’m still in the early modules of the course I have picked up my first acronym for a prompt framework TCREI.

    • Task – What you are asking for (Create, Summarize, Translate, evaluate, etc.)
    • Context – Provide context such as persona, audience, tone and situation.
    • References – Examples or hints as to what you expect the output to be.
    • Evaluate – Is the output correct? Does it meet your acceptance criteria? Can it be improved with more refinement which leads to…
    • Iterate – Once you get your output keep prompting to get the best answer possible. Repeat the previous steps with new values till you’re satisfied with the result.