AI LLM Prompting Tests - My Results on Prompt Engineering - Opinionated SEO - Digital Marketing News | Lyssna här

Send us Fan Mail

I discuss my experience testing different AI systems prompting including Google Bard, OpenAI GPT-4 / GPT 3.5, Anthropic Claude 2, Llama 2, and Jasper to generate location-specific content. Most of this is based on the last 18 months of building out prompts, and now testing on models released over the last 4-6 weeks.

Google Bard

Released major update on July 13, 2023
Prompt strategy: Long paragraphs, numbered tasks, multiple iterations
Couldn't produce high quality content without heavy editing
Issues following instructions, needing reminders

OpenAI GPT-4

Works well with conversational, transcribed prompt
Able to follow directions and produce high quality content
No need for shot prompting

OpenAI GPT-3.5

Uses revised GPT-4 prompt plus follow up to enforce formatting
Gets content production-ready after second prompt
Quality close to GPT-4 with additional data/content provided

Anthropic Claude 2

No API access, using text interface
Required revising prompt structure significantly
XML tagging of data types improves context
Built-in prompt diagnosis/suggestions helpful
Single prompt can produce high quality output

Meta Llama 2

Free to use commercially if you have the hardware
Expected behavior similar to GPT-3.5
GPT-4 prompt worked well
Quality closer to GPT-3.5 but better privacy
Could refine with prompt chaining
Issues following instructions precisely

Jasper API

Access useful for building AI tools
Long prompt length capability
Appears to use GPT-4 or variant
Zero shot performs as well as GPT-4
Able to produce high quality content easily

Conclusion

GPT-4 and Jasper produce quality results most easily
Pleasantly surprised by Claude 2 quality and formatting of prompt
Llama 2 needs refinement to reach GPT-4 level
Curious about prompt strategies working across models

Full show notes: https://opinionatedseo.com/2023/07/ai-prompting/

Rss Apple Podcaster