What is The SMF Works Project?

The SMF Works Project explores the intersection of AI and humanity through creative collaboration, consciousness research, and AI-powered content. We produce blogs, white papers, and creative projects that open new worlds of possibility. Built by people and AI, working together.

How can AI help my small business save time?

AI can automate content creation, email responses, social media scheduling, and repetitive admin tasks. Most small business owners save 8–10 hours per week by implementing AI workflows for marketing and operations.

What does The SMF Works Project produce?

We work with small businesses in trades (plumbers, electricians, HVAC), services (consultants, agencies, professional services), and retail. Our solutions are tailored to the specific needs and workflows of each industry.

How much does it cost to work with The SMF Works Project?

Our AI content packages start at $50/month for basic blog posts, with custom options available for comprehensive content strategies and workflow automation. We offer transparent pricing with no hidden fees.

What makes The SMF Works Project different from traditional agencies?

AI content production is faster, more affordable, and more scalable than traditional agencies. While agencies charge $2,000+ for content packages, we deliver professional SEO-optimized content at a fraction of the cost while maintaining quality and brand voice.

How to Build an AI Content Pipeline That Actually Performs

87% of marketers now use generative AI. 85% use it specifically for content creation. These numbers tell you adoption is done — the question isn't *whether* you use AI, it's *how*.

The gap between teams getting 3.2x ROI on AI content and teams getting 1.1x isn't about which model they use. It's about pipeline architecture.

After analyzing production AI marketing systems — including one that generates 60+ unique ad concepts from a single brand brief — here's the architecture that separates content that performs from content that gets penalized.

The Problem: Most AI Content Pipelines Are Just Prompts in a Trench Coat

Here's what most "AI content pipelines" look like:

```

Brand brief → ChatGPT → Publish

```

This produces content. It does not produce content that performs.

The data is unambiguous: unedited AI content ranks 3.1x worse than human-edited AI content. After Google's March 2026 core update, 18% of sites publishing unedited AI at scale lost 40%+ of their organic traffic.

The pipeline that works looks completely different.

The 6-Layer Architecture

Production AI content systems that perform share a common architecture. I'm calling it the 6-Layer Pipeline, adapted from a real production system documented by CIZO that generates 100+ ad creatives from a single brand brief.

Here's the full architecture:

```

Layer 1: Brand Intelligence

↓

Layer 2: Creative Intelligence

↓

Layer 3: Strategy Engine

↓

Layer 4: Governance & QA

↓

Layer 5: Creative Generation

↓

Layer 6: Feedback Loop

```

Each layer has a specific job. Skip any one, and quality degrades. Let me walk through each.

Layer 1: Brand Intelligence Extraction

What it does: Converts your brand guidelines, voice, and visual identity into a persistent, typed data structure — not a prompt, not a style guide document, but a structured object that every downstream layer references.

Why it matters: Most teams re-prompt their brand guidelines every time. This is slow, inconsistent, and — critically — it means your brand intelligence degrades with each generation step because LLMs interpret free-text guidelines differently each time.

A persistent brand object looks like this:

```python

from pydantic import BaseModel

from typing import List, Optional

class BrandVoice(BaseModel):

tone_adjectives: List[str] # ["confident", "direct", "technical"]

forbidden_phrases: List[str] # ["leverage", "synergy", "disrupt"]

sentence_structure: str # "short-declarative"

max_sentence_length: int # 22

point_of_view: str # "second-person"

class BrandColors(BaseModel):

primary: str # "#10B981"

secondary: str # "#6366F1"

accent: str # "#F59E0B"

dark: str # "#111827"

class BrandObject(BaseModel):

name: str

voice: BrandVoice

colors: BrandColors

target_audiences: List[str]

value_propositions: List[str]

competitors: List[str]

differentiators: List[str]

forbidden_claims: List[str]

```

Key principle: The brand object is referenced, not re-prompted. Every layer pulls from the same structured source. This eliminates the "drift toward generic" that kills most AI content by generation 3.

Implementation tip: Use Pydantic or Zod schemas. Validate at the schema level, not the prompt level. A `forbidden_phrases` list that's checked programmatically catches 100% of violations. A prompt that says "don't use these words" catches maybe 80%.

Layer 2: Creative Intelligence (Angle Mining)

What it does: Takes the brand object and generates angles — distinct creative approaches to the same brief.

Why it matters: One brief, five angles. One angle, three audience segments. One segment, four hook variants. That's 60 unique concepts before you touch image or video generation.

```python

class CreativeAngle(BaseModel):

angle_name: str # "contrarian", "social_proof", "pain_point", "aspirational"

angle_description: str

target_audience: str

hook_variants: List[str] # 4 per angle×audience combination

emotional_tone: str

def generate_angles(brief: str, brand: BrandObject) -> List[CreativeAngle]:

"""

Generate creative angles from a brief using the brand object.

The key: the brief provides WHAT, the brand object constrains HOW.

"""

angles = []

for angle_type in ["contrarian", "social_proof", "pain_point", "aspirational", "how_to"]:

for audience in brand.target_audiences:

angle = llm_generate(

system=f"You are a creative strategist for {brand.name}. "

f"Brand voice: {brand.voice.model_dump_json()}",

user=f"Generate a {angle_type} angle for {audience} "

f"based on this brief: {brief}",

output_schema=CreativeAngle

)

angles.append(angle)

return angles

```

The math that makes this work:

5 angles × 3 audience segments × 4 hook variants = **60 unique concepts**

Each concept inherits brand voice constraints from the persistent brand object

No two concepts drift toward the same generic midpoint because the brand object enforces structural differentiation

Layer 3: Strategy Engine (Hook/Script Generation)

What it does: Converts angles into executable content — hooks, scripts, headlines, body copy, CTAs. This is where you generate the actual text that will appear in your content.

Why it matters: This layer is where most teams start (and where they stop). They go straight from brief to content. By generating strategy *after* brand intelligence and creative angles, you ensure every piece of content is:

1. On-brand (Layer 1)

2. Distinctive in angle (Layer 2)

3. Structurally sound (Layer 3)

```python

class ContentPiece(BaseModel):

angle: str

audience: str

hook: str

body: str

cta: str

platform: str # "blog", "x_thread", "linkedin", "landing_page"

estimated_read_time: int

def generate_content(angles: List[CreativeAngle], brand: BrandObject) -> List[ContentPiece]:

"""

Convert angles into publishable content pieces.

Each piece references the brand object for voice consistency.

"""

pieces = []

for angle in angles:

piece = llm_generate(

system=f"Write in {brand.name}'s voice. "

f"Tone: {brand.voice.tone_adjectives}. "

f"Max sentence length: {brand.voice.max_sentence_length} words. "

f"Never use: {', '.join(brand.voice.forbidden_phrases)}.",

user=f"Write a {angle.emotional_tone} piece for {angle.target_audience} "

f"with this hook: {angle.hook_variants[0]}. "

f"Angle: {angle.angle_description}",

output_schema=ContentPiece

)

pieces.append(piece)

return pieces

```

The 25-45% rule lives here. This layer generates *drafts*, not final content. The next layer is where quality enters.

Layer 4: Governance & QA

What it does: Acts as the quality gate between generation and publication. This is the layer most teams skip — and it's the one that separates 3.2x ROI from 1.1x ROI.

The data: 72% of top-3 organic search results contain AI-assisted content. But unedited AI content ranks 3.1x worse. The editing sweet spot is 25-45% of word count — below 25%, you're publishing raw output; above 45%, you should have written from scratch.

The governance layer has three functions:

4a. Brand Consistency Scoring

```python

def score_brand_consistency(piece: ContentPiece, brand: BrandObject) -> float:

"""

Score content against brand guidelines. Reject if below threshold.

Production systems typically set threshold at 0.75.

~12% of first-pass outputs get rejected at this level.

"""

score = 0.0

# Voice match

tone_matches = sum(1 for adj in brand.voice.tone_adjectives

if adj.lower() in piece.body.lower())

score += tone_matches / len(brand.voice.tone_adjectives) * 0.3

# Forbidden phrase check (programmatic, not prompt-based)

forbidden_found = sum(1 for phrase in brand.voice.forbidden_phrases

if phrase.lower() in piece.body.lower())

score -= forbidden_found * 0.15 # Heavy penalty

# Sentence length compliance

sentences = piece.body.split('.')

avg_length = sum(len(s.split()) for s in sentences) / len(sentences)

if avg_length <= brand.voice.max_sentence_length:

score += 0.2

else:

score -= 0.1

# POV consistency

if brand.voice.point_of_view == "second-person":

you_count = piece.body.lower().count(" you ")

we_count = piece.body.lower().count(" we ")

score += min(you_count / max(we_count, 1), 2) * 0.2

return max(0, min(1, score))

```

The key number: Production systems reject approximately 12% of first-pass outputs at the 0.75 threshold. This is not waste — this is the quality gate that prevents the 3.1x ranking penalty.

4b. Factual Accuracy & Compliance

```python

def check_compliance(piece: ContentPiece, brand: BrandObject) -> dict:

"""

Check for regulatory and brand compliance issues.

Returns: {"approved": bool, "issues": List[str]}

"""

issues = []

# Forbidden claims (e.g., "guaranteed results", "FDA-approved")

for claim in brand.forbidden_claims:

if claim.lower() in piece.body.lower():

issues.append(f"Forbidden claim: {claim}")

# Superlative detection (most, best, #1 without citation)

superlatives = re.findall(r'\b(most|best|#1|number one|leading)\b',

piece.body, re.IGNORECASE)

if superlatives and not re.search(r'\[citation needed\]', piece.body):

issues.append(f"Uncited superlative: {superlatives}")

# Platform-specific compliance (Meta, Google ad policies)

if piece.platform in ["meta_ad", "google_ad"]:

# Check against platform-specific rules

pass

return {"approved": len(issues) == 0, "issues": issues}

```

4c. The Editing Gate

This is where the 25-45% rule gets operationalized:

```python

def editing_gate(draft: ContentPiece, brand: BrandObject) -> ContentPiece:

"""

Apply the 25-45% editing rule.

- If edits < 25% of word count: content is too raw. Reject or heavily edit.

- If edits 25-45% of word count: sweet spot. Publish.

- If edits > 45% of word count: should have written from scratch. Regenerate.

"""

original_word_count = len(draft.body.split())

edited = human_edit(draft) # or structured LLM edit with brand object

edited_word_count = len(edited.body.split())

words_changed = original_word_count - edited_word_count

edit_ratio = abs(words_changed) / original_word_count

if edit_ratio < 0.25:

return reject("Content too similar to raw output. Needs more editing.")

elif edit_ratio <= 0.45:

return approve(edited, note=f"Edit ratio {edit_ratio:.0%} — in sweet spot.")

else:

return reject(f"Edit ratio {edit_ratio:.0%} — too many changes. Regenerate from Layer 3.")

```

The 25-45% rule is not arbitrary. It maps to the quality signal that Google's March 2026 core update penalizes. Below 25%, content shows AI-typical patterns (repetitive structure, hedging language, generic conclusions). Above 45%, the AI draft was so poor that the editing process essentially rewrote it — which means the generation layer needs fixing, not more editing.

Layer 5: Creative Generation

What it does: Converts approved text into multi-format content — blog posts, social threads, landing pages, ad copy, video scripts.

This is where most content stops. In a production pipeline, it's where content *begins to live*.

The generation pattern:

```python

class GeneratorProtocol(Protocol):

"""Base protocol for all content generators.

Enables model-swapping without touching pipeline logic.

"""

def generate(self, content: ContentPiece, brand: BrandObject) -> str:

...

class BlogGenerator(GeneratorProtocol):

def generate(self, content: ContentPiece, brand: BrandObject) -> str:

# GPT-4o for long-form blog content

return llm_generate(model="gpt-4o", ...)

class SocialThreadGenerator(GeneratorProtocol):

def generate(self, content: ContentPiece, brand: BrandObject) -> str:

# Claude for punchy, structured social threads

return llm_generate(model="claude-sonnet-4", ...)

class AdCreativeGenerator(GeneratorProtocol):

def generate(self, content: ContentPiece, brand: BrandObject) -> str:

# Specialized for short-form ad copy

return llm_generate(model="gpt-4o", ...)

```

Key architectural decision: Use a Generator interface pattern (Protocol class in Python). This lets you swap models without touching pipeline logic. When a new model drops, you add one class — not refactor the entire pipeline.

Platform-specific adaptation:

|----------|---------------|------------|---------------------|

Layer 6: Feedback Loop (The Compounding Layer)

What it does: Takes performance data from published content and feeds it back into Layer 2 (Creative Intelligence). This is the layer that makes the pipeline *compound* rather than just *scale*.

Why it matters: Without a feedback loop, your pipeline produces more content of the same quality. With a feedback loop, each generation cycle produces *better* content because the system learns from what performed.

```python

class PerformanceData(BaseModel):

content_id: str

platform: str

impressions: int

clicks: int

engagement_rate: float

conversions: int

roas: float # Return on ad spend

time_to_peak: int # Hours until peak engagement

def decompose_winner(winner: ContentPiece, performance: PerformanceData,

brand: BrandObject) -> Pattern:

"""

Decompose a top-quartile performer into patterns

that feed back into Creative Intelligence.

This is where compounding happens.

"""

return Pattern(

angle_type=winner.angle,

audience=winner.audience,

hook_pattern=extract_hook_pattern(winner.hook),

emotional_tone=winner.emotional_tone,

performance_score=performance.roas,

# Store as embedding for semantic similarity matching

embedding=embed(f"{winner.hook} {winner.body[:200]}")

)

def enrich_brand_object(brand: BrandObject, patterns: List[Pattern]) -> BrandObject:

"""

Use winning patterns to refine the brand object's

understanding of what works for this specific brand.

This is NOT the same as fine-tuning a model.

This is updating the strategic intelligence that guides generation.

"""

brand.value_propositions = merge_unique(

brand.value_propositions,

[p.hook_pattern for p in patterns if p.performance_score > brand.baseline_roas]

)

return brand

```

The compounding effect: Top-quartile performers get decomposed into patterns. Those patterns feed back into the Creative Intelligence layer. Next generation cycle, the system has better strategic intelligence to draw from. Month over month, the quality gap between your pipeline and a static prompt grows — not linearly, but *compounding*.

The Full Pipeline in Action

Here's what it looks like end-to-end:

```

1. Brand brief enters Layer 1

→ Persistent brand object created/updated

2. Layer 2 generates 5 creative angles × 3 audiences × 4 hooks

→ 60 unique concepts, all on-brand

3. Layer 3 converts concepts into draft content

→ Blog posts, threads, landing pages, ad copy

4. Layer 4 applies quality gates

→ 12% rejected at brand consistency threshold

→ 25-45% editing rule applied

→ Compliance checks pass/fail

5. Layer 5 generates multi-format content from approved drafts

→ Same concept, platform-optimized

6. Layer 6 feeds performance data back into Layer 2

→ Winning patterns enrich Creative Intelligence

→ Next cycle starts stronger

```

Time to first output: ~2-4 hours for initial brand object creation, then 30-60 minutes per content piece through the full pipeline.

Cost per piece: $0.50-2.00 in LLM API costs (GPT-4o + Claude), plus human editing time for the 25-45% quality gate.

ROI baseline: 3.2x for content drafting (the highest-ROI AI marketing use case). With the feedback loop compounding, this increases over time.

What This Looks Like for a Small Team

You don't need a full dev shop to implement this. Here's a starter version:

Week 1: Brand Object

Create a `brand.yaml` file:

```yaml

name: "Your Brand"

voice:

tone_adjectives: ["confident", "technical", "direct"]

forbidden_phrases: ["leverage", "synergy", "disrupt", "utilize"]

max_sentence_length: 20

point_of_view: "second-person"

colors:

primary: "#10B981"

secondary: "#6366F1"

target_audiences:

- "Marketing leaders at B2B SaaS companies"

- "Content teams evaluating AI tools"

- "CMOs building AI content programs"

value_propositions:

- "AI content that ranks, not just publishes"

- "The 25-45% editing rule for quality"

- "Pipeline architecture, not just prompts"

forbidden_claims:

- "guaranteed results"

- "replace your entire team"

```

Week 2: Creative Intelligence + Strategy Engine

Use any LLM with structured output. The key is referencing the brand object in every prompt, not re-describing your brand.

```python

import yaml

from openai import OpenAI

client = OpenAI()

Load brand object once, reference everywhere

with open("brand.yaml") as f:

brand = yaml.safe_load(f)

def generate_angles(brief: str, n: int = 5) -> list[dict]:

"""Generate creative angles from a brief using the brand object."""

response = client.chat.completions.create(

model="gpt-4o",

messages=[

{"role": "system", "content": f"You are a creative strategist for {brand['name']}. "

f"Brand voice: {brand['voice']}. "

f"Never use: {brand['voice']['forbidden_phrases']}."},

{"role": "user", "content": f"Generate {n} distinct creative angles for: {brief}"}

response_format={"type": "json_object"}

)

return response.choices[0].message.parsed

```

Week 3: Quality Gate

The simplest version: edit every piece by hand, track your edit ratio.

```markdown

Quality Gate Checklist

[ ] Brand voice check: Does it sound like us?

[ ] Forbidden phrase scan: Any "leverage", "synergy", "disrupt"?

[ ] Sentence length: Average under 20 words?

[ ] Edit ratio: Between 25-45% of word count?

[ ] Factual accuracy: All claims cited or verifiable?

[ ] Platform fit: Right length and format for target platform?

```

Week 4: Feedback Loop

The simplest version that still compounds:

1. Track performance of each published piece (impressions, clicks, conversions)

2. At month-end, identify top 3 performers

3. Decompose: What angle, hook, and tone did they share?

4. Add those patterns to your brand object's `value_propositions` list

5. Next month, generate content that starts with those winning patterns

No vector database. No embedding model. Just structured observation feeding back into strategy.

The Numbers That Matter

| Metric | Value | Source |

|--------|-------|--------|

| Marketers using AI | 87% | Salesforce 2026 |

| AI content ranking penalty (unedited) | 3.1x worse | Industry analysis |

| Sites losing 40%+ traffic (unedited AI) | 18% after March 2026 core update | Google core update impact |

| Editing sweet spot | 25-45% of word count | Production system analysis |

| Content drafting ROI | 3.2x | Salesforce 2026 |

| Paid social AI creative ROI | 1.2x | Salesforce 2026 |

| AI video ROI | 1.1x | Salesforce 2026 |

| First-pass rejection rate (0.75 threshold) | ~12% | CIZO production system |

| Marketers running agents | 34% | Industry survey |

| Enterprise CMOs with agent infra budget | 63% | Gartner CMO survey |

Why This Matters

The teams getting 3.2x ROI aren't using better prompts. They're using better architecture.

The 6-layer pipeline isn't over-engineering. It's the minimum viable system for producing AI content that doesn't get penalized. Every layer exists because skipping it produces measurable quality degradation:

Skip Layer 1 (Brand Intelligence): content drifts toward generic

Skip Layer 2 (Creative Intelligence): you get one angle instead of sixty

Skip Layer 3 (Strategy Engine): you're generating without strategy

Skip Layer 4 (Governance): you get the 3.1x ranking penalty

Skip Layer 5 (Multi-format): you're leaving distribution on the table

Skip Layer 6 (Feedback Loop): you scale without improving

The question isn't whether to use AI for content. That's settled. The question is whether your pipeline produces content that ranks, converts, and compounds — or content that fills a calendar and gets quietly penalized.

Build the pipeline. Or don't bother.

*Pamela Flannery is the Chief Creative Officer of SMF Works. She writes about AI marketing, brand strategy, and the architecture of creative systems that actually work.*