Content Generation
In this tutorial, you'll learn how to build an automated SEO content generation system that analyzes web pages and creates optimized content.
What you'll build: A web scraping and content analysis tool that generates SEO-optimized metadata.
What you'll learn:
- Web scraping and HTML parsing
- AI-powered content analysis
- SEO optimization techniques
- Structured data generation
Learning Objectives
By the end of this tutorial, you will be able to:
- Implement web scraping to extract page content
- Process HTML and convert it to analyzable text
- Use AI to generate Search engine optimized (SEO) content
- Create structured metadata for search engines
Scenario
Your marketing team needs to improve website traffic and search engine rankings. The goal is to automate the creation of SEO-optimized content by analyzing existing web pages and generating relevant keywords, compelling titles, and effective meta tags to enhance online visibility.
Goal
Build a system that can automatically retrieve web page content, analyze it for SEO opportunities, and generate optimized metadata that drives organic growth through improved search engine rankings.

Step-by-Step Implementation
Step 1: Understanding SEO Fundamentals
Key SEO elements to generate:
- Title Tags: Compelling, keyword-rich page titles
- Meta Descriptions: Concise, engaging summaries
- Keywords: Relevant search terms and phrases
- Header Structure: Organized content hierarchy
- Schema Markup: Structured data for search engines
Step 2: Examine the SEO Component
Navigate to apps-chat\chatbot-frontend\pages\seo\Seo.tsx. You'll find:
- A URL input field for target web pages
- A generate button to trigger analysis
- A display area for generated SEO content
- Test URL:
http://localhost:4000/product.html
Step 3: Implement Web Content Extraction
Your first task is to retrieve and process web page content:
async function extractWebContent(url: string): Promise<string> {
// TODO: Implement web content extraction
// 1. Fetch the HTML content from the URL
// 2. Parse and clean the HTML
// 3. Extract meaningful text content
// 4. Remove scripts, styles, and navigation elements
// 5. Return clean text for analysis
}
Step 4: HTML Processing and Cleaning
Process the raw HTML to extract meaningful content:
function cleanHtmlContent(html: string): string {
// Remove script and style elements
// Extract text from important elements (h1, h2, p, etc.)
// Clean up whitespace and formatting
// Return structured text content
}
Step 5: AI-Powered SEO Analysis
Implement the core SEO generation function:
async function seoApi(url: string): Promise<string> {
// TODO: Complete SEO analysis implementation
// 1. Extract web page content
// 2. Prepare analysis prompt for AI
// 3. Call Azure OpenAI for content analysis
// 4. Generate structured SEO metadata
// 5. Return formatted JSON response
}
Step 6: Code Solution
View Complete Solution
Try implementing it yourself first!
Click to reveal the solution code
import React, { useState } from "react";
import { trackPromise } from "react-promise-tracker";
import { usePromiseTracker } from "react-promise-tracker";
import { OpenAIClient, AzureKeyCredential } from '@azure/openai';
const Page = () => {
const { promiseInProgress } = usePromiseTracker();
const [seoUrl, setSeoUrl] = useState<string>("");
const [seoText, setSeoText] = useState<string>("");
async function process() {
if (seoUrl) {
trackPromise(
seoApi(seoUrl)
).then((res) => {
setSeoText(res);
}).catch((error) => {
console.error('SEO analysis failed:', error);
setSeoText('Error analyzing the webpage. Please check the URL and try again.');
});
}
}
async function seoApi(url: string): Promise<string> {
try {
// Fetch webpage content
const response = await fetch(url);
if (!response.ok) {
throw new Error(`HTTP error! status: ${response.status}`);
}
const html = await response.text();
// Clean and extract meaningful content
const cleanContent = cleanHtmlContent(html);
// Prepare AI prompt for SEO analysis
const messages = [
{
"role": "system",
"content": `You are an SEO expert. Analyze the provided HTML content and generate SEO-optimized metadata.
Return a valid JSON object with the following structure:
{
"seoTitle": "compelling page title (50-60 characters)",
"seoDescription": "engaging meta description (150-160 characters)",
"seoKeywords": ["keyword1", "keyword2", "keyword3"],
"focusKeyword": "primary keyword",
"suggestions": ["improvement suggestion 1", "suggestion 2"]
}
Ensure the output is valid JSON format only.`
},
{
"role": "user",
"content": `Analyze this webpage content and generate SEO metadata:\n\n${cleanContent}`
}
];
const options = {
api_version: "2024-08-01-preview"
};
const openai_url = "https://aiaaa-s2-openai.openai.azure.com/";
const openai_key = "<AZURE_OPENAI_API_KEY>";
const client = new OpenAIClient(
openai_url,
new AzureKeyCredential(openai_key),
options
);
const deploymentName = 'gpt-4o';
const result = await client.getChatCompletions(deploymentName, messages, {
maxTokens: 500,
temperature: 0.3
});
return result.choices[0]?.message?.content ?? 'No SEO analysis generated';
} catch (error) {
console.error('Error in seoApi:', error);
throw error;
}
}
function cleanHtmlContent(html: string): string {
// Create a temporary DOM element to parse HTML
const parser = new DOMParser();
const doc = parser.parseFromString(html, 'text/html');
// Remove script and style elements
const scripts = doc.querySelectorAll('script, style');
scripts.forEach(el => el.remove());
// Extract text from important elements
const title = doc.querySelector('title')?.textContent || '';
const headings = Array.from(doc.querySelectorAll('h1, h2, h3, h4, h5, h6'))
.map(el => el.textContent).join(' ');
const paragraphs = Array.from(doc.querySelectorAll('p'))
.map(el => el.textContent).join(' ');
const metaDescription = doc.querySelector('meta[name="description"]')?.getAttribute('content') || '';
// Combine and clean content
const content = `Title: ${title}\nHeadings: ${headings}\nContent: ${paragraphs}\nMeta Description: ${metaDescription}`;
// Clean whitespace and return
return content.replace(/\s+/g, ' ').trim();
}
const updateText = (e: React.ChangeEvent<HTMLInputElement>) => {
setSeoUrl(e.target.value);
};
return (
<div className="pageContainer">
<h2>SEO Content Generator</h2>
<p>
Analyze web pages and generate SEO-optimized content automatically.
<br />
Sample product page: <code>http://localhost:4000/product.html</code>
</p>
<div>
<input
type="url"
placeholder="Enter webpage URL"
value={seoUrl}
onChange={updateText}
style={{ width: '400px', marginRight: '10px' }}
/>
<button onClick={process} disabled={!seoUrl || promiseInProgress}>
Generate SEO Content
</button>
<br />
{promiseInProgress && <span>Analyzing webpage...</span>}
</div>
<div style={{ marginTop: '20px' }}>
{seoText && (
<div>
<h3>Generated SEO Content:</h3>
<pre style={{
background: '#f5f5f5',
padding: '10px',
borderRadius: '5px',
whiteSpace: 'pre-wrap',
fontSize: '14px'
}}>
{seoText}
</pre>
</div>
)}
</div>
</div>
);
};
export default Page;
Step 7: Testing it out
- Replace
<AZURE_OPENAI_API_KEY>placeholder value by looking up https://aiaaa-s2-setting.azurewebsites.net - Go to
apps-chat\chatbot-frontendfolder in terminal windows and runnpm run dev - Navigate to
SEOpage in the top navigation bar - Copy
https://****.app.github.dev/product.htmlinto the text box. **** should be your codespaces name - Click
Generateto see SEO content
Integration Opportunities
Content Management Systems
- WordPress plugin integration
- Shopify SEO automation
- Custom CMS implementations
- Bulk page optimization
Analytics and Monitoring
- Google Search Console integration
- Rank tracking automation
- Performance monitoring dashboards
- A/B testing for titles and descriptions
Workflow Automation
- Automated SEO audits
- Content optimization alerts
- Competitor monitoring
- Performance reporting
Real-World Applications
E-commerce SEO
- Product page optimization
- Category page enhancements
- Review and rating integration
- Local store optimization
Content Marketing
- Blog post optimization
- Landing page creation
- Social media content
- Email marketing integration
Enterprise SEO
- Large-scale site optimization
- International SEO strategies
- Technical SEO auditing
- Brand reputation management