V1 Clean - Nielsen Viewership Data for Cortex Analyst

A streamlined Python toolkit for processing Nielsen television viewership data and creating Cortex Analyst semantic models in Snowflake.

🎯 Overview

This repository contains scripts to:

Consolidate multiple CSV files into a single Snowflake table
Create network dimension tables for enhanced analytics
Generate and deploy semantic models for Snowflake Cortex Analyst
Enable natural language querying of television ratings data

📋 Prerequisites

Python 3.8+
Snowflake account with Cortex Analyst access
Required Python packages (see requirements section)
Nielsen viewership CSV files

🛠 Installation

Clone/Download this repository
```
git clone <repository-url>
cd v1-clean
```

Create Python virtual environment

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install required packages

pip install pandas snowflake-connector-python python-dotenv

Configure environment variables Create a .env file in the root directory:

SNOWFLAKE_ACCOUNT=your_account_identifier
SNOWFLAKE_USER=your_username
SNOWFLAKE_PASSWORD=your_password
SNOWFLAKE_WAREHOUSE=your_warehouse
SNOWFLAKE_ROLE=your_role

📁 Project Structure

v1-clean/
├── README.md                    # This file
├── complete_setup.py           # Full setup script
├── partial_setup.py            # Partial setup (skip CSV processing)
├── fox_v1.yaml                 # Cortex Analyst semantic model
├── csv_inputs/                 # Directory for Nielsen CSV files
│   ├── ratings-dataset-1.csv
│   ├── ratings-dataset-2.csv
│   └── ... (additional CSV files)
├── .env                        # Environment configuration
└── requirements.txt            # Python dependencies

📊 Data Requirements

CSV Input Files

Place your Nielsen viewership CSV files in the csv_inputs/ directory. The scripts expect:

CSV files with consistent column structures
Required columns: PROGRAM_DISTRIBUTOR, AVERAGE_AUDIENCE_PROJECTION
Header row with column names
Any number of CSV files (automatically combined)

Snowflake Setup

Database: FOX
Schema: V1_CLEAN (automatically created from directory name)
Required permissions: CREATE TABLE, CREATE STAGE, USAGE on database/schema

🚀 Usage

Option 1: Complete Setup (First Time)

Use this when setting up everything from scratch:

python complete_setup.py

What it does:

Creates schema FOX.V1_CLEAN
Combines all CSV files from csv_inputs/
Creates consolidated table: VW_NIELSEN_PROGRAM_VIEWERSHIP_DAILY
Creates dimension table: DIM_CONFORMED_NETWORK
Creates Snowflake stage for semantic models
Uploads fox_v1.yaml semantic model to Cortex Analyst

Option 2: Partial Setup (Update Only)

Use this when the main viewership table already exists:

python partial_setup.py

What it does:

Verifies existing VW_NIELSEN_PROGRAM_VIEWERSHIP_DAILY table
Recreates DIM_CONFORMED_NETWORK dimension table
Updates semantic model in Cortex Analyst
Preserves existing viewership data

📋 Generated Tables

VW_NIELSEN_PROGRAM_VIEWERSHIP_DAILY

Consolidated viewership data from all CSV files.

Purpose: Main fact table containing all program ratings
Key Columns: PROGRAM_DISTRIBUTOR, AVERAGE_AUDIENCE_PROJECTION
Rows: Combination of all input CSV files

DIM_CONFORMED_NETWORK

Network dimension for consistent reporting.

Purpose: Standardized network lookup table
Structure:
- NETWORK_CODE: Unique program distributor identifier
- NETWORK_NAME: Friendly network name (e.g., "Fox Broadcasting Company")

Join Pattern

SELECT *
FROM VW_NIELSEN_PROGRAM_VIEWERSHIP_DAILY v
LEFT JOIN DIM_CONFORMED_NETWORK n
  ON v.PROGRAM_DISTRIBUTOR = n.NETWORK_CODE

🧠 Cortex Analyst Integration

After running the setup scripts:

Access Cortex Analyst
- Go to Snowsight → AI & ML → Cortex Analyst
Load Semantic Model
- Database: FOX
- Schema: V1_CLEAN
- Stage: SEMANTIC_MODELS
- File: fox_v1.yaml

Example Natural Language Queries

What are the top networks by total audience?
Compare Fox viewership to other broadcasters
Show me audience trends by network
Which programs have the highest ratings?

🔧 Configuration

Environment Variables (.env)

# Snowflake Connection
SNOWFLAKE_ACCOUNT=abc123.us-east-1
SNOWFLAKE_USER=your_username
SNOWFLAKE_PASSWORD=your_password
SNOWFLAKE_WAREHOUSE=COMPUTE_WH
SNOWFLAKE_ROLE=SYSADMIN

# Optional: Custom database name
# DATABASE_NAME=FOX

Semantic Model (fox_v1.yaml)

The semantic model defines:

Table relationships and joins
Metrics and dimensions for analysis
Synonyms for natural language queries
Custom instructions for query generation

🚨 Troubleshooting

Common Issues

1. CSV Files Not Found

❌ No CSV files found in csv_inputs

Solution: Add CSV files to the csv_inputs/ directory

2. Snowflake Connection Failed

❌ Failed to connect to Snowflake

Solution: Verify .env file configuration and network access

3. Column Mismatch in CSVs

⚠️ Column mismatch in filename.csv

Solution: Scripts auto-align columns, but verify CSV structure

4. Semantic Model Validation Error

❌ Invalid YAML file - missing required fields

Solution: Ensure fox_v1.yaml contains valid YAML with name: and tables: fields

Logs and Debugging

Scripts provide detailed logging with timestamps
Check Snowflake query history for SQL execution details
Verify stage contents: LIST @FOX.V1_CLEAN.SEMANTIC_MODELS

📈 Performance Considerations

Batch Processing: CSV data inserted in 1,000-row batches
Data Types: Automatic inference with NUMBER(15,2) for metrics
Indexing: Consider adding indexes on PROGRAM_DISTRIBUTOR for large datasets

🤝 Contributing

Fork the repository
Create feature branch: git checkout -b feature-name
Commit changes: git commit -am 'Add feature'
Push to branch: git push origin feature-name
Submit pull request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

📞 Support

For issues and questions:

Check the troubleshooting section above
Review script logs for detailed error messages
Verify Snowflake permissions and connectivity

📝 Example Workflow

# 1. Setup environment
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

# 2. Configure Snowflake connection
cp .env.example .env
# Edit .env with your Snowflake credentials

# 3. Add your CSV files
cp /path/to/your/nielsen/*.csv csv_inputs/

# 4. Run complete setup
python complete_setup.py

# 5. Access Cortex Analyst in Snowsight
# Database: FOX, Schema: V1_CLEAN
# Load semantic model: fox_v1.yaml

# 6. Start querying with natural language!

Sample Cortex Analyst Questions:

"What are the top 10 Fox programs by audience?"
"Compare CBS and NBC viewership performance"
"Show me total audience by network type"
"Which demographic groups watch the most Fox content?"

Built for Nielsen viewership analysis and Snowflake Cortex Analyst integration


This README provides comprehensive documentation for the v1-clean repository as a standalone project, including setup instructions, usage examples, troubleshooting, and integration details for Cortex Analyst.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

V1 Clean - Nielsen Viewership Data for Cortex Analyst

🎯 Overview

📋 Prerequisites

🛠 Installation

📁 Project Structure

📊 Data Requirements

CSV Input Files

Snowflake Setup

🚀 Usage

Option 1: Complete Setup (First Time)

Option 2: Partial Setup (Update Only)

📋 Generated Tables

VW_NIELSEN_PROGRAM_VIEWERSHIP_DAILY

DIM_CONFORMED_NETWORK

Join Pattern

🧠 Cortex Analyst Integration

🔧 Configuration

Environment Variables (.env)

Semantic Model (fox_v1.yaml)

🚨 Troubleshooting

Common Issues

Logs and Debugging

📈 Performance Considerations

🤝 Contributing

📄 License

📞 Support

📝 Example Workflow

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md
complete_setup.py		complete_setup.py
fox_v1.yaml		fox_v1.yaml
partial_setup.py		partial_setup.py

atlanhq/Fox-Onsite

Folders and files

Latest commit

History

Repository files navigation

V1 Clean - Nielsen Viewership Data for Cortex Analyst

🎯 Overview

📋 Prerequisites

🛠 Installation

📁 Project Structure

📊 Data Requirements

CSV Input Files

Snowflake Setup

🚀 Usage

Option 1: Complete Setup (First Time)

Option 2: Partial Setup (Update Only)

📋 Generated Tables

VW_NIELSEN_PROGRAM_VIEWERSHIP_DAILY

DIM_CONFORMED_NETWORK

Join Pattern

🧠 Cortex Analyst Integration

🔧 Configuration

Environment Variables (.env)

Semantic Model (fox_v1.yaml)

🚨 Troubleshooting

Common Issues

Logs and Debugging

📈 Performance Considerations

🤝 Contributing

📄 License

📞 Support

📝 Example Workflow

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages