PGS Case vs Control Analysis
Compare polygenic score distributions between cases and controls with statistical analysis.
Configuration
Subcohort Comparison
Select a PGS ID, score type, and two cohorts to generate visualizations.
Name Your Subcohort Plot
Enter a descriptive name for this visualization or leave it blank to use the default.
Subcohort Analysis Guide
Overview
Subcohort Analysis allows you to compare polygenic score distributions between two specific cohorts. This tool generates three types of visualizations:
- Distribution Curve: Shows the overall score distribution for the combined cohorts
- Cohort Histograms: Displays frequency distributions for each cohort separately
- Odds Ratio Plot: (Optional) Compares risk between cohorts using statistical measures
Visualization Components
Distribution Curve
Smooth curve showing combined score distributionCohort Histograms
Side-by-side frequency counts for each cohortOdds Ratio Plot
Statistical comparison of risk between cohortsHow to Use
- Select PGS ID: Choose the polygenic score you want to analyze
- Choose Score Type: Select the appropriate normalization (SUM, Z_MostSimilarPop, Z_norm1, or Z_norm2)
- Pick Two Cohorts: Select the two groups you want to compare
- Configure Options:
- Add a custom plot title if desired
- Enable "Normalize Bins" for better visual comparison of differently-sized cohorts
- Optional - Add Odds Ratio: Enable odds ratio analysis to quantify risk differences
- Generate Plot: Click the generate button to create your visualization
Configuration Options
When enabled, adjusts histogram bin heights to account for different cohort sizes. This makes it easier to compare distribution shapes when your two cohorts have very different sample sizes.
Primary Application: This analysis is particularly powerful for case vs control comparisons, where you want to compare polygenic score distributions between individuals with and without a specific condition or trait.
Odds Ratio Methods Explained
What is an Odds Ratio?
An odds ratio (OR) compares the likelihood of an outcome between two groups. In our context, it compares the likelihood of having a high polygenic score between two cohorts.
- OR = 1: Equal risk between cohorts
- OR > 1: Cohort 1 has higher risk than Cohort 2
- OR < 1: Cohort 1 has lower risk than Cohort 2
Single Threshold Method
What it does: Compares the proportion of individuals above vs below a specific score value.
How it works:
| Score ≥ Threshold | Score < Threshold | |
|---|---|---|
| Cohort 1 | A | B |
| Cohort 2 | C | D |
Odds Ratio = (A × D) / (B × C)
Best for: Testing a specific clinically relevant threshold (e.g., a risk cutoff used in practice).
Multiple Thresholds (Cumulative)
What it does: Calculates odds ratios at multiple score thresholds, showing how risk changes across the score distribution.
How it works:
- Divides all scores into equal-sized groups (e.g., 10 groups)
- For each threshold, compares individuals above vs below that point
- Creates a series of odds ratios showing cumulative risk
Best for: Understanding how risk differences change across the entire score range, identifying optimal thresholds.
Risk Categories vs Lowest Risk
What it does: Divides scores into risk categories and compares each category against the lowest risk group.
How it works:
- Divides all scores into equal-sized risk categories (e.g., deciles)
- Uses the lowest category (1st decile) as the reference baseline
- Compares each higher category against this baseline
Best for: Clinical interpretation, creating risk stratification systems, comparing high-risk to low-risk groups.
Which Method Should You Choose?
- Single Threshold: When you have a specific, clinically meaningful cutoff value
- Multiple Thresholds: For exploratory analysis to find optimal thresholds
- Risk Categories: For clinical applications and risk stratification
Statistical Notes
- All methods include 95% confidence intervals
- Small cell counts are adjusted using continuity correction (ε = 0.5)
- Confidence intervals that cross 1.0 indicate non-significant differences