Welcome to datascience stack exchange !
To design an effective schema for your healthcare use case, there are a number of issues to consider:
1. Avoiding 1:1 Relationships Between Fact and Dimension Tables
Creating a 1:1 relationship between the fact table and a dimension table is generally not recommended in dimensional modelling. Fact tables should store measurable events, while dimensions provide descriptive context. A 1:1 relationship indicates that information in the fact table might belong in the dimension table.
I would suggest merging [patientid] and [organid] into a single dimension, eg., PatientOrgan. This dimension uniquely identifies each patient-organ combination and connects to the fact table via a foreign key.
Example:
PatientOrgan Dimension
| PatientOrganKey | PatientID | OrganID | OrganName |
|------------------|-----------|---------|------------|
| 1 | 101 | 1 | Heart |
| 2 | 102 | 1 | Heart |
| 3 | 101 | 2 | Left Lung |
Fact Table
| FactID | PatientOrganKey | Value |
|--------|------------------|--------|
| 1 | 1 | 75.5 |
| 2 | 2 | 68.2 |
This approach simplifies the schema and centralises organ-level metadata in the dimension.
2. Handling "Double Lung" with Left and Right Lung Details
The Double Lung [organid] introduces a unique challenge since the corresponding dimension table must include attributes for both Left Lung and Right Lung. Two potential solutions are:
Option 1: Separate Rows for Left and Right Lung
Create distinct rows for Left Lung and Right Lung in the Lung dimension. The fact table would still reference Double Lung as a single PatientOrganKey.
Lung Dimension
| LungKey | PatientID | LungSide | LungVolume | LungCapacity |
|---------|-----------|----------|------------|--------------|
| 1 | 101 | Left | 3.1 | 4.0 |
| 2 | 101 | Right | 3.2 | 4.1 |
The ETL process or queries can handle aggregation for reporting.
Option 2: Composite Attributes for Double Lung
Store combined attributes for Left and Right Lung in a single row for Double Lung.
Lung Dimension
| LungKey | PatientID | LeftLungVolume | RightLungVolume | TotalVolume |
|---------|-----------|----------------|-----------------|-------------|
| 1 | 101 | 3.1 | 3.2 | 6.3 |
This method is simpler but sacrifices normalisation.
3. Organ-Specific Dimension Tables
If you want detailed attributes for each organ, consider using organ-specific dimension tables alongside a general OrganDimension.
OrganDimension
| OrganKey | OrganName |
|----------|--------------|
| 1 | Heart |
| 2 | Double Lung |
HeartDimension
| OrganKey | PatientID | EjectionFraction |
|----------|-----------|------------------|
| 1 | 101 | 0.65 |
LungDimension
| OrganKey | PatientID | LeftLungVolume | RightLungVolume | TotalVolume |
|----------|-----------|----------------|-----------------|-------------|
| 2 | 101 | 3.1 | 3.2 | 6.3 |
The fact table links to the general OrganDimension, enabling flexibility and scalability.
Fact Table
| FactID | PatientID | OrganKey | Value |
|--------|-----------|----------|--------|
| 1 | 101 | 1 | 75.5 |
| 2 | 101 | 2 | 68.2 |
Summing up:
- Try to avoid 1:1 relationships; instead, use a
PatientOrgan dimension to link patients and organs.
- Handle "Double Lung" either by splitting data into Left Lung and Right Lung rows or storing composite attributes in a single dimension row.
- Use organ-specific dimensions to store detailed attributes while maint