2

I am trying to build a small healthcare fact table with the following information

[patientid], [organid], [value]

Each [patientid] is unique to that patient, but there are only 10 available [organid] in the system (Heart, Left Lung, Right Lung, Liver, Pancreas, etc). So obviously, you will see many other [patientid] with the same 10 or so [organid] in this fact table.

I then want to link it to dimension tables, with each dimension table being it's own organ. So a dimension table that is all about the Heart.

However, the Heart dimension table will have data regarding the Heart for each [patientid]. So one column will be the Ejection Fraction, and you'll be able to see each [patientid] Ejection Fraction in the rows.

To do this, though, it would require a 1:1 relationship between the fact table and dimension table, since the rows would be unique to the [patientid] and the [organid]. Is that best practice for this, or should I rethink my dimension tables?

Also, one other weird question. A Double Lung will have it's own [organid] but the Lung dimension table will include Left Lung and Right Lung data, which goes against the grain. I cannot help this, the [organid] is for the double lung but there is specific information for both lungs. How could I approach that?

A. Romain
  • 21
  • 1

1 Answers1

4

Welcome to datascience stack exchange !

To design an effective schema for your healthcare use case, there are a number of issues to consider:

1. Avoiding 1:1 Relationships Between Fact and Dimension Tables

Creating a 1:1 relationship between the fact table and a dimension table is generally not recommended in dimensional modelling. Fact tables should store measurable events, while dimensions provide descriptive context. A 1:1 relationship indicates that information in the fact table might belong in the dimension table.

I would suggest merging [patientid] and [organid] into a single dimension, eg., PatientOrgan. This dimension uniquely identifies each patient-organ combination and connects to the fact table via a foreign key.

Example: PatientOrgan Dimension

| PatientOrganKey | PatientID | OrganID | OrganName  |
|------------------|-----------|---------|------------|
| 1                | 101       | 1       | Heart      |
| 2                | 102       | 1       | Heart      |
| 3                | 101       | 2       | Left Lung  |

Fact Table

| FactID | PatientOrganKey | Value  |
|--------|------------------|--------|
| 1      | 1                | 75.5   |
| 2      | 2                | 68.2   |

This approach simplifies the schema and centralises organ-level metadata in the dimension.

2. Handling "Double Lung" with Left and Right Lung Details

The Double Lung [organid] introduces a unique challenge since the corresponding dimension table must include attributes for both Left Lung and Right Lung. Two potential solutions are:

Option 1: Separate Rows for Left and Right Lung

Create distinct rows for Left Lung and Right Lung in the Lung dimension. The fact table would still reference Double Lung as a single PatientOrganKey.

Lung Dimension

| LungKey | PatientID | LungSide | LungVolume | LungCapacity |
|---------|-----------|----------|------------|--------------|
| 1       | 101       | Left     | 3.1        | 4.0          |
| 2       | 101       | Right    | 3.2        | 4.1          |

The ETL process or queries can handle aggregation for reporting.

Option 2: Composite Attributes for Double Lung

Store combined attributes for Left and Right Lung in a single row for Double Lung.

Lung Dimension

| LungKey | PatientID | LeftLungVolume | RightLungVolume | TotalVolume |
|---------|-----------|----------------|-----------------|-------------|
| 1       | 101       | 3.1            | 3.2             | 6.3         |

This method is simpler but sacrifices normalisation.

3. Organ-Specific Dimension Tables

If you want detailed attributes for each organ, consider using organ-specific dimension tables alongside a general OrganDimension.

OrganDimension

| OrganKey | OrganName    |
|----------|--------------|
| 1        | Heart        |
| 2        | Double Lung  |

HeartDimension

| OrganKey | PatientID | EjectionFraction |
|----------|-----------|------------------|
| 1        | 101       | 0.65             |

LungDimension

| OrganKey | PatientID | LeftLungVolume | RightLungVolume | TotalVolume |
|----------|-----------|----------------|-----------------|-------------|
| 2        | 101       | 3.1            | 3.2             | 6.3         |

The fact table links to the general OrganDimension, enabling flexibility and scalability.

Fact Table

| FactID | PatientID | OrganKey | Value  |
|--------|-----------|----------|--------|
| 1      | 101       | 1        | 75.5   |
| 2      | 101       | 2        | 68.2   |

Summing up:

  1. Try to avoid 1:1 relationships; instead, use a PatientOrgan dimension to link patients and organs.
  2. Handle "Double Lung" either by splitting data into Left Lung and Right Lung rows or storing composite attributes in a single dimension row.
  3. Use organ-specific dimensions to store detailed attributes while maint
Robert Long
  • 3,518
  • 12
  • 30