Learning Objectives
By completing this worksheet, you will be able to:
- Create frequency distribution tables from pharmaceutical data
- Calculate measures of central tendency with step-by-step reasoning
- Apply Excel functions for statistical analysis
- Interpret results in pharmaceutical manufacturing context
- Understand data organization and presentation principles
๐ญ Real-World Context
You're working as a Quality Control analyst at a pharmaceutical company. Your supervisor has asked you to analyze recent batch data from tablet production to ensure compliance with USP standards and identify any trends that might affect product quality.
Part 1: Frequency Distribution Workshop
Understanding Frequency Distributions
Let's think step-by-step about why frequency distributions matter in pharmaceutical manufacturing:
๐ง Step-by-Step Reasoning: Why Frequency Distributions?
๐ Scenario: Tablet Weight Analysis
Below is weight data (in mg) from 30 tablets of acetaminophen 500 mg from Batch #TX2024-001:
Tablet Weight Data (mg) - Batch TX2024-001 | |||||
---|---|---|---|---|---|
548.2 | 552.1 | 549.8 | 551.3 | 550.7 | 549.1 |
550.9 | 548.7 | 552.8 | 549.5 | 551.1 | 550.3 |
549.7 | 551.8 | 550.1 | 548.9 | 552.3 | 549.4 |
550.6 | 549.2 | 551.5 | 550.8 | 548.5 | 552.0 |
549.9 | 551.2 | 550.4 | 549.6 | 551.7 | 550.0 |
๐ง Step-by-Step: Creating a Frequency Distribution
Range = Maximum value - Minimum value
Range = 552.8 - 548.2 = 4.6 mg
Using Sturges' Rule: k = 1 + 3.322 ร logโโ(n)
k = 1 + 3.322 ร logโโ(30) = 1 + 3.322 ร 1.477 = 5.9 โ 6 classes
Class Width = Range รท Number of Classes
Class Width = 4.6 รท 6 = 0.77 mg โ 0.8 mg (rounded up for convenience)
Starting from 548.0 mg with width 0.8 mg
๐ Key Formulas
๐ป Excel Implementation
Range: =MAX(data_range) - MIN(data_range)
Logโโ: =LOG10(COUNT(data_range))
Sturges' Rule: =1 + 3.322*LOG10(COUNT(data_range))
๐ฅ Exercise 1.1: Complete the Frequency Distribution
Using the tablet weight data above, complete the frequency distribution table:
Class Interval (mg) | Frequency | Relative Frequency (%) | Cumulative Frequency |
---|---|---|---|
548.0 - 548.8 | ? | ? | ? |
548.8 - 549.6 | ? | ? | ? |
549.6 - 550.4 | ? | ? | ? |
550.4 - 551.2 | ? | ? | ? |
551.2 - 552.0 | ? | ? | ? |
552.0 - 552.8 | ? | ? | ? |
Your Work Space:
Show your step-by-step counting process here...
Part 2: Central Tendency Calculations
Mean (Arithmetic Average)
๐ง Step-by-Step: Calculating the Mean
The mean is the balance point of our data - it tells us the typical tablet weight
Sum = 548.2 + 552.1 + 549.8 + ... + 550.0
Sum = 16,508.7 mg
Mean = Sum รท n = 16,508.7 รท 30 = 550.29 mg
The average tablet weight is 550.29 mg, which is close to our target of 550 mg
๐ Mean Formula
Where: ฮฃx = sum of all values, n = number of observations
๐ป Excel Function
=AVERAGE(A1:F5) where A1:F5 contains your tablet weight data
Alternative: =SUM(A1:F5)/COUNT(A1:F5)
Median (Middle Value)
๐ง Step-by-Step: Finding the Median
548.2, 548.5, 548.7, 548.9, 549.1, 549.2, 549.4, 549.5, 549.6, 549.7, 549.8, 549.9, 550.0, 550.1, 550.3, 550.4, 550.6, 550.7, 550.8, 550.9, 551.1, 551.2, 551.3, 551.5, 551.7, 551.8, 552.0, 552.1, 552.3, 552.8
With n = 30 (even number), median position = (n+1)/2 = 15.5
So we need the average of 15th and 16th values
15th value = 550.3 mg
16th value = 550.4 mg
Median = (550.3 + 550.4) รท 2 = 550.35 mg
๐ป Excel Function
=MEDIAN(A1:F5) automatically sorts and finds the middle value
Mode (Most Frequent Value)
๐ง Step-by-Step: Finding the Mode
Look for values that appear more than once in our dataset
In this dataset, each weight value appears only once
No mode exists (no value repeats) - this dataset is amodal
This suggests good manufacturing precision - weights are well-distributed
๐ป Excel Function
=MODE.SNGL(A1:F5) for single mode
=MODE.MULT(A1:F5) for multiple modes
Note: Returns #N/A if no mode exists
๐ฅ Exercise 2.1: Content Uniformity Analysis
You receive the following assay results (% of labeled claim) for 10 tablets from a different batch:
Content Uniformity Data (% of labeled claim) | ||||
---|---|---|---|---|
98.7 | 99.8 | 101.2 | 100.1 | 99.3 |
100.7 | 98.9 | 99.8 | 100.5 | 101.0 |
Calculate (show all steps):
a) Mean:
Step 1: Sum = _____________
Step 2: Mean = Sum รท n = _____ รท 10 = _____
b) Median:
Step 1: Sorted data: _________________________
Step 2: Middle positions (5th and 6th): _____ and _____
Step 3: Median = (___ + ___) รท 2 = _____
c) Mode:
Value that appears most frequently: _____
d) Excel Formulas:
Mean: _______________
Median: _______________
Mode: _______________
Part 3: Excel Functions Laboratory
Essential Statistical Functions
Basic Descriptive Statistics
=AVERAGE(range) - Arithmetic mean
=MEDIAN(range) - Middle value
=MODE.SNGL(range) - Most frequent value
=COUNT(range) - Count of numeric values
=MAX(range) - Largest value
=MIN(range) - Smallest value
Advanced Functions
=QUARTILE.INC(range, quartile) - Quartiles (0,1,2,3)
=PERCENTILE.INC(range, k) - kth percentile
=FREQUENCY(data_array, bins_array) - Frequency distribution
=COUNTIF(range, criteria) - Conditional counting
๐ง Step-by-Step: Using FREQUENCY Function
Create bins in column C: 548.8, 549.6, 550.4, 551.2, 552.0, 552.8
Select cells D2:D7 (one more cell than bins)
Type: =FREQUENCY(A1:F5,C2:C7) then press Ctrl+Shift+Enter
Each cell shows count of values in that range
๐ฅ Exercise 3.1: Excel Mastery Challenge
Using the dissolution data below (% dissolved at 30 minutes), create a complete Excel analysis:
Dissolution Data (% dissolved at 30 min) | |||||
---|---|---|---|---|---|
85.2 | 87.8 | 84.1 | 86.9 | 85.7 | 87.2 |
86.3 | 84.8 | 87.5 | 85.4 | 86.1 | 84.9 |
Task 1: Write Excel formulas for the following:
a) Mean: _______________
b) Median: _______________
c) Maximum: _______________
d) Minimum: _______________
e) Count: _______________
Task 2: Create bins for frequency analysis:
Bins (every 1%): 84, 85, 86, 87, 88
Frequency formula: _______________
Task 3: Interpretation
Based on your results, comment on the dissolution performance:
_________________________________________________