IIT Madras BS Descriptive Statistics

IIT Madras BS Descriptive Statistics

Certainly! Below is a detailed explanation of the key topics in the IIT Madras BS Descriptive Statistics PDF, presented with examples, questions, and step-by-step solutions in a clear and structured layout[1].


1. Introduction to Statistics

Statistics is the science of collecting, organizing, analyzing, interpreting, and presenting data.

Key Concepts

  • Population: All elements of interest (e.g., all houses in Tamil Nadu).
  • Sample: A subset of the population (e.g., 1000 houses from Tamil Nadu).
  • Descriptive Statistics: Summarizing and describing data.
  • Inferential Statistics: Drawing conclusions about a population from sample data.

Example:
A teacher wants to know the average marks of all students in a school. She collects a sample of students and calculates their average. If she uses this to estimate the school average, she is using inferential statistics.


2. Data Types and Organization

2.1 Unstructured vs. Structured Data

  • Unstructured Data: Not organized (e.g., social media posts).
  • Structured Data: Organized (e.g., a table of student records).

Example Table:

NameGenderDate of BirthMarksBoard
AnjaliF17 Feb, 2003484State
PradeepM3 June, 2002514ICSE

2.2 Classification of Data

  • Categorical Data: Labels (e.g., Gender, Board).
  • Numerical Data: Numbers (e.g., Marks).

Scales of Measurement:

  • Nominal: Categories (e.g., Gender).
  • Ordinal: Ordered categories (e.g., ratings: poor, good, excellent).
  • Interval: Numerical, no true zero (e.g., temperature in ยฐC).
  • Ratio: Numerical, true zero (e.g., height, weight).

3. Describing Categorical Data

3.1 Frequency Distribution

Counts of each category.

Example:
Data: A, A, B, C, A, D, A, B, D, C

CategoryFrequency
A4
B2
C2
D2

3.2 Relative Frequency

Frequency divided by total observations.

CategoryFrequencyRelative Frequency
A40.4
B20.2
C20.2
D20.2

3.3 Charts

  • Pie Chart: Shows proportions.
  • Bar Chart: Shows counts.
  • Pareto Chart: Bar chart sorted by frequency.

Pie Chart Example:
A: 40%, B: 20%, C: 20%, D: 20%


4. Describing Numerical Data

4.1 Types of Variables

  • Discrete: Countable (e.g., number of people).
  • Continuous: Measurable (e.g., height, weight).

4.2 Organizing Data

  • Frequency Table: For discrete data.
  • Class Intervals: For continuous data.

Example:
Marks obtained by 50 students:

Class IntervalFrequency
30-403
40-506
50-6018
60-7017
70-804
80-902

4.3 Stem-and-Leaf Diagram

Shows individual data points.

Example:
Ages: 15, 22, 29, 36, 31, 23, 45, 10, 25, 28, 48

StemLeaf
10 5
22 3 5 8 9
31 6
45 8

5. Measures of Central Tendency and Dispersion

5.1 Mean

Sum of values divided by number of values.

Example:
Data: 2, 12, 5, 7, 6, 7, 3
Mean = (2+12+5+7+6+7+3)/7 = 6

5.2 Median

Middle value in ordered data.

Example:
Ordered data: 2, 3, 5, 6, 7, 7, 12
Median = 6 (4th value)

5.3 Mode

Most frequent value.

Example:
Data: 2, 12, 5, 7, 6, 7, 3
Mode = 7 (appears twice)

5.4 Range

Difference between maximum and minimum.

Example:
Data: 1, 2, 3, 4, 5
Range = 5 - 1 = 4

5.5 Variance and Standard Deviation

Measure of spread.

Example:
Data: 68, 79, 38, 68, 35, 70, 61, 47, 58, 66
Mean = (68+79+38+68+35+70+61+47+58+66)/10 = 59

Variance = ฮฃ(xi - mean)ยฒ / n = 1898 / 10 = 189.8
Standard Deviation = โˆš189.8 โ‰ˆ 13.78


6. Percentiles and Quartiles

6.1 Percentiles

Value below which a given percentage of data falls.

Example:
Data: 35, 38, 47, 58, 61, 66, 68, 68, 70, 79
25th percentile:
n = 10, p = 0.25
np = 2.5 โ†’ 3rd value = 47

6.2 Quartiles

  • Q1: 25th percentile
  • Q2: 50th percentile (median)
  • Q3: 75th percentile

Example:
From above, Q1 = 47, Q2 = (61+66)/2 = 63.5, Q3 = 68


7. Association Between Variables

7.1 Categorical Variables

  • Contingency Table: Shows counts for combinations of categories.
  • Stacked Bar Chart: Shows counts for each category.

Example:
Gender vs. Smartphone Ownership

GenderNoYesTotal
Female103444
Male144256

8. Numerical Variables

8.1 Scatter Plot

Shows relationship between two numerical variables.

Example:
Age vs. Height

Age (years)Height (cm)
175
285
394
4101
5108

9. Counting Principles

9.1 Addition Rule

If you can do A in nโ‚ ways or B in nโ‚‚ ways, total = nโ‚ + nโ‚‚.

Example:
Choose a shirt (4 options) or a pant (3 options) โ†’ 4 + 3 = 7 ways

9.2 Multiplication Rule

If you can do A and B, total = nโ‚ ร— nโ‚‚.

Example:
Choose a shirt and a pant โ†’ 4 ร— 3 = 12 ways


10. Factorial, Permutation, and Combination

10.1 Factorial

n! = n ร— (n-1) ร— … ร— 1

Example:
5! = 5 ร— 4 ร— 3 ร— 2 ร— 1 = 120

10.2 Permutation

Arrangement of r objects from n:
nPr = n! / (n-r)!

Example:
Arrange 3 books from 5: 5P3 = 60

10.3 Combination

Selection of r objects from n:
nCr = n! / (r!(n-r)!)

Example:
Choose 2 students from 4: 4C2 = 6


11. Example Questions and Solutions

Q1: Find the mean, median, and mode of: 10, 20, 30, 40, 50, 60, 70, 80

Solution:

  • Mean: (10+20+30+40+50+60+70+80)/8 = 45
  • Median: (40+50)/2 = 45
  • Mode: No mode (all unique)

Q2: A class has 60 students. In how many ways can a captain and vice-captain be chosen?

Solution:
Order matters โ†’ Permutation
60P2 = 60 ร— 59 = 3540


Q3: From a pack of 52 cards, how many ways to choose 4 cards of the same suit?

Solution:

  • Choose suit: 4C1 = 4
  • Choose 4 cards from suit: 13C4 = 715
  • Total: 4 ร— 715 = 2860

Q4: If nC2 = nC3, find n.

Solution:
nC2 = nC3
n! / (2!(n-2)!) = n! / (3!(n-3)!)
(n-2)(n-3) = 6
nยฒ - 5n + 6 = 6
nยฒ - 5n = 0
n(n-5) = 0
n = 0 or 5
Valid answer: n = 5


Summary Table

TopicExample/FormulaSolution/Explanation
Mean(2+12+5+7+6+7+3)/76
Median2,3,5,6,7,7,126
Mode2,3,5,6,7,7,127
Varianceฮฃ(xi-mean)ยฒ / n189.8
Permutation5P360
Combination4C26
Factorial5!120
Addition RuleShirt or pant4 + 3 = 7
Multiplication RuleShirt and pant4 ร— 3 = 12

This layout and step-by-step approach should make the course material easy to understand and apply to various statistical problems[1].

Citations: [1] https://ppl-ai-file-upload.s3.amazonaws.com/web/direct-files/attachments/67794165/d3930e0b-ab93-4fa6-864b-d0fcf1afbe2a/S1_VOL1_DESCRIPTIVE_STATISTICS.pdf


Answer from Perplexity: pplx.ai/share