How to Prepare Statistics for SSC CGL Tier II?

With the announcement of** SSC CGL 2017 notification**, exam dates of SSC CGL Tier II have been released. The exam is scheduled from January 18 to January 20, 2018. Now, it’s time to roll up your sleeves and delve into the preparation for SSC CGL Tier II with only 30 days in hand.

Around 4733 vacancies will be filled up this year through **SSC CGL**. Tier I went through a lot of changes and this is nerve-wracking for the candidates. **Check Changes introduced in SSC CGL 2017**

To boost your confidence and amp up your preparation for Statistics for SSC CGL Tier 2, we have come up with some of the best preparation strategies. In the following article, you will get to know tips for Descriptive Statistics, viz. Central Tendency & Dispersion, Mean, Median & Mode, Skewness & Kurtosis, etc. And here you go!

In order to describe and represent a set of data as a single number, we need measures of central tendency that intended to describe the performance of the group and centre of the data. It also tells us about the shape and nature of the distribution. Measures of central tendency include:

**Mean:** The sum of all the observations divided by the total number of observations. The simple steps to calculate mean of data are:

- Find the mean of 1, 2, 3, 4, 5, 6, 7, 8, 9

**Median: **The median is the middle value in the numerical order. If the total number of observations is odd then the median of the data will be the ((n+1)/2).

If the total number of observations is even then the mean of the two middle number is the median of the given observation.

- Find the median of 5, 6, 11, 10, 4, 9, 7
- Arrange the digits in increasing order.
- 4, 5, 6, 7, 9, 10, 11
- Check the middle number.
- 4, 5, 6,
__7__, 9, 10, 11 = 7 - Here, 7 is the middle number and is the median of the given data.

- Find the median of 5, 17, 15, 3, 9, 18, 6, 10

- When the total number of digits are even then the median of the data is calculated by taking mean of two middle numbers.
- Arrange the data series in ascending order.
- 3, 5, 6,
__9__,__10__, 15, 17, 18 - Calculate the mean of two middle numbers.
- ((9+10)/2) = 9.5

**Mode:** It is the number that is repeated more often than any other. If two numbers tie then the observation will have two modes and is called Bimodal.

- Find the mode of 2, 6, 3, 9, 5, 6, 2, 6

- Arrange the observations in either ascending or descending order.
- Check for the numbers that are repeated more often.
- 2, 2, 3, 5,
__6__,__6__,__6__, 9 = 6

**Relation between Mean, Median and Mode**

Mean – Mode = 3 (Mean – Median)

Each scale of measurement satisfies one or more of the following properties- Identity, magnitude, equal intervals and a minimum value of zero.

- Nominal Scale – That can simply be broken down into categories. The mode is the only measure that can be used.
- Ordinal Scale – That can be categorized and can be placed in order or ranking. The mode and the median may be used.
- Interval Scale – That can be ranked but has no absolute zero point. The mean, median and mode all can be used.
- Ratio Scale – That allows to compare and has meaningful zero values. The mean, median and mode all can be used.

If the data in observations are arranged in ascending or descending order, then the measures of central tendency divide the observations into two equal parts. In the same way, the given series can be divided into four, ten and hundred equal parts.

Just as one median divides the data into two subgroups, three quartiles divide the data into 4 quarters. The three quartiles are normally represented by the points Q1, Q2 and Q3. The lower quartile is denoted by Q1 (covering 25% observations) and the upper quartile by Q3 (covering 75% observations). The second quartile Q2 is the same as the median.

Where,

l = lower limit of median class; i = class interval

cf = total of all frequencies before median class

f = frequency of median class; n = total number of observations

Just as Quartiles divides the observations into 4 quarters. Deciles divides a series into 10 equal parts i.e. D_{1, }D_{2}, D_{3}, etc.

Where,

l = lower limit of median class; i = class interval

cf = total of all frequencies before median class

f = frequency of median class; n = total number of observations

Percentiles divide a series into 100 equal parts i.e., P_{1, }P_{2}, P_{3},…..P_{99}, P_{100} etc.

Where,

l = lower limit of median class; i = class interval

cf = total of all frequencies before median class

f = frequency of median class; n = total number of observations

Measures of central tendency gives an idea of an average but it is important to know how the data are clustered or scattered away from the average and the degree to which it spreads, is called dispersion. Various measures of dispersion in statistics are:

__Range__: Range is the difference between the largest and the smallest value of the series.

- Range = Largest value – Smallest Value
- Co – efficient of range = ((L – S)/(L + S))
- L = Largest value in the observation
- S = Smallest value in the observation

__Inter–Quartile Range__: It is the difference between the 3^{rd}quartile and the 1^{st}quartile.- Inter–Quartile Range = Q
_{3}– Q_{1} - It is also known as range of middle 50% values

- Inter–Quartile Range = Q
__Percentile Range__: It is the difference between the 90^{th}and 10^{th}percentile.- Percentile Range = P
_{90}– P_{10} - It is also known as range of middle 80% values

- Percentile Range = P
__Quartile Deviation__: It is the average of the difference between the 3^{rd}quartile and 1^{st}quartile. It is an absolute measure of dispersion.- Quartile Deviation = (Q
_{3}– Q_{1})/2 - Co–efficient of Quartile Deviation = (Q
_{3}– Q_{1}/Q_{3}+ Q_{1})

- Quartile Deviation = (Q
__Mean Deviation__: It is an average or the mean of the deviations of the values from a fixed point. It is a calculative measures of dispersion- Mean Absolute Deviation = where, N = Number of observations, = Mean

__Standard Deviation__: It is defined as the square root of the mean of the squared deviations of individual values around their mean. If the values of the observations are same, then standard deviation is zero (0) and it is least affected by fluctuations.

Where,

- σ = Standard Deviation
- S
^{2}= Variance - sum of the square of deviations from the mean
- N = total number of observations

Measures of dispersion help to describe the width of the distribution, but they don’t give any information about the shape of the distribution. There are further statistics that give information about the shape of the distribution. They are:

- 1
^{st}moment Mean (describes central value)- 1st-moment =, is equal to zero

- 2
^{nd}moment Variance (describes dispersion)- gives information on the spread or scale of the distribution of numbers

- 3
^{rd}moment Skewness (describes asymmetry)- , gives information on the Skewness of the distribution

- 4
^{th}moment Kurtosis (describes peakedness)- , gives information on the Kurtosis of the distribution

Measures of dispersion tells us about the variation of the data set whereas, Skewness tells us about the direction of variation of data set. It is a measure of symmetry i.e. same to the right and the left of the center point.

Skewness can be positive or negative or zero

- When the values of mean, median and mode are equal, there is no Skewness.
- When mean > median > mode, Skewness will be positive.
- When mean < median < mode, Skewness will be negative.

**Pearson’s coefficient of Skewness =**

It ranges between –3 to +3

**Bowley’s measure of Skewness =**

Where, Q1 = first Quartile, Q2 = second Quartile, Q3 = third Quartile, = Median

**Kelly’s measure of Skewness =**

Where,

P_{90} = 90^{th} Percentile; P_{50} = 50^{th} Percentile; P_{10} = 10^{th} Percentile

It measures the relative peakedness or flatness of a distribution compared to the normal distribution 12

- When Kurtosis > 0, the peak of a curve becomes relatively high and that curve is called Leptokurtic. The positive Kurtosis indicates a flat distribution with long tails
- When Kurtosis < 0, the curve is flat-topped, then it is called Platykurtic. The negative Kurtosis indicates a peaked distribution with short tails
- A normal curve is neither very peaked nor very flat-topped, so it is taken as a basis for comparison. The normal curve is called Mesokurtic. For a normal distribution, kurtosis is equal to 3.

The measure of Kurtosis, known as Percentile coefficient of kurtosis is:

Kurtosis =

Where,

Q.D is semi-interquartile range=Q.D=

P_{90}= 90^{th} Percentile; P_{10}= 10^{th} Percentile

The recommended books for preparation of SSC CGL examination 2017 are-

Name of the Book | Author | Publisher |
---|---|---|

SSC combined Graduate Level Guide | R. Gupta's | Ramesh Publishing House |

SSC Combined Graduate Level Examination | Sanjeev Joon | Tata McGraw Hill Education Private Limited |

A Complete Guide for SSC: Graduate Level Examination | Sachchida Nand Jha | UPSC Portal Publications |

SSC Combined Graduate Level Mains Exam with Solved & Practice Sets | Arihant Experts | Arihant |

SSC Staff Selection Commission: Combined Graduate Level Previous Years' Papers With Practice Test Papers (Solved) (Paperback) | R. Gupta's | Ramesh Publishing House |

Paper 3 is held for candidates who have applied for the post of Statistical Officer. Statistics Paper will carry 100 questions comprising a total of 200 marks. This paper would be set on graduation level questions from the following topics:

**Collection and Representation of Data:**Methods of data collection, frequency distributions, etc.**Measures of Central Tendency:**Mean, median and mode, Partition Values– quartiles, deciles & percentiles.**Measures of Dispersion:**Measures of relative dispersion, Common measures– range, quartile deviations, standard deviation & mean deviation.**Moments, Skewness and Kurtosis:**Meaning and different Measures of skewness and kurtosis; Different types of moments and their relationship.**Correlation and Regression:**Scatter diagram, Simple correlation coefficient, Simple regression lines, Spearman’s rank correlation, Measures of association of attributes, Multiple regression, Multiple and partial correlations (For 3 variables only).**Probability Theory:**Meaning and different definitions of Probability, Compound Probability, Conditional Probability, Independent events and Bayes’ Theorem.

**Random Variable and Probability Distributions:**Random variable, Higher moments of a random variable, Binomial, Poisson, Normal and Exponential distributions, Joint distribution of two random variable (discrete), Probability functions, Expectation and Variance of a random variable.**Sampling Theory:**Concept of population and sample; Sample size decisions; Parameter and statistics, Sampling and non-sampling errors; Probability and non-probability sampling techniques (simple random sampling, convenience sampling, stratified sampling, systematic sampling, multistage sampling, multiphase sampling, cluster sampling, purposive sampling and quota sampling); Sampling distribution (statement only).**Statistical Inference:**Properties of a good estimator, Basic concept of testing, Small sample and large sample tests, Tests based on Z, t, Chi-square and F statistic, Confidence intervals, Methods of estimation (Moments method, Maximum likelihood method, Least squares method), Testing of hypothesis, Point estimation and interval estimation.**Analysis of Variance:**Analysis of one way & two way classified data.**Time Series Analysis**: Components of time series, Determinations of trend component by different methods, Measurement of seasonal variation by different methods.**Index Numbers:**Meaning of Index Numbers, Cost of living Index Numbers, Uses of Index Numbers, Types of index number, Different formulae, Problems in the construction of index numbers, Base shifting and splicing of index numbers.

*The article might have information for the previous academic years, please refer the official website of the exam.