Hello everyone, welcome to the next class on the testing of the hypothesis. Today we will start with the non-parametric testing of hypothesis and the first lecture is on the sign test. Myself Dr. Gaur, working in the School of Mathematics Thapar Institute, India.
You are feel free to contact me through my either of the email ids or through my YouTube channel link. So first of all what is the non-parametric test and why we required that. So as we all of you know that there are several type of the testing of hypothesis which obviously required the normal distribution population.
For example, we have discussed so far the Z test F test T test ANOVA and several others are there which is a testing of hypothesis in this this this and this testing we need that the population must be a normal distribution or the SUM test requires that the population variance be equal like say ANOVA and so on. So in order to handle this such type of the conditions, so what will happen if for a given test such requirements are not met, it means the distribution is not with the normally distributed or the population variance are not equal. In order to handle this, we require the testing that is called as the distribution free and such test is called as the non-parametric test and that's why this test are so important as of the parametric test. So, there are the several type of the non-parametric test are there, sign test, Wilkerson, Mann-Whitney, Crucial and the rank correlation etc.
We will cover this one by one and in this lecture we will start with the first that is a sign test. So, first of all this is one of the simplest test of the non-parametric. As the name suggests that is a sign. What is the meaning of the sign is there is a plus or minus.
So, it is based on the sign of the deviations rather than the exact magnitude. If I say my numbers are say 41, 39 and so on. So, it is not depending upon these 34 and 39. It is only depending upon their signs plus and minus. So, this test is basically used for concerning the medians for the one population. So, we are discussed about the single sample.
so that it is discussed about the median and we use throughout the lecture as median as my eta so this is a symbol so so my test testing of hypothesis is we will check that whether the median is some this eta 0 is some specified value that is we will check that whether the median of the population is equal to eta naught or not again some alternative so this is my alternative hypothesis. So that alternative hypothesis is either of the two tail or the one tail which is categorized as right or the left. Now how to find how to apply the sign test.
So assume that there are the sample size of n which is given from the population and this is under h0 is my eta is known. So how you what is the procedure is you can subtract this eta node from the each observations and What is the meaning is if I say my excise are this say 43 29 53 and so on and I will want to check ETA as my say 43. So we will subtract 43 from the each 31 minus 43 and each and we can write only as their plus if the deviation is positive negative if deviation negative 0 if it is 0 for example in this case, what is the deviation is negative so we can write minus. this is zero this is negative the deviation is positive we are not focused on the value so this is there otherwise otherwise you can also use this notation you can put the plus sign which are exceed this like say this in this case you can see that which number which are greater than of the 63 only this we can write as a plus sign which number which are less than of the 43 these two So we take a negative sign and which numbers are equal we will put as a 0. So you can use either of this notation or this now since what is the definition of the median since the sign test is working on the median. So what is the definition of the median is it's a position which divides the series into the equal parts.
So what is the meaning of that under this H0 what we have so we have median is nothing but my Eta 0. so this is nothing but half so we will check that h0 is true then the number of the positive sign should be approximately equal to the negative sign so look at that so if if this difference our task is to check that the number of the positive and the number of the negative signs are equal if the difference is due to the chance or fluctuation of the sampling then we will fail to reject the h0 that is we will we may accept the h0 that's the meaning of it so what is that so look at the notation we will define t positive number of the positive signs t negative number of the negative signs and capital T as the minimum. This is computed only after discarding 0. What is the meaning of this is look at this example. So, first of all, what is a null hypothesis is H0 is my here.
You have to compute T positive, T negative and T for the following information. So, what you can do is subtract 5 from the each of the numbers and write their sign. For example, 8 minus 5 is a positive, 9 minus 5 again is a positive.
3 minus 5 is a negative, 5 minus 5 is 0, 4 minus 5 is negative, 11 minus 5 is positive. Now you can compute the positive signs. How many positive signs are there? 3. So number of the positive signs, how many positive signs? 3. How many negative signs are there?
2. Then you can define the capital T as minimum of T positive. and t negative. So, you will get as a 2. So, this is a way you can compute t positive, t negative and t in each of the example. So, let us start with a single sample for the small sample.
So, when the sample is size is less than of the 25, then we call as the small sample for the sign test. So, what is the procedure is that we can compute this in the fourth step. We will define the hypothesis that is h0, h1.
We will compute T positive and T negative as we discussed in the last example. How you compute that T as this is nothing but my minimum of positive and negative and we will see about the critical reason. Okay, so the hypothesis H0 is always with equality. Alternative is any one of them.
Step 2 is compute T positive. How you compute them? You can subtract eta 0 from each observation and after discarding the zeros, we will compute T positive and T negative.
Then we will step third that is the test statistics we will calculate here and the critical value is defined as here. So, let us discuss with some example where what is that Tc is where Tc is value of the critical value at the given level of the significance. So, if this is my value which is obtained from the table. So, then the critical reason is my T less than equal to Tc. This is my critical reason.
So if any value if any value lies here, it means if the calculated value that is a step third is lies here. We will reject H0 if it lies here we will we may be regarded as a accepted H0. Look at the examples to explain this. So say following are the responses of the questions. How many hours do you study before the major statistics test using the sign test?
So it's what is the step number one you have to define the h0 and h1 so what is the question is that what do you want to compute the median of the students before the test is 3 so the median median is eta is 3 that's equality so whenever there is a equality we always define as a h0 so this is not equal to 3 so it means this is my total so the sample size n is small so we will call this sample size small So H0S, so this is the two-tip. What is the step two is? We will subtract this three from each of the numbers.
So what is the first number is? 6 minus 3, that's a positive. 5 minus 3, again positive.
1 is a negative. Then 2 is a negative and so on. We will calculate as here.
Then we will calculate the T positive and T negative. How many positives are there? 2, 3, 4, 5, 6, 7, 8. 8 are my T positive.
How many T negatives? 1, 2, 3. So, there are 3 T negatives. Then we will calculate the T as here. That is a step 3. Now, the main thing is a step 4. So, firstly calculate N, T positive plus T negative.
That is 11. Now, my task is to find the value of the TC. So, look at this given statement. It is given that the critical value for the sign test for n is equal to 1 at 5 percentage level of significance for the two tail is 1. So this is 1 is my critical value. for n is 11. So this Tc is my 1. So it is given. So this is here.
So what is the meaning of this is? So the critical reason is here T less than equal to 1. Now check where it lies. This is my minus infinity to plus infinity.
So where is 3 lies? It lies here. So look at that 3 is greater than 1. It lies here. So we fail to reject H0. So what is your H0 is?
Look at that. eta is 3 and it is not equal to 3. This is my reject H0. This is accepted H0. It means this is accepted.
So, what your conclusion is? We conclude that the median number is 3. So, this is the way you can compute them. Look at the one more example so that it will be clear to you.
So, firstly, the sample size is very small. So, n is my less than. 25 so it is there so define the step 1 h0 h1 so what is that use the sign test the teacher claims so what is the teacher claim is a teacher claims that the median time so it's a median of the particular type is at the most 3 so it is not equality is at the most 3 okay so this is not equality so what is the meaning of the at most 3 it means less than equal to 3 so it is not equality so here this one okay so what is the opposite of this is here so this is my is at most sorry this is less than this is less than and here is greater than this is the at most 3 so it's a less than so it's it is my left tail so what is the step 2 is you can subtract these three from each of the numbers you can subtract this 3 from each of the numbers so first one is negative second one 2 minus 3 negative 4 minus 3 positive and so on you will calculate this so how many positive numbers are there positive signs 6 are my positive how many negative are there 1 2 3 so t positive 6 t negative 3 then you have to find the capital t minimum of this that's a 3. Now you can calculate the n 6 plus 3 9. So you have to find here.
So it is given that in the statement the sign test for the 9 degree of 9 at the 5 level is 1. So this value is my 1. So the critical reason is less than equal to this one. Critical reason is here. So this portion is my rejected H0. This is my accepted H0. So, look at that the value of 3 lies here.
So, this is greater than 1. So, we fail to reject H0 again. Hence, we conclude that the teacher claim is at most is true. Look at one more example which will explain you more.
So, firstly again the sample size is less than 1 or less than 25. So, we will use the small sample. So, what is the H0 and H1? So, what is that? the travel claim.
So, what is the travel claim is travel agency who promote a particular holiday advertise that the median cost in the city is 50. So, this is is means equality. Okay, the task is 50. So, since it is equality, so that's why this is H0 instead of the H1. So, what is the opposite of this is not equal to 50. So, it means this is my total. So once you are defining this H step 1, what is the step 2 is you can subtract 50 from the each number. First number is my positive, second is positive, 49 minus 50 negative, 50 minus 50 0, 53 is a positive, then positive, then negative, then negative.
So you will calculate this one. So what is your T positive? 4. What is your T negative? How many negative numbers are there?
Three. So you have to define this calculate the capital T as a minimum of 4 and 3 then what is the n is step number 4 n is my 7 so you need this value so it is given that in the statement the critical value for n is equal to 7 at 5 percentage level of significance for the 2 tail why it's a 2 tail look at the h1 what is your h1 is it's a 2 tail so it is given as my So, this value is my 0. So therefore what is the critical reason is this portion. So how you define that this critical portion is less than equal to 0. So this is my rejected H node.
This is my accepted H node. So look at that where it lies. So since it is 0 so 3 lies here. So we can say it is fail to reject H0 fail to reject H0 means we will accept it H0.
and look at that what is your H0 is median is 50 and so we fail to reject that is accepted H0 so this is accepted so we conclude that the median may not differ from the 50 or you can see the median of this is 50 so this is a simple way you can solve the sign test Let's do for the large sample. What will happen when sample size is greater than 25? So, this is the definition of the median.
So, this P is nothing but the probability. So, when we are talking about the normal, when we are talking about the large sample, so it means it is a normal distribution. So, when it is a normal distribution, the probability is given as this.
So, the normal distribution is followed by the binomial distribution. So, with parameter n and P here. So, if T follows this what is the mean of the binomial distribution is NP what is the variance of the binomial distribution is NP Q so we can calculate that so mean is NP P is my 0.5 so here variance is my NP Q if P is my half Q is also half so you can substitute here this billion so this is the mean this is a variance so by using the normal distribution we can apply here if T positive less than of the half of the sample we will apply Z test as T positive plus 0.5 because this is a binomial is my discrete distribution and we will apply the continuity correction by using 0.5 if this is greater than n by 2 then we will use as a negative so the working rules are same we will define the H0 and H1 we will compute at T positive and T negative If t positive less than n by 2, we compute z as here, otherwise this and since it is a normal distribution, we will draw the graph. This is the critical value and we will reject or accept the h0. Look at this.
So, since it's a normal distribution, so we will apply the p-value as well as critical value both. So, firstly look at that the sample size is greater than of the 25. n is greater than 25. So, firstly define h0. So, what is the claim? You can read the statement is at least 42. At least 42 is not the equality.
How you write that? This is. So, it is not equality. So, you can write here as this. What is the opposite of this is?
Less than. So, this is my h0 and this is my h1. Now, you can define subtract 42 from each of the observations. you can subtract 42 from each of the observation look at the first number is positive second number is positive third is positive fourth is again positive fifth is again positive sixth is my negative and so on so you can define like this then compute the T positive and T negative how many positive signs are there 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 12, 13 here. Okay.
And how many negative are there? You can count them 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18. So you can see T positive here and N is my 31. Now, since look at that N is my greater than or equal to 25. So we will apply the large sample. So how you apply the last sample we will check whether T positive less than n by 2 or T positive greater than n by 2. So look at that T positive is 13 so whether 13 what is that 13 is less than of the 31.2 so this satisfied so since it is this so we will apply here what is the T positive is 13 plus 0.5 what is the n is 31 by 2 divided by half of 31. So after solving this you will get here.
So this is step third. Now in the step four, so what is your this is a negative so it's a left tail. So the critical value at the five percentage level of significance is minus 1.645. So look at that the value is minus 0.718 where it lies?
It lies here, it lies in the accepted H0. So since it here, so we fail to reject H0. Hence the claim of the median test at least because what is your H0 is at least 42 and this is here.
So this is accepted. So at least is well look at one more example. Okay, how to calculate this compute by p value instead of this step 4 if you want to calculate the p value.
So look at that what is your h0 is less than of the 42 say left tail. So that's why this sign is less than what is the value of the z we computed as look at that the value of the z we computed as this number. So we will write the value of computed here. So, the value of this you can see from the normal distribution table you will find here. Now, since the level of significance here is 5% and this value is greater than 5 so we will fail to reject H0.
If p-value less than of the level of significance we reject H0 if p-value greater than of the 0.05 we fail to reject H0 so the claim is valid. Look at one more example. So there are the 35 numbers are there. So definitely it's a large sample test.
So what is that H0 and H1? So the median, median that's the Eta, At most 15. So you have to define as at most 15 that's less than equal to is at most and here. So it's a right tail. So step two is subtract this 15 from each of the sorry subtract 15 from each of the observation so you will get the first number as a negative second number as a positive third is positive fourth positive this is zero and so on now you will calculate the t positive how many positive numbers are there signs and how many negative signs are there you will get 25 and 9 you will get here then since here so we will check whether the T positive is less than n by 2 or T positive greater than n by 2 which one satisfy this is 25 n by 2 is 17 so this satisfied. So we will calculate this value by using negative sign so you will get as 2.57 now it's a right tail so you have to write this one because what is your h0 this is here h1 is my right tail look at that this is right tail so you have here so the level of significance for the one percent level of significance is 2.33 okay so what is the value of the z is 2.57 where it lies it lies here look at this this is h0 h1 2.57 here so it lies in the rejected portion so we reject h0 so this is rejected it means this is accepted.
So the median length of the long distance telephone call is greater than 15th. So this is a step 4 by using the critical value. If you want to use this by p value, so this is there, so this is a greater than sign, right tail. So you have to use as a greater than. So what is the calculated value is a 2.56.
Look at the normal distribution table, you will get this. Level of significance is given as a 1 percentage. So the P value is less than of the 1% is so H0 is rejected again.
Look at the last example. So again, these are the some informations. You firstly define the H0 and H1 that is the step number one.
Test the claim that the median monthly balance is at least $200. So what is your H0 is at least okay. So this is at least 200 and now look at that this is a left tail.
What is the step 2 is, subtract this 200 from each of the numbers. You will get look at that. This is a negative. This is positive.
This is positive. This is negative and so on. Then you can calculate the T positive signs and T negative signs.
You will get here. Again, you can see that T positive is less than of the n by 2. So we will calculate this value by using this equation. We will substitute T positive as my 7. n is my 32 you will get here. Now you will substitute the value h0 as a less than sign so it's a left tail the value is here Now look at the three point lies in this case. So it is rejected.
So we will reject the H0. So the if H0 is rejected, so it means H1 is accepted. So how you write the median salary is less than look at that. This is less than sign is less than of 200 on the other hand if you want to calculate by the p-value since it is less than so it's less than sign calculated values here.
Look at the normal table. So you can see the normal table from the any standard books you will get here and the level of significance is 5. So this is less than 5 again. We will reject the H0.
So this is a simple way you can apply for the sign test in any non-parametric test. I hope you can enjoy this session. We will see on the next lecture on the Wilkerson sign. If you like this video, you can subscribe the channel and share with your other friends. You can also browse this link for the various testing of the hypothesis and the statistics videos.
Till then