Sunday, May 3, 2020

Statistical Analysis

Question: Write an essay onStatistical Analysis. Answer: In this assignment, the number of admissions to the movies in Australia was surveyed from the year 1994 to 2014. Other factors were also surveyed in this assignment. The data for the variables of Screens, Theatres, Films Screened, Real Ticket Price and Capacity were also collected from the year 1994 to 2014 (Bickel and Lehmann 2012). These data were used in this assignment for the analysis. Statistical analysis would be done on this data set. Descriptive statistics, inferential statistics and concepts of linear regression methods would be given in this assignment based on these data (Vogt and Barta 2013). Graphs and charts would also represent the data and provide conclusion about the data and their relations. Methodology The descriptive statistics for the variables are given below: Screens Mean 1213.52381 Standard Error 103.6609083 Median 1028 Mode #N/A Standard Deviation 475.0339587 Sample Variance 225657.2619 Kurtosis -1.569616636 Skewness 0.410014464 Range 1264 Minimum 645 Maximum 1909 Sum 25484 Count 21 Largest(1) 1909 Smallest(1) 645 Confidence Level (95.0%) 216.2328649 Admissions (millions) Mean 61.82380952 Standard Error 5.151804606 Median 68.1 Mode 92.5 Standard Deviation 23.60853457 Sample Variance 557.3629048 Kurtosis -1.666346431 Skewness -0.055528974 Range 63.6 Minimum 28.9 Maximum 92.5 Sum 1298.3 Count 21 Largest(1) 92.5 Smallest(1) 28.9 Confidence Level (95.0%) 10.74647607 Theatres Mean 545.0952381 Standard Error 9.45840638 Median 547 Mode 520 Standard Deviation 43.34386319 Sample Variance 1878.690476 Kurtosis 8.375821966 Skewness 2.445313468 Range 201 Minimum 501 Maximum 702 Sum 11447 Count 21 Largest(1) 702 Smallest(1) 501 Confidence Level (95.0%) 19.72988992 Films Screened Mean 257.3809524 Standard Error 5.769573802 Median 255 Mode 259 Standard Deviation 26.43950868 Sample Variance 699.047619 Kurtosis 1.256326286 Skewness -0.057083837 Range 124 Minimum 194 Maximum 318 Sum 5405 Count 21 Largest(1) 318 Smallest(1) 194 Confidence Level (95.0%) 12.03512002 Real Ticket Price Mean 19.77952381 Standard Error 0.091997609 Median 19.66 Mode #N/A Standard Deviation 0.421586008 Sample Variance 0.177734762 Kurtosis 0.085043332 Skewness 0.91600328 Range 1.51 Minimum 19.25 Maximum 20.76 Sum 415.37 Count 21 Largest(1) 20.76 Smallest(1) 19.25 Confidence Level (95.0%) 0.191903649 Capacity ('000s) Mean 362.8095238 Standard Error 15.57082422 Median 332 Mode 295 Standard Deviation 71.35448062 Sample Variance 5091.461905 Kurtosis -1.582996919 Skewness 0.44357307 Range 186 Minimum 285 Maximum 471 Sum 7619 Count 21 Largest(1) 471 Smallest(1) 285 Confidence Level (95.0%) 32.48017007 Considering the variable admission (millions), the central tendency the variable, i.e. the mean is 61.8238. The median of the variable is 68.1. This is the middle value of the admission (millions) is 68.1. The modal value of the variable was 92.5 (Plonsky 2015). This is the maximum frequency for the number of people who were admitted for the movie. The variability of the variable; i.e. the standard deviation is 23.6085. This depicts that the variable had a moderate amount of variability in the admission (millions) over the years (Vogt and Barta 2013). The shape of the distribution is platykurtic and the distribution is negatively skewed. The mean of the variable screens was found to be 1213.5238. The median of the variable was 1028 and there was no mode for this variable. The standard deviation of the variable was 475.0339 (Thiem 2014). This depicts that there was moderate variation in the number of screens available in Australia for screening of movies. The shape of the distribution id platykurtic and it is positively skewed. The average value of theatres was found to be 545.095. The median of the variable is 547 and its mode is 520. The standard deviation was 43.34. There was a low deviation in the number of theatres open in Australia during 1994 to 2014 (Campbell and Knapp 2013). The shape of the distribution is leptokurtic and the distribution is positively skewed. The average value of the variable films screened was found to be 257.38. The median value was 255 and the modal value was 259 (Ang and Van 2015). The standard deviation was found to be 36.439. This variable had a low deviation of the number of theatres opened in these years. The shape of the distribution is leptokurtic and it is negatively skewed. The mean of the variable real ticket price is 19.779 and its median is 19.66. the standard deviation of the variable is 0.42 (Kleinbaum et al. 2013). This is a very low standard deviation and the price of the tickets fluctuated little during the period of 1994 to 2014. The shape of the distribution is leptokurtic and the distribution is positively skewed. The average value of the variable capacity was found to be 362.8095. The median was 332 and the mode was 295. The standard deviation of the variable was 71.3544. This depicts that there was moderate variation among the daily capacity of the customers over the years. The shape of the distribution is platykurtic and the distribution is positively skewed. Graph displaying the distribution of admission Box-and-whisker plot for the distribution of the real ticket price is given below The likelihood that the admission is greater than 70 million when the real price of the ticket is more than $20 is given by P(X Z) = 1 P( X Z) = 1- 0.613 = 0.387 (Campbell and Knapp 2013). The admissions are statistically independent of price. This is because the value of the chi square test was found to be zero. The contingency table is as follows: Sum of probability of admission Column Labels Row Labels 28.9 29.7 30.8 35.5 37.4 39 43 46.9 47.2 55.5 68.1 69.9 73.9 76 80 82.2 88 89.8 91.5 92.5 Grand Total 19.25-19.35 0.036 0.071 0.107 19.35-19.45 0.030 0.036 0.043 0.063 0.172 19.45-19.55 0.068 0.071 0.139 19.55-19.65 0.024 0.069 0.093 19.65-19.75 0.022 0.029 0.051 19.75-19.85 0.023 0.062 0.084 19.85-19.95 0.027 0.027 19.95-20.05 0.070 0.070 20.05-20.15 0.059 0.059 20.15-20.25 0.057 0.057 20.35-20.45 0.033 0.033 20.45-20.55 0.054 0.054 20.75-20.85 0.052 0.052 Grand Total 0.022 0.023 0.024 0.027 0.029 0.030 0.033 0.036 0.036 0.043 0.052 0.054 0.057 0.059 0.062 0.063 0.068 0.069 0.070 0.142 1 The 95% confidence interval of mean theatre capacity is given by (mean 1.96* s.d.), (mean + 1.96* s.d.) = (460.1412662, 630.0492099) At 5% level of significance, the admission from 2008 to 2014 had exceeded the constant amount of 84 millions in Taiwan had the hypothesis as follows: H0 = the admission from 2008 to 2014 did not exceed the constant amount of 84 millions in Taiwan H1 = the admission from 2008 to 2014 had exceeded the constant amount of 84 millions in Taiwan On testing the two variables, the p value of the one-tailed test was found to be 0.02732, which is less than the p value (Levy and Lemeshow 2013). The null hypothesis in this case is rejected and the admission from 2008 to 2014 had exceeded the constant amount of 84 millions in Taiwan. The output of multiple linear regression is given in sheet named regression in the excel file. Using the result of regression analysis, the hypothesis is as follows: H0 = there is no difference between the ticket price in 2014 and zero at 5% level of significance H1 = there is difference between the ticket price in 2014 and zero at 5% level of significance The p value of the regression analysis for the variable real ticket price had the value of 0.044, which is less than 0.05. This leads to the rejection of null hypothesis. Thus, there is difference between the ticket price in 2014 and zero at 5% level of significance. The slope of the variable is 5.2766, which is positive (Montgomery et al. 2015). This states that the variable effects the admission in a positive way. The change in price in the ticket leads to the change in the admission in the similar direction. The value of intercept was found to be negative. This suggests that the admission would be negative in absence of all the factors. The slope of Screens was 0.0878, which was slightly positive (Draper and Smith 2014). This value aims to influence the admission in a positive manner, as the value is positive. The value of the slope of Theatres was found to be 0.04508, which is positive. This depicts that the factor had a weak positive influence of the admission. The slope of Flimsy screened was 0.001 and it is weakly negative (Csikszentmihalyi and Larson 2014). This also influences the admission positively and the change in value of this variable changes the value of admission in the same direction (Kleinbaum et al. 2013). The slope of capacity (000s) has a negative slope of -0.2819. This indicates that the variable influence the admission in a negative way. The increase in capacity decreases the admission. All the slope of the variables had the same sign as was expected. The sign of capacity (000s) was expected to be positive whereas it turned out to be negative. The value of adjusted r square is 0.971. This indicates that 97.1% of variation is explained by only the independent variables that actually affect the dependent variable (Fox 2015). The p value of the variables theatres and film screened are more than 5% level of significance. The other three variables have their p value less than 0.05. Thus, this overall model is statistically not significant as the p values of all the variables are not less than 0.05. Scatter diagram and histogram The variable is heteroscadastic, normal and linear. Variables like location of the theatre, facilities provided in the theatre and the type of movie influence the admission positively. These factors would increase the value of regression coefficient and would thereby, influence the regression coefficient. The sampling process of random sampling would not be appropriate one at the first instance. The organisation must identify all the households of native-born Australians. They could then use the process of random sampling to select the households from the identified households. Conclusion It was seen from the analysis that the variable admission is influenced by various factors like the Screens, Theatres, Films Screened, Real Ticket Price and Capacity. The average, standard deviation and the type of distribution of each variable vary from each other over the period from 1994 to 2014. The graphs show the distribution of each variable and the tables give an idea about the type of admission with the real ticket price. The degree of association between the variables was also analysed. Thus the analysis gave a clear picture about the variables and the effect of admission over the years 1994 to 2014 and the influence of the other factors on the admission. References Ang, S. and Van Dyne, L., 2015.Handbook of cultural intelligence. Routledge. Bickel, P.J. and Lehmann, E.L., 2012. Descriptive statistics for nonparametric models I. Introduction. InSelected Works of EL Lehmann(pp. 465-471). Springer US. Campbell, J.P. and Knapp, D.J. eds., 2013.Exploring the limits in personnel selection and classification. Psychology Press. Csikszentmihalyi, M. and Larson, R., 2014. Validity and reliability of the experience-sampling method. InFlow and the Foundations of Positive Psychology(pp. 35-54). Springer Netherlands. Draper, N.R. and Smith, H., 2014.Applied regression analysis. John Wiley Sons. Fox, J., 2015.Applied regression analysis and generalized linear models. Sage Publications Kleinbaum, D., Kupper, L., Nizam, A. and Rosenberg, E., 2013.Applied regression analysis and other multivariable methods. Nelson Education. Levy, P.S. and Lemeshow, S., 2013.Sampling of populations: methods and applications. John Wiley Sons. Montgomery, D.C., Peck, E.A. and Vining, G.G., 2015.Introduction to linear regression analysis. John Wiley Sons. Plonsky, L., 2015. Statistical power, p values, descriptive statistics, and effect sizes: A" back-to-basics" approach to advancing quantitative methods in L2 research. Thiem, A., 2014. Membership function sensitivity of descriptive statistics in fuzzy-set relations.International Journal of Social Research Methodology,17(6), pp.625-642. Vogt, A. and Barta, J., 2013.The making of tests for index numbers: Mathematical methods of descriptive statistics. Springer Science Business Media.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.