MTH 522 – 09/13/2023
When I compare the kurtosis numbers for diabetes and inactivity to those given in the course materials and discussed in class, I see a disparity. I used the kurtosis() function from the scipy library to quantify these statistics. The observable variances have prompted inquiries about the distribution of the underlying data and the potential causes of these variations. I’m looking at possible explanations for these discrepancies and trying to make the calculated kurtosis values match the anticipated trends. For our data analysis to be accurate and reliable, this inquiry is crucial.
Along with the kurtosis research, I have also dabbled with modeling the link between diabetes and inactivity using regression approaches. Although linear regression is frequently used for this purpose, I experimented with polynomial regression to identify more complex data patterns. A polynomial of degree 6 (y = -0.00x^8 0.00x^7 -0.14x^6 3.88x^5 -67.96x^4 753.86x^3 -5171.95x^2 20053.68x^1 -33621.52) offers the best fit for our dataset, according to my analysis of different polynomial degrees. This conclusion raises an important question: Why do we frequently use linear regression when polynomial regression seems to provide a more realistic depiction of the complexity of the data? To understand these variables’ dynamics better and to choose the best regression strategy for this particular dataset, more research is required and I am currently on it.