]— title: HW02 subtitle: Basic Rmd and Statistics output: html_document: theme: spacelab highlight: tango toc: true —
The purpose of this assignment is to get you working in R, to do a little self reflection and examination of your learning habits, and assess understanding of first week statistical concepts.
Put your answers for each question in this document where it says ANSWER HERE
. You don’t have to put your answers in between the bartacks. They are there just to make it easier to see where your answer should go. Read the questions carefully. Make sure you answer completely and to give explanations. It is better to be overly wordy than to put too little. The file 03_Describing-Distributions
is a useful reference for the Rmd section in addition to last week’s lecture recordings.
Because we all have room for improvement, here are a few tasks for you to become a better learner.
Introduce yourself to the concept of metacognition at;
https://sites.google.com/a/uwlax.edu/exploring-how-students-learn/what-s-all-the-fuss-about-metacognition
Take the Metacognition Awareness Inventory at;
https://www.harford.edu/~/media/PDF/Student-Services/Tutoring/Metacognition%20Awareness%20Inventory.ashx
Write a reflection on metacognition. Discuss your scores and what you learned from the readings.
My lowest scores were in the sections of planning and comprehension monitoring. I would agree with thees assumptions, I think that I am really baf at planning out my studying and my time while I am studying as well as my understanding of what I know and dont know, I loose track of what i have been reading a lot of the time and probrably can be much better at it.]
Video - Hidden Key to Understanding and Overcoming Procrastination
https://mcgraw.princeton.edu/node/2641
Podcast - NPR - Hidden Brain | I’m Right, You’re Wrong | Mar 13, 2017
https://www.npr.org/2017/03/13/519661419/when-it-comes-to-politics-and-fake-news-facts-arent-enough
YouTube - Becoming aware of our biases | Tali Sharot | Huxley Summit Dec 21, 2017
https://www.youtube.com/watch?v=hdnGkFdtA-A
I have seen the video about challenging our own beliefs before and I find it very interesting. Just the fact that people always look to confirm what they think is right, for example I saw anoter video on the experimant about hte numbers, and a man keot on trying to confirm his own sequence about the numbers because he thought it was right and he wouldnt get that off his mind. And only if he had just gone oh maybe I am wrong he would have gotten there much faster.
I clicked on the " Putting Your Extracurricular Skills to Use in Your Studies ". And I liked how they talked about the zone that, well in my case I get into while I am playing baseball and I am pitching, just being able to block out all of the outside distractions while studying and really focus on that one thing at a time could be really useful. I have actually found myself doing this before, if I am annoyed or angry at something I can easily just block everything out and have not hear a single thing other than what is in my head.
The following questions are to assess your understanding of our first week’s statistical topics.
-The cases are the people that have and dont have an infection -The explanitory variable is whether or not the person had an infection or not, and it is categorical -The response variable is the persons IQ, and this is numerical -Yes, saying that infections cam lower IQ is saying that an infection can Cause a lower IQ in a person -It is an observational study because they are not controlling whether or not the perons has an infection, they are recoding what they see -I do not think that we have enough information to say whether or not causation is present, If we had more informaion about medicines they were on and other variables in the hospital then mayeb we could make an assumption but right now I dont think we can say if it does or does not
-The case is the lake water, that has the level s of estrogen and the fish fertility variables -The variables are the estrogen levels of the water and the fertility levels of the fish -Both of the variables are quantitiative -The explanitory variable is the estrogen levels of the water and the response variable is the fertility rate of the fish
Drinking Age A biased sampling situation is described for the following study;
To estimate the proportion of Americans who support changing the drinking age from 21 to 18, a random sample of 100 college students are asked the question, “Would you support a measure to lower the drinking age from 21 to 18?”
-The sample is the 100 college students -The population instrest is all of of the US -We can generalize about college students thoughts on the drinking age, not the whole united states
Write a code block which loads the necessary libraries and data but also does not print out unnecessary stuff. The file you should load is ~/Data/output/ACS_clean.RData
.
What is the name of the data which was loaded?
ls()
## [1] "mydata_clean"
mydata_clean %>%
glimpse()
## Rows: 35,638
## Columns: 28
## $ AGEP <dbl> 19, 21, 65, 23, 48, 89, 18, 22, 94, 36, 18, 64, 15, 18, 19,…
## $ ENG <dbl> 2, NA, 1, 1, NA, 1, 1, 1, NA, NA, 1, NA, NA, NA, NA, 1, 1, …
## $ WAGP <dbl> 50, 7700, 5000, 6000, 0, 0, 500, 0, 0, 0, 1400, 0, 0, 0, 36…
## $ WKHP <dbl> 5, 20, 8, 50, NA, NA, 12, NA, NA, NA, 12, NA, NA, NA, 20, 4…
## $ PINCP <dbl> 50, 7700, 17200, 7500, 0, 15600, 500, 0, 4300, 0, 1400, 174…
## $ HINCP <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ NP <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
## $ NPF <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ RNTP <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ MRGP <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ BDSP <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ RMSP <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ VEH <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ JWMNP <dbl> NA, 10, 25, 10, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ PUMA <fct> 00101, 00106, 08507, 00103, 00106, 08512, 08509, 08509, 001…
## $ SERIALNO <fct> 2018GQ0000039, 2018GQ0000045, 2018GQ0000059, 2018GQ0000238,…
## $ SEX_cat <fct> Male, Male, Male, Female, Female, Female, Male, Female, Fem…
## $ DEYE_cat <fct> No eye difficulty, No eye difficulty, No eye difficulty, No…
## $ DEAR_cat <fct> No hearing difficulty, No hearing difficulty, No hearing di…
## $ RAC1P_cat <ord> Asian alone, White alone, White alone, White alone, White a…
## $ FS_cat <fct> No food stamps, No food stamps, No food stamps, No food sta…
## $ HUPAC_cat <ord> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ MV_cat <ord> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ JWTR_cat <fct> NA, "Walked", "Car, truck, or van", "Walked", NA, NA, NA, N…
## $ MSP_cat <fct> "Never married", "Never married", "Never married", "Never m…
## $ SCHL_cat <ord> "Some college, but less than 1 year", "Regular high school …
## $ COW_cat <fct> Employee of a private for-profit company, State government …
## $ ENG_cat <fct> Well, NA, Very well, Very well, NA, Very well, Very well, V…
JWTR_cat
is categorical and indicates how the person got to work. How many males are there in the sample that Walked
to work?mydata_clean %>%
select (JWTR_cat, SEX_cat) %>%
filter (SEX_cat == "Male", JWTR_cat == "Walked")
## # A tibble: 266 x 2
## JWTR_cat SEX_cat
## <fct> <fct>
## 1 Walked Male
## 2 Walked Male
## 3 Walked Male
## 4 Walked Male
## 5 Walked Male
## 6 Walked Male
## 7 Walked Male
## 8 Walked Male
## 9 Walked Male
## 10 Walked Male
## # … with 256 more rows
266 males walk to work
WAGP
and the command mean
) for Male
(s) and Female
(s) who have Never married
.mydata_clean %>%
select( WAGP, SEX_cat, MSP_cat) %>%
filter( SEX_cat == "Male", MSP_cat == "Never married") %>%
summarize (average_wage = mean (WAGP, na.rm=TRUE))
## # A tibble: 1 x 1
## average_wage
## <dbl>
## 1 36260.
mydata_clean %>%
select( WAGP, SEX_cat, MSP_cat) %>%
filter( SEX_cat == "Female" , MSP_cat == "Never married") %>%
summarize (average_wage = mean (WAGP, na.rm=TRUE))
## # A tibble: 1 x 1
## average_wage
## <dbl>
## 1 30681.
Male
(s) and Female
(s) for each race (use the variable RAC1P_cat
).mydata_clean %>%
count (SEX_cat, RAC1P_cat)
## # A tibble: 17 x 3
## SEX_cat RAC1P_cat n
## <fct> <ord> <int>
## 1 Male White alone 8068
## 2 Male Black or African American alone 839
## 3 Male American Indian alone 70
## 4 Male American Indian 33
## 5 Male Asian alone 6034
## 6 Male Native Hawaiian and Other Pacific Islander alone 99
## 7 Male Some Other Race alone 1443
## 8 Male Two or more races 1001
## 9 Female White alone 7963
## 10 Female Black or African American alone 900
## 11 Female American Indian alone 87
## 12 Female Alaska Native alone 4
## 13 Female American Indian 38
## 14 Female Asian alone 6422
## 15 Female Native Hawaiian and Other Pacific Islander alone 101
## 16 Female Some Other Race alone 1503
## 17 Female Two or more races 1033