hw02_Rmd-Stats-Basics.utf8

]— title: HW02 subtitle: Basic Rmd and Statistics output: html_document: theme: spacelab highlight: tango toc: true —

Assignment overview

The purpose of this assignment is to get you working in R, to do a little self reflection and examination of your learning habits, and assess understanding of first week statistical concepts.

Instructions

Put your answers for each question in this document where it says ANSWER HERE. You don’t have to put your answers in between the bartacks. They are there just to make it easier to see where your answer should go. Read the questions carefully. Make sure you answer completely and to give explanations. It is better to be overly wordy than to put too little. The file 03_Describing-Distributions is a useful reference for the Rmd section in addition to last week’s lecture recordings.

Part 1: Examination and Improvement of our Learning

Because we all have room for improvement, here are a few tasks for you to become a better learner.

Metacognition

Introduce yourself to the concept of metacognition at;
https://sites.google.com/a/uwlax.edu/exploring-how-students-learn/what-s-all-the-fuss-about-metacognition
Take the Metacognition Awareness Inventory at;
https://www.harford.edu/~/media/PDF/Student-Services/Tutoring/Metacognition%20Awareness%20Inventory.ashx
Write a reflection on metacognition. Discuss your scores and what you learned from the readings.

My lowest scores were in the sections of planning and comprehension monitoring. I would agree with thees assumptions, I think that I am really baf at planning out my studying and my time while I am studying as well as my understanding of what I know and dont know, I loose track of what i have been reading a lot of the time and probrably can be much better at it.]

Habits for Success

Watch or listen to one of the following choices and write one observation or question.

Video - Hidden Key to Understanding and Overcoming Procrastination
https://mcgraw.princeton.edu/node/2641
Podcast - NPR - Hidden Brain | I’m Right, You’re Wrong | Mar 13, 2017
https://www.npr.org/2017/03/13/519661419/when-it-comes-to-politics-and-fake-news-facts-arent-enough
YouTube - Becoming aware of our biases | Tali Sharot | Huxley Summit Dec 21, 2017
https://www.youtube.com/watch?v=hdnGkFdtA-A

I have seen the video about challenging our own beliefs before and I find it very interesting. Just the fact that people always look to confirm what they think is right, for example I saw anoter video on the experimant about hte numbers, and a man keot on trying to confirm his own sequence about the numbers because he thought it was right and he wouldnt get that off his mind. And only if he had just gone oh maybe I am wrong he would have gotten there much faster.

Select one link to examine from this site https://mcgraw.princeton.edu/undergraduates/resources-handouts-and-advice-undergraduates , and write one thing that you will attempt to incorporate into your studies this semester.

I clicked on the " Putting Your Extracurricular Skills to Use in Your Studies ". And I liked how they talked about the zone that, well in my case I get into while I am playing baseball and I am pitching, just being able to block out all of the outside distractions while studying and really focus on that one thing at a time could be really useful. I have actually found myself doing this before, if I am annoyed or angry at something I can easily just block everything out and have not hear a single thing other than what is in my head.

Part 2: Statistics

The following questions are to assess your understanding of our first week’s statistical topics.

Infections Can Lower IQ A headline in June 2015 proclaims “Infections can lower IQ.” The headline is based on a study in which scientists gave an IQ test to Danish men at age 19. They also analyzed the hospital records of the men and found that 35% of them had been in a hospital with an infection such as an STI or a urinary tract infection. The average IQ score was lower for the men who had an infection than for the men who hadn’t.

What are the cases in this study?
What is the explanatory variable? Is it categorical or quantitative?
What is the response variable? Is it categorical or quantitative?
Does the headline imply causation?
Is the study an experiment or an observational study?
Is it appropriate to conclude causation in this case?

-The cases are the people that have and dont have an infection -The explanitory variable is whether or not the person had an infection or not, and it is categorical -The response variable is the persons IQ, and this is numerical -Yes, saying that infections cam lower IQ is saying that an infection can Cause a lower IQ in a person -It is an observational study because they are not controlling whether or not the perons has an infection, they are recoding what they see -I do not think that we have enough information to say whether or not causation is present, If we had more informaion about medicines they were on and other variables in the hospital then mayeb we could make an assumption but right now I dont think we can say if it does or does not

Hormones and Fish Fertility When women take birth control pills, some of the hormones found in the pills eventually make their way into lakes and waterways. In one study, a water sample was taken from various lakes. The data indicate that as the concentration of estrogen in the lake water goes up, the fertility level of fish in the lake goes down. The estrogen level is measured in parts per trillion (ppt) and the fertility level is recorded as the percent of eggs fertilized.

What are the cases in this study?
What are the variables?
Classify each variable as either categorical or quantitative.
Identify the explanatory and response variables.

-The case is the lake water, that has the level s of estrogen and the fish fertility variables -The variables are the estrogen levels of the water and the fertility levels of the fish -Both of the variables are quantitiative -The explanitory variable is the estrogen levels of the water and the response variable is the fertility rate of the fish

Drinking Age A biased sampling situation is described for the following study;

To estimate the proportion of Americans who support changing the drinking age from 21 to 18, a random sample of 100 college students are asked the question, “Would you support a measure to lower the drinking age from 21 to 18?”

What is the sample?
What is the researcher’s population of interest?
To what population we can generalize to, (for our given sample)?

-The sample is the 100 college students -The population instrest is all of of the US -We can generalize about college students thoughts on the drinking age, not the whole united states

Part 3: Rmd

Write a code block which loads the necessary libraries and data but also does not print out unnecessary stuff. The file you should load is ~/Data/output/ACS_clean.RData.
What is the name of the data which was loaded?

ls()

## [1] "mydata_clean"

How many observations and variables are there in the data? How do you know?

mydata_clean %>%
glimpse()

## Rows: 35,638
## Columns: 28
## $ AGEP      <dbl> 19, 21, 65, 23, 48, 89, 18, 22, 94, 36, 18, 64, 15, 18, 19,…
## $ ENG       <dbl> 2, NA, 1, 1, NA, 1, 1, 1, NA, NA, 1, NA, NA, NA, NA, 1, 1, …
## $ WAGP      <dbl> 50, 7700, 5000, 6000, 0, 0, 500, 0, 0, 0, 1400, 0, 0, 0, 36…
## $ WKHP      <dbl> 5, 20, 8, 50, NA, NA, 12, NA, NA, NA, 12, NA, NA, NA, 20, 4…
## $ PINCP     <dbl> 50, 7700, 17200, 7500, 0, 15600, 500, 0, 4300, 0, 1400, 174…
## $ HINCP     <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ NP        <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
## $ NPF       <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ RNTP      <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ MRGP      <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ BDSP      <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ RMSP      <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ VEH       <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ JWMNP     <dbl> NA, 10, 25, 10, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ PUMA      <fct> 00101, 00106, 08507, 00103, 00106, 08512, 08509, 08509, 001…
## $ SERIALNO  <fct> 2018GQ0000039, 2018GQ0000045, 2018GQ0000059, 2018GQ0000238,…
## $ SEX_cat   <fct> Male, Male, Male, Female, Female, Female, Male, Female, Fem…
## $ DEYE_cat  <fct> No eye difficulty, No eye difficulty, No eye difficulty, No…
## $ DEAR_cat  <fct> No hearing difficulty, No hearing difficulty, No hearing di…
## $ RAC1P_cat <ord> Asian alone, White alone, White alone, White alone, White a…
## $ FS_cat    <fct> No food stamps, No food stamps, No food stamps, No food sta…
## $ HUPAC_cat <ord> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ MV_cat    <ord> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ JWTR_cat  <fct> NA, "Walked", "Car, truck, or van", "Walked", NA, NA, NA, N…
## $ MSP_cat   <fct> "Never married", "Never married", "Never married", "Never m…
## $ SCHL_cat  <ord> "Some college, but less than 1 year", "Regular high school …
## $ COW_cat   <fct> Employee of a private for-profit company, State government …
## $ ENG_cat   <fct> Well, NA, Very well, Very well, NA, Very well, Very well, V…

The variable JWTR_cat is categorical and indicates how the person got to work. How many males are there in the sample that Walked to work?

mydata_clean %>%
select (JWTR_cat, SEX_cat) %>%
filter (SEX_cat == "Male", JWTR_cat == "Walked")

## # A tibble: 266 x 2
##    JWTR_cat SEX_cat
##    <fct>    <fct>  
##  1 Walked   Male   
##  2 Walked   Male   
##  3 Walked   Male   
##  4 Walked   Male   
##  5 Walked   Male   
##  6 Walked   Male   
##  7 Walked   Male   
##  8 Walked   Male   
##  9 Walked   Male   
## 10 Walked   Male   
## # … with 256 more rows

266 males walk to work

Find the average wage (use the variable WAGP and the command mean) for Male(s) and Female(s) who have Never married.

mydata_clean %>%
select( WAGP, SEX_cat, MSP_cat) %>%
filter( SEX_cat == "Male", MSP_cat == "Never married") %>%
summarize (average_wage = mean (WAGP, na.rm=TRUE))

## # A tibble: 1 x 1
##   average_wage
##          <dbl>
## 1       36260.

mydata_clean %>%
select( WAGP, SEX_cat, MSP_cat) %>%
filter( SEX_cat == "Female" , MSP_cat == "Never married") %>%
summarize (average_wage = mean (WAGP, na.rm=TRUE))

## # A tibble: 1 x 1
##   average_wage
##          <dbl>
## 1       30681.

Write a code block which counts the number of Male(s) and Female(s) for each race (use the variable RAC1P_cat).

mydata_clean %>%
count (SEX_cat, RAC1P_cat)

## # A tibble: 17 x 3
##    SEX_cat RAC1P_cat                                            n
##    <fct>   <ord>                                            <int>
##  1 Male    White alone                                       8068
##  2 Male    Black or African American alone                    839
##  3 Male    American Indian alone                               70
##  4 Male    American Indian                                     33
##  5 Male    Asian alone                                       6034
##  6 Male    Native Hawaiian and Other Pacific Islander alone    99
##  7 Male    Some Other Race alone                             1443
##  8 Male    Two or more races                                 1001
##  9 Female  White alone                                       7963
## 10 Female  Black or African American alone                    900
## 11 Female  American Indian alone                               87
## 12 Female  Alaska Native alone                                  4
## 13 Female  American Indian                                     38
## 14 Female  Asian alone                                       6422
## 15 Female  Native Hawaiian and Other Pacific Islander alone   101
## 16 Female  Some Other Race alone                             1503
## 17 Female  Two or more races                                 1033