Question: Are male and female entrepeneurs represented differently in the nonprofit industry?

The dataset contains ten survey questions from a survey of nonprofit entrepreneurs. The gender variable represents our study groups, male and female.

URL <- "https://github.com/DS4PS/cpp-524-sum-2020/blob/master/labs/data/female-np-entrepreneurs.rds?raw=true"
dat <- readRDS(gzcon(url( URL )))
head( dat )
##   gender age income edu.level years.prof.exp experience.np.create
## 1 Female  54  79669  Graduate          11-15                   No
## 2 Female  62  63474  Graduate            15+                   No
## 3 Female  70  27887  Graduate            15+                  Yes
## 4   Male  63  63474  Graduate            15+                  Yes
## 5 Female  60 170832  Graduate            15+                  Yes
## 6 Female  41  69531  Graduate           6-10                  Yes
##   experience.np.form experience.np.other take.on.debt seed.funding
## 1                 No                 Yes           $0           No
## 2                Yes                 Yes           $0           No
## 3                Yes                 Yes           $0           No
## 4                 No                 Yes           $0           No
## 5                Yes                 Yes           $0          Yes
## 6                 No                  No           $0          Yes
##   most.imp.fund.source
## 1            Donations
## 2            Gov Grant
## 3            Donations
## 4            Donations
## 5           Corp Grant
## 6            Gov Grant



Survey Question 1:

Comparing education levels of male and female entrepreneurs. Education levels of male & female entrepreneurs. “What is the highest level of education you achieved?”

Tables

a <- table( dat$edu.level, dat$gender )
a %>% prop.table( margin=1 ) %>% round(2) %>% pander()
  Female Male
None 0.47 0.53
High School 0.4 0.6
Some College 0.57 0.43
Bachelor 0.6 0.4
Graduate 0.52 0.48

Chi Square Tests

chisq.test( a, simulate.p.value = TRUE , B = 10000 )
## 
##  Pearson's Chi-squared test with simulated p-value (based on 10000
##  replicates)
## 
## data:  a
## X-squared = 4.3831, df = NA, p-value = 0.3588

Question 1

The two factors chi square p value is less than 0.05, therefore they are correlated, and not independent. While we find that females on average have slightly higher levels of education, they are not statistically significant and therefore not expressly correlated.





Survey Question 2:

Compare work experience for male and female entrepreneurs. “Do males or females have more professional* work experience?”

b <- table( dat$years.prof.exp, dat$gender )
b %>% prop.table( margin=1 ) %>% round(2) %>% pander()
  Female Male
0 0.73 0.27
1-2 0.67 0.33
3-5 0.62 0.38
6-10 0.57 0.43
11-15 0.58 0.42
15+ 0.53 0.47
chisq.test( b, simulate.p.value = TRUE , B = 10000 )
## 
##  Pearson's Chi-squared test with simulated p-value (based on 10000
##  replicates)
## 
## data:  b
## X-squared = 4.0086, df = NA, p-value = 0.5764

The two factors chi square p value is greater than 0.05, therefore they are uncorrelated, and independent. The work experience of female entrepreneurs is on average greater and is statistically significant.





Survey Question 3:

Do females or males on average receive more seed funding?

c <- table( dat$seed.funding, dat$gender )
c %>% prop.table( margin=1 ) %>% round(2) %>% pander()
  Female Male
No 0.55 0.45
Yes 0.54 0.46
chisq.test( c, simulate.p.value = TRUE , B = 10000 )
## 
##  Pearson's Chi-squared test with simulated p-value (based on 10000
##  replicates)
## 
## data:  c
## X-squared = 0.033448, df = NA, p-value = 0.8558

The two factors chi square p value is greater than 0.05, therefore they are uncorrelated, and independent. Therefore we can determine that our data is statistically significant in finding that females receive seen funding more often than males.





Survey Question 4:

Are males or females more willing to take on debt to start a business?

d <- table( dat$take.on.debt, dat$gender )
d %>% prop.table( margin=1 ) %>% round(2) %>% pander()
  Female Male
$0 0.57 0.43
$0k-$10k 0.6 0.4
$10k-$25k 0.39 0.61
$25k-$50k 0.47 0.53
$50k+ 0.36 0.64
chisq.test( d, simulate.p.value = TRUE , B = 10000 )
## 
##  Pearson's Chi-squared test with simulated p-value (based on 10000
##  replicates)
## 
## data:  d
## X-squared = 8.6158, df = NA, p-value = 0.06689

The two factors chi square p value is just greater than 0.05, therefore they are uncorrelated, and independent. Our data do suggest that males comparatively take on debt more often than their counterparts.





Survey Question 5:

Do males and females have different sources of funding that are most importasnt to their businesses?

e <- table( dat$most.imp.fund.source, dat$gender )
e %>% prop.table( margin=1 ) %>% round(2) %>% pander()
  Female Male
Donations 0.51 0.49
Founder 0.56 0.44
Earned Revenues 0.66 0.34
Foundation Grant 0.59 0.41
Gov Grant 0.59 0.41
Member Fees 0.46 0.54
Parent Org 0.5 0.5
Angel 0.52 0.48
Corp Grant 0.67 0.33
chisq.test( e, simulate.p.value = TRUE , B = 10000 )
## 
##  Pearson's Chi-squared test with simulated p-value (based on 10000
##  replicates)
## 
## data:  e
## X-squared = 8.8304, df = NA, p-value = 0.367

The two factors chi square p value is just greater than 0.05, therefore they are uncorrelated, and independent. Our data do suggest that in some areas there are statistically significant differences in funding sources by gender.





Survey question 6:

Does the average age of nonprofit founders differ by gender?

chisq.test( f, simulate.p.value = TRUE , B = 10000 )
## 
##  Pearson's Chi-squared test with simulated p-value (based on 10000
##  replicates)
## 
## data:  f
## X-squared = 98.545, df = NA, p-value = 5e-04
t.test( age ~ gender, data=dat )
## 
##  Welch Two Sample t-test
## 
## data:  age by gender
## t = -3.1749, df = 589.02, p-value = 0.001577
## alternative hypothesis: true difference in means between group Female and group Male is not equal to 0
## 95 percent confidence interval:
##  -5.031881 -1.185709
## sample estimates:
## mean in group Female   mean in group Male 
##             51.93948             55.04828
get_pval_from_tval <- function( tvalue, n1, n2 )
{
  df <- min( 655, 657 ) - 1  # size of treated group and comparison
  pval <- 2 * pt( abs(tvalue), df=df, lower.tail=FALSE )
  return( pval )
}

get_pval_from_tval( tvalue=-3.1749, n1=655, n2=657 )
## [1] 0.001569263

Our data suggest within a 95% confidence interval that the average age of male nonprofit entrepreneurs is slightly older than that of female nonprofit entrepreneurs. (Though our adjusted alpha suggests the groups are equivalent, more in Q8)





Survey Question 7:

Were the income levels between men and women entrepeneurs different at the time of inception?

chisq.test( g, simulate.p.value = TRUE , B = 10000 )
## 
##  Pearson's Chi-squared test with simulated p-value (based on 10000
##  replicates)
## 
## data:  g
## X-squared = 601.88, df = NA, p-value = 0.6753
t.test( income ~ gender, data=dat )
## 
##  Welch Two Sample t-test
## 
## data:  income by gender
## t = -3.6353, df = 630.22, p-value = 0.0003003
## alternative hypothesis: true difference in means between group Female and group Male is not equal to 0
## 95 percent confidence interval:
##  -20239.710  -6042.518
## sample estimates:
## mean in group Female   mean in group Male 
##             67741.83             80882.95

The two factors chi square p value is just greater than 0.05, therefore they are uncorrelated, and independent. The differences in income between men and women at the time of starting a nonprofit was meaningfully statistically different and greater for males.





Analysis

-Further investigation using Bonferroni’s corrected alpha could be of use concerning type 1 errors. -The lowest p-value across the contrasts was that of age. -The only statistically significant p-value was that of age, and after running a basic Bonferroni Correction on it, we cannot reject the nunll hypothesis and therefore must state that these entrepreneurs surveyed are statistically the same when considering gender.

```