-
Notifications
You must be signed in to change notification settings - Fork 3
/
AnthroTools_Script.R
189 lines (134 loc) · 8.43 KB
/
AnthroTools_Script.R
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
######################################################################################
####################### AnthroTools: Some Custom Tools for Anthropology ##############
######################################################################################
### Benjamin Grant Purzycki and Alastair Jamieson-Lane prepared this script for use
### with the accompanying article:
### Purzycki, B. G., and Jamieson-Lane, A. (2017). AnthroTools: A package for cross-cultural
### ethnographic data analysis. Cross-Cultural Research 51(1):51-74.
### Please feel free to contact us with any questions, comments, or concerns at
### [email protected] or [email protected]. We are happy to help develop this
### package, so if you have any recommendations, please let us know!
### If you wish to cite this, please cite the package using the following:
### Alastair Jamieson Lane and Benjamin Grant Purzycki. (2016). AnthroTools: Some custom
### for anthropology. R package version 0.9.
### If at any point you wish to call up the help menu, run this:
help("AnthroTools")
##########################################################################################
#################################### Installation ########################################
##########################################################################################
### To install the package, highlight the following and use CTRL+R
### if you're using a PC, or command+return if using a Mac:
install.packages("devtools")
library("devtools")
install_github('alastair-JL/AnthroTools')
library(AnthroTools)
### You might consider doing this occasionally as this will install any updates.
##########################################################################################
######################## Analyzing Free-List Data ########################################
##########################################################################################
### Now you're ready. We'll now walk through a free-list example in the main text
### step-by-step. We write this primarily for those new to R.
### First, run the following:
data(FruitList)
View(FruitList)
### You just loaded and pulled up the FruitList data set. Now, you'll create a new
### object. Let's call it "FL".
FL <- CalculateSalience(FruitList)
View(FL)
### You just ran the function "CalculateSalience". It creates a new
### data set for you with the per-item salience calculations. Here's
### the full script for doing the same thing with the options.
FL <- CalculateSalience(FruitList, Order = "Order", Subj = "Subj",
CODE = "CODE", GROUPING = NA, Rescale = FALSE, Salience = "Salience")
### One helpful tool is built into the R language. It will call up only the
### first-listed items per person. Given the assumption that earlier-listed
### items are more salient (Romney & D'Andrade, 1964), this will allow for
### a quick look at the most salient items.
FruitList[which(FruitList$Order==1), ]
### Now, for salience calculations for items, you can run the following code.
### You'll see that per item salience mean, sum, and Smith's S is your output.
FL.S <- SalienceByCode(FL, dealWithDoubles = "MAX")
View(FL.S)
### Here is the full code, with examples already plugged in. Note that if you call
### salience something else, you'll have to appropriately alter the Salience =
### Fruist_S" part.
FL.S <- SalienceByCode(FL, Subj = "ID", CODE = "CODE", GROUPING = "Culture",
Salience = "Fruits_S", dealWithDoubles = "MAX")
### What's nice about the "deadWithDoubles" argument is that you can deal with
### a common problem in free-list data. When someone lists multiples of the same
### items, you can deal with it in a variety of ways. Run help(SalienceByCode) for
### options and their description.
### The following code highlights the "GROUPING" option. This option allows you
### to run the analyses detailed above, but split by some factor of your choosing.
### It is especially ideal for cross-cultural research. In this example, say we
### collected "kinds of fruits" data from three different samples (in this case,
### we have people from "MAINLAND", an "ISLAND", and the "MOON").
data(WorldList )
View(WorldList)
WL <- CalculateSalience(WorldList , GROUPING = "GROUPING")
View(WL)
WL.S <- SalienceByCode(WL, dealWithDoubles = "MAX", GROUPING = "GROUPING")
View(WL.S)
### We also include some table functions that make data integration easier. There
### are a few options for the tableType argument: PRESENCE, SUM_SALIENCE, MAX_SALIENCE,
### and FREQUENCY. PRESENCE reports a 1 if participants mentioned specific codes or a 0
### otherwise. FREQUENCY reports by-code frequencies, MAX_SALIENCE reports the salience of
### the code the first time it was mentioned, while SUM_SALIENCE reports the sum salience
### each person has for each code.
FLT <- FreeListTable(dataset, CODE = "CODE", Salience = "Salience", Subj = "Subj",
tableType = "DEFAULT")
View(FLT)
colSums(FLT)
### Let's try two with the Fruit List data.
### The first one (FL.BIN) will be a dichotomous dataset of whether or not someone mentioned a
### certain item. The second one (FL.FREQ) is a frequency matrix of the number of times each
### participant listed a specific item.
FL.BIN <- FreeListTable(FL, CODE = "CODE", Salience = "Salience", Subj = "Subj",
tableType = "PRESENCE")
View(FL.BIN)
FL.FREQ <- FreeListTable(FL, CODE = "CODE", Salience = "Salience", Subj = "Subj",
tableType = "FREQUENCY")
View(FL.FREQ)
##########################################################################################
###################### Cultural Consensus Analysis #######################################
##########################################################################################
### The following: 1) Calls up a mock data set, 2) Pops up the set viewer,
### 3) Runs the Cultural Consensus Analysis on this data set, and 4) Reports
### the analysis.
data(ConsensusTestData)
View(ConsensusTestData)
Results <- ConsensusPipeline(ConsensusTestData,3)
Results
### The sections of the report are as follows. If you wish to take them segment-by
### segment, you may use these:
Results$Answers ##This provides the culturally "correct" answers provided by the function.
Results$Competence ##This is the estimated cultural competence for each individual.
Results$origCompetence ##This is the competence score before transformation into the [0,1] range.
Results$TestScore ##Assuming the answer key is correct, this is the test score for each individual.
Results$Probs ##The probability that each answer is correct for a given question.
Results$reportback ##Important feedback from the analyses
Results$reportNumbers ## A vector containing all numerical values contained in reportback.
#########################################################################################
############################# Stress Tests ##############################################
#########################################################################################
### This is a more technical assessment of one's analysis. We've left the discussion
### for the help page and the main document. Still, these will quickly perform all
### described analyses in the main paper.
StressSummary<- ConsensusStressTest(15,15,4,5000,lockCompetance=0.6)
### This means: 15 individuals, 15 questions, 4 possible answers A,B,C,D, 5000 surveys simulated,
### all individuals have competence 0.6.
StressSummary[[1]] ##True and expected error rate.
mean(StressSummary[[2]]) ##The mean of the mean of calculated competence (should be near 0.6).
sum((StressSummary[[3]])>0.1)/length(StressSummary[[3]])
### The proportion of simulations with Competence variance calculated above 0.1.
### Note that the true value for this is 0, so all variance found is noise.
quantile(StressSummary[[3]],0.95)
### 95% of simulations detected variance below this value- even when true variance is 0.
### If your variance is below this level, there probably isn't much evidence for competence variability.
quantile(StressSummary[[3]],c(0.5,0.95,0.99,0.999) )
sum(StressSummary[[4]]<3.0)
### This last number is the number of surveys with Comrey ratio less than 3- these are datasets
### that the function would refuse to analyse unless the safety override was used.
### Please understand that this is the number of "Good" datasets that the function believes are bad.
### This value tells you nothing about what the Comrey ratio is likely to look like on "bad" datasets where
### important assumptions are violated.