-
Notifications
You must be signed in to change notification settings - Fork 6
/
g_expansion.Rmd
928 lines (647 loc) · 55 KB
/
g_expansion.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
---
title: How and why does behavioral flexibility vary across the range of a rapidly expanding species?
author: '[Logan CJ](http://CorinaLogan.com) (Max Planck Institute for Evolutionary Anthropology, corina_logan@eva.mpg.de), Rowney C (Max Planck Institute for Evolutionary Anthropology), Bergeron L (University of California Santa Barbara, UCSB), McCune KB (UCSB), MacPherson M (UCSB), Bond A (Arizona State University), [Blackwell A](http://www.anth.ucsb.edu/faculty/blackwell/) (University of California, Santa Barbara / Washington State University), [Escalante A](http://igem.temple.edu/people/person/c0ca56d9e60be58ca9380514ed8ac51d) (Temple University), [Pacheco M](http://igem.temple.edu/people/person/6417596d4976b9cbfa2a5ba8062ed7b9) (Temple University), [Lukas D](http://dieterlukas.strikingly.com) (Max Planck Institute for Evolutionary Anthropology)'
date: '`r Sys.Date()`'
output:
html_document:
toc: true
toc_depth: 4
toc_float:
collapsed: false
code_folding: hide
md_document:
toc: true
pdf_document:
keep_tex: yes
latex_engine: xelatex
github_document:
toc: true
bibliography: /Users/corina/GitHub/grackles/Files/MyLibrary.bib
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
```{r}
#Make code wrap text so it doesn't go off the page when Knitting to PDF
library(knitr)
opts_chunk$set(tidy.opts=list(width.cutoff=60),tidy=TRUE)
```
***Click [here](https://github.com/corinalogan/grackles/blob/master/README.md) to navigate to the version-tracked reproducible manuscript (.Rmd file)***
###A. STATE OF THE DATA
This preregistration was written prior to collecting any data.
###B. PARTITIONING THE RESULTS
We may publish the results from each hypothesis separately.
###C. HYPOTHESES
The way we define flexibility (individuals change behavior according to changing circumstances using learning from previous experiences) assumes that being flexible will always be associated with the higher fitness payoff (e.g., higher quality food rewards). Therefore, the reasons individuals might not be flexible are 1) there are indirect costs to flexibility (e.g., exploration, risks involved in uncertain choices), 2) individuals live in environments that rarely change or change randomly, or 3) individuals have internal constraints (e.g., not enough energy). Points [1](./g_exploration.Rmd) and [3](./g_withinpop.Rmd) are being tested in different preregistrations. Here we test point 2.
For the purposes of these predictions, the site that is in the middle of their original geographic range (Central America) is the Honduras site. “Edge” is defined based on distance from Central America (i.e., grackles expanded northward into Arizona in the 1960s and Nebraska in the 2000s (the northern edge) (Figure 1).
![Figure 1. The geographic range of the great-tailed grackle with the three field sites marked by (*). The arrow represents the northward direction of their range expansion. Purple squares represent the percentage of eBird.org birding checklists that report this species (see legend). Map credit: eBird.org.](g_expansionFig1.png)
#### H1: Individuals are more [behaviorally flexible](./g_flexmanip.Rmd) (measured by reversal learning and switching between options on a multi-access box) near the edge of their geographic range (in the US relative to Honduras) if flexibility is required for range expansion.
**Prediction 1:** There will be a positive association between flexibility (individuals that are faster to reverse preferences on a reversal learning task and who also have lower latencies to switch to solving new loci after previously solved loci become unavailable) a population's proximity to the edge of the geographic range. If this is supported, other variables should also be involved in how such flexibility is achieved. These other variables will be investigated in the hypotheses below.
**P1 alternative 1:** If there no difference in flexibility between the edge and non-edge populations, this could indicate that most individuals are already highly flexible (ceiling effect).
**P1 alternative 2:** If there is a negative correlation or no difference in flexibility between the edge and non-edge populations, this suggests that other variables might play a larger role than flexibility in how this species is able to rapidly expand its geographic range (see hypotheses below).
#### H2: Grackles in populations in the US will subsist more on human foods and less on wild foods than those in Honduras.
**P2:** The grackle's geographic expansion may be facilitated by human expansion and the ability to supplement or replace wild foraged foods with human food waste and other products (@wehtje2003range).
**P2 alternative 1:** There is no difference between sites in reliance on human foods, due either to a preference for human foods that exists across all sites, or an ability of grackles to flexibly forage at all sites.
#### H3: Populations closer to the range edge will have lower testosterone.
**P3:** Grackles living on the range edge will experience reduced food security, harsh physiological conditions (e.g., cold temperatures and snow) and will likely have poorer energetic condition, which will result in lower levels of testosterone (maintaining a high testosterone phenotype is difficult under energetic duress). Males in good condition with high testosterone are likely better able to control territories within the central range of the grackle habitat, pushing low testosterone males in poorer condition to the periphery of the grackle range.
**P3 alternative 1:** Alternatively, testosterone has been associated with novel sensation seeking and risk taking in humans (e.g., Campbell et al 2010), so males with higher testosterone may seek novel stimuli, moving to the periphery of grackle ranges.
**P3 alternative 2:** Alternatively, testosterone may be higher in the US populations due to a higher reliance on high calorie human foods and reduced parasite loads, paralleling differences seen in human populations.
#### H4: Populations closer to the range edge will have higher levels of serum glucocorticoids
**P4:** Grackles living on the range edge will experience reduced food security, harsh physiological conditions (e.g., cold temperatures and snow) and likely have poorer energetic condition, which will result in higher circulating glucocorticoids as these factors all represent physiological and mental stressors.
**P4 Alternative 1:** Constant physiological and mental stressors can result in chronically low levels of cortisol in humans (Yehuda 2002), and low cortisol responsivity to stresses (e.g. Yehuda and Seckl 2011). If this is a phylogenetically constrained phenomenon, then constantly stressed grackles may have low baseline glucocorticoids and a low stress response. Because samples are taken following the grackles being captured, they will likely be experiencing a stressor and have higher than baseline levels of corticosterone: if chronically stressed grackles on the margins of their territory do not espouse a strong stress response, they may appear to have low baseline cortisol when in actuality we are comparing non-reactive cortisol in marginal populations to a strong stress response in central populations.
#### H5: Populations closer to the range edge will have poorer body condition.
**P5:** The longer the distance from the center of the original range, the more likely they are to have poor body condition (shorter tarsus length) because of the difficulty of moving to new environments that are physiologically more stressful (e.g., cold temperatures and snow) than their original tropical environment.
**P5 alternative 1a:** If there is a positive correlation between distance from the center of the original range and body condition (longer tarsus length), this could indicate that the better quality individuals are the ones that are better able to disperse and adapt to new environments.
**P5 alternative 1b:** A positive correlation between body size and edge distance could also result if grackles in Arizona and Nebraska subsist more on high calorie human foods. However, we would expect this to be associated with poorer condition on other measures such as testosterone and hematocrit and/or red blood cell counts.
**P5 alternative 2:** If there is no correlation, this could indicate these variables are independent from each other or that they are related to another variable that was not measured
**P5 alternative 3:** Grackles expanding into colder areas (Nebraska) may have a larger body size, which is beneficial in a colder environment. If this is the case, then we predict that grackles will be smaller in Arizona and Honduras.
#### H6: Grackles in populations closer to the range edge will have higher variation in haemosporidian parasite loads.
**P6:** Populations closer to the range edge will have a higher individual variance in parasite loads (the total number of parasite individuals and the total number of parasite species), harbor more novel parasite species, and more parasite species that are not common near the range edge because 1) grackles that are nearer the range edge might be in poorer energetic condition and more susceptible to acquiring parasites, and 2) the younger age of these grackle populations exposes them to new parasite species due to encountering new geographic regions.
**P6 alternative 1:** Populations closer to the range edge will have lower haemosporidian parasite loads because there will be less contact with suitable vectors due to their recent establishment.
**P6 alternative 2:** Populations closer to the range edge will have haemosporidian parasite loads that are not different from other populations because these parasites were carried by grackles throughout their range expansion, which could indicate that flexibility and haemosporidian parasite loads are independent from each other or that they are related to another variable that was not measured.
#### H7: Grackles in populations closer to the range edge will have elevated immune markers indicative of active infection.
**P7:** Given the variation in exposure to novel parasites and poorer energetic condition (H6), we expect grackles closer to the range edge to have higher indicators of active infection, including elevated total leukocytes, elevated heterophils, and a greater number of immature red blood cells, indicating increased turn-over in red blood cells.
**P7 alternative 1** If P6 alternative 1 is supported, and grackles closer to the range edge are exposed to fewer parasites and have lower disease transmission, then we expect lower measures of immune activation.
#### H8: Populations closer to the range edge will have lower indicators or baseline cellular immunity when not actively fighting infection.
**P8:** Individuals in poor energetic condition from reduced food availability, increased physical activity to get scarce food resources, or physiologically stressful conditions (e.g., cold temperatures and snow) will lower lymphocyte counts, and lower lymphocyte/heterophil ratios.
**P8 alternative 1:** Alternatively, high lymphocyte counts may facilitate range expansion by improving resistance to novel pathogens, and individuals with high counts may be more likely to be found in edge populations.
**P8 alternative 2:** If individuals in edge populations are not in poorer condition, they might still have lower baseline immunity if they are less regularly challenged by parasites and pathogens.
#### H9: Populations closer to the range edge will have higher microbiome species diversity.
**P9:** Populations closer to the range edge will have higher microbiome species diversity because 1) grackles that are nearer the range edge might be in poorer energetic condition and more susceptible to hosting more species, and/or 2) the younger age of these grackle populations exposes them to new microbiome species due to consuming new foods (@alberdi2016vertebrate). The higher transient diversity of microbiome species in the edge population could stabilize over time.
**P9 alternative 1:** Populations closer to the range edge will have lower microbiome species diversity because these individuals exploit similar food sources and/or because they are a more recently established population that has not yet become a suitable host for particular microbiome species.
**P9 alternative 2:** Populations closer to the range edge will have microbiome species diversity that is not different from other populations, which could indicate that flexibility and microbiome species diversity are independent from each other or that they are related to another variable that was not measured.
#### H10: Populations closer to the range edge will have stronger selection for flexibility (a stronger association between fitness and flexibility) and lower heritability of flexibility (based on the pedigree)
**P10:** More flexible individuals are more likely to have fitness benefits in edge (novel) environments, whereas in non-edge populations previous experiences (either during development (plasticity) or during a "sensitive period" (initial learning about what/where/when resources are available, when to breed, etc.) or previous generations (selection)) may have shaped responses to the environment. Accordingly, in edge populations the fitness benefits of flexibility will outweigh any potential fitness costs, and there will be a stronger correlation between fitness and flexibility. In edge environments, when there is stronger selection for flexibility there will be less variation because everyone has to be flexible (e.g., @jones2018behavioral), therefore knowing which lineage individuals come from is not informative about how flexible they will be. In contrast, in non-edge populations where there is weaker selection for or against flexibility, variation in flexibility might exist and be inherited. Selection will be measured as the correlation between the number of offspring an individual has (determined using paternity and maternity from ddRADseq) and [flexibility](./g_flexmanip.Rmd), and the heritability of flexibility will be measured in a separate [preregistration](./g_flexgenes.Rmd).
**P10 alternative:** We will not detect selection for flexibility in individuals living in edge populations potentially because 1) the dispersing individuals are subordinate are forced to leave their natal areas and therefore the trait under selection would be dominance rank (investigated in a future preregistration), and/or 2) there are high costs associated with flexibility in novel environments (e.g., parasite exposure, a different climate, unfamiliar breeding sites, see previous hypotheses), and/or 3) non-edge populations also have a high degree of immigration and mixing such that individuals constantly experience new environments (measured using ddRADseq, see P12 alternative 2), and therefore we will not be able to detect a signal between fitness and flexibility.
#### H11: Populations at the range edge will be more likely to have unique traits (traits not found in other populations)
**P11:** Edge populations will be more divergent for some traits if bottlenecks occurred during their geographic range expansion (repeated linear expansion - like hopping from one island to the next). Rare variants can increase in frequency because migration into new areas usually starts from a random small subset of individuals, who by chance might have certain traits that are usually not found in most individuals.
**P11 alternative:** Edge populations will not be more divergent for some traits if migration into new areas is sourced from multiple places along the edge.
#### H12: Populations closer to the edge will have reduced genetic diversity, though genetic differences among pairs of individuals can be high
**P12:** Edge populations will have lower levels of genetic variation because expansion occurs through stepwise repeated migration of a small number of individuals who only carry a subset of the genetic variation found in the source population. Pairs of individuals might be different from each other if individuals immigrated from multiple distinct source populations, whereas in non-edge populations most individuals are likely to have been hatched in the same population.
**P12 alternative 1:** Edge populations will not show reduced variation if the expansion occurs gradually and where multiple individuals slowly move into new areas while remaining in contact with the source population. Pairs of individuals will not show higher genetic differences because they all originated from the same source population.
**P12 alternative 2:** Edge populations will not show reduced variation if individuals frequently move long distances to breed so that edge populations as well as more central populations contain very divergent individuals.
###D. METHODS
####**Randomization and counterbalancing**
No randomization or counterbalancing is involved in this study.
####**Blinding of conditions during analysis**
No blinding is involved in this study.
####**Dependent variables**
####*P1: edge & flexibility*
1) Flexibility 1: **Number of trials to reverse** a preference in the last reversal an individual experienced (reversal learning; an individual is considered to have a preference if it chose the rewarded option at least 17 out of the most recent 20 trials, with a minimum of 8 or 9 correct choices out of 10 on the two most recent sets of 10 trials). See behavioral flexibility [preregistration](./g_flexmanip.Rmd).
2) Flexibility 3: If the number of trials to reverse a preference does not positively correlate with the latency to attempt or solve new loci on the multi-access box (an additional measure of behavioral flexibility), then the **average latency to solve** and the **average latency to attempt** a new option on the multi-access box will be additional dependent variables. See behavioral flexibility [preregistration](./g_flexmanip.Rmd).
One model will be run per dependent variable.
####*P2: edge & foraging variation*
Number of human food types consumed in the first X minutes (X=the sum of the total observation time per individual, using the individual who had the lowest sum to equalize observation time across individuals) from focal follow observations on wild, individually-marked grackles
####*P3: edge & testosterone*
Testosterone concentration at capture
####*P4: edge & glucocorticoids*
Serum glucocorticoid concentration
####*P5: edge & body condition*
1) Tarsus length (average of left and right legs)
2) Hematocrit
3) Number of red blood cells
####*P6: edge & haemosporidian parasites*
1) Number of parasite individuals
2) Number of parasite species
3) Number of novel (to grackles) parasite species
4) Number of uncommon parasite species in this geographic area
One model will be run per dependent variable.
####*P7: edge & immune activation*
1) Number of white blood cells
2) Number of heterophils
3) Number of immature red blood cells
####*P8: edge & baseline immunity*
1) Number of lymphocytes
2) Lymphocyte/heterophil ratio
####*P9: edge & microbiome*
1) Number of species
####*P10: selection on flexibility*
1) Slope of the correlation between flexibility and the number of offspring per individual (controlled for age and sex)
####*P11: unique genes at edge*
1) Number of genetic variants that only exist in a single population
####*P12: reduced genetic diversity at edge*
1) Number of different genetic variants per locus
2) Heterozygosity: average proportion of the genome at which individuals carry two different genetic variants on the two different chromosomes
####**Independent variables**
####*P1-P12*
1) Population: center (Central America), middle of the northern expansion front (Arizona), edge of the northern expansion front (Nebraska)
###E. ANALYSIS PLAN
We do not plan to **exclude** any data. When **missing data** occur, the existing data for that individual will be included in the analyses for the tests they completed. Analyses will be conducted in R (current version `r getRversion()`; @rcoreteam). We will analyze data for females and males separately because each sex has a distinct natural history (@johnson2001great).
####*Data checking*
The data will be visually checked to determine whether they are normally distributed via two methods: 1) normality is indicated when the histograms of actual data match those with simulated data (Figure 2), and 2) normality is indicated when the residuals closely fit the dotted line in the Normal Q-Q plot (Figure 3) [@zuur2009].
```{r dist_checkHistQQ, eval=FALSE, warning=FALSE, results='asis', echo=TRUE, fig.cap="**Figure 2.** Part 1. The distribution of the actual data versus what a normal distribution would look like with simulated data. Part 2. Residuals vs fitted: checking for homogeneity, which is satisfied if residuals have an even spread across the x-axis; Normal Q-Q: residuals are normally distributed if they are on the diagonal line; Residuals vs leverage: Cook's distance <1 means no influential observations [@zuur2009]."}
edge <- read.csv ("/Users/corina/GTGR/data/data_pop.csv", header=T, sep=",", stringsAsFactors=F)
#Separate the sexes
fem <- edge[edge$Sex=="f",]
mal <- edge[edge$Sex=="m",]
#Check the dependent variables for normality: Histograms
#females
op <- par(mfrow=c(4,4), mar=c(4,4,2,0.2))
#This is what the distribution of actual data looks like
hist(fem$TrialsToReverseLast, xlab="Flexibility: Trials to reverse", main="Actual Data")
hist(fem$NumberHumanFoodsEaten, xlab="% human foods eaten", main="Actual Data")
hist(fem$AvgTarsusLength, xlab="Condition: Avg tarsus length", main="Actual Data")
hist(fem$Hematocrit, xlab="Condition: Avg tarsus length", main="Actual Data")
hist(fem$AvgTarsusLength, xlab="Condition: Avg tarsus length", main="Actual Data")
hist(fem$NumberRedBloodCells, xlab="Condition: number RBCs", main="Actual Data")
hist(fem$NumberLeucocytes, xlab="Immune activation: number WBCs", main="Actual Data")
hist(fem$NumberHeterophils, xlab="Immune activation: number heterophils", main="Actual Data")
hist(fem$NumberImmatureRedBloodCells, xlab="Immune activation: number immature RBCs", main="Actual Data")
hist(fem$NumberLymphocytes, xlab="Baseline immunity: number lymphocytes", main="Actual Data")
hist(fem$LymphocyteHeterophilRatio, xlab="Baseline immunity: lymphocyte/heterophil ratio", main="Actual Data")
hist(fem$NumberParasiteSpecies, xlab="Parasites: number of species", main="Actual Data")
hist(fem$NumberNovelParasiteSpecies, xlab="Parasites: number novel species", main="Actual Data")
hist(fem$NumberUncommonParasiteSpecies, xlab="Parasites: number uncommon species", main="Actual Data")
hist(fem$NumberMicrobiomeSpecies, xlab="Microbiome: number of species", main="Actual Data")
#Given the actual data, this is what a normal distribution would look like
X2 <- rnorm(1281, mean=mean(fem$TrialsToReverseLast), sd=sd(fem$TrialsToReverseLast))
hist(X2, xlab="Flexibility: Trials to reverse", main="Simulated Data")
G2 <- rnorm(1281, mean=mean(fem$NumberHumanFoodsEaten), sd=sd(fem$NumberHumanFoodsEaten))
hist(G2, xlab="% human foods eaten", main="Simulated Data")
H2 <- rnorm(1281, mean=mean(fem$Hematocrit), sd=sd(fem$Hematocrit))
hist(H2, xlab="Condition: hematocrit", main="Simulated Data")
Z2 <- rnorm(1281, mean=mean(fem$AvgTarsusLength), sd=sd(fem$AvgTarsusLength))
hist(Z2, xlab="Condition: Avg tarsus length", main="Simulated Data")
I2 <- rnorm(1281, mean=mean(fem$NumberRedBloodCells), sd=sd(fem$NumberRedBloodCells))
hist(I2, xlab="Condition: number RBCs", main="Simulated Data")
J2 <- rnorm(1281, mean=mean(fem$NumberLeucocytes), sd=sd(fem$NumberLeucocytes))
hist(J2, xlab="Immune activation: number WBCs", main="Simulated Data")
K2 <- rnorm(1281, mean=mean(fem$NumberHeterophils), sd=sd(fem$NumberHeterophils))
hist(K2, xlab="Immune activation: number heterophils", main="Simulated Data")
L2 <- rnorm(1281, mean=mean(fem$NumberImmatureRedBloodCells), sd=sd(fem$NumberImmatureRedBloodCells))
hist(L2, xlab="Immune activation: number immature RBCs", main="Simulated Data")
M2 <- rnorm(1281, mean=mean(fem$NumberLymphocytes), sd=sd(fem$NumberLymphocytes))
hist(M2, xlab="Baseline immunity: number lymphocytes", main="Simulated Data")
N2 <- rnorm(1281, mean=mean(fem$LymphocyteHeterophilRatio), sd=sd(fem$LymphocyteHeterophilRatio))
hist(N2, xlab="Baseline immunity: lymphocyte/heterophil ratio", main="Simulated Data")
A2 <- rnorm(1281, mean=mean(fem$NumberParasites), sd=sd(fem$NumberParasites))
hist(A2, xlab="Parasites: number parasite inds", main="Simulated Data")
B2 <- rnorm(1281, mean=mean(fem$NumberParasiteSpecies), sd=sd(fem$NumberParasiteSpecies))
hist(B2, xlab="Parasites: number spp", main="Simulated Data")
D2 <- rnorm(1281, mean=mean(fem$NumberNovelParasiteSpecies), sd=sd(fem$NumberNovelParasiteSpecies))
hist(D2, xlab="Parasites: novel spp", main="Simulated Data")
E2 <- rnorm(1281, mean=mean(fem$NumberUncommonParasiteSpecies), sd=sd(fem$NumberUncommonParasiteSpecies))
hist(E2, xlab="Parasites: uncommon spp", main="Simulated Data")
F2 <- rnorm(1281, mean=mean(fem$NumberMicrobiomeSpecies), sd=sd(fem$NumberMicrobiomeSpecies))
hist(F2, xlab="Microbiome: number species", main="Simulated Data")
#males
op <- par(mfrow=c(4,4), mar=c(4,4,2,0.2))
#This is what the distribution of actual data looks like
hist(mal$TrialsToReverseLast, xlab="Flexibility: Trials to reverse", main="Actual Data")
hist(mal$NumberHumanFoodsEaten, xlab="% human foods eaten", main="Actual Data")
hist(mal$AvgTarsusLength, xlab="Condition: Avg tarsus length", main="Actual Data")
hist(mal$Hematocrit, xlab="Condition: Avg tarsus length", main="Actual Data")
hist(mal$AvgTarsusLength, xlab="Condition: Avg tarsus length", main="Actual Data")
hist(mal$NumberRedBloodCells, xlab="Condition: number RBCs", main="Actual Data")
hist(mal$NumberLeucocytes, xlab="Immune activation: number WBCs", main="Actual Data")
hist(mal$NumberHeterophils, xlab="Immune activation: number heterophils", main="Actual Data")
hist(mal$NumberImmatureRedBloodCells, xlab="Immune activation: number immature RBCs", main="Actual Data")
hist(mal$NumberLymphocytes, xlab="Baseline immunity: number lymphocytes", main="Actual Data")
hist(mal$LymphocyteHeterophilRatio, xlab="Baseline immunity: lymphocyte/heterophil ratio", main="Actual Data")
hist(mal$NumberParasiteSpecies, xlab="Parasites: number of species", main="Actual Data")
hist(mal$NumberNovelParasiteSpecies, xlab="Parasites: number novel species", main="Actual Data")
hist(mal$NumberUncommonParasiteSpecies, xlab="Parasites: number uncommon species", main="Actual Data")
hist(mal$NumberMicrobiomeSpecies, xlab="Microbiome: number of species", main="Actual Data")
#Given the actual data, this is what a normal distribution would look like
x2 <- rnorm(1281, mean=mean(mal$TrialsToReverseLast), sd=sd(mal$TrialsToReverseLast))
hist(x2, xlab="Flexibility: Trials to reverse", main="Simulated Data")
g2 <- rnorm(1281, mean=mean(mal$NumberHumanFoodsEaten), sd=sd(mal$NumberHumanFoodsEaten))
hist(g2, xlab="% human foods eaten", main="Simulated Data")
h2 <- rnorm(1281, mean=mean(mal$Hematocrit), sd=sd(mal$Hematocrit))
hist(h2, xlab="Condition: hematocrit", main="Simulated Data")
z2 <- rnorm(1281, mean=mean(mal$AvgTarsusLength), sd=sd(mal$AvgTarsusLength))
hist(z2, xlab="Condition: Avg tarsus length", main="Simulated Data")
i2 <- rnorm(1281, mean=mean(mal$NumberRedBloodCells), sd=sd(mal$NumberRedBloodCells))
hist(i2, xlab="Condition: number RBCs", main="Simulated Data")
j2 <- rnorm(1281, mean=mean(mal$NumberLeucocytes), sd=sd(mal$NumberLeucocytes))
hist(j2, xlab="Immune activation: number WBCs", main="Simulated Data")
k2 <- rnorm(1281, mean=mean(mal$NumberHeterophils), sd=sd(mal$NumberHeterophils))
hist(k2, xlab="Immune activation: number heterophils", main="Simulated Data")
l2 <- rnorm(1281, mean=mean(mal$NumberImmatureRedBloodCells), sd=sd(mal$NumberImmatureRedBloodCells))
hist(l2, xlab="Immune activation: number immature RBCs", main="Simulated Data")
m2 <- rnorm(1281, mean=mean(mal$NumberLymphocytes), sd=sd(mal$NumberLymphocytes))
hist(m2, xlab="Baseline immunity: number lymphocytes", main="Simulated Data")
n2 <- rnorm(1281, mean=mean(mal$LymphocyteHeterophilRatio), sd=sd(mal$LymphocyteHeterophilRatio))
hist(n2, xlab="Baseline immunity: lymphocyte/heterophil ratio", main="Simulated Data")
a2 <- rnorm(1281, mean=mean(mal$NumberParasites), sd=sd(mal$NumberParasites))
hist(a2, xlab="Parasites: number parasite inds", main="Simulated Data")
b2 <- rnorm(1281, mean=mean(mal$NumberParasiteSpecies), sd=sd(mal$NumberParasiteSpecies))
hist(b2, xlab="Parasites: number spp", main="Simulated Data")
d2 <- rnorm(1281, mean=mean(mal$NumberNovelParasiteSpecies), sd=sd(mal$NumberNovelParasiteSpecies))
hist(d2, xlab="Parasites: novel spp", main="Simulated Data")
e2 <- rnorm(1281, mean=mean(mal$NumberUncommonParasiteSpecies), sd=sd(mal$NumberUncommonParasiteSpecies))
hist(e2, xlab="Parasites: uncommon spp", main="Simulated Data")
f2 <- rnorm(1281, mean=mean(mal$NumberMicrobiomeSpecies), sd=sd(mal$NumberMicrobiomeSpecies))
hist(f2, xlab="Microbiome: number species", main="Simulated Data")
#Check the dependent variables for normality: Q-Q plots
#females
op <- par(mfrow=c(4,4), mar=c(4,4,2,0.2))
plot(glm(fem$TrialsToReverseLast~fem$Population))
plot(glm(fem$AvgTarsusLength~fem$Population))
plot(glm(fem$NumberHumanFoodsEaten~fem$Population))
plot(glm(fem$Hematocrit~fem$Population))
op <- par(mfrow=c(5,4), mar=c(4,4,2,0.2))
plot(glm(fem$NumberRedBloodCells~fem$Population))
plot(glm(fem$NumberLeucocytes~fem$Population))
plot(glm(fem$NumberImmatureRedBloodCells~fem$Population))
plot(glm(fem$NumberLymphocytes~fem$Population))
plot(glm(fem$LymphocyteHeterophilRatio~fem$Population))
op <- par(mfrow=c(4,4), mar=c(4,4,2,0.2))
plot(glm(fem$NumberParasiteSpecies~fem$Population))
plot(glm(fem$NumberNovelParasiteSpecies~fem$Population))
plot(glm(fem$NumberUncommonParasiteSpecies~fem$Population))
plot(glm(fem$NumberMicrobiomeSpecies~fem$Population))
#males
op <- par(mfrow=c(4,4), mar=c(4,4,2,0.2))
plot(glm(mal$TrialsToReverseLast~mal$Population))
plot(glm(mal$AvgTarsusLength~mal$Population))
plot(glm(mal$NumberHumanFoodsEaten~mal$Population))
plot(glm(mal$Hematocrit~mal$Population))
op <- par(mfrow=c(5,4), mar=c(4,4,2,0.2))
plot(glm(mal$NumberRedBloodCells~mal$Population))
plot(glm(mal$NumberLeucocytes~mal$Population))
plot(glm(mal$NumberImmatureRedBloodCells~mal$Population))
plot(glm(mal$NumberLymphocytes~mal$Population))
plot(glm(mal$LymphocyteHeterophilRatio~mal$Population))
op <- par(mfrow=c(4,4), mar=c(4,4,2,0.2))
plot(glm(mal$NumberParasiteSpecies~mal$Population))
plot(glm(mal$NumberNovelParasiteSpecies~mal$Population))
plot(glm(mal$NumberUncommonParasiteSpecies~mal$Population))
plot(glm(mal$NumberMicrobiomeSpecies~mal$Population))
```
If the data do not appear normally distributed, visually check the residuals. If they are patternless, then assume a normal distribution (Figure 4) [@zuur2009].
```{r dist_checkResid, eval=FALSE, warning=FALSE, results='asis', echo=TRUE, fig.cap="**Figure 4.** Visual check of the residuals."}
#Check the dependent variables for normality: Residuals
edge <- read.csv ("/Users/corina/GTGR/data/data_pop.csv", header=T, sep=",", stringsAsFactors=F)
#Separate the sexes
fem <- edge[edge$Sex=="f",]
mal <- edge[edge$Sex=="m",]
#Figure 3. Visual check of the residuals
#females
op <- par(mfrow=c(4,4), mar=c(4,4,2,0.2))
plot(residuals(glm(fem$TrialsToReverseLast~fem$Population)), ylab="Residuals (flexibility): Trials to reverse ~ Population")
plot(residuals(glm(fem$NumberHumanFoodsEaten~fem$Population)), ylab="Residuals: % human foods ~ Population")
plot(residuals(glm(fem$AvgTarsusLength~fem$Population)), ylab="Residuals (condition): Tarsus length ~ Population")
plot(residuals(glm(fem$Hematocrit~fem$Population)), ylab="Residuals (condition): hematocrit ~ Population")
plot(residuals(glm(fem$NumberRedBloodCells~fem$Population)), ylab="Residuals (condition): RBCs ~ Population")
plot(residuals(glm(fem$NumberParasites~fem$Population)), ylab="Residuals (parasites): number inds ~ Population")
plot(residuals(glm(fem$NumberParasiteSpecies~fem$Population)), ylab="Residuals (parasites): number spp ~ Population")
plot(residuals(glm(fem$NumberNovelParasiteSpecies~fem$Population)), ylab="Residuals (parasites): number novel ~ Population")
plot(residuals(glm(fem$NumberUncommonParasiteSpecies~fem$Population)), ylab="Residuals (parasites): number uncommon ~ Population")
plot(residuals(glm(fem$NumberLeucocytes~fem$Population)), ylab="Residuals (immune): WBCs ~ Population")
plot(residuals(glm(fem$NumberHeterophil~fem$Population)), ylab="Residuals (immune): heterophils ~ Population")
plot(residuals(glm(fem$NumberLymphocytes~fem$Population)), ylab="Residuals (immunity): lymphocytes ~ Population")
plot(residuals(glm(fem$LymphocyteHeterophilRatio~fem$Population)), ylab="Residuals (immunity): Lymph/Hetero ~ Population")
plot(residuals(glm(fem$NumberMicrobiomeSpecies~fem$Population)), ylab="Residuals (microbiome): number spp ~ Population")
#males
op <- par(mfrow=c(4,4), mar=c(4,4,2,0.2))
plot(residuals(glm(mal$TrialsToReverseLast~mal$Population)), ylab="Residuals (flexibility): Trials to reverse ~ Population")
plot(residuals(glm(mal$NumberHumanFoodsEaten~mal$Population)), ylab="Residuals: % human foods ~ Population")
plot(residuals(glm(mal$AvgTarsusLength~mal$Population)), ylab="Residuals (condition): Tarsus length ~ Population")
plot(residuals(glm(mal$Hematocrit~mal$Population)), ylab="Residuals (condition): hematocrit ~ Population")
plot(residuals(glm(mal$NumberRedBloodCells~mal$Population)), ylab="Residuals (condition): RBCs ~ Population")
plot(residuals(glm(mal$NumberParasites~mal$Population)), ylab="Residuals (parasites): number inds ~ Population")
plot(residuals(glm(mal$NumberParasiteSpecies~mal$Population)), ylab="Residuals (parasites): number spp ~ Population")
plot(residuals(glm(mal$NumberNovelParasiteSpecies~mal$Population)), ylab="Residuals (parasites): number novel ~ Population")
plot(residuals(glm(mal$NumberUncommonParasiteSpecies~mal$Population)), ylab="Residuals (parasites): number uncommon ~ Population")
plot(residuals(glm(mal$NumberLeucocytes~mal$Population)), ylab="Residuals (immune): WBCs ~ Population")
plot(residuals(glm(mal$NumberHeterophil~mal$Population)), ylab="Residuals (immune): heterophils ~ Population")
plot(residuals(glm(mal$NumberLymphocytes~mal$Population)), ylab="Residuals (immunity): lymphocytes ~ Population")
plot(residuals(glm(mal$LymphocyteHeterophilRatio~mal$Population)), ylab="Residuals (immunity): Lymph/Hetero ~ Population")
plot(residuals(glm(mal$NumberMicrobiomeSpecies~mal$Population)), ylab="Residuals (microbiome): number spp ~ Population")
```
####*P1: edge & flexibility*
**Analysis:** Generalized Linear Model (GLM; glm function, stats package) with a Poisson distribution and log link. The contribution of the independent variable will be evaluated using the Estimate in the full model.
```{r flexedge, eval=FALSE, warning=FALSE, results='asis', echo=TRUE, include=TRUE}
edge <- read.csv ("/Users/corina/GTGR/data/data_pop.csv", header=T, sep=",", stringsAsFactors=F)
#Separate the sexes
fem <- edge[edge$Sex=="f",]
mal <- edge[edge$Sex=="m",]
#GLM: dependent variable = Flexibility 1: number of trials to reverse last preference
#female
flex1 <- glm(TrialsToReverseLast ~ Population, family="poisson", data=fem)
#summary(flex1)
flexa <- summary(flex1)
library(xtable)
flexa.table <- xtable(flexa)
library(knitr)
kable(flexa.table, caption="Table 1: Model selection output.", format="html", digits=2)
#male
flex2 <- glm(TrialsToReverseLast ~ Population, family="poisson", data=mal)
#summary(flex2)
flexb <- summary(flex2)
library(xtable)
flexb.table <- xtable(flexb)
library(knitr)
kable(flexb.table, caption="Table 2: Model selection output.", format="html", digits=2)
```
*P1 alternative: variance of flexibility across populations*
Determine whether population variances in Flexibility measures 1-3 differ among populations.
```{r flexvar, eval=FALSE, warning=FALSE, results='asis', echo=TRUE, include=TRUE}
edge <- read.csv ("/Users/corina/GTGR/data/data_accumulation.csv", header=T, sep=",", stringsAsFactors=F)
#Separate the sexes
fem <- edge[edge$Sex=="f",]
mal <- edge[edge$Sex=="m",]
#female variance
ctr1 <- fem[fem$Population=="HO",] #Honduras
mdl1 <- fem[fem$Population=="AZ",] #Arizona
edg1 <- fem[fem$Population=="NE",] #Nebraska
A <- var(ctr1$Flex3)
D <- var(ctr1$TrialsToReverseLast)
B <- var(mdl1$Flex3)
E <- var(mdl1$TrialsToReverseLast)
C <- var(edg1$Flex3)
G <- var(edg1$TrialsToReverseLast)
flex3a <- c(A,B,C)
last1 <- c(D,E,G)
op <- par(mfrow=c(1,2), mar=c(4,4,2,0.2))
plot(flex3a)
plot(last1)
#male variance
ctr2 <- mal[mal$Population=="HO",] #Honduras
mdl2 <- mal[mal$Population=="AZ",] #Arizona
edg2 <- mal[mal$Population=="NE",] #Nebraska
H <- var(ctr2$Flex3)
I <- var(ctr2$TrialsToReverseLast)
J <- var(mdl2$Flex3)
K <- var(mdl2$TrialsToReverseLast)
L <- var(edg2$Flex3)
M <- var(edg2$TrialsToReverseLast)
flex3b <- c(H,J,L)
last2 <- c(I,K,M)
op <- par(mfrow=c(1,2), mar=c(4,4,2,0.2))
plot(flex3b)
plot(last2)
```
####*P2: edge & foraging variation*
**Analysis:** Generalized Linear Model (GLM; glm function, stats package) with a Poisson distribution and log link. The contribution of the independent variable will be evaluated using the Estimate in the full model.
```{r edgeF, eval=FALSE, warning=FALSE, results='asis', echo=TRUE, include=TRUE}
edge <- read.csv ("/Users/corina/GTGR/data/data_pop.csv", header=T, sep=",", stringsAsFactors=F)
#Separate the sexes
fem <- edge[edge$Sex=="f",]
mal <- edge[edge$Sex=="m",]
#GLM
#female
t1 <- glm(NumberHumanFoodsEaten ~ Population, family="poisson", data=fem)
#summary(t1)
ta <- summary(t1)
library(xtable)
ta.table <- xtable(ta)
library(knitr)
kable(ta.table, caption="Table 5: Model selection output.", format="html", digits=2)
#male
t2 <- glm(NumberHumanFoodsEaten ~ Population, family="poisson", data=mal)
#summary(t2)
tb <- summary(t2)
library(xtable)
tb.table <- xtable(tb)
library(knitr)
kable(tb.table, caption="Table 6: Model selection output.", format="html", digits=2)
```
####*P3: edge & testosterone*
**Analysis:** Generalized Linear Model (GLM; glm function, stats package) with a Poisson distribution and log link. The contribution of the independent variable will be evaluated using the Estimate in the full model.
```{r edgeT, eval=FALSE, warning=FALSE, results='asis', echo=TRUE, include=TRUE}
edge <- read.csv ("/Users/corina/GTGR/data/data_pop.csv", header=T, sep=",", stringsAsFactors=F)
#Separate the sexes
fem <- edge[edge$Sex=="f",]
mal <- edge[edge$Sex=="m",]
#GLM
#female
t1 <- glm(Testosterone ~ Population, family="poisson", data=fem)
#summary(t1)
ta <- summary(t1)
library(xtable)
ta.table <- xtable(ta)
library(knitr)
kable(ta.table, caption="Table 7: Model selection output.", format="html", digits=2)
#male
t2 <- glm(Testosterone ~ Population, family="poisson", data=mal)
#summary(t2)
tb <- summary(t2)
library(xtable)
tb.table <- xtable(tb)
library(knitr)
kable(tb.table, caption="Table 8: Model selection output.", format="html", digits=2)
```
####*P4: edge & glucocorticoids*
**Analysis:** To be able to detect trade offs, we need to control for condition, therefore we chose the average tarsus length as the representative measure of condition (a random effect). We will conduct a Generalized Linear Mixed Model (GLMM; MCMCglmm function, MCMCglmm package; [@hadfield2010mcmc]) with a Poisson distribution and log link using 130,000 iterations with a thinning interval of 10, a burnin of 30,000, and minimal priors (V=1, nu=0) [@hadfield2014coursenotes]. We will ensure the GLMM shows acceptable convergence (lag time autocorrelation values <0.01; [@hadfield2010mcmc]), and adjust parameters if necessary to meet this criterion. The contribution of each independent variable will be evaluated using the Estimate in the full model.
```{r edgeC, eval=FALSE, warning=FALSE, results='asis', echo=TRUE, include=TRUE}
edge <- read.csv ("/Users/corina/GTGR/data/data_pop.csv", header=T, sep=",", stringsAsFactors=F)
#Separate the sexes
fem <- edge[edge$Sex=="f",]
mal <- edge[edge$Sex=="m",]
#GLMM
library(MCMCglmm)
prior = list(R=list(R1=list(V=1,nu=0)), G=list(G1=list(V=1,nu=0)))
#females
f1 <- MCMCglmm(Cort ~ Population, random=~AvgTarsusLength, family="poisson", data=fem, verbose=F, prior=prior, nitt=130000, thin=10, burnin=30000)
summary(f1)
#autocorr(f1$Sol) #Did fixed effects converge?
#autocorr(f1$VCV) #Did random effects converge?
#males
m1 <- MCMCglmm(Cort ~ Population, random=~AvgTarsusLength, family="poisson", data=mal, verbose=F, prior=prior, nitt=130000, thin=10, burnin=30000)
summary(m1)
#autocorr(m1$Sol) #Did fixed effects converge?
#autocorr(m1$VCV) #Did random effects converge?
```
####*P5: edge & body condition*
**Analysis:** Generalized Linear Model (GLM; glm function, stats package) with a Poisson distribution and log link. The contribution of the independent variable will be evaluated using the Estimate in the full model.
```{r edgeB, eval=FALSE, warning=FALSE, results='asis', echo=TRUE, include=TRUE}
edge <- read.csv ("/Users/corina/GTGR/data/data_pop.csv", header=T, sep=",", stringsAsFactors=F)
#Separate the sexes
fem <- edge[edge$Sex=="f",]
mal <- edge[edge$Sex=="m",]
#GLM
#female
t1 <- glm(AvgTarsusLength ~ Population, family="poisson", data=fem)
#summary(t1)
ta <- summary(t1)
library(xtable)
ta.table <- xtable(ta)
library(knitr)
kable(ta.table, caption="Table 9: Model selection output.", format="html", digits=2)
#male
t2 <- glm(AvgTarsusLength ~ Population, family="poisson", data=mal)
#summary(t2)
tb <- summary(t2)
library(xtable)
tb.table <- xtable(tb)
library(knitr)
kable(tb.table, caption="Table 10: Model selection output.", format="html", digits=2)
```
####*P6: edge & haemosporidian parasites*
**Analysis:** To be able to detect trade offs, we need to control for condition, therefore we chose the average tarsus length as the representative measure of condition (a random effect). We will conduct a Generalized Linear Mixed Model (GLMM; MCMCglmm function, MCMCglmm package; [@hadfield2010mcmc]) with a Poisson distribution and log link using 130,000 iterations with a thinning interval of 10, a burnin of 30,000, and minimal priors (V=1, nu=0) [@hadfield2014coursenotes]. We will ensure the GLMM shows acceptable convergence (lag time autocorrelation values <0.01; [@hadfield2010mcmc]), and adjust parameters if necessary to meet this criterion. The contribution of each independent variable will be evaluated using the Estimate in the full model.
```{r edgeH, eval=FALSE, warning=FALSE, results='asis', echo=TRUE, include=TRUE}
edge <- read.csv ("/Users/corina/GTGR/data/data_pop.csv", header=T, sep=",", stringsAsFactors=F)
#Separate the sexes
fem <- edge[edge$Sex=="f",]
mal <- edge[edge$Sex=="m",]
#GLMM: response variable = number of parasite individuals per bird
library(MCMCglmm)
prior = list(R=list(R1=list(V=1,nu=0)), G=list(G1=list(V=1,nu=0)))
#females
f1 <- MCMCglmm(NumberParasites ~ Population, random=~AvgTarsusLength, family="poisson", data=fem, verbose=F, prior=prior, nitt=130000, thin=10, burnin=30000)
summary(f1)
#autocorr(f1$Sol) #Did fixed effects converge?
#autocorr(f1$VCV) #Did random effects converge?
#males
m1 <- MCMCglmm(NumberParasites ~ Population, random=~AvgTarsusLength, family="poisson", data=mal, verbose=F, prior=prior, nitt=130000, thin=10, burnin=30000)
summary(m1)
#autocorr(m1$Sol) #Did fixed effects converge?
#autocorr(m1$VCV) #Did random effects converge?
#GLMM: response variable = number of parasite species per bird
#females
f2 <- MCMCglmm(NumberParasiteSpecies ~ Population, random=~AvgTarsusLength, family="poisson", data=fem, verbose=F, prior=prior, nitt=130000, thin=10, burnin=30000)
summary(f2)
#autocorr(f2$Sol) #Did fixed effects converge?
#autocorr(f2$VCV) #Did random effects converge?
#males
m2 <- MCMCglmm(NumberParasiteSpecies ~ Population, random=~AvgTarsusLength, family="poisson", data=mal, verbose=F, prior=prior, nitt=130000, thin=10, burnin=30000)
summary(m2)
#autocorr(m2$Sol) #Did fixed effects converge?
#autocorr(m2$VCV) #Did random effects converge?
#GLMM: response variable = number of NOVEL parasite species per bird
#females
f4 <- MCMCglmm(NumberNovelParasiteSpecies ~ Population, random=~AvgTarsusLength, family="poisson", data=fem, verbose=F, prior=prior, nitt=130000, thin=10, burnin=30000)
summary(f4)
#autocorr(f4$Sol) #Did fixed effects converge?
#autocorr(f4$VCV) #Did random effects converge?
#males
m4 <- MCMCglmm(NumberNovelParasiteSpecies ~ Population, random=~AvgTarsusLength, family="poisson", data=mal, verbose=F, prior=prior, nitt=130000, thin=10, burnin=30000)
summary(m4)
#autocorr(m4$Sol) #Did fixed effects converge?
#autocorr(m4$VCV) #Did random effects converge?
#GLMM: response variable = number of UNCOMMON parasite species per bird
#females
f5 <- MCMCglmm(NumberUncommonParasiteSpecies ~ Population, random=~AvgTarsusLength, family="poisson", data=fem, verbose=F, prior=prior, nitt=130000, thin=10, burnin=30000)
summary(f5)
#autocorr(f5$Sol) #Did fixed effects converge?
#autocorr(f5$VCV) #Did random effects converge?
#males
m5 <- MCMCglmm(NumberUncommonParasiteSpecies ~ Population, random=~AvgTarsusLength, family="poisson", data=mal, verbose=F, prior=prior, nitt=130000, thin=10, burnin=30000)
summary(m5)
#autocorr(m5$Sol) #Did fixed effects converge?
#autocorr(m5$VCV) #Did random effects converge?
```
####*P7: edge & immune activation*
**Analysis:** To be able to detect trade offs, we need to control for condition, therefore we chose the average tarsus length as the representative measure of condition (a random effect). We will conduct a Generalized Linear Mixed Model (GLMM; MCMCglmm function, MCMCglmm package; [@hadfield2010mcmc]) with a Poisson distribution and log link using 130,000 iterations with a thinning interval of 10, a burnin of 30,000, and minimal priors (V=1, nu=0) [@hadfield2014coursenotes]. We will ensure the GLMM shows acceptable convergence (lag time autocorrelation values <0.01; [@hadfield2010mcmc]), and adjust parameters if necessary to meet this criterion. The contribution of each independent variable will be evaluated using the Estimate in the full model.
```{r edgeIactivation, eval=FALSE, warning=FALSE, results='asis', echo=TRUE, include=TRUE}
edge <- read.csv ("/Users/corina/GTGR/data/data_pop.csv", header=T, sep=",", stringsAsFactors=F)
#Separate the sexes
fem <- edge[edge$Sex=="f",]
mal <- edge[edge$Sex=="m",]
#GLMM
library(MCMCglmm)
prior = list(R=list(R1=list(V=1,nu=0)), G=list(G1=list(V=1,nu=0)))
#females
f1 <- MCMCglmm(NumberLeucocytes ~ Population, random=~AvgTarsusLength, family="poisson", data=fem, verbose=F, prior=prior, nitt=130000, thin=10, burnin=30000)
summary(f1)
#autocorr(f1$Sol) #Did fixed effects converge?
#autocorr(f1$VCV) #Did random effects converge?
f2 <- MCMCglmm(NumberHeterophils ~ Population, random=~AvgTarsusLength, family="poisson", data=fem, verbose=F, prior=prior, nitt=130000, thin=10, burnin=30000)
summary(f2)
#autocorr(f2$Sol) #Did fixed effects converge?
#autocorr(f2$VCV) #Did random effects converge?
f3 <- MCMCglmm(NumberImmatureRedBloodCells ~ Population, random=~AvgTarsusLength, family="poisson", data=fem, verbose=F, prior=prior, nitt=130000, thin=10, burnin=30000)
summary(f3)
#autocorr(f3$Sol) #Did fixed effects converge?
#autocorr(f3$VCV) #Did random effects converge?
#males
m1 <- MCMCglmm(NumberLeucocytes ~ Population, random=~AvgTarsusLength, family="poisson", data=mal, verbose=F, prior=prior, nitt=130000, thin=10, burnin=30000)
summary(m1)
#autocorr(m1$Sol) #Did fixed effects converge?
#autocorr(m1$VCV) #Did random effects converge?
m2 <- MCMCglmm(NumberHeterophils ~ Population, random=~AvgTarsusLength, family="poisson", data=mal, verbose=F, prior=prior, nitt=130000, thin=10, burnin=30000)
summary(m2)
#autocorr(m2$Sol) #Did fixed effects converge?
#autocorr(m2$VCV) #Did random effects converge?
m3 <- MCMCglmm(NumberImmatureRedBloodCells ~ Population, random=~AvgTarsusLength, family="poisson", data=mal, verbose=F, prior=prior, nitt=130000, thin=10, burnin=30000)
summary(m3)
#autocorr(m3$Sol) #Did fixed effects converge?
#autocorr(m3$VCV) #Did random effects converge?
```
####*P8: edge & baseline immunity*
**Analysis:** To be able to detect trade offs, we need to control for condition, therefore we chose the average tarsus length as the representative measure of condition (a random effect). We will conduct a Generalized Linear Mixed Model (GLMM; MCMCglmm function, MCMCglmm package; [@hadfield2010mcmc]) with a Poisson distribution and log link using 130,000 iterations with a thinning interval of 10, a burnin of 30,000, and minimal priors (V=1, nu=0) [@hadfield2014coursenotes]. We will ensure the GLMM shows acceptable convergence (lag time autocorrelation values <0.01; [@hadfield2010mcmc]), and adjust parameters if necessary to meet this criterion. The contribution of each independent variable will be evaluated using the Estimate in the full model.
```{r edgeIbaseline, eval=FALSE, warning=FALSE, results='asis', echo=TRUE, include=TRUE}
edge <- read.csv ("/Users/corina/GTGR/data/data_pop.csv", header=T, sep=",", stringsAsFactors=F)
#Separate the sexes
fem <- edge[edge$Sex=="f",]
mal <- edge[edge$Sex=="m",]
#GLMM
library(MCMCglmm)
prior = list(R=list(R1=list(V=1,nu=0)), G=list(G1=list(V=1,nu=0)))
#females
f1 <- MCMCglmm(NumberLymphocytes ~ Population, random=~AvgTarsusLength, family="poisson", data=fem, verbose=F, prior=prior, nitt=130000, thin=10, burnin=30000)
summary(f1)
#autocorr(f1$Sol) #Did fixed effects converge?
#autocorr(f1$VCV) #Did random effects converge?
f2 <- MCMCglmm(LymphocyteHeterophilRatio ~ Population, random=~AvgTarsusLength, family="poisson", data=fem, verbose=F, prior=prior, nitt=130000, thin=10, burnin=30000)
summary(f2)
#autocorr(f2$Sol) #Did fixed effects converge?
#autocorr(f2$VCV) #Did random effects converge?
#males
m1 <- MCMCglmm(NumberLymphocytes ~ Population, random=~AvgTarsusLength, family="poisson", data=mal, verbose=F, prior=prior, nitt=130000, thin=10, burnin=30000)
summary(m1)
#autocorr(m1$Sol) #Did fixed effects converge?
#autocorr(m1$VCV) #Did random effects converge?
m21 <- MCMCglmm(LymphocyteHeterophilRatio ~ Population, random=~AvgTarsusLength, family="poisson", data=mal, verbose=F, prior=prior, nitt=130000, thin=10, burnin=30000)
summary(m2)
#autocorr(m2$Sol) #Did fixed effects converge?
#autocorr(m2$VCV) #Did random effects converge?
```
####*P9: edge & microbiome*
**Analysis:** Generalized Linear Model (GLM; glm function, stats package) with a Poisson distribution and log link. The contribution of the independent variable will be evaluated using the Estimate in the full model.
```{r edgeM, eval=FALSE, warning=FALSE, results='asis', echo=TRUE, include=TRUE}
edge <- read.csv ("/Users/corina/GTGR/data/data_pop.csv", header=T, sep=",", stringsAsFactors=F)
#Separate the sexes
fem <- edge[edge$Sex=="f",]
mal <- edge[edge$Sex=="m",]
#GLM
#female
t1 <- glm(NumberMicrobiomeSpecies ~ Population, family="poisson", data=fem)
#summary(t1)
ta <- summary(t1)
library(xtable)
ta.table <- xtable(ta)
library(knitr)
kable(ta.table, caption="Table 11: Model selection output.", format="html", digits=2)
#male
t2 <- glm(NumberMicrobiomeSpecies ~ Population, family="poisson", data=mal)
#summary(t2)
tb <- summary(t2)
library(xtable)
tb.table <- xtable(tb)
library(knitr)
kable(tb.table, caption="Table 12: Model selection output.", format="html", digits=2)
```
####*Alternative Analyses*
We anticipate that we will want to run additional/different analyses after reading @statrethinkingbook. We will revise this preregistration to include these new analyses before conducting the analyses above.
###F. PLANNED SAMPLE
Great-tailed grackles (n > 200) will be caught in the wild at three field sites across their geographic range: the center of their original range (Utila, Honduras), the middle of the northward expanding edge (Tempe, Arizona USA), and on the northern expanding edge (to be determined). Individuals will be identified using colored leg bands in unique combinations, their data collected (blood, feathers, and biometrics), and then they will be released back to the wild. Some individuals (40-100) will be brought temporarily into aviaries for behavioral testing, and then they will be released back to the wild.
**Sample size rationale**
We will band as many birds as possible throughout the year, and conduct behavioral tests in aviaries (during the non-breeding seasons approximately September through March) and in the wild (year-round) on as many birds as we can at each field site. The minimum sample size will be 200 banded birds and 60 behavior-tested birds in total, however we expect to be able to sample many more.
**Data collection stopping rule**
We will stop collecting data in April 2023 when the current funding ends, or after data have been collected at all three field sites, whichever date comes first.
###G. ETHICS
This research is carried out in accordance with permits from the:
1) US Fish and Wildlife Service (scientific collecting permit number MB76700A-0,1,2)
2) US Geological Survey Bird Banding Laboratory (federal bird banding permit number 23872)
3) Arizona Game and Fish Department (scientific collecting license number SP594338 [2017], SP606267 [2018], and SP639866 [2019])
4) Institutional Animal Care and Use Committee at Arizona State University (protocol number 17-1594R)
5) University of Cambridge ethical review process (non-regulated use of animals in scientific procedures: zoo4/17)
###H. AUTHOR CONTRIBUTIONS
**Logan:** Hypothesis development, study design, materials, data collection, data analysis and interpretation, write up, funding.
**Rowney:** Data collection, sample processing, data analysis and interpretation, editing/revising.
**Bergeron:** Data collection, sample processing, data analysis and interpretation, editing/revising.
**McCune:** Data collection, sample processing, data analysis and interpretation, editing/revising.
**MacPherson:** Hypothesis development (hormones), study design (hormones), data collection, data interpretation, revising/editing.
**Bond (testosterone and glucocorticoids):** Method development, sample processing, data analysis and interpretation, editing/revising.
**Blackwell (immunity):** Hypothesis development, materials/lab space, sample processing, data analysis and interpretation, write up.
**Escalante (haemosporidian parasites and microbiome):** Hypothesis development, data analysis and interpretation, write up.
**Pacheco (haemosporidian parasites and microbiome):** Hypothesis development, materials/lab space, sample processing, data analysis and interpretation, write up.
**Lukas (genetics):** Hypothesis development, data analysis and interpretation, write up, revising/editing.
###I. FUNDING
This research is funded by the Department of Human Behavior, Ecology and Culture at the Max Planck Institute for Evolutionary Anthropology, and, in 2017 through early 2018, by a Leverhulme Early Career Research Fellowship to Logan.
###J. ACKNOWLEDGEMENTS
We thank Melissa Wilson Sayres for sponsoring our affiliations at Arizona State University and lending lab equipment; Kevin Langergraber for serving as local PI on the ASU IACUC; Kristine Johnson for technical advice on great-tailed grackles; Jay Taylor for grackle scouting at Arizona State University; Arizona State University School of Life Sciences Department Animal Care and Technologies for providing space for our aviaries and for their excellent support of our daily activities; Julia Cissewski for tirelessly solving problems involving financial transactions and contracts; Richard McElreath for project support; and our research assistants: Aelin Mayer, Nancy Rodriguez, Brianna Thomas, Aldora Messinger, Elysia Mamola, Michael Guillen, Rita Barakat, Adriana Boderash, Olateju Ojekunle, August Sevchik, Justin Huynh, Jennifer Berens, and Amanda Overholt.
###K. REFERENCES