-
Notifications
You must be signed in to change notification settings - Fork 2
/
02-micromap.Rmd
1510 lines (1306 loc) · 74.6 KB
/
02-micromap.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
# Linked Micromap Plots via the **micromap** R Package {#Ch2}
\chapterauthor{J{\"u}rgen Symanzik, Marcus W. Beck, Michael G. McManus}
The **micromap**\index{R Packages!micromap} R package [@PaOl2024],
accessible at https://cran.r-project.org/web/packages/micromap/index.html,
will be introduced in this chapter. The reader will learn how to make use of the
four main steps that are required to create a
basic linked micromap plot\index{Linked micromap plot} via this R package.
Details will be provided how to optimize and fine-tune such a basic plot
into a publication-worthy final linked micromap plot.\index{Linked micromap plot}
Example linked micromap plots\index{Linked micromap plot} are created
for education and poverty data for the 50 states (and Washington, D.C.) of the
United States (U.S.) and for watersheds in West Virginia (one of the 50 U.S. states).
## Introduction {#Ch2-Introduction}
As discussed in Chapter \@ref(Ch1), linked micromap plots\index{Linked micromap plot}
were originally presented at the Joint Statistical Meetings (JSM)
in Chicago, Illinois, in 1996 [@OCCP1996]. They quickly gained popularity among
researchers at United States (U.S.) Federal Agencies such as
the U.S. Department of Agriculture – National Agricultural Statistics Service (USDA–NASS),
various branches of the U.S. Environmental Protection Agency (USEPA),
the National Cancer Institute (NCI),
the U.S. Census Bureau, and
the U.S. Bureau of Labor Statistics (BLS).
Early main applications of linked micromap plots\index{Linked micromap plot}
can be found in the environmental field [@COPC1998;@COPC2000]
and the medical field [@CCBPZ2002;@CBPZL2003].
While early linked micromap plots\index{Linked micromap plot} were created via
S-Plus\index{S-Plus} and Java, later ones were created in R (see @SC2013 for an overview).
However, even with the availability of R code that was provided in support
of @CP2010, creating linked micromap plots\index{Linked micromap plot}
was challenging which considerably limited their use.
In fact, @PWMO2012 observed that
"Producing LMplots [...] has
typically been somewhat difficult, and therefore LMplots
have seen limited use." However, linked micromap plots\index{Linked micromap plot}
continued to play an important role at the USEPA. Eventually, a team of
researchers including Anthony R. Olsen, Quinn C. Payton, Michael G. McManus,
Marc H. Weber, and Thomas M. Kincaid, all originally with the USEPA
in Corvallis, Oregon, started to develop an R package for
linked micromap plots.\index{Linked micromap plot}
First uses of this package can be seen in poster presentations in
May 2012 [@PWMO2012] and April 2013 [@PWMKO2013].
At about the same time, in December 2012, the first publicly available
version of the **micromap**\index{R Packages!micromap} R package (version 1.5),
was released to CRAN [@PaOl2012].
Eventually, Marcus W. Beck, then also with the USEPA in Gulf Breeze, Florida,
joined the team of the original developers with the release of version 1.9.3
of this R package in February 2018 [@PaOl2018] and has also served as the maintainer of
this R package since then.
As frequently happens in an open software environment such as R,
the **micromap**\index{R Packages!micromap} R package is not the only
R package for linked micromap plots.\index{Linked micromap plot}
Independently, but motivated by similar past uses and resources,
the **micromapST**\index{R Packages!micromapST} R package
has been developed in parallel and its first version (version 1.0)
was released to CRAN in June 2013 [@CP2013CRAN], i.e.,
only a few months after the first release of the
**micromap**\index{R Packages!micromap} R package.
The **micromapST**\index{R Packages!micromapST} R package will be extensively discussed in
Chapter \@ref(Ch3).
From a user's perspective, there are little differences in the
appearance and quality of the final linked micromap plots\index{Linked micromap plot}
that can be created by these two R packages.
There are, however, differences in details in how linked micromap plots\index{Linked micromap plot}
are created in each package that can be important for users.
Perhaps the two biggest differences between the two packages are that
the **micromap**\index{R Packages!micromap} R package makes it easy to bring
in one's own boundary files, in particular external shapefiles,\index{Shapefile}
while the **micromapST**\index{R Packages!micromapST} R package
initially supports a larger number of glyph types.
However, even these differences are relatively minor as users can
create their own plot types for use in the **micromap**\index{R Packages!micromap} R package
as discussed in Chapter \@ref(Ch5) and
incorporate external shapefiles,\index{Shapefile} into the
**micromapST**\index{R Packages!micromapST} R package as discussed in Chapter \@ref(Ch4b).
Ultimately, the decision is up to the analyst on which of these two R packages
to use for the construction of linked micromap plots.\index{Linked micromap plot}
The remainder of this chapter is organized as follows:
Section \@ref(Ch2-Steps), will introduce the
four main steps that are required to create a
basic linked micromap plot\index{Linked micromap plot} via the **micromap**\index{R Packages!micromap} R package.
Examples in Section \@ref(Ch2-Example1) and Section \@ref(Ch2-Example2) will outline how to apply these steps
to data for the for the 50 U.S. States and Washington, D.C. and
to data for watersheds in West Virginia, respectively.
This chapter concludes with a summary and suggestions for further reading
in Section \@ref(Ch2-SummaryFurtherReading).
## Steps to Create a Linked Micromap Plot with the **micromap**\index{R Packages!micromap} R Package {#Ch2-Steps}
```{r Ch2-flowchart, fig.cap = 'Workflow to create a linked micromap plot with the **micromap**\\index{R Packages!micromap} R package (Diagram created with the **DiagrammeR**\\index{R Packages!DiagrammeR} R package [@Iannone2022]).', fig.width = 5, fig.height = 5, echo = FALSE}
library(DiagrammeR)
DiagrammeR::grViz(
diagram = "digraph flowchart {
node [fontname = arial, shape = oval, color = grey, style = filled]
tab1 [label = '@@1']
tab2 [label = '@@2']
tab3 [label = '@@3']
tab4 [label = '@@4']
tab1 -> tab2 -> tab3 -> tab4;
}
[1]: '1. Identifying and Geoprocessing of Spatial Boundary Data'
[2]: '2. Linking Spatial Boundary Data and Statistical Data'
[3]: '3. Creating a Draft Linked Micromap Plot'
[4]: '4. Refining the Linked Micromap Plot'
"
)
```
Four main steps, shown in Figure \@ref(fig:Ch2-flowchart), are needed to create a
linked micromap plot\index{Linked micromap plot}
with the **micromap**\index{R Packages!micromap} R package:
1. **Identifying and Geoprocessing of Spatial Boundary Data**
(see Sections \@ref(Ch2-Identifying) and \@ref(Ch2-IdentifyingWV) for details):
In addition to a data frame that contains the statistical data,
the user must identify a data structure that contains the spatial boundary data for the region and subregions that are
colored in the maps. This spatial boundary data typically comes in the form of a
SpatialPolygonsDataFrame\index{SpatialPolygonsDataFrame}. In this chapter, we will work with ready-to-use
boundary files. In Chapter \@ref(Ch4), we will discuss how to make use of boundary files that are
provided as external shapefiles\index{Shapefiles}. In particular, we will see in that chapter how
to simplify complex boundaries, enlarge small subregions in the maps, and move subregions closer that are far
from the main area of the map.
2. **Linking Spatial Boundary Data and Statistical Data**
(see Sections \@ref(Ch2-Linking) and \@ref(Ch2-LinkingWV) for details):
To link spatial boundary data and statistical data,
the SpatialPolygonsDataFrame\index{SpatialPolygonsDataFrame} first has to be transformed into a regular data frame
via the `create_map_table()` function.
Next, we have to identify one variable from the statistical data frame
and one variable from the newly created data frame with the boundary information that
allow us to link statistical data and boundary data for each subregion.
This step is not required when the spatial boundary data and statistical data are adequately
stored in a simple features\index{Simple features} (sf) format [@OGC2022] instead of a
SpatialPolygonsDataFrame.\index{SpatialPolygonsDataFrame}
3. **Creating a Draft Linked Micromap Plot**
(see Sections \@ref(Ch2-Creating) and \@ref(Ch2-CreatingWV) for details):
While not necessary, it is always a good idea to first create a minimal
linked micromap plot\index{Linked micromap plot} to ensure that the statistical data and boundary data
are matching and a correct draft linked micromap plot\index{Linked micromap plot} is created. Skipping this
step and trying to create a complex linked micromap plot\index{Linked micromap plot} immediately may complicate
debugging the R code.
4. **Refining the Linked Micromap Plot**
(see Sections \@ref(Ch2-Refining) and \@ref(Ch2-RefiningWV) for details):
Once a draft linked micromap plot\index{Linked micromap plot} has been
created, this plot usually needs fine-tuning of its appearance and plot aesthetics, e.g., modification of colors,
change of the layout and of perceptual groups, addition of labels and legends, and possibly the addition of additional
statistical variables or changes to different graph types for some of the variables.
## Example 1: A Linked Micromap Plot for the 50 U.S. States and Washington, D.C. {#Ch2-Example1}
In this first linked micromap plot\index{Linked micromap plot}
example created with the **micromap**\index{R Packages!micromap} R package, we work with the
_USstates_\index{Datasets!USstates} and _edPov_\index{Datasets!edPov} datasets
from the **micromap**\index{R Packages!micromap} R package.
We follow the four steps outlined in Section \@ref(Ch2-Steps).
### Identifying and Geoprocessing of Spatial Boundary Data {#Ch2-Identifying}
_USstates_\index{Datasets!USstates} is a SpatialPolygonsDataFrame\index{SpatialPolygonsDataFrame} for
the 50 U.S. states (and Washington, D.C.)
that was created for use with linked micromap plots\index{Linked micromap plot}. Notably, the boundaries
of many of the states have been simplified, Alaska and Hawaii have been moved closer to the
contiguous 48 states (and also have been resized), and Washington, D.C. has been pulled out of the
main map, placed further to the east, and also has been enlarged
as shown in Figure \@ref(fig:Ch2-USstates). In the following R code,
we first load the **micromap**\index{R Packages!micromap} R package and the
_USstates_\index{Datasets!USstates} dataset, verify that this object indeed is a
SpatialPolygonsDataFrame\index{SpatialPolygonsDataFrame}, then look
at some of the data in the `data` component of this object, and finally plot it.
For figures that only contain maps, it is often helpful to remove all
margin space on all four sides of the plot
via `par(mar = c(0, 0, 0, 0))`
to maximize the plot, i.e., the actual map.
When loading the **micromap**\index{R Packages!micromap} R package,
the reader will notice that it depends on the
**RColorBrewer**\index{R Packages!RColorBrewer} [@Neuwirth2022],
**sp**\index{R Packages!sp} [@PeBi2022],
and **sf**\index{R Packages!sf} [@Pebesma2022] R packages.
These packages are dependencies for **micromap**\index{R Packages!micromap} and
are installed automatically with the package.
```{r Ch2-USstates, fig.cap = 'Map representation of the _USstates_\\index{Datasets!USstates} spatial boundary dataset for the United States that frequently is used as the basis for linked micromap plots\\index{Linked micromap plot} that are created with the **micromap**\\index{R Packages!micromap} R package.', fig.width = 7, fig.height = 4}
library(micromap)
data(USstates)
class(USstates)
head(USstates@data)
par(mar = c(0, 0, 0, 0))
plot(USstates)
```
### Linking Spatial Boundary Data and Statistical Data {#Ch2-Linking}
In this step, the SpatialPolygonsDataFrame\index{SpatialPolygonsDataFrame} first is transformed
into a regular data frame for use in a linked micromap plot\index{Linked micromap plot}
via the `create_map_table()` function. We have to indicate a variable from the `data`
component of the SpatialPolygonsDataFrame\index{SpatialPolygonsDataFrame} object that
can be used as an ID column. A matching ID column must also be identified in the statistical
dataset for linking with the spatial boundary dataset. For the _USstates_\index{Datasets!USstates}
dataset, this is usually the `ST` variable that contains the 51 abbreviations for the 50 U.S.
states (and for Washington, D.C.).
```{r Ch2-linkingdataframesUSstates}
head(USstates@data$ST)
state_polys_table <- create_map_table(
tmp.map = USstates,
IDcolumn = "ST"
)
class(state_polys_table)
dim(state_polys_table)
names(state_polys_table)
```
The resulting `state_polys_table` is a regular data frame that will be used for creating a
linked micromap plot\index{Linked micromap plot} in the next step.
`state_polys_table` consists of `r dim(state_polys_table)[1]` rows. This implies that the map is based
on `r dim(state_polys_table)[1]` line segments.
_edPov_\index{Datasets!edPov} is a data frame that contains
education and poverty level data for the 50 U.S. states (and Washington, D.C.).
This data frame has 51 rows, one for each of the 50 U.S. states (and one for Washington, D.C.).
We have to identify one variable from this data frame that can be used for linking the
two datasets. Here, this is the `StateAb` variable.
The last expression in the following R code verifies that there is indeed at least one matching ID
in the `state_polys_table` data frame for each ID in the _edPov_\index{Datasets!edPov} dataset.
Missing IDs in the statistical data frame may prevent data from appearing on the
maps. Similarly, mismatching identifiers in the statistical data frame may also prevent data from
being shown on the maps, e.g., if the statistical data frame uses `D.C.` as ID while
the spatial data frame uses `DC`. Here, everything is matching.
```{r Ch2-linkingdataframesedPov}
data(edPov)
dim(edPov)
head(edPov)
head(edPov$StateAb)
all(sort(edPov$StateAb) == sort(unique(state_polys_table$ID)))
```
### Creating a Draft Linked Micromap Plot {#Ch2-Creating}
Now we can create a minimal draft linked micromap plot\index{Linked micromap plot} using the
`mmplot()` function, based on the previously created
`state_polys_table` data frame and the _edPov_\index{Datasets!edPov} dataset.
In our function call, seven arguments are required, none of which have a default value.
For all other arguments of this function, the default settings will be used here.
The statistical data (_edPov_\index{Datasets!edPov}) is assigned to the `stat.data` argument
and the spatial boundary data (in the `state_polys_table` data frame) is assigned to the `map.data` argument.
The `map.link` argument is needed to link the statistical data
and the spatial boundary data. A vector with the names of the two linking variables
identified in the previous step is needed for this argument.
The first variable name (`StateAb`) must come from the statistical data (here _edPov_\index{Datasets!edPov})
and the second variable name (`ID`) must come from the `state_polys_table` data frame that
contains the spatial boundary information.
Changing the order of these two variable names typically results in an error.
The `panel.types` and `panel.data` arguments are closely related.
As the name suggests, `panel.types` is a vector
that specifies the layout of the columns of panels\index{Panel}
in the linked micromap plot.\index{Linked micromap plot}
Here, we have panels\index{Panel} with a `dot_legend` in the first column, `labels` in the second column,
the statistical data represented as dotplots\index{Dotplot} (`dot`) in the third and fourth columns,
and the micromaps in the fifth, i.e., final, (`map`) column.
This is matched with a list of data that is used for each of the five columns of panels.\index{Panel}
A list is necessary for this argument as some of the data itself can be lists as we will
see in the R code for Figures \@ref(fig:Ch2-refining1WV)
and \@ref(fig:Ch2-refining2WV) later on.
The `dot_legend` simply shows a plotting symbol in a certain color that represents that row in
the linked micromap plot\index{Linked micromap plot}. Thus, no further data is needed and `NA`
is assigned. `labels` requires some text argument, typically some identifiers of the subregions
in the maps. Here `state` from _edPov_ is used.
The variables `pov` and `ed` from _edPov_ are used for the statistical displays in columns three and four.
It should be noted that all data specified in `panel.data` by default is taken from the
data frame specified in the `stat.data` argument.
The fifth and final column contains the micromaps. Their boundaries are obtained from
the `map.data` data frame. Thus, no further data has to be specified here and `NA`
is assigned instead.
Two more arguments have to be specified: `ord.by` specifies the sorting variable of the
rows in the linked micromap plot\index{Linked micromap plot}. Here, `pov`
from the `stat.data` argument is used. The sorting of the rows goes from smallest (at the top) to
largest (at the bottom). Finally, `grouping` specifies the number of rows
in each of the perceptual groups. If a single integer value is provided,
that value is used for all perceptual groups. If a vector of integer values is provided,
each perceptual group may have a different number of rows as
shown in the R code for Figure \@ref(fig:Ch2-refining3).
Here, `grouping` is set to `5`, meaning there are five rows of data in each perceptual group.
The resulting draft linked micromap plot\index{Linked micromap plot} is shown in
Figure \@ref(fig:Ch2-creatingdraft). It is noteworthy that there are eleven perceptual groups
overall --- ten with five subregions and one with just one subregion. This is because
there are 51 subregions overall: The 50 U.S. states and Washington, D.C.
This results in only one row of data and one subregion highlighted in the final (bottom) perceptual
group. There is no automatic balancing of the number of rows in each perceptual group.
A layout such as the default one shown in Figure \@ref(fig:Ch2-creatingdraft)
should be avoided in general. Table \@ref(tab:Ch1-PartitioningTable) in Chapter \@ref(Ch1) provides
suggestions how to group data for various numbers of subregions.
We will also address this as one of the refinement steps in the next section.
We abstain from interpreting this figure at this stage, but we will provide some
helpful interpretation once we have created the last refined version of this figure
at the end of the next section. However, we encourage the reader to examine the draft maps
at this time and determine whether any spatial patterns may be visible and to assess the
statistical relationship between the two variables shown in the third and fourth columns of the plot.
```{r Ch2-creatingdraft, fig.cap = 'Draft linked micromap plot\\index{Linked micromap plot}, based on the _edPov_\\index{Datasets!edPov} dataset.', fig.width = 7, fig.height = 9}
mmplot(
stat.data = edPov,
map.data = state_polys_table,
map.link = c("StateAb", "ID"),
panel.types = c("dot_legend", "labels", "dot", "dot", "map"),
panel.data = list(NA, "state", "pov", "ed", NA),
ord.by = "pov",
grouping = 5
)
```
### Refining the Linked Micromap Plot {#Ch2-Refining}
We continue with the draft linked micromap plot\index{Linked micromap plot} from the previous
section and refine it in multiple small steps. This refinement process should only be started once
functional R code has been obtained in the previous step and an initial
linked micromap plot\index{Linked micromap plot} has been created.
First, we remove the eleventh perceptual group (with just one subregion) and introduce
a median row via `median.row = TRUE` instead. A median row is often a good solution if the
number of subregions is odd such as for the 50 U.S. states (and Washington, D.C.).
Here, Wyoming is the state shown in the median row. It has the
26$^{th}$ highest (or lowest) value for the sorting variable, i.e., `pov`.
It does not appear in a map by itself, but rather is added to the perceptual
groups above and below the median row in a neutral color,
thus increasing the number of subregions shown in each of these two maps by one (i.e., six here).
Also, we reverse the sorting order via `rev.ord = TRUE`.
Now the sorting of the rows goes from largest (at the top) to
smallest (at the bottom).
The resulting linked micromap plot\index{Linked micromap plot} is shown in
Figure \@ref(fig:Ch2-refining1).
While we keep `pov` as the sorting variable in this first refined version,
the reader is encouraged to use `ed` as the sorting variable to see how
the spatial patterns highlighted in the maps change. This can be done
for the original sorting order or for the reversed sorting order.
The rows can even be sorted by `region` or alphabetically
by `state` or `StateAb` even though such an alphabetical sorting
in most cases is not very meaningful.
```{r Ch2-refining1, fig.cap = 'First refined linked micromap plot\\index{Linked micromap plot}, based on the _edPov_\\index{Datasets!edPov} dataset. Main changes are the introduction of a median row instead of the eleventh perceptual group and the reverse ordering of the `pov` data in the first statistical graphics column.', fig.width = 7, fig.height = 9}
mmplot(
stat.data = edPov,
map.data = state_polys_table,
map.link = c("StateAb", "ID"),
panel.types = c("dot_legend", "labels", "dot", "dot", "map"),
panel.data = list(NA, "state", "pov", "ed", NA),
ord.by = "pov",
rev.ord = TRUE,
grouping = 5,
median.row = TRUE
)
```
We continue with modifications to the first refined linked micromap plot\index{Linked micromap plot}.
Next, we change the order of the two statistical graphics columns,
i.e., we place `ed` to the left of `pov` and use `ed` as the sorting variable (in reverse order).
While any column of the linked micromap plot\index{Linked micromap plot} or even
variables not shown can be used as sorting variables, in most cases,
the first (leftmost) statistical graphics column is used for sorting.
Moreover, we place the maps on the left side of the plot. As previously stated,
the `panel.types` and `panel.data` arguments of the `mmplot()` function
are closely related. Thus, if we change the order of one, the order of the
other one has to be changed accordingly.
The resulting linked micromap plot\index{Linked micromap plot} is shown in
Figure \@ref(fig:Ch2-refining2).
As stated in Section \@ref(Ch1-LinkedMicromapPlots),
there is no strong recommendation where the column with the maps should be placed.
```{r Ch2-refining2, fig.cap = 'Second refined linked micromap plot\\index{Linked micromap plot}, based on the _edPov_\\index{Datasets!edPov} dataset. Main changes are related to the order of the five columns in the plot. Most notable, the map column is shown on the left here.', fig.width = 7, fig.height = 9}
mmplot(
stat.data = edPov,
map.data = state_polys_table,
map.link = c("StateAb", "ID"),
panel.types = c("map", "dot_legend", "labels", "dot", "dot"),
panel.data = list(NA, NA, "state", "ed", "pov"),
ord.by = "ed",
rev.ord = TRUE,
grouping = 5,
median.row = TRUE
)
```
We continue making changes to the second refined linked micromap plot\index{Linked micromap plot}.
We change the grouping to nine perceptual groups overall (and no median row)
via `grouping = c(6, 6, 6, 6, 3, 6, 6, 6, 6)` and `median.row = FALSE` (which is the default and could be omitted)
and vertically align the rows in each perceptual group via `vertical.align = "center"`.
This grouping is not a recommended partitioning from Table \@ref(tab:Ch1-PartitioningTable) in Chapter \@ref(Ch1)
and is mostly done for experimental purposes here.
Finally, we start making changes to individual columns of the plot via
the `panel.att` argument. Here, the third column that shows the `labels` is aligned on the left.
The resulting linked micromap plot\index{Linked micromap plot} is shown in
Figure \@ref(fig:Ch2-refining3).
```{r Ch2-refining3, fig.cap = 'Third refined linked micromap plot\\index{Linked micromap plot}, based on the _edPov_\\index{Datasets!edPov} dataset. Main changes are related to perceptual groups with different numbers of subregions and the vertical alignment of rows in the middle of the plot. Also, `labels` are aligned on the left.', fig.width = 7, fig.height = 9}
mmplot(
stat.data = edPov,
map.data = state_polys_table,
map.link = c("StateAb", "ID"),
panel.types = c("map", "dot_legend", "labels", "dot", "dot"),
panel.data = list(NA, NA, "state", "ed", "pov"),
ord.by = "ed",
rev.ord = TRUE,
grouping = c(6, 6, 6, 6, 3, 6, 6, 6, 6),
median.row = FALSE,
vertical.align = "center",
panel.att = list(list(3, align = "left"))
)
```
After this experiment with a different grouping, we revert back to the more traditional grouping
of ten perceptual groups with five rows each and a median row that is frequently used
for the 50 U.S. states (and Washington, D.C.)
as suggested as Partitioning 1 from Table \@ref(tab:Ch1-PartitioningTable) in Chapter \@ref(Ch1).
Next, the choice of colors
for the subregions in each map (and thus for the `dot_legend` appearance in the
`panel.type` argument as well) is discussed. The default setting for the `colors` argument
makes use of `max(grouping)` different colors from a spectral color scheme.
Thus, when `grouping = 5` as in Figures \@ref(fig:Ch2-creatingdraft)-\@ref(fig:Ch2-refining2),
a five-class spectral color scheme\index{Color scheme!Spectral} is selected,
whereas a six-class spectral color scheme\index{Color scheme!Spectral} is
selected when `grouping = c(6, 6, 6, 6, 3, 6, 6, 6, 6)` as in Figure \@ref(fig:Ch2-refining3).
Historically, many linked micromap plots,\index{Linked micromap plot}
e.g., in @COCPC1998 and @WCCBP2002,
made use of rainbow colors\index{Color scheme!Rainbow colors} that could be obtained
via the `colors = c("red", "orange", "green", "blue", "purple")` setting of the
`colors` argument. While these colors work well for readers with normal color vision,
they may not work well for readers with certain types of color vision deficiencies.
Instead, some color schemes that are colorblind safe are better suited for such readers.
Options are single-hue or multi-hue sequential color schemes\index{Color scheme!Sequential}
or selected divergent color schemes\index{Color scheme!Divergent}.
Such color schemes can be obtained from the
**RColorBrewer**\index{R Packages!RColorBrewer} R package [@Neuwirth2022].
The reader is encouraged to read more about the theoretical background
of these color schemes in @BHH2003 and @HaBr2003 and experiment with different settings
at the supporting web page at https://colorbrewer2.org/.
@SDWPM2014 used a five-class greyscale sequential color scheme\index{Color scheme!Sequential}
from **RColorBrewer**\index{R Packages!RColorBrewer} in reverse sorting (where
the darkest grey color comes first) for publication in a greyscale publication,\index{Colors!Greyscale publication}
obtained via `colors = RColorBrewer::brewer.pal(n = 5, name = "Greys")[5:1]`.
@SBDSS2016 used a five-class divergent red-yellow-blue (RdYlBu) color scheme\index{Color scheme!Divergent}
from **RColorBrewer**\index{R Packages!RColorBrewer}
that is colorblind safe\index{Colors!Colorblind safe} and print friendly,\index{Colors!Print friendly}
obtained via `colors = RColorBrewer::brewer.pal(n = 5, name = "RdYlBu")`.
In the linked micromap plot\index{Linked micromap plot} shown in
Figure \@ref(fig:Ch2-refining4), we use a
five-class divergent brown-blue-green (BrBG) color scheme\index{Color scheme!Divergent}
from **RColorBrewer**\index{R Packages!RColorBrewer}
that is also colorblind safe\index{Colors!Colorblind safe} and print friendly,\index{Colors!Print friendly}
obtained via `colors = RColorBrewer::brewer.pal(n = 5, name = "BrBG")`.
```{r Ch2-refining4, fig.cap = 'Fourth refined linked micromap plot\\index{Linked micromap plot}, based on the _edPov_\\index{Datasets!edPov} dataset. Main changes are the use of a divergent brown-blue-green color scheme\\index{Color scheme!Divergent} and the conversion back to the traditional grouping for 51 subregions.', fig.width = 7, fig.height = 9}
mmplot(
stat.data = edPov,
map.data = state_polys_table,
map.link = c("StateAb", "ID"),
panel.types = c("map", "dot_legend", "labels", "dot", "dot"),
panel.data = list(NA, NA, "state", "ed", "pov"),
ord.by = "ed",
rev.ord = TRUE,
grouping = 5,
median.row = TRUE,
colors = RColorBrewer::brewer.pal(n = 5, name = "BrBG"),
panel.att = list(list(3, align = "left"))
)
```
So far, labels for the columns and titles for the statistical graphics columns are not shown.
Tic marks and tic mark labels, in particular in the second statistical graphics column, could be improved,
background colors in the statistical graphics columns and maps could be modified,
font and symbol sizes could be modified, and the widths of the columns could be adjusted.
All of this is done via the `panel.att` argument that controls the panel specific attributes of
each column in the linked micromap plot.\index{Linked micromap plot}
The content of this argument typically is a list of lists where the attributes
for each column of the linked micromap plot\index{Linked micromap plot} are modified via a separate list.
These inner lists are numbered from 1 to the number of elements in the `panel.types` vector
where `1` is related to `map`, `2` to `dot_legend`, `3` to `labels`, and `4` and `5` to `dot`, i.e.,
the dotplots\index{Dotplot} in the two statistical graphics columns,
in the linked micromap plot\index{Linked micromap plot} shown in
Figure \@ref(fig:Ch2-refining5).
We leave it to the reader to further experiment with the different elements in these lists.
If the purpose of a certain element is not immediately obvious, it is useful
to considerably increase or decrease the numeric value of that element or change the color to
`red` or `yellow` to highlight that element.
Here, we only want to explain the purpose of the `fill.regions` element with the matching `header` element
in the list for column `1`, i.e., the `map` column:
The setting `fill.regions = "aggregate"` (which in fact is the default setting)
fills in the subregions from all previous perceptual groups in the
subsequent perceptual groups. This filing proceeds from
the top perceptual group to the bottom perceptual group by sequentially
filling the subregions that have already been displayed.
Thus, in the map for the final perceptual group at the bottom, all subregions have been filled.
The text in the `header` is used to communicate this information to the reader.
Alternatively, the setting `fill.regions = "with data"` only fills those subregions
in a map that actually show data in that perceptual group.
No additional subregions are filled in any of the maps.
Another setting for `fill.regions` is discussed in the next refinement step.
```{r Ch2-refining5, fig.cap = 'Fifth refined linked micromap plot\\index{Linked micromap plot}, based on the _edPov_\\index{Datasets!edPov} dataset. Main changes are related to the panel specific attributes.', fig.width = 7, fig.height = 9}
mmplot(
stat.data = edPov,
map.data = state_polys_table,
map.link = c("StateAb", "ID"),
panel.types = c("map", "dot_legend", "labels", "dot", "dot"),
panel.data = list(NA, NA, "state", "ed", "pov"),
ord.by = "ed",
rev.ord = TRUE,
grouping = 5,
median.row = TRUE,
colors = RColorBrewer::brewer.pal(n = 5, name = "BrBG"),
panel.att = list(
list(
1,
header = "Light Gray Means\nPreviously Displayed",
map.all = TRUE,
fill.regions = "aggregate",
active.border.color = "black",
active.border.size = 1.2,
inactive.border.color = gray(0.7),
inactive.border.size = 1,
panel.width = 0.85
),
list(
2,
point.type = 20,
point.border = TRUE,
point.size = 2,
panel.width = 1.0
),
list(
3,
header = "States",
align = "left",
text.size = 0.9,
panel.width = 0.75
),
list(
4,
header = "Percent Adults With\n4+ Years of College",
graph.bgcolor = "lightgray",
point.size = 1.5,
xaxis.ticks = list(10, 20, 30, 40),
xaxis.labels = list(10, 20, 30, 40),
xaxis.title = "Percent"
),
list(
5,
header = "Percent Living Below\nPoverty Level",
graph.bgcolor = "lightgray",
point.size = 1.5,
xaxis.ticks = list(5, 10, 15, 20),
xaxis.labels = list(5, 10, 15, 20),
xaxis.title = "Percent"
)
)
)
```
In the final refined linked micromap plot\index{Linked micromap plot} shown in
Figure \@ref(fig:Ch2-refining6), we make three more types of changes.
First, we make use of the `labeling::extended()` function of
the **labeling**\index{R Packages!labeling} R package [@Talbot2020]. This package is not
included with **micromap**\index{R Packages!micromap} and it must be installed separately.
This function is provided with the minimum and maximum values of a variable and the tentative
number of tic marks and tic marks labels for that variable and it then creates a vector with
near-optimal axis labels. The argument `m` is used as a guideline for the number of axis labels,
but the actual number of near-optimal axis labels may differ slightly. The reader is encouraged to experiment with
`m = 2` to `m = 6` in the R code below.
Second, we use the setting `fill.regions = "two ended"`. This setting makes most sense when
`median.row = TRUE` (as is the case here) and the focus of the maps is to
indicate which subregions are above or below the median value
of the variable specified in the `ord.by` argument (here `ed`).
Similar to the setting `fill.regions = "aggregate"`,
the subregions from all previous perceptual groups are filled in the
subsequent perceptual groups. This filling proceeds from
the top perceptual group to the median row
and from the bottom perceptual group to the median row
by sequentially
filling the subregions that have already been displayed on the more extreme ends.
The text in the `header` has been updated to communicate this information to the reader.
Third, we reduce the margin space between some of the columns via the
`right.margin` and `left.margin` settings. Negative values are allowed.
Fine-tuning the margin spacing and the widths of the columns can require
a few iterations. The reader always should check carefully that no identifiers
are truncated or overprinted, in particular that no letters
from the longest identifier (here `Washington D.C.`) are cut off.
```{r Ch2-refining6, fig.cap = 'Sixth (and final) refined linked micromap plot\\index{Linked micromap plot}, based on the _edPov_\\index{Datasets!edPov} dataset. Main changes are related to labeling, the coloring of perceptual groups above and below the median row, and the column spacing.', fig.width = 7, fig.height = 9}
library(labeling)
mmplot(
stat.data = edPov,
map.data = state_polys_table,
map.link = c("StateAb", "ID"),
panel.types = c("map", "dot_legend", "labels", "dot", "dot"),
panel.data = list(NA, NA, "state", "ed", "pov"),
ord.by = "ed",
rev.ord = TRUE,
grouping = 5,
median.row = TRUE,
colors = RColorBrewer::brewer.pal(n = 5, name = "BrBG"),
panel.att = list(
list(
1,
header = "Two-ended\nCumulative Maps",
map.all = TRUE,
fill.regions = "two ended",
active.border.color = "black",
active.border.size = 1.2,
inactive.border.color = gray(0.7),
inactive.border.size = 1,
panel.width = 0.85
),
list(
2,
point.type = 20,
point.border = TRUE,
point.size = 2,
panel.width = 1.0
),
list(
3,
header = "States",
align = "left",
right.margin = 0,
left.margin = -1,
text.size = 0.9,
panel.width = 0.75
),
list(
4,
header = "Percent Adults With\n4+ Years of College",
graph.bgcolor = "lightgray",
right.margin = 0,
left.margin = -0.6,
point.size = 1.5,
xaxis.ticks = as.list(labeling::extended(
dmin = min(edPov$ed),
dmax = max(edPov$ed),
m = 5
)),
xaxis.labels = as.list(labeling::extended(
dmin = min(edPov$ed),
dmax = max(edPov$ed),
m = 5
)),
xaxis.title = "Percent"
),
list(
5,
header = "Percent Living Below\nPoverty Level",
graph.bgcolor = "lightgray",
right.margin = 0.25,
left.margin = -0.6,
point.size = 1.5,
xaxis.ticks = as.list(labeling::extended(
dmin = min(edPov$pov),
dmax = max(edPov$pov),
m = 4
)),
xaxis.labels = as.list(labeling::extended(
dmin = min(edPov$pov),
dmax = max(edPov$pov),
m = 4
)),
xaxis.title = "Percent"
)
)
)
```
What remains to be done is a summary and interpretation of the
final refined linked micromap plot\index{Linked micromap plot} shown in Figure \@ref(fig:Ch2-refining6).
This plot shows dotplots of two statistical variables, the percentage of adults with four or more years of college
(in the first statistical graphics column which is the fourth column overall)
and the percentage living below poverty level (in the second statistical graphics column
which is the fifth column overall) in the 50 U.S. states and Washington, D.C.
The percentage of adults with four or more years of college is used as the sorting variable
for the rows in the plot -- with highest percentages shown at the top and lowest percentages
shown at the bottom.
The map panel in the first column shows some noticeable, but not very strong spatial patterns.
Highest percentages of adults with four or more years of college can be found in the
northeastern states. In fact, eight of the top-10 states are located in the northeast,
with Washington, D.C., having the highest percentage with almost 40%. When looking at the other
maps above the median state (here, Arizona), additional eastern states, but also several western
states can be seen.
When looking at the two maps at the bottom of the plot, mostly southern states can be seen.
West Virginia is the state with the lowest percentage of only about 15%
of adults with four or more years of college. Overall,
primarily southern and central states can be seen in the maps below the median state.
The percentage living below poverty level (in the fifth column overall) shows a different pattern.
States with a high percentage of adults with four or more years of college have
a low percentage living below poverty level.
Visually, the dots in the fourth and fifth column diverge, forming some crude caret shape
(resembling an upside down V-shape).
Washington, D.C., is a major outlier as it has the highest
percentage of adults with four or more years of college (almost 40%)
but also the highest percentage living below poverty level (more than 20%).
Another interesting state is New Mexico with
an above median percentage of adults with four or more years of college,
but with the fourth highest percentage living below poverty level (about 18%).
As expected, the overall correlation between these two variables is negative,
but given these two major outliers and several minor outliers
(such as Indiana and Nevada that both have a relatively low percentage living below poverty level
despite being among the bottom-10 states with respect to the percentage of
adults with four or more years of college), the (negative) correlation is relatively weak.
The correlation coefficient $r$ is
only `r round(cor(edPov$ed, edPov$pov), digits = 2)`.
## Example 2: A Linked Micromap Plot for Watersheds in West Virginia {#Ch2-Example2}
In this second linked micromap plot\index{Linked micromap plot}
example created with the **micromap**\index{R Packages!micromap} R package, we work with the
_WV_Watershed_\index{Datasets!WV\_Watershed} dataset
from the **micromapExtra**\index{R Packages!micromapExtra} R package.
We follow the four general steps outlined in Section \@ref(Ch2-Steps) again.
However, there are several differences compared to the first example in Section \@ref(Ch2-Example1).
First, we work with external shapefiles\index{Shapefiles} [@ESRI1998]
that contain both the boundary for watersheds in West Virginia and the statistical data
used in the following linked micromap plots\index{Linked micromap plot}, rather than having
a separate dataset for the statistical data.
In general, shapefiles\index{Shapefiles} are a collection of related files
with the same prefix that contain the geography and attributes (i.e., data)
of geographically referenced spatial features.
Shapefiles\index{Shapefiles} consist of at least three files:
a main file that stores the feature geometry (with suffix `.shp`),
an index file that stores the index of the feature geometry (with suffix `.shx`),
and a dBASE table that contains the attribute information of the spatial features (with suffix `.dbf`).
Additional files may be included in a shapefile\index{Shapefiles}.
Here, for the _WV_Watershed_\index{Datasets!WV\_Watershed} dataset, four additional files are provided:
a file with suffix `.prj` that contains the spatial coordinate system information (i.e., the projection),
two files with suffix `.sbn` and `.sbx` that store the spatial index of the features,
and a file with suffix `.xml` that contains metadata for the shapefile.
Additional suffixes may be used for other shapefiles\index{Shapefiles} as outlined in @ESRI2016.
For use in linked micromap plots\index{Linked micromap plot}
that are created with the **micromap**\index{R Packages!micromap} R package,
external shapefiles\index{Shapefiles} can be handled in two ways.
Option (i) is to read in the external shapefile\index{Shapefiles} as a
SpatialPolygonsDataFrame,\index{SpatialPolygonsDataFrame}
and then split it into the geographic information and the statistical data component,
use a modified statistical data component, or use a statistical data component from
a different source, in particular if the shapefile\index{Shapefiles} does not contain any
statistical data.
Numerous R packages, such as
**raster**\index{R Packages!raster} [@Hijmans2022raster],
**sf**\index{R Packages!sf} [@Pebesma2022],
**shapefiles**\index{R Packages!shapefiles} [@Stabler2022],
and **terra**\index{R Packages!terra} [@Hijmans2022terra],
support the use of shapefiles\index{Shapefiles} in R.
From these R packages, only the **raster**\index{R Packages!raster}
R package directly creates a SpatialPolygonsDataFrame.\index{SpatialPolygonsDataFrame}
For functions from the other R packages, an additional transformation step would be needed.
Therefore, we use the `raster::shapefile()` function here to read in the shapefile.\index{Shapefiles}
Option (ii) is to read in the external shapefile\index{Shapefiles} in a
simple features\index{Simple features} format [@OGC2022]
via the `sf::st_read()` function
from the **sf**\index{R Packages!sf} R package [@Pebesma2022],
instead of creating a
SpatialPolygonsDataFrame\index{SpatialPolygonsDataFrame}.
It is worthwhile to mention that geography and attributes that are
stored in a simple features\index{Simple features} format
no longer have to be split when used in linked micromap plots.\index{Linked micromap plot}
Rather, the **micromap**\index{R Packages!micromap} R package
can handle them directly from the simple features\index{Simple features} object.
Overall, options (i) and (ii) both will eventually result in the same final
linked micromap plot.\index{Linked micromap plot}
Option (i) may be preferred if the external shapefile does not contain the statistical data
or if some considerable modifications have to be made to the statistical data prior to
creating the linked micromap plot.\index{Linked micromap plot}
Not all R package can handle simple features\index{Simple features} objects so that
a split into a geographic component and a statistical data component may be necessary anyway
for some advanced processing of the statistical data.
Option (ii) may be preferred if the statistical data from the external shapefile\index{Shapefiles}
can be used almost as provided in the external shapefile.\index{Shapefiles}
Moreover, in this second example we introduce two new statistical displays
for the statistical graphics columns of the linked micromap plot\index{Linked micromap plot}:
a boxplot\index{Boxplot} and a dotplot with confidence bounds\index{Dotplot with confidence bounds}.
Different arguments are used to fine-tune this linked micromap plot\index{Linked micromap plot}.
Finally, we will demonstrate how to add an overall statistics (or criteria) line to the
statistical graphics columns of the linked micromap plot\index{Linked micromap plot}.
### Identifying and Geoprocessing of Spatial Boundary Data {#Ch2-IdentifyingWV}
The _WV\_Watershed_\index{Datasets!WV\_Watershed} dataset
from the **micromapExtra**\index{R Packages!micromapExtra} R package
is stored in external shapefiles\index{Shapefiles} that can be
read in as a SpatialPolygonsDataFrame\index{SpatialPolygonsDataFrame}
via the `raster::shapefile()` function from the
**raster**\index{R Packages!raster} R package [@Hijmans2022raster]
or via the `sf::st_read()` function from the
**sf**\index{R Packages!sf} R package [@Pebesma2022]
as discussed in the previous section.
The shapefile contains
25 aggregated watersheds and subbasins in West Virginia in the United States.
These watersheds were introduced and discussed in more detail in @MPRG2016.
In addition to the geographic information, this dataset also contains
the statistical information for the linked micromap plots\index{Linked micromap plot}
created in this section.
We first demonstrate the steps necessary for option (i).
Similar to the first example in Section \@ref(Ch2-Identifying),
we verify that this object (once read into R) indeed is a
SpatialPolygonsDataFrame\index{SpatialPolygonsDataFrame}, then view
some of the data in the `data` component of this object, and finally plot it,
as shown in Figure \@ref(fig:Ch2-WV).
The last step is the extraction of the statistical data from this
SpatialPolygonsDataFrame\index{SpatialPolygonsDataFrame} into a regular data frame.
This can be done by accessing the data from the `@data` slot in the
SpatialPolygonsDataFrame\index{SpatialPolygonsDataFrame}.
There are 41 variables in the resulting data frame.
In contrast to @MPRG2016, we will focus on the specific conductance variables
in the following linked micromap plots.\index{Linked micromap plot}
Variable names beginning with an uppercase letter, e.g., `Cond_med`, `Cond_LCB95`,
`Cond_UCB95`, are the population estimates, representing the median,
the lower 95% confidence bound, and the upper 95% confidence bound of a variable
(here, specific conductance) in each of the 25 watersheds, respectively.
These will be used for the construction of
dotplots with confidence bounds\index{Dotplot with confidence bounds}.
Variable names starting with a lowercase letter, e.g.,
`cond_min`, `condq1`, `cond_med1`, `cond_q3`, `cond_max`,
are the descriptive statistics, representing the minimum, first quartile, median,
third quartile, and maximum of a variable (here, specific conductance) in each of the 25 watersheds, respectively.
These will be used for the construction of boxplots\index{Boxplot}.
(ref:Ch2-WV-cap) Map representation of the _WV_Watershed_\index{Datasets!WV\_Watershed} spatial boundary dataset for the 25 watersheds in West Virginia that is used as the basis for the linked micromap plots\index{Linked micromap plot} for option (i).
```{r Ch2-WV, fig.cap = '(ref:Ch2-WV-cap)', fig.width = 5, fig.height = 4}
wv_watershed <- raster::shapefile(
x = "data/WV_Watershed/RandomWatershed2_stats_smooth.shp",
verbose = FALSE
)
class(wv_watershed)
head(wv_watershed@data, n = 2)
par(mar = c(0, 0, 0, 0))
plot(wv_watershed)
wv_data <- wv_watershed@data
names(wv_data)
dim(wv_data)
```
Alternatively, we demonstrate the steps necessary for option (ii).
Similar to option (i),
we verify that this object (once read into R) indeed is in the
simple features\index{Simple features} format, then look
at some of the data of this sf object, and finally plot it,
as shown in Figure \@ref(fig:Ch2-WV-sf).
(ref:Ch2-WV-cap-sf) Map representation of the _WV_Watershed_\index{Datasets!WV\_Watershed} spatial boundary dataset for the 25 watersheds in West Virginia that is used as the basis for the linked micromap plots\index{Linked micromap plot} for option (ii). It is identical to the one for option (i) shown in Figure \@ref(fig:Ch2-WV).
```{r Ch2-WV-sf, fig.cap = '(ref:Ch2-WV-cap-sf)', fig.width = 5, fig.height = 4}
wv_watershed_sf <- sf::st_read(
dsn = "data/WV_Watershed/RandomWatershed2_stats_smooth.shp",
quiet = TRUE
)
class(wv_watershed_sf)
head(wv_watershed_sf, n = 2)
par(mar = c(0, 0, 0, 0))
plot(sf::st_geometry(wv_watershed_sf))
```
### Linking Spatial Boundary Data and Statistical Data {#Ch2-LinkingWV}
This step is only required for option (i) and can be skipped entirely for option (ii).
Similar to Section \@ref(Ch2-Linking),
the SpatialPolygonsDataFrame\index{SpatialPolygonsDataFrame} first is transformed
into a regular data frame for use in a linked micromap plot\index{Linked micromap plot}
via the `create_map_table()` function. The `Random_Wat` variable from the `data`
component of the SpatialPolygonsDataFrame\index{SpatialPolygonsDataFrame} object
can be used as an ID column here.
```{r Ch2-linkingdataframesWV}
wv_polys_table <- create_map_table(
tmp.map = wv_watershed,
IDcolumn = "Random_Wat"
)
head(wv_polys_table)
class(wv_polys_table)
dim(wv_polys_table)
names(wv_polys_table)
```
The resulting `wv_polys_table` is a regular data frame that will be used for creating a
linked micromap plot\index{Linked micromap plot} in the next step.
`wv_polys_table` consists of `r dim(wv_polys_table)[1]` rows. This implies that the map is based
on `r dim(wv_polys_table)[1]` line segments.
We have already extracted the statistical data into the `wv_data` data frame.
Same as for the spatial component,
the `Random_Wat` variable serves as the linking variable between the two data frames.
As in Section \@ref(Ch2-Linking), we want to verify that there is indeed at least one matching ID
in the `wv_polys_table` data frame for each ID in the `wv_data` data frame.
In fact, everything is matching here.
```{r Ch2-linkingdataframesWV2}
all(sort(wv_data$Random_Wat) == sort(unique(wv_polys_table$ID)))
```
### Creating a Draft Linked Micromap Plot {#Ch2-CreatingWV}
We can now create a minimal draft linked micromap plot\index{Linked micromap plot} using the
`mmplot()` function, based on the previously created
`wv_polys_table` and `wv_data` data frames for option (i).
Similar to our first example in Section \@ref(Ch2-Creating),
we only provide seven required arguments in our function call.