To create the interactive maps shown in this document, the following contributed R packages are required: tidygeocoder, sf, mapview, and rnaturalearth. As always, these packages must be installed once, such as with the RStudio Tools menu. Or, just run the following function call to install.
install.packages(c("tidygeocoder", "sf", "mapview", "rnaturalearth"))
Access the packages from your R library with the following library() calls. For lessR, turn off the suggestions to save space and focus on the primary content.
suppressPackageStartupMessages(library(lessR))
style(suggest=FALSE)
library(tidygeocoder)
library(mapview)
library(sf)
library(rnaturalearth)Map Basics
From lessR version 4.5.4.
From sf version 1.1.0.
From mapview version 1.0.6.
From tidygeocoder version 2.11.4.
From rnaturalearth version 1.2.0.
Map Types
There are many types of maps. The following are among the most common types.
- Political map or country map: Show political and administrative boundaries, such as countries and states within countries, and cities within states. Plotted symbols represent geographical entities such as cities, and possibly the size of the plotted symbol reflects the extent of the value of some variable, such as population. With a small enough area that shows streets and highways, this map can be called a street map. (Tableau calls this a symbol map.)
- Choropleth map or thematic map: Fill different parts of the map with varying intensities of color, where the color intensity reflects the value of some other variable. (Tableau calls this a filled map.)
- Physical map or terrain map: Show natural landscape features of the Earth what when created from imagery of the world. (Tableau calls this a satellite map.)
We use the R mapview() function from the package of the same name to create maps in these three styles.
Projections
The Earth is approximately spherical. However, we draw maps of the Earth on flat surfaces such as paper or a computer screen. To create a map, we project spherical coordinates, which describe specific locations on a curved surface, into the two-dimensional <x, y> coordinates of a flat surface.
Flatten the coordinates of latitude and longitude that describe a position on the Earth’s three-dimensional surface into two dimensions.
All maps of any part of the Earth, or of the entire Earth, are projections. The problem is that a flat surface cannot represent a spherical object without distortion. Of the many available competing projections, each has its own flaws. A transformation of a spherical object to flatness cannot simultaneously retain the accuracy of area, direction, distance, and shape. Each possible projection compromises at least one of these properties while minimizing distortion in the others. The larger the mapped area, the greater the distortion. A world map necessarily presents more distortion than a city map.
With modern computer technology, you can view a variety of projections or even create your own.
Figure 1 illustrates two different projections of the Earth’s surface to yield two different political maps.
Figure 1 (a) presents the simplest possible projection, which maps longitude and latitude directly to the map’s rectangular grid <x, y> coordinates. This simple rectangular longitude–latitude rendering of the world is more formally known as the Plate Carree projection, French for “square plan”. This projection renders the Earth as a flat rectangle but introduces large distortions, especially away from the equator. For example, the Antarctic is not nearly as large as North America, yet the map renders a huge Antarctic.
Figure 1 (b) shows the Mollweide projection, an equal-area projection that preserves relative sizes of regions but not angles. As a result, the continents appear in correct proportions by area, with much less distortion of size than in Plate Carree, though their shapes and angular relationships are not preserved. The Mollweide projection renders the Earth as an ellipse.
Visualization Modes
Store a digital image in one of two primary formats: vector or raster.
Composed of paths defined by mathematical formulas, including points, lines, and curves.
Vector images are resolution-independent, so they scale perfectly from small to large with no distortion because they consist of mathematical descriptions of objects. Viewing a vector image is viewing the display of mathematical objects created at the given resolution. They are informally referred to as drawings, roughly analogous to a human-created drawing on paper, with the advantage of scalability. Common file types are .pdf and .eps.
Vector images can scale perfectly, but without the finest level of detail and gradation that a photograph can provide, which represents the other primary type of image.
Composed of a grid of pixels, each having a specific color value.
A photograph is a raster image, typically composed of a large number of tiny pixels or dots. The disadvantages are that raster images can require much more storage space than vector images and distort when scaled to larger sizes. A raster image, however, at its native resolution, can achieve precise detail, including textures with color gradations. A raster image is also called a bit-mapped image or, more informally, a painting or photograph. Common file types are .jpeg or .jpg, and .png.
Figure 2 presents an example of each image type.
Historically, detailed maps were only created from raster images. However, with advances in computer imaging technology, we can now produce high-quality maps from vector images. While raster maps can still be created from visualization systems, the success of high-quality vector images has shifted the focus to vector-based mapping.
Spatial Data
To draw a map, we need the precise coordinates on Earth where the relevant geographical features are located, typically their longitude and latitude. Identify the location of an object in terms of these coordinates, its geocode.
An object’s location in terms of longitude and latitude.
We have two general ways to obtain geocodes.
- Use an online service for the specific addresses of places we provide, which allows customization of the resulting map to the specific locations of interest.
- Access data files on the web that already include geocodes for existing geographical features, such as cities around the world.
First, consider obtaining custom geocodes from supplied addresses.
Step 1: Specify the Addresses
For custom geocodes, create a file that contains the names of the locations to map and their corresponding addresses. Locations can include the name of a city, a specific address within a city, a mountain, a lake, or any other relevant geographical feature. The name can be anything as the key information is the supplied address. Moreover, the file can include additional information such as population, financial information, or any other variable of interest that could be related to a map. These location data are processed as an R data frame, so store them in any file format that R can read.
For example, to create a map of the locations of some universities and colleges in Portland, Oregon, begin by entering two columns of data into a spreadsheet, and any additional columns of interest. The first variable, we choose to label school, is the name of the feature. The second variable, address, defines its location. The first row of the file might appear as follows.
Figure 3 shows the information contained in an Excel file from which the geocodes will be obtained and a map drawn. Find the brief Excel file at:
https://dgerbing.github.io/data/PDXaddresses.xlsx
Read this address file into R, such as into the pdx data frame.
pdx <- Read("https://dgerbing.github.io/data/PDXaddresses.xlsx")
pdx school address students
1 Portland State University 1825 SW Broadway, Portland, OR 97201 12490
2 University of Portland 5000 N Willamette Blvd, Portland, OR 97203 3700
3 Reed College 3203 SE Woodstock Blvd, Portland, OR 97202 1458
4 Lewis and Clark College 615 S Palatine Hill Rd, OR 97219 3520
View the available services by entering ?geo. Choose the geocoding service with the parameter method.
With this information, the geocodes can be obtained.
Step 2: Obtain the Geocodes
There are multiple sources of custom geocodes. Here, obtain the geocodes with the geocode() function from the tidygeocoder package. Present the address for each object as a single character string or divide it into separate variables that specify the street, city, state, postalcode, and country. For the input data table, name the variables as you wish, except that addresses must be specified with these given parameter names when calling the geocode() function.
In this data set, there is a single address field instead of a separate street, etc. Identify this address variable in the Excel data file and specify as the argument to the address parameter. The other variables can be ignored in the call to geocode() because they will simply be passed along to the data frame enhanced with the geocodes.
pdx <- geocode(pdx, address=address)Following is the original data table with the requested geocodes.
data.frame(pdx) school address students lat long
1 Portland State University 1825 SW Broadway, Portland, OR 97201 12490 45.51175 -122.6845
2 University of Portland 5000 N Willamette Blvd, Portland, OR 97203 3700 45.57368 -122.7283
3 Reed College 3203 SE Woodstock Blvd, Portland, OR 97202 1458 45.48150 -122.6310
4 Lewis and Clark College 615 S Palatine Hill Rd, OR 97219 3520 45.45069 -122.6713
A mildly annoying aspect of the geocode() function is that it does not return a standard R data frame, but instead a tibble style data frame. Not necessary, but to display the tibble as a standard R data frame, submit the output data frame to the Base R data.frame() function.
The revised data file contains the original information, plus the geocodes, the latitude and longitude of each location, with variable names lat and long.
You can save the updated file with the geocodes for use with another visualization system, such as Tableau. The lessR function Write() writes this information to a file with a format of your choice. Specify the date frame, the name of the file you wish to write within quotes, which can include a full path name. The default format is csv. To write an Excel file, specify the value of the parameter format as "Excel".
Write(pdx, "Geocoded", format="Excel")[with the writeData() function from Schauberger and Walker's openxlsx package]
Unlike the Base R write functions, Write() helpfully informs you as to where it wrote the file.
Once the geocodes are obtained for the desired locations, one more data-preparation step remains: convert the data to a special file type.
Step 3: Simple Features Data
Spatial Data
Maps are drawn from data formatted as spatial data. To draw a map, we need more than the geocodes of every object that we wish to draw. We also need the type of object to be drawn.
Data that describe geophysical features, such as boundaries between countries, altitudes, and shorelines, and their locations.
Geographers have developed specific data-table formats for storing spatial data, of which the most commonly encountered is called simple features (sf). Draw a map by entering an sf data frame into a mapping function.
Data that use geometric figures such as points, lines, or polygons and their respective coordinates to describe the features of geographical objects.
This standard format for storing spatial data is independent of any one software system and can be read by mapping functions available in many visualization systems, such as R, including mapview().
The mapview() function provides an interface to the interactive leaflet mapping system, accomplishing the same amount of work with fewer computer instructions, similar to the relationship between lessR and R.
Projection
As previously discussed, specifying the location of a geographical object requires adopting a projection to define locations on a flat map. Drawing a map is necessarily done according to a specific projection.
The projection that transforms locations on the surface of the Earth into map coordinates.
Geographers have defined more than 7000 CRS’s. A commonly encountered CRS is identified by the number 4326. To understand how and when to choose a specific CRS requires a deep dive into cartography, the practice of map making. Fortunately, CRS 4326 is generally applicable to many different maps and often serves as the default projection for mapping software. It corresponds to the familiar longitude and latitude coordinate system used by GPS devices, so a location recorded by GPS is typically expressed as a pair of longitude and latitude values in this CRS.
Create
An sf data table describes vector data as text. Each row of an sf data table represents a single spatial object and the coordinates that locate that object on the map. For example, one row may define a point at its associated coordinates, or a polygon that outlines a country’s shape. Other data can also be included in the file, such as a city’s population and elevation.
The transformation of the original geocoded data frame into an sf data frame incorporates the longitude and latitude values into a variable it creates named geometry. These data values include the coordinates and the type of geometric object to represent on the map. In this example, the type of object drawn for this particular row of data is a POINT.
The function st_as_sf() from the sf package transforms the geocoded data frame to a simple features data frame by creating the geometry variable. Refer to the variable names long and lat as they are defined in the geocoded file, specified as a vector with the coords parameter. The first parameter is the name of the geocoded data frame, here pdx.
pdx_sf <- st_as_sf(pdx, coords=c("long", "lat"), crs=4326)pdx_sfSimple feature collection with 4 features and 3 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: -122.7283 ymin: 45.45069 xmax: -122.631 ymax: 45.57368
Geodetic CRS: WGS 84
school address students geometry
1 Portland State University 1825 SW Broadway, Portland, OR 97201 12490 POINT (-122.6845 45.51175)
2 University of Portland 5000 N Willamette Blvd, Portland, OR 97203 3700 POINT (-122.7283 45.57368)
3 Reed College 3203 SE Woodstock Blvd, Portland, OR 97202 1458 POINT (-122.631 45.4815)
4 Lewis and Clark College 615 S Palatine Hill Rd, OR 97219 3520 POINT (-122.6713 45.45069)
The coordinate reference system 4326 is also known as WGS 84, as identified in this simple features file.
This example provided an overview of how to obtain an sf data file from a given set of geocoded addresses. In other scenarios, as illustrated later, we directly access the relevant sf data file from the internet. For example, when mapping an entire country and its corresponding states or provinces, sf files are available that already contain the needed spatial information.
Political Maps
Portland
Step 4: Create the Map
The most straightforward R mapping system is provided by the mapview package. The simplest function call is to the function of the same name, mapview(). As with most mapping functions in any data analysis system, reference the corresponding sf data frame, such as pdx_sf, from which the map is constructed.
mapview() ease of use
The mapview() function automatically scales the map to include the area surrounding the plotted points and automatically downloads the necessary spatial data to draw the background based on the designated map type.
The result is an interactive map showing the locations of the four schools within the city limits of Portland, Oregon. The name of the input simple features data frame appears in the bottom-right corner of the map. Also displayed are the name of the Leaflet interactive mapping system that mapview() accesses and the source of the spatial data used to construct the map, in this case OpenStreetMap.
mapview(pdx_sf)You can interact with the map in several ways. The default interactive map from this simple function call supports the following actions.
- Click on the map and then Drag the mouse to scroll the map contents up or down, right or left.
- Click on a zoom control buttons,
+or-, to zoom in or out, though using high zoom levels may download large amounts of spatial data. - Drag with two fingers on the trackpad or Right-click and drag to Zoom in or out.
- Click on the layer-control button
located under the zoom buttons, to switch between 5 different base maps.
- Hover the mouse over a plotted symbol to display the label of the plotted symbol.
- Click a plotted symbol to have a pop-up appear that lists the corresponding data.
- Click on the name of the input
sfdata frame at the bottom-right corner to re-center and resize the map back to the original setting.
Of course, there are more options to explore beyond that default interactive map. One advantage of using the geocode() function is that it can include custom information that provide that is then transferred to the simple features data frame, allowing further customization of the map.
In addition to locations, your geocode data file and the resulting simple features version of the file can contain other useful information for visualization, such as the average tuition cost at each school, the number of employees at company locations, and financial data, such as revenue, sales, and profitability.
Visualize any of these different variables on a map with the cex parameter. Here, create a bubble plot where the size of each plotted point represents the magnitude of the variable students, the enrollment at each university or college. Also, include the following parameters that further customize the resulting map.
label: When hovering the mouse over a plotted point, display the name of the value of the specified character variable at that location in place of the row number.layer.name: Name for the map, displayed in the upper right corner.col.regions: Interior color of the plotted regions, here points.
mapview(pdx_sf, cex="students", col.regions="red", label="school",
layer.name="Some Portland Universities")
Several freely available map types are available, which can be explored from the layer control button. The default map style display is CartoDB.Positron, as displayed in the preceding map. The five base map types available from the layer-control button are listed first.
Many other map styles are available, though they require registration or an API key. You can sign up for a free but limited-access tier at Stadia Maps. Many map types are also available from many different companies, see free options .
| Provider | Style |
|---|---|
| CartoDB.Positron | Clean, light, minimal |
| CartoDB.DarkMatter | Dark/black background |
| OpenStreetMap | Colorful, detailed |
| OpenTopoMap | Topographic map with contour shading |
| Esri.WorldImagery | High-resolution satellite imagery |
| Esri.WorldShadedRelief | Physical shaded relief (no labels) |
| Esri.WorldTopoMap | Topographic with relief, contours, labels |
| Esri.WorldStreetMap | Street map, roads and places |
| Esri.WorldPhysical | Physical map emphasizing landforms |
| Esri.NatGeoWorldMap | National Geographic cartographic style |
| Esri.WorldGrayCanvas | Light gray canvas for overlays |
Use the parameter map.types to explicitly specify one of those styles for the initial map display. The following is the map displaying the OpenStreetMap style, created by a collaborative effort of a community of volunteers and organizations.
mapview(pdx_sf, cex="students", col.regions="red", label="school",
layer.name="Some Portland Universities",
map.types="OpenStreetMap")Explicitly define the multiple map styles that appear when the layer-control button is clicked by defining a vector of map types, such as: c("OpenStreetMap", "OpenTopoMap").
Saving the Map
The map appears in the bottom-right window pane from the RStudio Viewer tab, the location for interactive visualizations. There are multiple controls in the Viewer from which you can manipulate and interact with the map. You can save the map as a self-contained web page from the Export tab, as shown in Figure 4. The web page can then be opened in any standard web browser and yet retains the full interactivity that is available within RStudio (or when running R by itself).
Italy
In this section, we create a map of Italy and highlight its largest cities, those with a population greater than 250,000. To draw this map with mapview(), we first obtain the geocodes for these cities, which we then convert to an sf data frame. Once the highlighted cities are identified as located within Italy, mapview() automatically downloads and draws the background spatial data for the map of Italy.
Access the Geocodes
In this example, we access publicly available geocodes already assigned to pre-designated locations. Several sources of these geocodes are available. One free, public-domain database, available at simplemaps.com, includes information on about 48,000 cities worldwide, including their geocodes. The following is the URL for downloading this geocoded file.
https://simplemaps.com/data/world-cities
Go to this URL and click the Download button for the Basic, free database in both Excel and .csv format.
Once downloaded, to read the data file into R you need to establish the location of where you downloaded the file. With the lessR function Read(), you have three possibilities. The choice is yours.
- Use
Read("")with empty quotes to browse for the data file. - Refer to a file by its full path name, stored somewhere on your computer’s file system.
- Download the data to the previously created
datafolder within the current folder from which your R session is active. (To verify within RStudio, click theFilestab in the lower-right window pane. If you click on that folder, you should see the indicated file.)
d <- Read("data/worldcities.csv")Data Types
------------------------------------------------------------
character: Non-numeric data values
integer: Numeric data values, integers only
double: Numeric data values with decimal digits
------------------------------------------------------------
Variable Missing Unique
Name Type Values Values Values First and last values
------------------------------------------------------------------------------------------
1 city character 48059 0 44512 Tokyo Jakarta ... Lupane Charlotte Amalie
2 city_ascii character 48059 0 44281 Tokyo Jakarta ... Lupane Charlotte Amalie
3 lat double 48059 0 38229 35.687 -6.175 ... -18.9315 18.342
4 lng double 48059 0 40934 139.7495 106.8275 ... 27.807 -64.9331
5 country character 48059 0 242 Japan ... U.S. Virgin Islands
6 iso2 character 48059 0 241 JP ID IN ... ZA ZW VI
7 iso3 character 48059 0 241 JPN IDN IND ... ZAF ZWE VIR
8 admin_name character 48059 0 4047 Tōkyō ... Virgin Islands
9 capital character 48059 0 4 primary primary ... admin primary
10 population double 47808 251 31109 37785000 33756000 ... NA NA
11 id integer 48059 0 48059 1392685764 ... 1850037473
------------------------------------------------------------------------------------------
From the output of Read(), we see that there are 11 variables and 48,059 rows of data (available in April 2026). Variables include the name of the city, the name of the country, country abbreviations, the name of states or equivalent entities within countries, and the latitude and longitude of each city, identified by variables lat and lng.
Subset the Geocodes Data
Because the goal is to create a map of Italy with its largest cities displayed, it is best to subset both the rows and the columns of the large downloaded data frame to obtain a more manageable data table.
- Include only rows that reference large Italian cities.
- Optional, but for clarity, include only the relevant variables needed to create this map.
The following call to the Base~R subset() function does this subsetting.
cities <- subset(d,
country=="Italy" & population > 250000,
select=c("city", "lat", "lng", "country", "population"))Subset the rows and columns of data frame d with the Base R function subset().
The first entered parameter is the data frame to subset, here d.
The second parameter is the logical condition used to extract rows. Here, subset city information for only the Italian cities with more than 250,000 inhabitants. The
&means and, so both conditions must be true simultaneously for the corresponding row of data to be extracted. The double equal sign,==, in the expressioncountry=="Italy"indicates a logical evaluation of the data values of variablecountry, selecting only those that equal"Italy". It is not an assignment statement.Parameter
select: The variables to extract from the full data frame. Several variables not needed for this analysis are included in the original geocodes data file. The mapping will work if we keep them, but retaining only the variables needed is cleaner, especially when we view the data to verify its integrity.Save the extracted data in a new data frame, named here cities.
cities city lat lng country population
290 Rome 41.8931 12.4828 Italy 2748109
554 Milan 45.4669 9.1900 Italy 1354196
813 Naples 40.8358 14.2486 Italy 913462
887 Turin 45.0792 7.6761 Italy 841600
1155 Palermo 38.1157 13.3613 Italy 630167
1303 Genoa 44.4072 8.9340 Italy 558745
1815 Bologna 44.4939 11.3428 Italy 387971
1916 Florence 43.7714 11.2542 Italy 360930
2170 Bari 41.1253 16.8667 Italy 316015
2200 Catania 37.5027 15.0873 Italy 311584
2621 Verona 45.4386 10.9928 Italy 255588
2678 Venice 45.4397 12.3319 Italy 250369
We now have the geocodes, the latitude and longitude of each Italian city with a population of more than 250,000, along with its population. This is the information we need to create the desired map of the 10 largest cities in Italy.
The downloaded file contains the variable population. However, for your specific purposes, you may wish to add other information to the file, such as financial information.
Not doing so here, but add additional variables to your geocoded data file with the Base R merge() function, illustrated in the section on choropleth maps. Or, write the geocoded data frame to your file system, read the geocoded file into Excel, and then manually add as many variables as desired.
To use the mapview() function to plot these 10 Italian cities, convert the geocoded data frame to an sf data frame. Again, use the coordinate reference system identified by the number 4326.
cities_sf <- st_as_sf(cities, coords=c("lng", "lat"), crs=4326)
cities_sfSimple feature collection with 12 features and 3 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 7.6761 ymin: 37.5027 xmax: 16.8667 ymax: 45.4669
Geodetic CRS: WGS 84
First 10 features:
city country population geometry
290 Rome Italy 2748109 POINT (12.4828 41.8931)
554 Milan Italy 1354196 POINT (9.19 45.4669)
813 Naples Italy 913462 POINT (14.2486 40.8358)
887 Turin Italy 841600 POINT (7.6761 45.0792)
1155 Palermo Italy 630167 POINT (13.3613 38.1157)
1303 Genoa Italy 558745 POINT (8.934 44.4072)
1815 Bologna Italy 387971 POINT (11.3428 44.4939)
1916 Florence Italy 360930 POINT (11.2542 43.7714)
2170 Bari Italy 316015 POINT (16.8667 41.1253)
2200 Catania Italy 311584 POINT (15.0873 37.5027)
Create the Map
Following is a basic mapview() map of Italy and its most populous cities, the default map enhanced with some labeling for clarity.
mapview(cities_sf, label="city", layer.name="Italian Cities")Next, draw a map with red bubbles representing cities by population size. Set the parameter legend to FALSE to remove the title, which would otherwise occur at the top corner of the map. Hover over a city with the mouse to view the city’s name.
mapview(cities_sf,
label="city", col.regions="red", cex="population", legend=FALSE)The parameter cex sets the size of the plotted points according to the corresponding value of the specified variable. The same process applies to the parameter zcol, which plots a different color based on the value of the corresponding variable.
Physical Maps
Automatic Background
Portland
We already have the simple features data frame, pdx_sf, for the four Portland colleges and universities. To draw a topographic map that appears directly, we specify the map.types value as "Esri.NatGeoWorldMap", which provides a hybrid physical and political map.
mapview(pdx_sf, cex="students", col.regions="red", label="school",
layer.name="Some Portland Universities",
map.types="Esri.NatGeoWorldMap")A modern version of a topographic map is a satellite image, here from the math style Esri.WorldImagery.
mapview(pdx_sf, cex="students", col.regions="red", label="school",
layer.name="Some Portland Universities",
map.types="Esri.WorldImagery")For a satellite view, choose the map type of "Esri.WorldImagery". Or, leave off the map.types parameter and revert to the five interactive map possibilities. Hundreds of map types are available from many different companies, including more free options.
Manually zoom out on this map to see more geological features, such as the Pacific Ocean and the Cascade Mountains.
Italy
We have already constructed the simple features data frame for the Italian cities, cities_sf. To map, reference the data frame and a topographical style map, here "Esri.WorldTopoMap". Set the parameter legend to FALSE to remove the name of the input data frame, cities_sf and a red square from the top right corner.
mapview(cities_sf,
label="city", col.regions="yellowgreen", cex="population",
legend=FALSE, map.types="Esri.WorldTopoMap")Following is the satellite image of Italy from "Esri.WorldImagery".
mapview(cities_sf,
label="city", col.regions="darkgoldenrod2",
cex="population", legend=FALSE,
map.types="Esri.WorldImagery")Provide the Background
Previous examples of using mapview() involved plotting individual points, with the range of those plotted points determining the region of the map that mapview() provides. However, what if we do not wish to plot individual points, particularly on a physical map where we may be more interested in the terrain’s geographical features?
Italy
To create the following map, not from individual points but from a pre-existing sf data structure for an entire country, begin with in country information in the package rrnaturalearth.
italy_sf <- ne_countries(country="Italy")Setting the parameter "scale" to "large" provides the most detail and the most accurate mapping of the borders, here for the country of Italy. The setting of "large" also requires the accompanying higher resolution data from the package “rnaturalearthhires”.
From that sf data frame, we can draw the physical map, here as a satellite view from space.
mapview(italy_sf,
col.regions="transparent", alpha.regions=0,
border.col="gray25", lwd=1.2, legend=FALSE,
map.types = "Esri.WorldImagery")Ensure that the regions of the map, the various Italian provinces, are transparent so as not to obscure the underlying geographical features. Do so by specifying both their fill color as "transparent" with parameter col.regions and their transparency set to maximum by setting alpha.regions to 0. Increase the default border width to a dark gray of "gray25" with border.col, and use’ border. col’ to separate regions by setting the line width, lwd, to 1.2. "Esri.WorldImagery" is the satellite view of the Earth.
USA
The following shows how to obtain the physical map of the USA with the state boundaries, a hybrid of physical and political maps.
We need the polygons that define the state boundaries, which we obtain from the ne_states() function in the rnaturalearth package. Call the resulting simple features data frame states_sf. For ne_states, the default is to draw the boundaries at high resolution.
states_sf <- ne_states(country="United States of America")We have far more information in the resulting data frame than we want. There are 51 rows of data and 122 variables. First, we eliminate the territories of the USA by specifying that the variable type must equal the data value “State” or, to include Washington DC, “Federal district”. Second, we use parameter select to retain only three variables: the name of each state, the postal zip code, and the geometry that identifies the polygon and its coordinates that define each state boundary.
states_sf <- subset(states_sf,
type %in% c("State", "Federal district"),
select = c("name", "postal", "geometry")) With this reduced sf data frame, we can draw the physical map. Set the parameter alpha.regions to 0 so that only the boundaries of each state appear without filling in the interior with a color. We also see that the geometry variable does not describe individual points but individual polygons that define the state boundaries.
The following mapping function is the same as for the physical map of Italy except now we input the sf data frame states_sf.
mapview(states_sf,
col.regions="transparent", alpha.regions=0,
border.col="gray25", lwd=1.2, legend=FALSE,
map.types = "OpenTopoMap")
The default map shows the states of the United States, but it also includes information for the entire world, in part because the USA stretches from Hawaii and Alaska to Maine and Florida. Manually zoom in for more detail anywhere in the world, including the USA, which also includes the state boundaries.
Or, to directly show the map at appropriate scale, map only the contiguous states. So, remove Alaska and Hawaii from the sf data frame, and map them separately if desired.
The ! in R means not. That is, subset stats_sf to not contain rows of data in which the name is any element in the vector of Alaska and Hawaii. Name the reduced special features data frame states48_sf.
states48_sf <- subset(states_sf, !name %in% c("Alaska", "Hawaii"))Now focus the map on the contiguous 48 states.
mapview(states48_sf,
col.regions="transparent", alpha.regions=0,
border.col="gray25", lwd=1.2, legend=FALSE,
map.types = "OpenTopoMap")Choropleth Maps
A choropleth map visualizes the distribution of values for a variable across geographic regions.
A map that shades or colors geographic regions according to the value of a variable measured for each region.
In this example, a choropleth map can display sales revenue by state. For this example, we construct a choropleth map of the United States to demonstrate how income inequality varies across states. We measure income inequality with the Gini coefficient, a value that ranges from 0 to 1, with higher values indicating greater inequality.
To create a choropleth map, we need a variable with a data value for each state. Any variable with values that vary across states can be used. Multiple variables can be used to create multiple choropleth maps. Create a separate data file, such as an Excel file, that contains the state values for one or more variables. The constraint is that the state-name variable and its values must exactly match the corresponding variable and state names in the simple features data frame.
From the preceding listing of the first six rows of the sf data frame, states_sf, we see that the state-name variable is name. Accordingly, when constructing the file of Gini coefficients, use exactly the same state names and exactly the same variable name. Then add a second column that contains the Gini coefficient for each state.
The state Gini coefficients are available from the census.gov website. I downloaded and organized the Gini information into an Excel file with only two columns: the state name and corresponding Gini coefficient. Here, read the Gini information into data frame d_gini.
d_gini <- Read("https://dgerbing.github.io/data/Gini2023.xlsx",
quiet=TRUE)head(d_gini) name Gini
1 Alabama 0.4783
2 Alaska 0.4333
3 Arizona 0.4624
4 Arkansas 0.4807
5 California 0.4887
6 Colorado 0.4576
The Gini information and mapping information reside in separate data frames, which, by design, both share the same variable: name. Merge the Gini information into the states_sf sf data frame according to the state name in each data table.
Both data frames to be merged need a common variable by which to align them.
To merge, use the Base R function appropriately named merge(). List the names of the two data frames to merge, and then use the by parameter to specify the variable used for merging. The name of each state is name in both data tables, so that is the variable on which the merge is organized.
states_sf = merge(states_sf, d_gini, by="name")
head(states_sf)Simple feature collection with 6 features and 3 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: -179.1435 ymin: 30.22362 xmax: 179.7809 ymax: 71.4125
Geodetic CRS: WGS 84
name postal Gini geometry
1 Alabama AL 0.4783 MULTIPOLYGON (((-87.41958 3...
2 Alaska AK 0.4333 MULTIPOLYGON (((-141.0056 6...
3 Arizona AZ 0.4624 MULTIPOLYGON (((-111.0063 3...
4 Arkansas AR 0.4807 MULTIPOLYGON (((-90.30422 3...
5 California CA 0.4887 MULTIPOLYGON (((-114.7243 3...
6 Colorado CO 0.4576 MULTIPOLYGON (((-109.0463 4...
If the common variable shared by the two data frames by which to merge has different variable names, then use the by.x parameter for the variable name in the first specified data frame, and the by.y parameter to specify the variable name in the second specified data frame.
We now have the Gini coefficient by State and the state mapping information combined into a single sf data frame. Use the mapview() function here to create the interactive choropleth map. Specify the variable that shades the interior color of the individual states with the parameter zcol.
mapview(states_sf, zcol="Gini")Again, we can zoom and recenter the map. Or, directly focus on the contiguous 48 states by removing the Alaska and Hawaii rows of data from the simple features data frame.
states48_sf <- subset(states_sf, !name %in% c("Alaska", "Hawaii"))With this reduced data frame, draw the choropleth map of the 48 states of the continental US.
mapview(states48_sf, zcol="Gini")
Unfortunately, with the legend placed in the top-right corner, the legend obscures the northeastern portion of the United States. To cover less of the map, use parameter position to shift the legend to the bottom-right corner instead of the default position “topright”.
mapview(states48_sf, zcol="Gini", position="bottomright")That shift reduces the amount of the map covered by the legend, but some states are still obscured. Another option is to turn the legend off completely with legend=FALSE and instead rely on the map’s interactivity. Hover the mouse over any state to display its corresponding Gini coefficient.
mapview(states48_sf, zcol="Gini", legend=FALSE)Displayed in the lightest color, yellow, New York has the greatest income inequality. New York is the only state in yellow, which corresponds to the largest Gini coefficient, 0.515. In contrast, Utah has the darkest map color and so has the smallest Gini coefficient, 0.428. Because the map is interactive, hover the mouse over any state to read its exact Gini coefficient. Also, examining the previous map, Alaska has a low Gini coefficient, only slightly larger than Utah’s.
As always, to view the full mapview() manual and its options, enter ?mapview.
Summary
Follow these steps to create a map. Construct an sf data frame from which to construct your map. The simple features data frame contains the mapping data and the variables to be visualized in the resulting map.
If using custom locations, create a data file that contains your locations of interest, such as with Excel. If you have the data values of variables you wish to plot in addition to location data, include those variables as well.
Get the geocodes for your locations of interest.
- If geocodes for custom locations, use
geocode()from thetidygeocoderpackage. - Or, access an existing file, such as at
simplemaps.comor the rnaturalearth package, that already contains the needed locations and corresponding geocodes, then subset as needed.
- If geocodes for custom locations, use
If needed, transform your geocoded data frame into an
sfdata frame withst_as_sf()from thesfpackage.If you are creating a choropleth map:
- In addition to the state boundaries from the rnaturalearth package, create or access a second data file with the variable of interest and a variable with the same name, such as ID, as in the mapping file.
- Merge the variable of interest to plot the common variable with the
sfmapping information, such as with Base Rmerge().
Use
mapview()from themapviewpackage to create the map from thesfdata frame.