Chapter 3 World open data

Several inter-governmenatal organizations have statisics divisions working in dissiminating global statistical information and maintaining open data plateforms. The main inter-governmental organization working in promoting open data is the united nations. It includes various specialized agencies such as the World Bank Group, the World Health Organization, the World Food Program, UNESCO… This chapter presents an overview of data produced and published by these organization and lists various sources of open data within different thematics.

3.1 Sources

3.1.1 United Nations

The United Nations (UN) is an intergovernmental organization that aims to maintain international peace and security, develop friendly relations among nations, achieve international cooperation, and harmonize their actions. The UN system includes a multitude of specialized agencies, such as the World Bank Group, the World Health Organization, the World Food Programme, UNESCO, and UNICEF. The United Nations Statistics Division is committed to the advancement of the global statistical system. They compile and dissiminate global statistical information, develop standards and norms for statistical activities, and support countries’ efforts to strengthen their national statistical systems.

UN Comtrade Database: UN COMTRADE is the pseudonym for United Nations International Trade Statistics Database. It contains well over 3 billion data records since 1962 and is available publicly on the internet. In addition, it offers public and premium data API for easier integration/download. Time series of data for reporter countries starts as far back as 1962 and goes up to the most recent completed year. Current data is published annually in the International Trade Statistics Yearbook.
UNdata: UNdata is a web-based data service providing access to a variety of statistical resources compiled by the United Nations (UN) statistical system and other international agencies. Th UNdata contains 22 databases presenting various topics. Here is a sample of published databases:
- Energy statistics Database
- Environment Statistics Database
- Industrial Commodity Statistics Database
- International Financial Statistics
- World Tourism Data These data are produced from more than 20 international statistical sources compiled by the UN statistical system and other international agencies: Food and Agriculture Organization, International Labour Organization, International Monetary Fund, The World Bank, UNESCO Institute for Statistics, United Nations Children’s Fund, World Meteorological Organization, World Tourism Organization…
The UNESCO Institute for Statistics: The UNESCO Institute for Statistics (UIS) is the official and trusted source of internationally-comparable data on education, science, culture and communication. The UIS provides free access to data for all UNESCO countries and regional groupings from 1970 to the most recent year available. Databases are accessible via: data rowser, bulk data download service and API. Data is organized in different themes: Education (out-of-school childre, literacy, financial resources in education…), SDG 4 (sustaiable development goal indicators), equity, science and innovation, culture (culture employement, feature films, international trade in cultural goods…), communication and information, demographic and socio-economic.
Sustainable Development Goals: The Sustainable Development Goals (SDGs) are a collection of 17 global goals designed to be a “blueprint to achieve a better and more sustainable future for all”. The SDGs, set in 2015 by the United Nations General Assembly and intended to be achieved by the year 2030.

SDG indicators

The Sustainable Development Goals are: No Poverty, Zero Hunger, Good Health and Well-being, Quality Education, Gender Equality, Clean Water and Sanitation, Affordable and Clean Energy, Decent Work and Economic Growth, Industry, Innovation, and Infrastructure, Reducing Inequality, Sustainable Cities and Communities, Responsible Consumption and Production, Climate Action, Life Below Water, Life On Land, Peace, Justice, and Strong Institutions, and Partnerships for the Goals.

The United Nations provide platform of the Global SDG Indicators Database. This platform provides access to data compiled through the UN System in preparation for the Secretary-General’s annual report on “Progress towards the Sustainable Development Goals”.

We can also explore these indicators by the mean of SDG Tracker: a free, open-access resource where users can track and explore global and country-level progress towards each of the 17 Sustainable Development Goals through interactive data visualizations.

Sustainable Development Report Dashboards 2019

3.1.2 World Bank Open data

The World Bank Group is the largest development bank in the world and it is an observer at he United Nations Development Group. The World Bank collects and processes large amounts of data. These data have been made available to the public with different services and platforms.

Open Data Catalog: It is a listing of available World Bank datasets that we can browse by data type (Time Series, Microdata, and geospatial data) or by countries/regions.
- Time Series data: Datasets and Indicators level data that is a sequence of numbers collected at regular intervals over a period of time (Examples: Unemployement rate, labor force, urban population…)
- Microdata: Unit-level data obtained from sample surveys, censuses, and administrative systems (Examples: Employment And Welfare Survey, Demographic And Health Survey… )
- Geospatial: Data that has explicit geographic positioning information in vector or raster format (Examples: World - Terrain Elevation Above Sea Level, World Subnational Boundaries, World - Photovoltaic Power Potential…)
World Development Indicators: The World Development Indicators is a compilation of relevant, high-quality, and internationally comparable statistcs about global development and the fight against poverty. The database contains 1600 time series indicators for 217 economies and more than 40 country groups, with data for many indicators going back more than 50 years. Data is accessible by querying online th database using DataBank tool, by API or bulk downloads. The different datasets are categorized witin six themes: poverty and inequality, people, environment, eonomy, states and markets and global links. The following tables list some featured indicators provided in these themes.
- Poverty and shared prosperity: It presents indicators that measure progress toward the World Bank Group’s twin goals of ending extreme poverty by 2030 and promoting shared prosperity in every country in a sustainable manner.

library(knitr)
library(kableExtra)

kable(WDI_Poverty_and_Inequality[,-1]) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"),
                font_size = 12) %>%
  scroll_box(width = "100%", height = "400px")

Sub.themes	Featured.indicators	Indicator.description
Poverty rates at national poverty lines	Poverty headcount ratio at national poverty lines	The percentage of the population living below the national poverty lines.
	Poverty gap at national poverty lines	The mean shortfall from the poverty lines (counting the nonpoor as having zero shortfall) as a percentage of the poverty lines
Poverty rates at international poverty lines	Poverty headcount ratio at $1.90 a day	The percentage of the population living on less than $1.90 a day.
Distribution of income or consumption	GINI index	It measures the extent to whoch the distribution of income among individuals within an economy deviates frome a perfectly equal distribution.
Shared prosperity	Annualized average growth rate in per capita real survey mean consumption or income	The annualized average growth rate in per capita real consumption or income of the bottom 40% of the population
	Survey mean consumption or income per capita, bottom 40% of population	Mean consumption or income per capita

People: It contains indicators for education, health, jobs, population, and gender that help capture the quality of people’s lives and provide a multidimensional portrait of the progress of societies.

kable(WDI_People[,-1]) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"),
                font_size = 12) %>%
  scroll_box(width = "100%", height = "400px")

	Sub.themes	Featured.indicators	Indicator.description
7	Population dynamics	Population, total	Counting of all residents of countries.
8		Population growth (annual %)	The exponential rate of growth of midyear population from year t-1 to t.
9		Birth rate, crude	The number of live births occuring during the year, per 1000 population estimated at midyear.
10		Death rate, crude	The number of deaths occuring during the year, per 1000 population estimated at midyear.
11		Fertility rate, total	The number of children that would be born to a woman.
12		Life expectancy at birth	The number of years a newborn infant would live if prevailing patterns of mortality at the time of its birth were to stay the same throughout its life.
13	Education	Government expenditure on education	General government expenditure on education as a percentage of GDP.
14		School enrollment	Ratio of total enrollment (primary, secondary)
15		Literacy rate, youth total	The percentage of people ages 15-24 who can both read and write.
16	Labor	Labor force participation rate	The proportion of the population ages 15 and older that is economically active.
17		Emplyment by domain	The percentage of employment in agriculture, industry and services
18		Employment to population ratio	The proportionof a country’s population that is employed.
19	Health	Maternal mortality ratio	The number of women who die from pregnancy-related causes.
20		Mortality from CVD, cancer, diabetes or CRD between exact ages 30 and 70	The percent of 30 y.o people who would die from these diseases.
21		Mortality caused by road traffic injury	Road traffic fatal injury deaths per 100,000 population
22	Gender	Ratio of female to male labor force participation rate	Dividing female labor force participation rate by male labor force participation rate

Environment: It shows how the state of the planet, as well as our use of natural resources, and the observed impacts. It contains indicators about agriculture, climate, energy use, water ressources…

kable(WDI_Environment[,-1]) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"),
                font_size = 12) %>%
  scroll_box(width = "100%", height = "300px")

	Sub.themes	Featured.indicators	Indicator.description
23	Agriculture	Agricultural land	The share of land area that is arable, under permanent crops, and under permanent pastures.
24	Climate	CO2 emissions	CO2 emission in tons per capita
25		PM2.5 air pollution	The average level of exposure of a nation’s population to concentrations of PM2.5.
26		Average precipitation in depth	The ong-term average in depth (over space and time) of annual precipitation in the country (mm per year)
27	Energy and mining	Energy intensity level of primary energy	The ratio between energy supply and gross domestic product. It is an indication of how much energy is used to produce one unit of economic output.
28		Renewable energy consumption	The share of renewables energy in total final enegy consumption
29		Renewable electricity output	The share of electricity generated by renewable power plants in total electricity generated by all types of plants
30	Environment	Forest area	Percentage of land under natural or planted stands of trees.
31		Total natural resources rents	The sum of oil rents, natural gas rents, coal rents, mineral rents, and forest rents.
32	Urban and rural development	People using at least basic drinking water service	The percentage of peaple using at least basic water services
33	Water and sanitation	Renewable internal freshwater resources per capita	Internal renewable resources (river flows and groundwater from rainfall)

Economy: The indicators within the Economy section allow us to analyze various aspects of both national and global economic activity. As countries produce goods and services, and consume these domestically or trade internationally, economic indicators measure levels and changes in the size and structure of different economies, and identify growth and contractions.

kable(WDI_Economy[,-1]) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"),
                font_size = 12) %>%
  scroll_box(width = "100%", height = "400px")

	Sub.themes	Featured.indicators	Indicator.description
34	Growth and economic structure	GDP	The sum of gross value added by all resident producers in the economy.
35		GDP growth	Annual percentage growth rate of GDP at market prices.
36		Agriculture, industry, manifacturing, services: value added	Annual growth rate for agricultural, industry, manifacturing, services value added based on constant local currency.
37		Gross capital formation	Annual growth rate of gross capital formation
38		Exports of goods and services	Annual growth rate of exports of goods and services
39		Imports of goods and services	Annual growth rate of imports of goods and services
40	Income and savings	GNI per capita, Atlas method	The gross national income divided by the midyear population.
41		Gross savings	Gross national income less total consumption, plus net transfers.
42	Prices and terms of trade	Consumer price index	It reflects changes in the cost to the average consumer of acquiring a basket of goods and services
43	Labor and productivity	GDP per person employed	Gross domestic product divided by total employment in the economy.

States and markets: States and markets data measure the quality of the business environment, financial system development, transport infrastructure, ICT, science and technology, government, and policy performance.

kable(WDI_States_and_Markets[,-1]) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"),
                font_size = 12) %>%
  scroll_box(width = "100%", height = "400px")

	Sub.themes	Featured.indicators	Indicator.description
44	Business environment	Time required to start a business	The number of calendar days needed to complete the procedures to legally operate a business.
45	Financial access and stability	Depositors with commercial bank	The reported number of deposit account holders at commercial banks.
46	Government finance and taxes	Revenue, excluding grants	Cash receipts from taxes, social contributions, and other revenues.
47		Expense	Cash payments for operating activities of the government in providing goods and services.
48	Military and fragile situations	Military expenditure	It includes all current and capital expenditures on the armed forces.
49		Armed forces personnel, total	Number of active military personnel.
50	Infrastructure and communications	Air transport, passengers carried	It includes both domestic and international aircraft passengers
51		Mobile cellular subscriptions	Number of mobile cellular subscriptions
52	Science and innovation	Research and development expenditure	Expenditure on research and development as a percent of GDP.
53		Scientific and technical journal articles	The number of sceintific and engineering articles published in sceintific journal.

Global links: The Global Links indicators provide an overview of the flows and associations that enable the world’s economy—and the economies of individual countries—to grow and expand. These indicators measure the size and direction of these flows, and document policy interventions such as tariffs, trade facilitation, and aid flows.

kable(WDI_Global_Links[,-1]) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"),
                font_size = 12) %>%
  scroll_box(width = "100%", height = "400px")

	Sub.themes	Featured.indicators	Indicator.description
54	External debt	External debt stocks, total	The sum of public and private nonguaranteed long-term debt and short term debt.
55	Trade	Merchandise trade	The sum of merchandise exports and imports divided by the value of GDP.
56	Refugees	Refugee population by country or territory of asylum	Number of refugee population
57	Tourism	International tourism, receipts	Expenditures by international inbound visitors (% of total exports)
58	Migration	Net migration	The net total migrants

DataBank: DataBank is an analysis and visualisation tool that contains collections of time series data on a variety of topics. With this tool, we can create our own queries, generate tables, charts, and maps, and easily save, embed, and share them.
Open finances: Open Finances makes World Bank Group’s financials available for everybody to explore. They are orgnaized within different categories: loans and credits, financial reporting, shareholder equity, procurement, projects, budget…
Global Consumption Database: The Global Consumption Database is a one-stop source of data on household consumption patterns in developing countries. It is designed to serve a wide range of users—from researchers seeking data for analytical studies to businesses seeking a better understanding of the markets into which they are expanding or those they are already serving.
Microdata library: Provides acces to data collected through sample surveys of households, business establishments or other facilities.

3.1.3 World Health Organization

The Global Health Observatory data repository is an initiative of the World Health Organization to share data (through their website) on global health, including statistics by country and information about specific diseases and health measures. It provides access to over 1000 indicators on priority health topics including mortality and burden of diseases, the Millennium Development Goals (child nutrition, child health, maternal and reproductive health, immunization, HIV/AIDS, tuberculosis, malaria, neglected diseases, water and sanitation), non communicable diseases and risk factors, epidemic-prone diseases, health systems, environmental health, violence and injuries, equity among others.

We can use th WHO R package to collect the Global Health Observatory data.

library(WHO)
library(dplyr)
codes <- get_codes()
head(codes)

## # A tibble: 6 x 3
##   label     display                             url                             
##   <chr>     <chr>                               <chr>                           
## 1 MDG_0000~ Infant mortality rate (probability~ https://www.who.int/data/gho/in~
## 2 MDG_0000~ Adolescent birth rate (per 1000 wo~ https://www.who.int/data/gho/in~
## 3 MDG_0000~ Contraceptive prevalence (%)        https://www.who.int/data/gho/in~
## 4 MDG_0000~ Unmet need for family planning (%)  https://www.who.int/data/gho/in~
## 5 MDG_0000~ Under-five mortality rate (probabi~ https://www.who.int/data/gho/in~
## 6 MDG_0000~ Median availability of selected ge~ https://www.who.int/data/gho/in~

To find datasets based on specific thematics w can use regular expresssions for searching in existing datasets.

# searching for datasets related to "life expectancy"
codes[grepl("[Ll]ife expectancy", codes$display), ]

## # A tibble: 4 x 3
##   label      display                     url                                    
##   <chr>      <chr>                       <chr>                                  
## 1 WHOSIS_00~ Life expectancy at birth (~ https://www.who.int/data/gho/indicator~
## 2 WHOSIS_00~ Healthy life expectancy (H~ https://www.who.int/data/gho/indicator~
## 3 WHOSIS_00~ Life expectancy at age 60 ~ https://www.who.int/data/gho/indicator~
## 4 WHOSIS_00~ Healthy life expectancy (H~ https://www.who.int/data/gho/indicator~

Once, we identify a dataset code we can collect the data using get_data command.

# get Life expectancy at birth data
df <- get_data("WHOSIS_000001")
head(df)

## # A tibble: 6 x 7
##    year gho                          country sex       region publishstate value
##   <dbl> <chr>                        <chr>   <chr>     <chr>  <chr>        <dbl>
## 1  2001 Life expectancy at birth (y~ Rwanda  Both sex~ Africa Published     46.5
## 2  2002 Life expectancy at birth (y~ Rwanda  Male      Africa Published     46  
## 3  2004 Life expectancy at birth (y~ Rwanda  Female    Africa Published     54.6
## 4  2004 Life expectancy at birth (y~ Rwanda  Both sex~ Africa Published     51.9
## 5  2005 Life expectancy at birth (y~ Rwanda  Male      Africa Published     51  
## 6  2007 Life expectancy at birth (y~ Rwanda  Female    Africa Published     62.2

Now, we can plot the evolution of average life expectancy by region.

library(ggplot2)
df %>% 
  filter(sex == "Both sexes") %>% 
  group_by(region, year) %>%
  summarise(value = mean(value)) %>% 
  ggplot(aes(x = year, y = value, color = region, linetype = region)) +
  geom_line(size = 1) +
  theme_light(9) +
  labs(x = NULL, y = "Life expectancy at birth (years)\n", 
       linetype = NULL, color = NULL,
       title = "Evolution of life expectancy (by region)\n")

3.1.4 Food and Agriculture Organization of the United Nations

Food and Agriculture Organizationis responsible for the compilation, analysis and dissemination of a comprehensive variety of statistical data on food, agriculture and the sustainable management of natural resources. The organization provides free and unrestricted access to 16 major databases and produces publications with key statistical content covering different topics.

kable(FAO_opendata) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"),
                font_size = 12) %>%
  scroll_box(width = "100%", height = "500px")

Themes	Database	Description
Agriculture	FAOSTAT	It is the world’s most comprehensice statsitical database on food, agriculture, fisheries, forestry, natural resources management and nutrition.
	Food and Agriculture Microdata	It provides an inventory of micro-level datasets collected through farm and household surveys which contain information related to agriculture, food security and nutrition.
	Food loss and Food waste database	It contains data and information from openly accessible reports and studies measuring food loss and waste across food products, stages of the value chain, and geographical areas.
Fishery and aquaculture	FishStat Database	It provides comprehensive fisheries and aquaculture statistics, structured within data collections on a global and regional basis.
Forestry	Global Forest Resources Assessment Database	It provides essential information for understanding the extent of forest resources, their condition, management and uses.
Food prices and market information	Agricultural Market Information System Database	It provides market information on four grains particularly important in international food markets: wheat, maize, rice and soybeans.
	Food Price Monitoring and Analysis	It contains latest information and analysis on domestic prices of basic foods mainly in developing countries, complementing FAO analysis on international markets.
Economic, social and rural development	AIDmonitor	It aims to improve understanding of Official Development Assistance (ODA) funding flows in food and agriculture-related sectors.
	Rural Livelihoods Information System	It provides indicators of agricultural income and rural livelihoods.
	Gender and Land Rights Database	A platform to highlight the major political, legal and cultural factors that influence the realisation of womenâ€™s land rights throughout the world.
Food security and nutrition	International Network of Food Data Systems	INFOODS is a worldwide network of food composition experts aiming at improving the quality, availability, comparability, reliability and use of food composition data.
	Global Individual Food consumption data	It provides microdata sets and ready-to-use indicators on individual food consumption, based on food consumption surveys conducted at national and subnational level.
Natural genetic and biodiversity resources	AQUASTAT	It provides comprehensive information related to water resources, water uses and agricultural water management across the world.
	Domestic Animal Diversity Information System	It provides access to searchable databases of breed-related information and photos and links to other online resources on livestock diversity.
	WIEWS	World onformation and early warning system on plant genetic resources for food and agriculture
Sustainable Development Goals	SDG datasets	FAOâ€™s Statistical Capacity Assessment survey for SDG Indicators provides insights about member countries’ national statistical systems in regard to their capacity to monitor and report the 21 SDG indicators: zero hunger, gender equality, clean water administration, responsable consumption and production, life below water, life onland…

Samples of FAO data:

Here is an example of an extraction of apple production data in France

kable(FAOSTAT_data[ ,c("Domain","Area","Element","Item","Year", "Unit","Value")]) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"),
                font_size = 12) %>%
  scroll_box(width = "100%", height = "600px")

Domain	Area	Element	Item	Year	Unit	Value
Crops	France	Production	Apples	1961	tonnes	2142000
Crops	France	Production	Apples	1962	tonnes	5050000
Crops	France	Production	Apples	1963	tonnes	2581000
Crops	France	Production	Apples	1964	tonnes	3424000
Crops	France	Production	Apples	1965	tonnes	3675000
Crops	France	Production	Apples	1966	tonnes	2796000
Crops	France	Production	Apples	1967	tonnes	3846000
Crops	France	Production	Apples	1968	tonnes	3568000
Crops	France	Production	Apples	1969	tonnes	3377000
Crops	France	Production	Apples	1970	tonnes	3903000
Crops	France	Production	Apples	1971	tonnes	2967000
Crops	France	Production	Apples	1972	tonnes	2553000
Crops	France	Production	Apples	1973	tonnes	3265000
Crops	France	Production	Apples	1974	tonnes	2757000
Crops	France	Production	Apples	1975	tonnes	3104000
Crops	France	Production	Apples	1976	tonnes	2497000
Crops	France	Production	Apples	1977	tonnes	1687000
Crops	France	Production	Apples	1978	tonnes	3100000
Crops	France	Production	Apples	1979	tonnes	2464000
Crops	France	Production	Apples	1980	tonnes	2902000
Crops	France	Production	Apples	1981	tonnes	1437000
Crops	France	Production	Apples	1982	tonnes	1977000
Crops	France	Production	Apples	1983	tonnes	1965000
Crops	France	Production	Apples	1984	tonnes	2868000
Crops	France	Production	Apples	1985	tonnes	2349000
Crops	France	Production	Apples	1986	tonnes	2738000
Crops	France	Production	Apples	1987	tonnes	2388000
Crops	France	Production	Apples	1988	tonnes	2582000
Crops	France	Production	Apples	1989	tonnes	2340000
Crops	France	Production	Apples	1990	tonnes	2326000
Crops	France	Production	Apples	1991	tonnes	1283211
Crops	France	Production	Apples	1992	tonnes	2371861
Crops	France	Production	Apples	1993	tonnes	2050983
Crops	France	Production	Apples	1994	tonnes	2157952
Crops	France	Production	Apples	1995	tonnes	2063809
Crops	France	Production	Apples	1996	tonnes	1980069
Crops	France	Production	Apples	1997	tonnes	2018979
Crops	France	Production	Apples	1998	tonnes	1765375
Crops	France	Production	Apples	1999	tonnes	2133296
Crops	France	Production	Apples	2000	tonnes	2130274
Crops	France	Production	Apples	2001	tonnes	1882109
Crops	France	Production	Apples	2002	tonnes	1995306
Crops	France	Production	Apples	2003	tonnes	1721662
Crops	France	Production	Apples	2004	tonnes	1800177
Crops	France	Production	Apples	2005	tonnes	1829166
Crops	France	Production	Apples	2006	tonnes	1679328
Crops	France	Production	Apples	2007	tonnes	1781947
Crops	France	Production	Apples	2008	tonnes	1701752
Crops	France	Production	Apples	2009	tonnes	1729615
Crops	France	Production	Apples	2010	tonnes	1751269
Crops	France	Production	Apples	2011	tonnes	1762640
Crops	France	Production	Apples	2012	tonnes	1306333
Crops	France	Production	Apples	2013	tonnes	1688158
Crops	France	Production	Apples	2014	tonnes	1847551
Crops	France	Production	Apples	2015	tonnes	1968628
Crops	France	Production	Apples	2016	tonnes	1823123
Crops	France	Production	Apples	2017	tonnes	1695949
Crops	France	Production	Apples	2018	tonnes	1737412

unique(FAOSTAT_data$Item)

## [1] Apples
## Levels: Apples

summary(FAOSTAT_data$Value)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 1283211 1786504 2131785 2349757 2752250 5050000

library(ggplot2)
ggplot(FAOSTAT_data, aes(x = Year, y = Value)) +
  geom_line(alpha = 0.5)

3.1.5 The World Resources Institute

World Resources Institute (WRI) is a global research organization that spans more than 60 countries. Its mission consists of promoting environmental sustainability and human health and well-being. WRI provides open access to reliable data related to global environment issues (forests monitoring, climate change, water scarcity…). In addition, WRI produces maps, charts, and visual resources based on their datasets.

WOrld Greenhouse Gas EMission in 2016 (WRI)

Water, peace and security: The Water, Peace, and Security tool is a data platform that help identify water-related security risks, and allow stakeholders to take action at an early stage.
Energy Access Explorer: The Energy Access Explorer is an online, open-source, interactive platform that uses mapping to visualize the state of energy access in unserved and underserved areas. It analyzes credible and public data to make the connection between the demand and supply of energy.
Global Forest Watch: The Global Forest Watch iis an online platform that provides data and tools for monitoring forests. By harnessing cutting-edge technology, GFW allows anyone to access near real-time information about where and how forests are changing around the world.
Resource Watch: Resource Watch is an open data visualization platform with over 200 available data sets on topics ranging from climate change to human migration, deforestation to air quality, agriculture to energy and much more.
Climate Watch: Climate Watch is an online platform designed to empower policymakers, researchers, media and other stakeholders with the open climate data, visualizations and resources they need to gather insights on national and global progress on climate change.
PREPdata: PREPdata is a free, open-source data platform that provides the accessible, curated data that decision-makers need to analyze vulnerability and build climate resilience.
Forest Atlases: Forest Atlases are online platforms that help countries better manage their forest resources by combining government data with the latest forest monitoring technology.
LandMark: LandMark is the first online, interactive global platform to provide precise maps and other critical information on lands that are collectively held and used by Indigenous Peoples and local communities.
Aqueduct: Aqueduct enable users to measure, map, and mitigate water risks around the world with its open-source, and high-resolution platform. It uses multiple data layers to display water risks at any given location.

3.1.6 Earth Observation data

Remote sensing is a technology that aims to observe and study the Earth systems, and their dynamics. Data related to remote sensing, specifically satellite images, have a purpose to observe and study the Earth from space; its land surface, the oceans and the atmosphere. Earth observation data can be acquired through different public and private channels. Free of cost data is generally provided by public agencies. Here is a list of the main providers of open earth observation data:

European Space Agency (ESA): The European Space Agency is an intergovernmental organisation dedicated to the exploration of space. It manages Copernicus Earth Observation program that aims at chieving a global, continuous, autonomous, high quality, wide range Earth observation capacity. The data and information produced in the framework Copernicus are made available free-to-charge all its users and the public. The ESA different provides tools to download, visualize and process earth observation data.
- Sentinel open Access Hub
United States Geological Survey (USGS): The USGS is a scientific agency of the United STates government studying the the landscape of the United States, its natural resources, and the natural hazards that threaten it. It operates the landsat program with the NASA. Landsat is the longest-running enterprise for acquisition of satellite imagery of Earth. Landsat data products held in the USGS archives can be searched and downloaded at no charge from a variety of sources
National Aeronautics and Space Administration (NASA): The NASA is an independant agency of the United States Federal Government responsible for the civilian space program. NASA promotes the full and open sharing of all its data to research and applications communities, private industry, academia, and the general public by the mean of their Earth Observing System Data and Information System (EOSDIS). They provide different tools and products for seraching, using and visualizing remote sensing data:

Here is an example of visualizing downloaded Landsat data:

library(raster)
library(rgdal)
# Blue
b2 <- raster('data/rs/LC08_044034_20170614_B2.tif')
# Green
b3 <- raster('data/rs/LC08_044034_20170614_B3.tif')
# Red
b4 <- raster('data/rs/LC08_044034_20170614_B4.tif')
# Near Infrared (NIR)
b5 <- raster('data/rs/LC08_044034_20170614_B5.tif')

landsatRGB <- stack(b4, b3, b2)
plotRGB(landsatRGB, axes = TRUE, stretch = "lin", main = "Landsat True Color Composite")

3.1.7 Other resources

In addition to the previously cited organisations and services , plenty of other initiatives and services are providing and promoting the collection and usage of open data in a world wide scale.

Our World in data: Our World in data is a collaborative project developed by reserachers at the University of Oxford and the non profit organization “Global Exhange Data Lab” to publish open data and statistics covering various world developement indicators. The website contains different articles based on open data and describing topics like: health, demographic change, food, agriuculture, energy, environment…

Global CO2 emissions by world region since 1750 (Our World In Data)

Open knowledge FOundation (OKF): Open knowledge FOundation is a global, non-profit network that promotes and shares information at no charge, including both content and data.
- Comprehensive Knowledge Archive Network (CKAN): Open knowledge FOundation maintains the COmprehensive Knowledge Archive Network (CKAN) which is an open- source open data portal for the storage and distribution of open data. CKAN is used by differentnational and local governments such as: US data gov, european data portal, open data swiss, canadian data gov… Once the data is published, users can use its faceted search features to browse and find the data they need, and preview it using maps, graphs and tables – whether they are developers, journalists, researchers, NGOs, citizens, or even your own staff.
- GLobal Open Index: The Global Open Data Index (GODI) is the annual global benchmark for publication of open government data, run by the Open Knowledge Network.

Open Data Index by country/thematic

DBnomics: DBnomics is an oen platform aggregating global economic data and providing direct access to these data via one API. Examples of data provides: European Central Bank, Eurostat, Internationa Monetary Fund, International Labour Organization, WOrld Bank…

3.2 Tools

3.2.1 Extracting World Bank Data

There are different R and Python libraries providing access to World Bank data. We will test two of them in this chapter: WDI and Wbstats.

3.2.1.1 WDI

Maternal mortality ratio

The WDI library provides easy access to the WorldBank open data. We can use the command WDIsearch for looking up an indicator.

library(WDI)

WDIsearch(string = "maternal mortality ratio",
          field = "name",
          short = FALSE)

##      indicator       
## [1,] "SH.STA.MMRT.NE"
## [2,] "SH.STA.MMRT"   
##      name                                                                   
## [1,] "Maternal mortality ratio (national estimate, per 100,000 live births)"
## [2,] "Maternal mortality ratio (modeled estimate, per 100,000 live births)" 
##      description                                                                                                                                                                                                                                                                                                                                                      
## [1,] "Maternal mortality ratio is the number of women who die from pregnancy-related causes while pregnant or within 42 days of pregnancy termination per 100,000 live births."                                                                                                                                                                                       
## [2,] "Maternal mortality ratio is the number of women who die from pregnancy-related causes while pregnant or within 42 days of pregnancy termination per 100,000 live births. The data are estimated with a regression model using information on the proportion of maternal deaths among non-AIDS deaths in women ages 15-49, fertility, birth attendants, and GDP."
##      sourceDatabase                
## [1,] "World Development Indicators"
## [2,] "World Development Indicators"
##      sourceOrganization                                                                                                                                                     
## [1,] "UNICEF, State of the World's Children, Childinfo, and Demographic and Health Surveys."                                                                                
## [2,] "WHO, UNICEF, UNFPA, World Bank Group, and the United Nations Population Division. Trends in Maternal Mortality: 1990 to 2015. Geneva, World Health Organization, 2015"

Once we get the label of the desired indicator, we can use the command WDI() to extract the data by specifying the indicator name, the list of countries, and the sart/end dates.

df <- data.frame(
WDI(country = c("US",
                "BR",
                "ZA"),
    indicator = "SH.STA.MMRT",
    start = 2005,
    end = 2015,
    extra = TRUE))
head(df)

##   iso2c country SH.STA.MMRT year iso3c                    region  capital
## 1    BR  Brazil          63 2015   BRA Latin America & Caribbean Brasilia
## 2    BR  Brazil          62 2014   BRA Latin America & Caribbean Brasilia
## 3    BR  Brazil          61 2013   BRA Latin America & Caribbean Brasilia
## 4    BR  Brazil          60 2012   BRA Latin America & Caribbean Brasilia
## 5    BR  Brazil          61 2011   BRA Latin America & Caribbean Brasilia
## 6    BR  Brazil          65 2010   BRA Latin America & Caribbean Brasilia
##   longitude latitude              income lending
## 1  -47.9292 -15.7801 Upper middle income    IBRD
## 2  -47.9292 -15.7801 Upper middle income    IBRD
## 3  -47.9292 -15.7801 Upper middle income    IBRD
## 4  -47.9292 -15.7801 Upper middle income    IBRD
## 5  -47.9292 -15.7801 Upper middle income    IBRD
## 6  -47.9292 -15.7801 Upper middle income    IBRD

We can filter the extratced data using filter command of dplyr package.

library(dplyr)
brazil <- df %>% filter(iso2c == "BR")
usa <- df %>% filter(iso2c == "US")
head(usa)

##   iso2c       country SH.STA.MMRT year iso3c        region         capital
## 1    US United States          18 2015   USA North America Washington D.C.
## 2    US United States          16 2014   USA North America Washington D.C.
## 3    US United States          16 2013   USA North America Washington D.C.
## 4    US United States          16 2012   USA North America Washington D.C.
## 5    US United States          15 2011   USA North America Washington D.C.
## 6    US United States          15 2010   USA North America Washington D.C.
##   longitude latitude      income        lending
## 1   -77.032  38.8895 High income Not classified
## 2   -77.032  38.8895 High income Not classified
## 3   -77.032  38.8895 High income Not classified
## 4   -77.032  38.8895 High income Not classified
## 5   -77.032  38.8895 High income Not classified
## 6   -77.032  38.8895 High income Not classified

Now, we can use plotly library for producing an interactive visualization of the evolution of our indicator in different countries.

library(plotly)

trace_brazil = brazil$SH.STA.MMRT
trace_usa = usa$SH.STA.MMRT

dts = brazil$year

df.plot <- data.frame(dts, trace_brazil, trace_usa)

p1 <- plot_ly(df.plot,
              x = ~dts,
              y = ~trace_brazil,
              name = "Brazil",
              type = "scatter",
              mode = "lines+markers") %>% 
  add_trace(y = ~trace_usa,
            name = "USA",
            type = "scatter",
            mode = "lines+markers") %>% 
  layout(title = "Maternal mortality per 100,000 births",
         xaxis = list(title = "Year",
                      zeroline = FALSE),
         yaxis = list(title = "Count",
                      zeroline = FALSE))
p1

GDP Indicator

Since, there are a plenty of indicators related to gdp, we can use regular expressions for enhancing our search.

WDIsearch("gdp.*capita.*constant")

##      indicator             
## [1,] "6.0.GDPpc_constant"  
## [2,] "NY.GDP.PCAP.PP.KD.87"
## [3,] "NY.GDP.PCAP.PP.KD"   
## [4,] "NY.GDP.PCAP.KN"      
## [5,] "NY.GDP.PCAP.KD"      
##      name                                                  
## [1,] "GDP per capita, PPP (constant 2011 international $) "
## [2,] "GDP per capita, PPP (constant 1987 international $)" 
## [3,] "GDP per capita, PPP (constant 2011 international $)" 
## [4,] "GDP per capita (constant LCU)"                       
## [5,] "GDP per capita (constant 2010 US$)"

Now, let’s download the data corresponding to “GDP per capita (constant 2000 US$)”: the indicator name is NY.GDP.PCAP.KD. We can rename directly the indicator as gdp_per_capita.

dat = WDI(indicator=c("gdp_per_capita" = "NY.GDP.PCAP.KD"), 
          country=c('MX','CA','US'), 
          start=1960, 
          end=2012)
head(dat)

##   iso2c country gdp_per_capita year
## 1    CA  Canada       48788.33 2012
## 2    CA  Canada       48466.85 2011
## 3    CA  Canada       47450.32 2010
## 4    CA  Canada       46542.90 2009
## 5    CA  Canada       48497.56 2008
## 6    CA  Canada       48536.53 2007

We can plot the data for the 3 selected countries

library(ggplot2)
ggplot(dat, 
       aes(year, gdp_per_capita, color=country)) + 
  geom_line() + 
  xlab('Year') + 
  ylab('GDP per capita')

Life expctancy:

Let’s find the IDs for the following indicators: life expectancy, GDP, and infant mortality.

WDIsearch("gdp.*capita.*US\\$")

##      indicator        name                                
## [1,] "NY.GDP.PCAP.KD" "GDP per capita (constant 2010 US$)"
## [2,] "NY.GDP.PCAP.CD" "GDP per capita (current US$)"

WDIsearch("life expectancy at birth.*total")

##                                 indicator 
##                          "SP.DYN.LE00.IN" 
##                                      name 
## "Life expectancy at birth, total (years)"

WDIsearch("^mortality.*rate.*infant")

##       indicator          
##  [1,] "HF.DYN.IMRT.IN.Q5"
##  [2,] "HF.DYN.IMRT.IN.Q4"
##  [3,] "HF.DYN.IMRT.IN.Q3"
##  [4,] "HF.DYN.IMRT.IN.Q2"
##  [5,] "HF.DYN.IMRT.IN.Q1"
##  [6,] "HF.DYN.IMRT.IN"   
##  [7,] "SP.DYN.IMRT.MA.IN"
##  [8,] "SP.DYN.IMRT.IN"   
##  [9,] "SP.DYN.IMRT.FE.IN"
##       name                                                          
##  [1,] "Mortality rate, infant (per 1,000 live births): Q5 (highest)"
##  [2,] "Mortality rate, infant (per 1,000 live births): Q4"          
##  [3,] "Mortality rate, infant (per 1,000 live births): Q3"          
##  [4,] "Mortality rate, infant (per 1,000 live births): Q2"          
##  [5,] "Mortality rate, infant (per 1,000 live births): Q1 (lowest)" 
##  [6,] "Mortality rate, infant (per 1,000 live births)"              
##  [7,] "Mortality rate, infant, male (per 1,000 live births)"        
##  [8,] "Mortality rate, infant (per 1,000 live births)"              
##  [9,] "Mortality rate, infant, female (per 1,000 live births)"

Now we download the data

wdi_dat <- WDI(indicator = c("GDP" = "NY.GDP.PCAP.KD", 
                             "life_expectancy" = "SP.DYN.LE00.IN", 
                             "infant_mortality" = "SP.DYN.IMRT.IN"), 
               start = 1960, end = 2015, 
               extra = TRUE)
head(wdi_dat)

##   iso2c    country year GDP life_expectancy infant_mortality iso3c     region
## 1    1A Arab World 1960  NA        46.54691               NA   ARB Aggregates
## 2    1A Arab World 1965  NA        49.49648               NA   ARB Aggregates
## 3    1A Arab World 1966  NA        50.07295               NA   ARB Aggregates
## 4    1A Arab World 1972  NA        53.40689               NA   ARB Aggregates
## 5    1A Arab World 1973  NA        54.01585               NA   ARB Aggregates
## 6    1A Arab World 1961  NA        47.14162               NA   ARB Aggregates
##   capital longitude latitude     income    lending
## 1                            Aggregates Aggregates
## 2                            Aggregates Aggregates
## 3                            Aggregates Aggregates
## 4                            Aggregates Aggregates
## 5                            Aggregates Aggregates
## 6                            Aggregates Aggregates

We can remove the entries with aggregated regional values

wdi_dat <- subset(wdi_dat, region != "Aggregates")

Let’s plot the distribution of infant mortality and GDP levels

library(ggplot2)
ggplot(subset(wdi_dat, year == 2008), 
       aes(x = GDP, y = infant_mortality)) + geom_point()

We can compare poverty in America with some world’s poor country, we can reproduce the graph presented by the Stanford Center on Poverty & Inequality article

# select countries
country_list = c("United States", "Rwanda", "Mongolia", "Pakistan", "Lao PDR", "Bhutan", 
                 "Malaysia", "Brazil", "Ireland", "Japan", "Sweden", "Netherlands")
# subset data for the listed countries
lifexp_countries <- subset(wdi_dat, 
                           country %in% country_list)
# plot data
ggplot(subset(lifexp_countries, year == "2008"), 
       aes(x = GDP, y = life_expectancy, color = country == "United States")) + 
  geom_point() +  
  geom_text(aes(label = country), size=3, nudge_y = 0.4) +
  scale_x_continuous(limits = c(0, 70000))

We can follow the evoltion of these indicators in the selected countries using an interactive graph

library(plotly)

g <- ggplot(subset(lifexp_countries, year > "1999" & year <= "2015"), 
            aes(x = GDP, y = life_expectancy, color = country == "United States", tooltip = country)) + 
  geom_point() + 
  facet_wrap(~ year) + 
  theme(legend.position="none")

ggplotly(g)

3.2.1.2 Wbstats

Like WDI package, the wbstats library provide an interface for querying the World Bank API. In this section, we present some examples of using different commands for downloading and manipulating the world bank data. wbstats provides a cached version of world bank data called wb_cachelist

library(wbstats)
str(wb_cachelist, max.level = 1)

## List of 7
##  $ countries  :'data.frame': 304 obs. of  18 variables:
##  $ indicators :'data.frame': 16978 obs. of  7 variables:
##  $ sources    :'data.frame': 43 obs. of  8 variables:
##  $ datacatalog:'data.frame': 238 obs. of  29 variables:
##  $ topics     :'data.frame': 21 obs. of  3 variables:
##  $ income     :'data.frame': 7 obs. of  3 variables:
##  $ lending    :'data.frame': 4 obs. of  3 variables:

# We can use updated data and  last ersions of indicators by saving the list returned from `wbcache`
new_cache <- wbcache()

We use the command wbsearch for seraching the indicators based on regular expressions and text patterns

# 'poverty' OR 'unemployment' OR 'employment'
povemply_vars <- wbsearch(pattern = "poverty|unemployment|employment")
head(povemply_vars)

##    indicatorID
## 35   WP15177.9
## 36   WP15177.8
## 37   WP15177.7
## 38   WP15177.6
## 39   WP15177.5
## 40   WP15177.4
##                                                                                        indicator
## 35         Received government transfers in the past year, income, richest 60% (% ages 15+) [w2]
## 36         Received government transfers in the past year, income, poorest 40% (% ages 15+) [w2]
## 37 Received government transfers in the past year, secondary education or more (% ages 15+) [w2]
## 38   Received government transfers in the past year, primary education or less (% ages 15+) [w2]
## 39                Received government transfers in the past year, older adults (% ages 25+) [w2]
## 40              Received government transfers in the past year, young adults (% ages 15-24) [w2]

we download the data with wb() command. We can specify a subset of countries

# Population, total
# country values: iso3c, iso2c, regionID, adminID, incomeID
pop_data <- wb(country = c("ABW","AF", "SSF", "ECA", "NOC"),
               indicator = "SP.POP.TOTL", startdate = 2000, enddate = 2018)
head(pop_data)

##   iso3c date  value indicatorID         indicator iso2c country
## 1   ABW 2018 105845 SP.POP.TOTL Population, total    AW   Aruba
## 2   ABW 2017 105366 SP.POP.TOTL Population, total    AW   Aruba
## 3   ABW 2016 104872 SP.POP.TOTL Population, total    AW   Aruba
## 4   ABW 2015 104341 SP.POP.TOTL Population, total    AW   Aruba
## 5   ABW 2014 103774 SP.POP.TOTL Population, total    AW   Aruba
## 6   ABW 2013 103159 SP.POP.TOTL Population, total    AW   Aruba

we can download data with multiple indicators in a long data format or a wide formar by specifying the parameter return_wide = TRUE

pop_gdp_wide <- wb(country = c("US", "NO"), 
                   indicator = c("SP.POP.TOTL", "NY.GDP.MKTP.CD"),
                   startdate = 1971, enddate = 1971, 
                   return_wide = TRUE)

head(pop_gdp_wide)

##   iso3c date iso2c       country NY.GDP.MKTP.CD SP.POP.TOTL
## 1   NOR 1971    NO        Norway   1.458311e+10     3903039
## 2   USA 1971    US United States   1.164850e+12   207661000

When we don’t know the latest date for indicator availability in specified country, we can use the mrv command that returns the latest available entry.

eg_data <- wb(country = c("US"), indicator = 'EG.ELC.ACCS.ZS', mrv = 1)
eg_data

##   iso3c date value    indicatorID                               indicator iso2c
## 1   USA 2018   100 EG.ELC.ACCS.ZS Access to electricity (% of population)    US
##         country
## 1 United States

When we have missed values between actual observations, we can use the command gapfill to fill in these values with last observed ones

eg_data <- wb(country = c("IN"), indicator = 'EG.ELC.ACCS.ZS', mrv = 10, gapfill = TRUE)
eg_data

##    iso3c date    value    indicatorID                               indicator
## 1    IND 2019 95.23586 EG.ELC.ACCS.ZS Access to electricity (% of population)
## 2    IND 2018 95.23586 EG.ELC.ACCS.ZS Access to electricity (% of population)
## 3    IND 2017 92.60000 EG.ELC.ACCS.ZS Access to electricity (% of population)
## 4    IND 2016 89.67192 EG.ELC.ACCS.ZS Access to electricity (% of population)
## 5    IND 2015 88.00000 EG.ELC.ACCS.ZS Access to electricity (% of population)
## 6    IND 2014 83.53472 EG.ELC.ACCS.ZS Access to electricity (% of population)
## 7    IND 2013 80.79154 EG.ELC.ACCS.ZS Access to electricity (% of population)
## 8    IND 2012 79.90000 EG.ELC.ACCS.ZS Access to electricity (% of population)
## 9    IND 2011 67.60000 EG.ELC.ACCS.ZS Access to electricity (% of population)
## 10   IND 2010 76.30000 EG.ELC.ACCS.ZS Access to electricity (% of population)
##    iso2c country
## 1     IN   India
## 2     IN   India
## 3     IN   India
## 4     IN   India
## 5     IN   India
## 6     IN   India
## 7     IN   India
## 8     IN   India
## 9     IN   India
## 10    IN   India

We can use the command POSIXct = TRUE in order to obtain datetime formats that make easier the manipulation and plotting time series data

pop_data <- wb(country = c("US","SSF", "ECA", "NOC"),
               indicator = "SP.POP.TOTL", 
               startdate = 2000, enddate = 2018,
               POSIXct = TRUE, freq = "M")
head(pop_data)

##   iso3c date     value indicatorID         indicator iso2c
## 1   ECA 2018 416944914 SP.POP.TOTL Population, total    7E
## 2   ECA 2017 414916897 SP.POP.TOTL Population, total    7E
## 3   ECA 2016 412611566 SP.POP.TOTL Population, total    7E
## 4   ECA 2015 410124886 SP.POP.TOTL Population, total    7E
## 5   ECA 2014 407578609 SP.POP.TOTL Population, total    7E
## 6   ECA 2013 405078322 SP.POP.TOTL Population, total    7E
##                                         country    date_ct granularity
## 1 Europe & Central Asia (excluding high income) 2018-01-01      annual
## 2 Europe & Central Asia (excluding high income) 2017-01-01      annual
## 3 Europe & Central Asia (excluding high income) 2016-01-01      annual
## 4 Europe & Central Asia (excluding high income) 2015-01-01      annual
## 5 Europe & Central Asia (excluding high income) 2014-01-01      annual
## 6 Europe & Central Asia (excluding high income) 2013-01-01      annual

ggplot(pop_data, aes(x = date_ct, y = value, colour = country)) + geom_line(size = 1) +
  labs(title = "Total population", x = "Date", y = "Population")

3.2.2 Extracting DBnomics Data

DBnomics is an open-source project with the goal of aggregating the world’s economic data in one location, free of charge to the public. DBnomics covers hundreds of millions of series from international and national institutions (Eurostat, World Bank, IMF, …). We will use in this section the rdbnomics which is an R package providing access to DBnomics data series.

In order to get dbnomics data we have to specify a serie identifier (ids) defined by three values formatted in this way: provider_code/dataset_code/series/code. We can find these infrmation in the DBnomics and DBnomics API documentation. Here is an example of fetching unemployemet rate dataset (called ZUTN) from the AMECO provider.

library(rdbnomics)
library(data.table)
library(magrittr)
library(dplyr)
library(ggplot2)

# get two series of unemployement rate data (europe and danemark) 
df = rdb(ids = c('AMECO/ZUTN/EA19.1.0.0.0.ZUTN', 'AMECO/ZUTN/DNK.1.0.0.0.ZUTN')) %>% 
  filter(!is.na(value))
head(df)

##   @frequency   Country dataset_code
## 1     annual Euro area         ZUTN
## 2     annual Euro area         ZUTN
## 3     annual Euro area         ZUTN
## 4     annual Euro area         ZUTN
## 5     annual Euro area         ZUTN
## 6     annual Euro area         ZUTN
##                                                     dataset_name freq Frequency
## 1 Unemployment rate: total :- Member States: definition EUROSTAT    a  Annually
## 2 Unemployment rate: total :- Member States: definition EUROSTAT    a  Annually
## 3 Unemployment rate: total :- Member States: definition EUROSTAT    a  Annually
## 4 Unemployment rate: total :- Member States: definition EUROSTAT    a  Annually
## 5 Unemployment rate: total :- Member States: definition EUROSTAT    a  Annually
## 6 Unemployment rate: total :- Member States: definition EUROSTAT    a  Annually
##    geo          indexed_at original_period original_value     period
## 1 ea19 2020-05-07 08:46:14            1997           11.5 1997-01-01
## 2 ea19 2020-05-07 08:46:14            1998             11 1998-01-01
## 3 ea19 2020-05-07 08:46:14            1999           10.2 1999-01-01
## 4 ea19 2020-05-07 08:46:14            2000            9.2 2000-01-01
## 5 ea19 2020-05-07 08:46:14            2001            8.4 2001-01-01
## 6 ea19 2020-05-07 08:46:14            2002            8.6 2002-01-01
##   provider_code       series_code
## 1         AMECO EA19.1.0.0.0.ZUTN
## 2         AMECO EA19.1.0.0.0.ZUTN
## 3         AMECO EA19.1.0.0.0.ZUTN
## 4         AMECO EA19.1.0.0.0.ZUTN
## 5         AMECO EA19.1.0.0.0.ZUTN
## 6         AMECO EA19.1.0.0.0.ZUTN
##                                                series_name
## 1 Annually – (Percentage of active population) – Euro area
## 2 Annually – (Percentage of active population) – Euro area
## 3 Annually – (Percentage of active population) – Euro area
## 4 Annually – (Percentage of active population) – Euro area
## 5 Annually – (Percentage of active population) – Euro area
## 6 Annually – (Percentage of active population) – Euro area
##                              unit                              Unit value
## 1 percentage-of-active-population (Percentage of active population)  11.5
## 2 percentage-of-active-population (Percentage of active population)  11.0
## 3 percentage-of-active-population (Percentage of active population)  10.2
## 4 percentage-of-active-population (Percentage of active population)   9.2
## 5 percentage-of-active-population (Percentage of active population)   8.4
## 6 percentage-of-active-population (Percentage of active population)   8.6

# plot
ggplot(df,
       aes(x = period, y = value, color = series_name)) +
  geom_line(size = 1.2) +
  geom_point(size = 2) +
  dbnomics()

There is another way to get time series by using code mask notation. Here is an example of extracting the balance of payement data (BOP) from IMF provider.

# get data of France
df <- rdb('IMF', 'BOP', mask = 'A.FR.BCA_BP6_EUR') %>%
  filter(!is.na(value))
unique(df$REF_AREA)

## [1] "FR"

# get data of france and spain
df <- rdb('IMF', 'BOP', mask = 'A.FR+ES.BCA_BP6_EUR') %>%
  filter(!is.na(value))
unique(df$REF_AREA)

## [1] "ES" "FR"

# plot
ggplot(df, aes(x = period, y = value, color = series_name)) +
  geom_line(size = 1.2) +
  geom_point(size = 2) +
  dbnomics()

We can extract data using directly the API link provided in DBnomics webstie.

df <- rdb(api_link = "https://api.db.nomics.world/v22/series?observations=1&series_ids=BOE/6008/RPMTDDC,BOE/6231/RPMTBVE") %>%
  filter(!is.na(value))

ggplot(df, aes(x = period, y = value, color = series_name)) +
  geom_line(size = 1.2) +
  geom_point(size = 2) +
  dbnomics()

3.2.3 Other tools

We can find a plenty of other tools and libraries enabling the extraction of world scale open data from different sources. This article presents a complete overview of these tools.

tidycensus: Load US Census Boundary and Attribute Data as ‘tidyverse’ and ‘sf’-Ready Data Frames
IMFData: R Interface for International Monetary Fund(IMF) Data API
OECD: Search and Extract Data from the OECD
rWBclimate: R interface for the World Bank climate data
getlandsat: Get Landsat 8 Data from Amazon Public Data Sets