Make some general statistics like Coronavirus Confirmed Cases Around The Globe, Coronavirus Cases
Plotting 5 charts is referred to as: Coronavirus Confirmed Cases, Daily New Coronavirus Confirmed Cases, Coronavirus Deaths, Daily New Coronavirus Deaths, Active Coronavirus Cases for each country
2.1. USA (The Leader)
2.2. China (The Origin)
2.3. UK (The Mutant)
2.4. Italy (The Early Chaos)
2.5. India (The Midway Chaos)
2.6. Australia (The Latest Chaos)
2.7. France (My Country of Residence)
Plotting 5 charts is referred to as: Coronavirus Confirmed Cases for each country
3.1. Asia
3.2. Europe
3.3. Africa
3.4. North America
3.5. South America
3.6. Australia/Oceania
Finding the most affected countries by COVID
Current and History of Distribution of Active Cases
Plotting the chart to show the distribution of COVID around the world
Import the necessary libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objs as go
import os
import math
Parse the datetime format
from datetime import datetime, timedelta
dateparse = lambda x: datetime.strptime(x, '%Y-%m-%d')
Reading data from CSV file and print the data frame
df = pd.read_csv('data/worldometer_coronavirus_daily_data.csv',
parse_dates=['date'], date_parser=dateparse)
df
| date | country | cumulative_total_cases | daily_new_cases | active_cases | cumulative_total_deaths | daily_new_deaths | |
|---|---|---|---|---|---|---|---|
| 0 | 2020-02-15 | Afghanistan | 0.0 | NaN | 0.0 | 0.0 | NaN |
| 1 | 2020-02-16 | Afghanistan | 0.0 | NaN | 0.0 | 0.0 | NaN |
| 2 | 2020-02-17 | Afghanistan | 0.0 | NaN | 0.0 | 0.0 | NaN |
| 3 | 2020-02-18 | Afghanistan | 0.0 | NaN | 0.0 | 0.0 | NaN |
| 4 | 2020-02-19 | Afghanistan | 0.0 | NaN | 0.0 | 0.0 | NaN |
| ... | ... | ... | ... | ... | ... | ... | ... |
| 184782 | 2022-05-10 | Zimbabwe | 248642.0 | 106.0 | 963.0 | 5481.0 | 2.0 |
| 184783 | 2022-05-11 | Zimbabwe | 248778.0 | 136.0 | 1039.0 | 5481.0 | 0.0 |
| 184784 | 2022-05-12 | Zimbabwe | 248943.0 | 165.0 | 1158.0 | 5481.0 | 0.0 |
| 184785 | 2022-05-13 | Zimbabwe | 249131.0 | 188.0 | 1283.0 | 5482.0 | 1.0 |
| 184786 | 2022-05-14 | Zimbabwe | 249206.0 | 75.0 | 1307.0 | 5482.0 | 0.0 |
184787 rows × 7 columns
Reading data from CSV file and print the data frame
df_summary = pd.read_csv('data/worldometer_coronavirus_summary_data.csv')
df_summary
| country | continent | total_confirmed | total_deaths | total_recovered | active_cases | serious_or_critical | total_cases_per_1m_population | total_deaths_per_1m_population | total_tests | total_tests_per_1m_population | population | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Afghanistan | Asia | 179267 | 7690.0 | 162202.0 | 9375.0 | 1124.0 | 4420 | 190.0 | 951337.0 | 23455.0 | 40560636 |
| 1 | Albania | Europe | 275574 | 3497.0 | 271826.0 | 251.0 | 2.0 | 95954 | 1218.0 | 1817530.0 | 632857.0 | 2871945 |
| 2 | Algeria | Africa | 265816 | 6875.0 | 178371.0 | 80570.0 | 6.0 | 5865 | 152.0 | 230861.0 | 5093.0 | 45325517 |
| 3 | Andorra | Europe | 42156 | 153.0 | 41021.0 | 982.0 | 14.0 | 543983 | 1974.0 | 249838.0 | 3223924.0 | 77495 |
| 4 | Angola | Africa | 99194 | 1900.0 | 97149.0 | 145.0 | NaN | 2853 | 55.0 | 1499795.0 | 43136.0 | 34769277 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 221 | Wallis And Futuna Islands | Australia/Oceania | 454 | 7.0 | 438.0 | 9.0 | NaN | 41755 | 644.0 | 20508.0 | 1886140.0 | 10873 |
| 222 | Western Sahara | Africa | 10 | 1.0 | 9.0 | 0.0 | NaN | 16 | 2.0 | NaN | NaN | 624681 |
| 223 | Yemen | Asia | 11819 | 2149.0 | 9009.0 | 661.0 | 23.0 | 381 | 69.0 | 265253.0 | 8543.0 | 31049015 |
| 224 | Zambia | Africa | 320591 | 3983.0 | 315997.0 | 611.0 | NaN | 16575 | 206.0 | 3452554.0 | 178497.0 | 19342381 |
| 225 | Zimbabwe | Africa | 249206 | 5482.0 | 242417.0 | 1307.0 | 12.0 | 16324 | 359.0 | 2287793.0 | 149863.0 | 15265849 |
226 rows × 12 columns
The code adds a new column 'continent' to the DataFrame 'df', where the value for each row is fetched from the 'continent' column of another DataFrame 'df_summary' based on a matching 'country' value. The updated 'df' DataFrame is then displayed.
df['continent'] = df.apply(lambda row: df_summary[df_summary.country == row.country].iloc[0].continent, axis=1)
df
| date | country | cumulative_total_cases | daily_new_cases | active_cases | cumulative_total_deaths | daily_new_deaths | continent | |
|---|---|---|---|---|---|---|---|---|
| 0 | 2020-02-15 | Afghanistan | 0.0 | NaN | 0.0 | 0.0 | NaN | Asia |
| 1 | 2020-02-16 | Afghanistan | 0.0 | NaN | 0.0 | 0.0 | NaN | Asia |
| 2 | 2020-02-17 | Afghanistan | 0.0 | NaN | 0.0 | 0.0 | NaN | Asia |
| 3 | 2020-02-18 | Afghanistan | 0.0 | NaN | 0.0 | 0.0 | NaN | Asia |
| 4 | 2020-02-19 | Afghanistan | 0.0 | NaN | 0.0 | 0.0 | NaN | Asia |
| ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 184782 | 2022-05-10 | Zimbabwe | 248642.0 | 106.0 | 963.0 | 5481.0 | 2.0 | Africa |
| 184783 | 2022-05-11 | Zimbabwe | 248778.0 | 136.0 | 1039.0 | 5481.0 | 0.0 | Africa |
| 184784 | 2022-05-12 | Zimbabwe | 248943.0 | 165.0 | 1158.0 | 5481.0 | 0.0 | Africa |
| 184785 | 2022-05-13 | Zimbabwe | 249131.0 | 188.0 | 1283.0 | 5482.0 | 1.0 | Africa |
| 184786 | 2022-05-14 | Zimbabwe | 249206.0 | 75.0 | 1307.0 | 5482.0 | 0.0 | Africa |
184787 rows × 8 columns
The code generates a pie chart using Plotly's graph objects (go). It sets the labels and values for different categories, configures various visual and interactive properties of the chart, and displays the chart using fig.show().
trace = go.Pie(labels=['Total Recovered', 'Total Active', 'Total Deaths'],
values=[df_summary.total_recovered.sum(), df_summary.active_cases.sum(), df_summary.total_deaths.sum()],
title="<b>Coronavirus Cases</b>",
title_font_size=18,
hovertemplate="<b>%{label}</b><br>%{value}<br><i>%{percent}</i>",
#hoverinfo='percent+value+label',
textinfo='percent',
textposition='inside',
hole=0.6,
showlegend=True,
marker=dict(colors=["#8dd3c7", "ffffb3", "#fb8072"],
line=dict(color='#000000',
width=2),
),
name=""
)
fig=go.Figure(data=[trace])
fig.show()
The code defines a function add_commas that takes a number as input and adds commas as thousand separators to it. It then prints statements with formatted numbers and country references related to total deaths, active cases, and total recoveries based on data from the df_summary DataFrame. The function add_commas is used to format the numbers with commas.
def add_commas(num):
out = ""
counter = 0
for n in num[::-1]:
counter += 1
if counter == 4:
counter = 1
out = "," + out
out = n + out
return out
print(f"As of {df.date.max().strftime('%Y-%m-%d')}, here are the numbers:\n")
print(add_commas(str(int(df_summary.total_deaths.sum()))), "total deaths. That is more than the entire population of ", end="")
deaths_ref = df_summary[df_summary.population < df_summary.total_deaths.sum()].sort_values("population", ascending=False).iloc[:2]
print(deaths_ref.iloc[0].country, f"({add_commas(str(int(deaths_ref.iloc[0].population)))}) or",
deaths_ref.iloc[1].country, f"({add_commas(str(int(deaths_ref.iloc[1].population)))})!")
print(add_commas(str(int(df_summary.active_cases.sum()))), "active cases. You can think of that as if the entire population of ", end="")
active_ref = df_summary[df_summary.population < df_summary.active_cases.sum()].sort_values("population", ascending=False).iloc[:2]
print(active_ref.iloc[0].country, f"({add_commas(str(int(active_ref.iloc[0].population)))}), or",
active_ref.iloc[1].country, f"({add_commas(str(int(active_ref.iloc[1].population)))}), were sick right now!")
print(add_commas(str(int(df_summary.total_recovered.sum()))), "total recoveries. It's as if the entire population of ", end="")
recover_ref = df_summary[df_summary.population < df_summary.total_recovered.sum()].sort_values("population", ascending=False).iloc[:2]
print(recover_ref.iloc[0].country, f"({add_commas(str(int(recover_ref.iloc[0].population)))}), or",
recover_ref.iloc[1].country, f"({add_commas(str(int(recover_ref.iloc[1].population)))}), went through and recovered from Covid-19!")
As of 2022-05-14, here are the numbers: 6,288,083 total deaths. That is more than the entire population of Singapore (5,936,034) or Denmark (5,830,190)! 13,996,500 active cases. You can think of that as if the entire population of Guinea (13,795,931), or Rwanda (13,549,956), were sick right now! 460,397,633 total recoveries. It's as if the entire population of USA (334,617,623), or Indonesia (278,910,317), went through and recovered from Covid-19!
The code computes the logarithm (base 2) of the 'total_confirmed' column in the 'df_summary' DataFrame and assigns it to a new column 'log(Total Confirmed)'. It also applies the 'add_commas' function to format the 'total_confirmed' values with commas and assigns the formatted values to a new column 'Total Confirmed'. Then, it creates a choropleth map using Plotly Express (px) based on the 'df_summary' DataFrame, where the color represents the logarithm of the total confirmed cases. The map is customized with a color scale, hover information, and a color bar with tick labels. Finally, the map is displayed using fig.show().
df_summary['log(Total Confirmed)'] = np.log2(df_summary['total_confirmed'])
df_summary['Total Confirmed'] = df_summary['total_confirmed'].apply(lambda x: add_commas(str(x)))
fig = px.choropleth(df_summary,
locations="country",
color="log(Total Confirmed)",
locationmode = 'country names',
hover_name='country',
hover_data=['Total Confirmed'],
color_continuous_scale='reds',
title = '<b>Coronavirus Confirmed Cases Around The Globe</b>')
log_scale_vals = list(range(0,25,2))
scale_vals = (np.exp2(log_scale_vals)).astype(int).astype(str)
scale_vals = list(map(add_commas, scale_vals))
fig.update_layout(title_font_size=22,
margin={"r":20, "l":30},
coloraxis={#"showscale":False,
"colorbar":dict(title="<b>Confirmed Cases</b><br>",
#range=[np.log(50), np.log(6400)],
titleside="top",
tickmode="array",
tickvals=log_scale_vals,
ticktext=scale_vals
)},
)
fig.show()
The code generates a treemap using Plotly Express (px) based on the 'df_summary' DataFrame, representing the breakdown of total confirmed cases by country. The treemap is customized with a specified path, values, height, title, and color sequence. The update_traces method is used to set the text information displayed on the treemap, and the resulting visualization is displayed using fig.show().
fig = px.treemap(df_summary, path=["country"], values="total_confirmed", height = 750,
title="<b>Total Coronavirus Confirmed Cases Breakdown by Country</b>",
color_discrete_sequence = px.colors.qualitative.Set3)
fig.update_traces(textinfo = "label+text+value")
fig.show()
The code defines a function plot_stats that takes a country as input. It generates multiple plots using Plotly and pandas based on the data for the specified country. The plots include cumulative total confirmed cases, daily new cases, cumulative total deaths, daily new deaths, and active cases. Each plot has a customized layout, title, and visual style. The resulting plots are displayed using fig.show().
def plot_stats(country):
if country in ["USA", "UK"]:
country_prefix = "the "
else:
country_prefix = ""
df_country = df[df.country == country]
df_country.set_index('date', inplace=True)
# Plot 1
if not all(df_country.cumulative_total_cases.isna()):
layout = go.Layout(
yaxis={'range':[0, df_country.cumulative_total_cases[-1] * 1.05],
'title':'Coronavirus Confirmed Cases'},
xaxis={'title':''},
)
fig = px.area(df_country, x=df_country.index, y="cumulative_total_cases",
title=f"<b>Cumulative Total Confirmed Cases in {country_prefix}{country}<br>from {df_country.index[0].strftime('%Y-%m-%d')} till {df_country.index[-1].strftime('%Y-%m-%d')}</b>",
template='plotly_dark')
fig.update_traces(line={'width':5})
fig.update_layout(layout)
fig.show()
# Plot 2
if not all(df_country.daily_new_cases.isna()):
layout = go.Layout(
yaxis={'range':[0, df_country.daily_new_cases.max() * 1.05],
'title':'Daily New Coronavirus Confirmed Cases'},
xaxis={'title':''},
template='plotly_dark',
title=f"<b>Daily New Cases in {country_prefix}{country}<br>from {df_country.index[0].strftime('%Y-%m-%d')} till {df_country.index[-1].strftime('%Y-%m-%d')}</b>",
)
MA7 = df_country.daily_new_cases.rolling(7).mean().dropna().astype(int)
fig = go.Figure()
fig.add_trace(go.Bar(name="Daily Cases", x=df_country.index, y=df_country.daily_new_cases))
fig.add_trace(go.Scatter(name="7-Day Moving Average", x=df_country.index[df_country.shape[0] - MA7.shape[0]:], y=MA7, line=dict(width=3)))
fig.update_layout(layout)
fig.show()
# Plot 3
if not all(df_country.cumulative_total_deaths.isna()):
layout = go.Layout(
yaxis={'range':[0, df_country.cumulative_total_deaths[-1] * 1.05],
'title':'Coronavirus Deaths'},
xaxis={'title':''},
)
fig = px.area(df_country, x=df_country.index, y="cumulative_total_deaths",
title=f"<b>Cumulative Total Deaths in {country_prefix}{country}<br>from {df_country.index[0].strftime('%Y-%m-%d')} till {df_country.index[-1].strftime('%Y-%m-%d')}</b>",
template='plotly_dark')
fig.update_traces(line={'color':'red', 'width':5})
fig.update_layout(layout)
fig.show()
# Plot 4
if not all(df_country.daily_new_deaths.isna()):
layout = go.Layout(
yaxis={'range':[0, df_country.daily_new_deaths.max() * 1.05],
'title':'Daily New Coronavirus Deaths'},
xaxis={'title':''},
template='plotly_dark',
title=f"<b>Daily Deaths in {country_prefix}{country}<br>from {df_country.index[0].strftime('%Y-%m-%d')} till {df_country.index[-1].strftime('%Y-%m-%d')}</b>",
)
MA7 = df_country.daily_new_deaths.rolling(7).mean().dropna().astype(int)
fig = go.Figure()
fig.add_trace(go.Bar(name="Daily Deaths", x=df_country.index, y=df_country.daily_new_deaths, marker_color='red'))
fig.add_trace(go.Scatter(name="7-Day Moving Average", x=df_country.index[df_country.shape[0] - MA7.shape[0]:], y=MA7, line={'width':3, 'color':'white'}))
fig.update_layout(layout)
fig.show()
# Plot 5
if not all(df_country.active_cases.isna()):
layout = go.Layout(
yaxis={'range':[0, df_country.active_cases.max() * 1.05],
'title':'Active Coronavirus Cases'},
xaxis={'title':''},
)
fig = px.line(df_country, x=df_country.index, y="active_cases",
title=f"<b>Active Cases in {country_prefix}{country}<br>from {df_country.index[0].strftime('%Y-%m-%d')} till {df_country.index[-1].strftime('%Y-%m-%d')}</b>",
template='plotly_dark')
fig.update_traces(line={'color':'yellow', 'width':5})
fig.update_layout(layout)
fig.show()
The code calls the function plot_stats with the argument 'USA', which generates and displays multiple plots showing COVID-19 statistics for the United States, including cumulative total confirmed cases, daily new cases, cumulative total deaths, daily new deaths, and active cases.
plot_stats('USA')
The code calls the function plot_stats with the argument 'China', which generates and displays multiple plots showing COVID-19 statistics for China, including cumulative total confirmed cases, daily new cases, cumulative total deaths, daily new deaths, and active cases.
plot_stats('China')
The code calls the function plot_stats with the argument 'UK', which generates and displays multiple plots showing COVID-19 statistics for the United Kingdom, including cumulative total confirmed cases, daily new cases, cumulative total deaths, daily new deaths, and active cases.
plot_stats('UK')
The code calls the function plot_stats with the argument 'Italy', which generates and displays multiple plots showing COVID-19 statistics for Italy, including cumulative total confirmed cases, daily new cases, cumulative total deaths, daily new deaths, and active cases.
plot_stats('Italy')
The code calls the function plot_stats with the argument 'India', which generates and displays multiple plots showing COVID-19 statistics for India, including cumulative total confirmed cases, daily new cases, cumulative total deaths, daily new deaths, and active cases.
plot_stats("India")
The code calls the function plot_stats with the argument 'Australia', which generates and displays multiple plots showing COVID-19 statistics for Australia, including cumulative total confirmed cases, daily new cases, cumulative total deaths, daily new deaths, and active cases.
plot_stats("Australia")
The code calls the function plot_stats with the argument 'France', which generates and displays multiple plots showing COVID-19 statistics for France, including cumulative total confirmed cases, daily new cases, cumulative total deaths, daily new deaths, and active cases.
plot_stats("France")
The code defines a function plot_continent that takes a continent as input. It filters the data from the DataFrame df based on the specified continent and generates a line plot using Plotly Express (px). The plot shows the cumulative total confirmed cases over time for each country in the selected continent. The plot is customized with annotations, marker sizes, and various visual properties. The resulting plot is displayed using fig.show().
def plot_continent(continent):
df_continent = df[df.continent == continent]
fig = px.line(df_continent, x="date", y="cumulative_total_cases", color="country", #log_y=True,
line_group="country", hover_name="country", template="plotly_dark")
annotations = []
# Adding labels
ys = []
for tr in fig.select_traces():
ys.append(tr.y[-1])
y_scale = 0.155 / max(ys)
for tr in fig.select_traces():
# labeling the right_side of the plot
size = max(1, int(math.log(tr.y[-1], 1.1) * tr.y[-1] * y_scale))
annotations.append(dict(x=tr.x[-1] + timedelta(hours=int((2 + size/5) * 24)), y=tr.y[-1],
xanchor='left', yanchor='middle',
text=tr.name,
font=dict(family='Arial',
size=7+int(size/2)
),
showarrow=False))
fig.add_trace(go.Scatter(
x=[tr.x[-1]],
y=[tr.y[-1]],
mode='markers',
name=tr.name,
marker=dict(color=tr.line.color, size=size)
))
fig.update_traces(line={'width':1})
fig.update_layout(annotations=annotations, showlegend=False, uniformtext_mode='hide',
title=f"<b>Cumulative Total Coronavirus Cases in {continent}<br>between {df_continent.date.min().strftime('%Y-%m-%d')} and {df_continent.date.max().strftime('%Y-%m-%d')}</b>",
yaxis={'title':'Coronavirus Confirmed Cases'},
xaxis={'title':''}
)
fig.show()
The code calls the function plot_continent with the argument 'Asia', which generates and displays a line plot showing the cumulative total confirmed cases over time for each country in Asia. The plot is customized with annotations, marker sizes, and visual properties to highlight the data.
plot_continent("Asia")
The code calls the function plot_continent with the argument 'Europe', which generates and displays a line plot showing the cumulative total confirmed cases over time for each country in Europe. The plot is customized with annotations, marker sizes, and visual properties to highlight the data.
plot_continent("Europe")
The code calls the function plot_continent with the argument 'Africa', which generates and displays a line plot showing the cumulative total confirmed cases over time for each country in Africa. The plot is customized with annotations, marker sizes, and visual properties to highlight the data.
plot_continent("Africa")
The code calls the function plot_continent with the argument 'North America', which generates and displays a line plot showing the cumulative total confirmed cases over time for each country in North America
plot_continent("North America")
The code calls the function plot_continent with the argument 'South America', which generates and displays a line plot showing the cumulative total confirmed cases over time for each country in South America
plot_continent("South America")
The code calls the function plot_continent with the argument 'Australia/Oceania', which generates and displays a line plot showing the cumulative total confirmed cases over time for each country in Australia/Oceania
plot_continent("Australia/Oceania")
The code sorts the df_summary DataFrame by total cases per 1 million population and calculates the percentage of the population with confirmed cases. It assigns colors based on whether the percentage is above or below the mean, and creates a scatter plot using Plotly Express (px). The plot shows the percentage of population with confirmed cases for each country, with marker size and color indicating the percentage. The plot is further customized with annotations, a line representing the mean, and additional visual elements. The resulting plot is displayed using fig.show().
sorted_by_cases_per_1m = df_summary.sort_values(['total_cases_per_1m_population'])
sorted_by_cases_per_1m['% of Population with Confirmed Cases'] = sorted_by_cases_per_1m['total_cases_per_1m_population']/1_000_000
mean = sorted_by_cases_per_1m['% of Population with Confirmed Cases'].mean()
sorted_by_cases_per_1m['color'] = sorted_by_cases_per_1m.apply(lambda row: "Red" if row['% of Population with Confirmed Cases'] > mean else "Blue", axis=1)
fig = px.scatter(sorted_by_cases_per_1m, x='country', y='% of Population with Confirmed Cases',
size='% of Population with Confirmed Cases',
color='color',
title=f"<b>Coronavirus Infection-Rate by Country as of {df.date.max().strftime('%Y-%m-%d')}</b>",
height=650)
fig.update_traces(marker_line_color='rgb(75,75,75)',
marker_line_width=1.5, opacity=0.8,
hovertemplate="<b>%{x}</b><br>%{y} of Population with Confirmed Cases<extra></extra>",)
fig.update_layout(showlegend=False,
yaxis={"tickformat":".3%", "range":[0,sorted_by_cases_per_1m['% of Population with Confirmed Cases'].max() * 1.1]},
xaxis={"title": ""},
title_font_size=20)
to_mention = ["China", "Australia", "India", "South Africa", "Russia", "Italy","Brazil", "UK", "France", "USA", "Montenegro"]
for i, country in enumerate(to_mention):
ay = 30 if i%2 else -30
ax = 20
if country == "USA": ay, ax = -30, -20
if country == "UK": ax = -20
if country == "France": ay, ax = -60, -40
if country == "Russia": ax = -20
if country == "Australia": ay = -30
if country == "Brazil": ax = -20
fig.add_annotation(
x=country,
y=sorted_by_cases_per_1m['% of Population with Confirmed Cases'][sorted_by_cases_per_1m.index[sorted_by_cases_per_1m.country==country][0]],
xref="x",
yref="y",
text=country,
showarrow=True,
font=dict(
family="Courier New, monospace",
size=14,
color="#ffffff"
),
align="center",
arrowhead=2,
arrowsize=1,
arrowwidth=2,
arrowcolor="#636363",
ax=ax,
ay=ay,
bordercolor="#c7c7c7",
borderwidth=2,
borderpad=4,
bgcolor=sorted_by_cases_per_1m['color'][sorted_by_cases_per_1m.index[sorted_by_cases_per_1m.country==country][0]],
opacity=0.6
)
fig.add_shape(type='line',
x0=sorted_by_cases_per_1m['country'].iloc[0], y0=mean,
x1=sorted_by_cases_per_1m['country'].iloc[-1], y1=mean,
line=dict(color='Green',width=1),
xref='x', yref='y'
)
fig.add_annotation(x=sorted_by_cases_per_1m['country'].iloc[0], y=mean,
text=f"mean = {mean*100:.2f}%",
showarrow=False,
xanchor="left",
yanchor="bottom",
font={"color":"Green", "size":14}
)
fig.show()
The code sorts the df_summary DataFrame by total deaths per 1 million population, filters out any rows with missing values, and calculates the percentage of the population with coronavirus death cases. It assigns colors based on whether the percentage is above or below the mean, and creates a scatter plot using Plotly Express (px). The plot shows the percentage of the population with death cases for each country, with marker size and color indicating the percentage. The plot is further customized with annotations, a line representing the mean, and additional visual elements. The resulting plot is displayed using fig.show().
sorted_by_deaths_per_1m = df_summary.sort_values(['total_deaths_per_1m_population'])
sorted_by_deaths_per_1m = sorted_by_deaths_per_1m[sorted_by_deaths_per_1m['total_deaths_per_1m_population'].notna()]
sorted_by_deaths_per_1m['% of Population with Coronavirus Death Cases'] = sorted_by_deaths_per_1m['total_deaths_per_1m_population']/1_000_000
mean = sorted_by_deaths_per_1m['% of Population with Coronavirus Death Cases'].mean()
sorted_by_deaths_per_1m['color'] = sorted_by_deaths_per_1m.apply(lambda row: "Red" if row['% of Population with Coronavirus Death Cases'] > mean else "Blue", axis=1)
#sorted_by_deaths_per_1m.dropna(inplace=True)
fig = px.scatter(sorted_by_deaths_per_1m, x='country', y='% of Population with Coronavirus Death Cases',
size='% of Population with Coronavirus Death Cases',
color='color',
title=f"<b>Coronavirus Death-Rate by Country as of {df.date.max().strftime('%Y-%m-%d')}</b>",
height=650)
fig.update_traces(marker_line_color='rgb(75,75,75)',
marker_line_width=1.5, opacity=0.8,
hovertemplate="<b>%{x}</b><br>%{y} of Population with Death Cases<extra></extra>",)
fig.update_layout(showlegend=False,
yaxis={"tickformat":".3%", "range":[0,sorted_by_deaths_per_1m['% of Population with Coronavirus Death Cases'].max() * 1.1]},
xaxis={"title": ""},
title_font_size=20)
to_mention = ["China", "Australia", "India", "South Africa", "Russia", "Italy","Brazil", "UK", "France", "USA", "Bulgaria", "Peru"]
for i, country in enumerate(to_mention):
print
ay = 30 if i%2 else -30
ax = 20
if country == "Russia": ax = -20
if country == "Czech Republic": ay, ax = -30, -60
if country == "USA": ay = 50
if country == "Italy": ay, ax = 30, -20
if country == "UK": ay, ax = -30, 40
if country == "Australia": ay = -30
if country == "France": ay, ax = -60, -40
if country == "Brazil": ax = -20
if country == "Peru": ay = -30
fig.add_annotation(
x=country,
y=sorted_by_deaths_per_1m['% of Population with Coronavirus Death Cases'][sorted_by_deaths_per_1m.index[sorted_by_deaths_per_1m.country==country][0]],
xref="x",
yref="y",
text=country,
showarrow=True,
font=dict(
family="Courier New, monospace",
size=14,
color="#ffffff"
),
align="center",
arrowhead=2,
arrowsize=1,
arrowwidth=2,
arrowcolor="#636363",
ax=ax,
ay=ay,
bordercolor="#c7c7c7",
borderwidth=2,
borderpad=4,
bgcolor=sorted_by_deaths_per_1m['color'][sorted_by_deaths_per_1m.index[sorted_by_deaths_per_1m.country==country][0]],
opacity=0.6
)
fig.add_shape(type='line',
x0=sorted_by_deaths_per_1m['country'].iloc[0], y0=mean,
x1=sorted_by_deaths_per_1m['country'].iloc[-1], y1=mean,
line=dict(color='Green',width=1),
xref='x', yref='y'
)
fig.add_annotation(x=sorted_by_deaths_per_1m['country'].iloc[0], y=mean,
text=f"mean = {mean*100:.2f}%",
showarrow=False,
xanchor="left",
yanchor="bottom",
font={"color":"Green", "size":14}
)
fig.show()
The code calculates the severity of the coronavirus by computing the ratio of total deaths to total confirmed cases for each country in the df_summary DataFrame. It sorts the DataFrame based on the severity ratio, filters out any rows with missing values, and assigns colors based on whether the severity ratio is above or below the mean. It then creates a scatter plot using Plotly Express (px) to visualize the severity ratio for each country, with marker size and color indicating the ratio. The plot is further customized with annotations, a line representing the mean, and additional visual elements. The resulting plot is displayed using fig.show().
df_summary["Coronavirus Deaths/Confirmed Cases"] = df_summary["total_deaths"] / df_summary["total_confirmed"]
sorted_by_deaths_per_confirmed = df_summary.sort_values(['Coronavirus Deaths/Confirmed Cases'])
sorted_by_deaths_per_confirmed = sorted_by_deaths_per_confirmed[sorted_by_deaths_per_confirmed['Coronavirus Deaths/Confirmed Cases'].notna()]
mean = sorted_by_deaths_per_confirmed['Coronavirus Deaths/Confirmed Cases'].mean()
sorted_by_deaths_per_confirmed['color'] = sorted_by_deaths_per_confirmed.apply(lambda row: "Red" if row['Coronavirus Deaths/Confirmed Cases'] > mean else "Blue", axis=1)
fig = px.scatter(sorted_by_deaths_per_confirmed, x='country', y='Coronavirus Deaths/Confirmed Cases',
size='Coronavirus Deaths/Confirmed Cases',
color='color',
title=f"<b>Coronavirus severity by Country as of {df.date.max().strftime('%Y-%m-%d')}</b>",
height=650)
fig.update_traces(marker_line_color='rgb(75,75,75)',
marker_line_width=1.5, opacity=0.8,
hovertemplate="<b>%{x}</b><br>%{y} of Cases Leading to Death Cases<extra></extra>",)
fig.update_layout(showlegend=False,
yaxis={"tickformat":".3%", "range":[0,sorted_by_deaths_per_confirmed['Coronavirus Deaths/Confirmed Cases'].max() * 1.1]},
xaxis={"title": ""},
title_font_size=20)
to_mention = ["China", "Australia", "India", "South Africa", "Russia", "Italy","Brazil", "UK", "France", "USA", "Yemen", "Vanuatu"]
for i, country in enumerate(to_mention):
print
ay = 30 if i%2 else -30
ax = 20
if country in ["India", "USA", "Russia"]: ax = -20
if country == "Yemen": ay = 30
if country == "UK": ay, ax = -60, -40
if country == "Belgium": ay, ax = -30, -60
if country == "USA": ay, ax = -30, 40
if country == "Italy": ax = -40
if country == "Australia": ay = -30
if country == "France": ay, ax = -60, 40
if country == "Brazil": ay, ax = -60, -20
fig.add_annotation(
x=country,
y=sorted_by_deaths_per_confirmed['Coronavirus Deaths/Confirmed Cases'][sorted_by_deaths_per_confirmed.index[sorted_by_deaths_per_confirmed.country==country][0]],
xref="x",
yref="y",
text=country,
showarrow=True,
font=dict(
family="Courier New, monospace",
size=14,
color="#ffffff"
),
align="center",
arrowhead=2,
arrowsize=1,
arrowwidth=2,
arrowcolor="#636363",
ax=ax,
ay=ay,
bordercolor="#c7c7c7",
borderwidth=2,
borderpad=4,
bgcolor=sorted_by_deaths_per_confirmed['color'][sorted_by_deaths_per_confirmed.index[sorted_by_deaths_per_confirmed.country==country][0]],
opacity=0.6
)
fig.add_shape(type='line',
x0=sorted_by_deaths_per_confirmed['country'].iloc[0], y0=mean,
x1=sorted_by_deaths_per_confirmed['country'].iloc[-1], y1=mean,
line=dict(color='Green',width=1),
xref='x', yref='y'
)
fig.add_annotation(x=sorted_by_deaths_per_confirmed['country'].iloc[0], y=mean,
text=f"mean = {mean*100:.2f}%",
showarrow=False,
xanchor="left",
yanchor="bottom",
font={"color":"Green", "size":14}
)
fig.show()
The code prepares and visualizes the global active COVID-19 cases over time. It selects the relevant columns from the df DataFrame, filters out rows with zero or missing active cases, and calculates the logarithm base 2 of the active cases. It then creates a choropleth map animation using Plotly Express (px), where the color represents the logarithm of active cases. The map is animated over time, and additional visual properties such as the title and colorbar are customized. The resulting visualization is displayed using fig.show().
active_cases_df = df[['date', 'country', 'active_cases']].dropna().sort_values('date')
active_cases_df = active_cases_df[active_cases_df.active_cases > 0]
active_cases_df['log2(active_cases)'] = np.log2(active_cases_df['active_cases'])
active_cases_df['date'] = active_cases_df['date'].dt.strftime('%m/%d/%Y')
fig = px.choropleth(active_cases_df, locations="country", locationmode='country names',
color="log2(active_cases)", hover_name="country", hover_data=['active_cases'],
projection="natural earth", animation_frame="date",
title='<b>Coronavirus Global Active Cases Over Time</b>',
color_continuous_scale="reds",
)
fig.update_layout(coloraxis={"colorbar": {"title":"<b>Active Cases</b><br>",
"titleside":"top",
"tickmode":"array",
"tickvals":log_scale_vals,
"ticktext":scale_vals}
}
)
fig.layout.updatemenus[0].buttons[0].args[1]['frame']['duration'] = 10
fig.layout.updatemenus[0].buttons[0].args[1]['transition']['duration'] = 2
fig.show()
The code generates an area plot using Plotly Express (px) to visualize the active COVID-19 cases over time for the top 20 countries with the highest active cases on the latest date in the df DataFrame. The plot is customized with line width, title, and axis labels. The resulting visualization is displayed using fig.show().
fig = px.area(df[df.country.isin(df[df.date == df.date.max()].sort_values("active_cases", ascending=False).iloc[:20].country)].sort_values("active_cases", ascending=False),
x="date", y="active_cases", color="country", template="plotly_dark")#, groupnorm='percent')
fig.update_traces(line={"width":1.25})
fig.update_layout(title = f"Top 20 Countries with Most Active Cases on {df.date.max().strftime('%Y-%m-%d')}",
xaxis={"title": ""},
yaxis={"title":"Active Cases"})
The code generates a treemap using Plotly Express (px) to display the breakdown of active COVID-19 cases by country on the latest date in the df_summary DataFrame. The treemap's tiles represent countries, with their size indicating the number of active cases. The plot is customized with a title and text information displayed on the tiles. The resulting visualization is displayed using fig.show().
fig = px.treemap(df_summary, path=["country"], values="active_cases", height = 750,
title=f"<b>Active Cases Breakdown on {df.date.max().strftime('%Y-%m-%d')}</b>",
color_discrete_sequence = px.colors.qualitative.Set3)
fig.update_traces(textinfo = "label+text+value")
fig.show()