疫情数据分析平台工作报告【2】接口API-锐单电子商城

接口api申请

请求接口://nCoV/api/overall 请求方式：GET 自2020年1月24日下午4:00返回爬虫运行以来，病毒研究和全国疫情概况可指定返回数据为最新发布数据或时间序列数据。

变量名	注释
latest	1:返回最新数据(默认)0:返回时间序列数据(已废除)

返回数据

变量名	注释
generalRemark	全国疫情信息概览
remarkX	注释内容，X为1~5
note1	病毒名称
note2	传染源
note3	传播途径
currentConfirmedCount(Incr)	现有确诊人数(比昨天增加)值为confirmedCount(Incr) - curedCount(Incr) - deadCount(Incr)
confirmedCount(Incr)	累计确诊人数(比昨天增加)
suspectedCount(Incr)	疑似感染人数(比昨天增加)
curedCount(Incr)	治愈人数(比昨天增加)
deadCount(Incr)	死亡人数(比昨天增加)
seriousCount(Incr)	重症病例人数(比昨天增加)
updateTime	数据最终变化时间

请求接口://nCoV/api/provinceName 请求方式：GET 有数据条目的国家、省、地区、直辖市列表返回数据库。

变量名	注释
lang	返回数据的语言。zh（默认）英文：en

示例

/nCoV/api/provinceName 返回国家、省、地区或直辖市中文版列表。
/nCoV/api/provinceName?lang=en 返回国家、省、地区或直辖市英文版列表。

请求接口://nCoV/api/area 请求方式：GET 自2020年1月22日凌晨3:00(爬虫开始运行)以来，中国各省、地区、直辖市和世界其他国家疫情信息变化的时间序列数据(准确到市)可追溯到确诊/疑似感染/治愈/死亡人数的时间序列。注:2020年1月22日凌晨3:00至2020年1月24日凌晨3:40之间的数据仅为省级数据。自2020年1月24日起，丁香园开始统计和披露市级数据。

变量名	注释
latest
province	美国、湖北、香港、北京等国家、省、地区或直辖市的中文名称。具体名称列表可通过/nCoV/api/provinceName?lang=zh获取。
provinceEng	湖北、香港、北京等国家、省、地区或直辖市的英文名称。具体名称列表可通过/nCoV/api/provinceName?lang=en获取。请注意大小写的规范，应与/nCoV/api/provinceName?lang=en保持一致。

返回数据

变量名	注释
locationId	城市编号中国大陆城市编号为邮编，中国大陆以外城市编号暂不知规则
continent(English)Name	大洲(英文)名称
country(English)Name	国家（英文）名称
province(English)Name	省、区、直辖市(英语)全称
provinceShortName	简称省、地区或直辖市
currentConfirmedCount	现有确诊人数值为confirmedCount - curedCount - deadCount
confirmedCount	累计确诊人数
suspectedCount	疑似感染人数
curedCount	治愈人数
deadCount	死亡人数
comment	其他信息
cities	下属城市的情况
updateTime	数据更新时间

示例

/nCoV/api/area?latest=1&province=湖北省返回湖北省疫情最新数据
/nCoV/api/area?latest=0&province=湖北省返回湖北省疫情的时间序列数据
/nCoV/api/area?latest=1 疫情最新数据返回中国所有城市和世界其他国家

请求接口://nCoV/api/news 请求方式：GET 返回所有与疫情有关的新闻信息，包括数据源和数据源链接。按发布顺序倒序排列。

变量名	注释
page	返回新闻页码。默认返回第一页
num	每页返回新闻数量。默认为10。

返回数据

变量名	注释
pubDate	新闻发布时间
title	新闻标题
summary	新闻内容概述
infoSource	数来源
sourceUrl	来源链接
province	省份或直辖市名称
provinceId	省份或直辖市代码

示例

/nCoV/api/news?page=1&num=10 返回所有地区范围内第1页的新闻，每页10则。

请求接口：/nCoV/api/rumors 请求方式：GET 返回与疫情有关的谣言以及丁香园的辟谣。按发布顺序倒序排列。

变量名	注释
rumorType	0：返回谣言（默认）1：返回可信信息2：返回尚未证实信息
page	返回谣言的页码。默认返回第1页
num	返回每页谣言的数量。默认为10则。

返回数据

变量名	注释
id	谣言编号
title	谣言标题
mainSummary	辟谣内容概述
body	辟谣内容全文
sourceUrl	来源链接

/nCoV/api/rumors?page=1&num=10&rumorType=1 返回第2页可信信息，每页10则，即返回所有可信信息的第11至20则。

微软运营的 COVID-19 数据集

# JSON schema of full text documents


{ 
       
    "paper_id": <str>,                      # 40-character sha1 of the PDF
    "metadata": { 
       
        "title": <str>,
        "authors": [                        # list of author dicts, in order
            { 
       
                "first": <str>,
                "middle": <list of str>,
                "last": <str>,
                "suffix": <str>,
                "affiliation": <dict>,
                "email": <str>
            },
            ...
        ],
        "abstract": [                       # list of paragraphs in the abstract
            { 
       
                "text": <str>,
                "cite_spans": [             # list of character indices of inline citations
                                            # e.g. citation "[7]" occurs at positions 151-154 in "text"
                                            # linked to bibliography entry BIBREF3
                    { 
       
                        "start": 151,
                        "end": 154,
                        "text": "[7]",
                        "ref_id": "BIBREF3"
                    },
                    ...
                ],
                "ref_spans": <list of dicts similar to cite_spans>,     # e.g. inline reference to "Table 1"
                "section": "Abstract"
            },
            ...
        ],
        "body_text": [                      # list of paragraphs in full body
                                            # paragraph dicts look the same as above
            { 
       
                "text": <str>,
                "cite_spans": [],
                "ref_spans": [],
                "eq_spans": [],
                "section": "Introduction"
            },
            ...
            { 
       
                ...,
                "section": "Conclusion"
            }
        ],
        "bib_entries": { 
       
            "BIBREF0": { 
       
                "ref_id": <str>,
                "title": <str>,
                "authors": <list of dict>       # same structure as earlier,
                                                # but without `affiliation` or `email`
                "year": <int>,
                "venue": <str>,
                "volume": <str>,
                "issn": <str>,
                "pages": <str>,
                "other_ids": { 
       
                    "DOI": [
                        <str>
                    ]
                }
            },
            "BIBREF1": { 
       },
            ...
            "BIBREF25": { 
       }
        },
        "ref_entries":
            "FIGREF0": { 
       
                "text": <str>,                  # figure caption text
                "type": "figure"
            },
            ...
            "TABREF13": { 
       
                "text": <str>,                  # table caption text
                "type": "table"
            }
        },
        "back_matter": <list of dict>           # same structure as body_text
    }
}

可以使用下列代码连接至该代码托管服务器。

from azure.storage.blob import BlockBlobService

# storage account details
azure_storage_account_name = "azureopendatastorage"
azure_storage_sas_token = "sv=2019-02-02&ss=bfqt&srt=sco&sp=rlcup&se=2025-04-14T00:21:16Z&st=2020-04-13T16:21:16Z&spr=https&sig=JgwLYbdGruHxRYTpr5dxfJqobKbhGap8WUtKFadcivQ%3D"

# create a blob service
blob_service = BlockBlobService(
    account_name=azure_storage_account_name,
    sas_token=azure_storage_sas_token,
)

CORD-19 数据存储在 covid19temp 容器中。下面是容器中的文件结构以及示例文件。

metadata.csv
custom_license/
    pdf_json/
        0001418189999fea7f7cbe3e82703d71c85a6fe5.json        # filename is sha-hash
        ...
    pmc_json/
        PMC1065028.xml.json                                  # filename is the PMC ID
        ...
noncomm_use_subset/
    pdf_json/
        0036b28fddf7e93da0970303672934ea2f9944e7.json
        ...
    pmc_json/
        PMC1616946.xml.json
        ...
comm_use_subset/
    pdf_json/
        000b7d1517ceebb34e1e3e817695b6de03e2fa78.json
        ...
    pmc_json/
        PMC1054884.xml.json
        ...
biorxiv_medrxiv/                                             # note: there is no pmc_json subdir
    pdf_json/
        0015023cc06b5362d332b3baf348d11567ca2fbb.json
        ...

每个 .json 文件对应于数据集中的一篇文章。标题、作者、摘要和（如适用）全文数据都存储在这里。该数据集附带一个 metadata.csv，记录了相关基本信息。读取对应的文件和相关信息列。

# container housing CORD-19 data
container_name = "covid19temp"

# download metadata.csv
metadata_filename = 'metadata.csv'
blob_service.get_blob_to_path(
    container_name=container_name,
    blob_name=metadata_filename,
    file_path=metadata_filename
)

simple_schema = ['cord_uid', 'source_x', 'title', 'abstract', 'authors', 'full_text_file', 'url']

def make_clickable(address):
    '''Make the url clickable'''
    return '<a href="{0}">{0}</a>'.format(address)

def preview(text):
    '''Show only a preview of the text data.'''
    return text[:30] + '...'

format_ = { 
       'title': preview, 'abstract': preview, 'authors': preview, 'url': make_clickable}

metadata[simple_schema].head().style.format(format_)

num_entries = len(metadata)
print("There are {} many entries in this dataset:".format(num_entries))

metadata_with_text = metadata[metadata['full_text_file'].isna() == False]
with_full_text = len(metadata_with_text)
print("-- {} have full text entries".format(with_full_text))

with_doi = metadata['doi'].count()
print("-- {} have DOIs".format(with_doi))

with_pmcid = metadata['pmcid'].count()
print("-- {} have PubMed Central (PMC) ids".format(with_pmcid))

with_microsoft_id = metadata['Microsoft Academic Paper ID'].count()
print("-- {} have Microsoft Academic paper ids".format(with_microsoft_id))

来自bing的受信任可靠来源数据集

修改后的数据集一共提供 CSV、JSON、JSON-Lines 和 Parquet 格式。全部列在下方了: https://pandemicdatalake.blob.core.windows.net/public/curated/covid-19/bing_covid-19_data/latest/bing_covid-19_data.csv https://pandemicdatalake.blob.core.windows.net/public/curated/covid-19/bing_covid-19_data/latest/bing_covid-19_data.json https://pandemicdatalake.blob.core.windows.net/public/curated/covid-19/bing_covid-19_data/latest/bing_covid-19_data.jsonl https://pandemicdatalake.blob.core.windows.net/public/curated/covid-19/bing_covid-19_data/latest/bing_covid-19_data.parquet

在这里插入图片描述

加载并验证该数据集

import pandas as pd
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt

df = pd.read_parquet("https://pandemicdatalake.blob.core.windows.net/public/curated/covid-19/bing_covid-19_data/latest/bing_covid-19_data.parquet")
df.head(10)

df_Worldwide=df[df['country_region']=='Worldwide']
df_Worldwide_pivot=df_Worldwide.pivot_table(df_Worldwide, index=['country_region','updated'])

df_Worldwide_pivot

df_Worldwide.plot(kind='line',x='updated',y="confirmed",grid=True)
df_Worldwide.plot(kind='line',x='updated',y="deaths",grid=True)
df_Worldwide.plot(kind='line',x='updated',y="confirmed_change",grid=True)
df_Worldwide.plot(kind='line',x='updated',y="deaths_change",grid=True)

Our World in Data 提供的数据来源

Metrics	Source	Updated	Countries
Vaccinations	Official data collated by the Our World in Data team	Daily	218
Tests & positivity	Official data collated by the Our World in Data team	Weekly	193
Hospital & ICU	Official data collated by the Our World in Data team	Daily	47
Confirmed cases	JHU CSSE COVID-19 Data	Daily	217
Confirmed deaths	JHU CSSE COVID-19 Data	Daily	217
Reproduction rate	Arroyo-Marioli F, Bullano F, Kucinskas S, Rondón-Moreno C	Daily	192
Policy responses	Oxford COVID-19 Government Response Tracker	Daily	187
Other variables of interest	International organizations (UN, World Bank, OECD, IHME…)	Fixed	241

Variable	Description
total_cases	Total confirmed cases of COVID-19. Counts can include probable cases, where reported.
new_cases	New confirmed cases of COVID-19. Counts can include probable cases, where reported. In rare cases where our source reports a negative daily change due to a data correction, we set this metric to NA.
new_cases_smoothed	New confirmed cases of COVID-19 (7-day smoothed). Counts can include probable cases, where reported.
total_cases_per_million	Total confirmed cases of COVID-19 per 1,000,000 people. Counts can include probable cases, where reported.
new_cases_per_million	New confirmed cases of COVID-19 per 1,000,000 people. Counts can include probable cases, where reported.
new_cases_smoothed_per_million	New confirmed cases of COVID-19 (7-day smoothed) per 1,000,000 people. Counts can include probable cases, where reported.

url = 'https://api.tianapi.com/ncovabroad/index'

# 国际疫情新闻、疫情概况、风险地区

# 名称 类型 示例值 说明
# modifyTime int 1584159933000 数据修改时间
# continents string 欧洲 大洲
# provinceName string 意大利 地区名
# currentConfirmedCount int 14955 现存确诊人数
# confirmedCount int 17660 累计确诊人数
# suspectedCount int 1439 治愈人数
# deadCount int 1266 死亡人数
# locationId int 965008 地理位置编号
# countryShortCode string ITA 国家代码



query_params = { 
       "key": 'd334721cf6eba2d619a5855420ec352c'}

res = requests.get(url, params=query_params)
res_dict = res.json()
print(res_dict)

import requests

url = 'http://api.tianapi.com/ncov/index'

# 国内疫情新闻、疫情概况、风险地区

# 名称 类型 示例值 说明
# news object 新闻资讯对象 疫情新闻动态列表
# desc object 疫情概况对象 全球疫情详细数据
# riskarea object 风险地区对象 全国风险地区，high高风险、mid中风险
# currentConfirmedCount int 55881 现存确诊人数
# confirmedCount int 74679 累计确诊人数
# suspectedCount int 2053 累计境外输入人数
# curedCount int 16676 累计治愈人数
# deadCount int 2122 累计死亡人数
# seriousCount int 306 现存无症状人数
# suspectedIncr int 8 新增境外输入人数
# currentConfirmedIncr int -2002 相比昨天现存确诊人数
# confirmedIncr int 403 相比昨天累计确诊人数
# curedIncr int 2289 相比昨天新增治愈人数
# deadIncr int 116 相比昨天新增死亡人数
# seriousIncr int 4 相比昨天现存无症状人数


query_params = { 
       "key": 'd334721cf6eba2d619a5855420ec352c'}

res = requests.get(url, params=query_params)
res_dict = res.json()
print(res_dict)

import requests

url = 'https://api.muxiaoguo.cn/api/epidemic'

# MXG api
# 警告:容易超时
# 查询参数
# [macroscopically(高危地区)，epidemicInfectionData(疫情数据)，epidemicHotspot(疫情热点)]

query_params = { 
       "type": 'macroscopically'}

res = requests.get(url, params=query_params)
res_dict = res.json()
print(res_dict)

import requests
url = 'https://view.inews.qq.com/g2/getOnsInfo?name=disease_other'
# 国内历史
# 警告:易超时
res = requests.get(url)
res_dict = res.json()
print(res_dict)

资讯详情

疫情数据分析平台工作报告【2】接口API

TDK VLBUC功率电感器的介绍、特性、及应用

疫情数据分析平台工作报告【2】接口API

TDK VLBUC功率电感器的介绍、特性、及应用

最近热搜

历史搜索 清除历史记录

历史搜索清除历史记录