h成人亚欧精品久久,荷兰人妻少妇无码精品一区二区

新聞中心

這里有您想知道的互聯(lián)網(wǎng)營銷解決方案

在Python中使用Pygal進行交互可視化

本文轉(zhuǎn)載自微信公眾號「Python學(xué)會」，作者Huangwei AI。轉(zhuǎn)載本文請聯(lián)系Python學(xué)會公眾號。

成都創(chuàng)新互聯(lián)專注于下城網(wǎng)站建設(shè)服務(wù)及定制，我們擁有豐富的企業(yè)做網(wǎng)站經(jīng)驗。熱誠為您提供下城營銷型網(wǎng)站建設(shè)，下城網(wǎng)站制作、下城網(wǎng)頁設(shè)計、下城網(wǎng)站官網(wǎng)定制、微信小程序開發(fā)服務(wù)，打造下城網(wǎng)絡(luò)公司原創(chuàng)品牌,更為您提供下城網(wǎng)站排名全網(wǎng)營銷落地服務(wù)。

1前言

我們需要處理、分析和探索的大量數(shù)據(jù);隨著技術(shù)的進步，這個數(shù)字只會越來越大?，F(xiàn)在，想象一下必須盯著電子表格中的數(shù)千行數(shù)據(jù)，試圖找到隱藏的模式并追蹤數(shù)字的變化。這就是數(shù)據(jù)可視化的切入點。擁有可視化的信息摘要比瀏覽電子表格更容易識別模式和趨勢。由于數(shù)據(jù)分析的目的是獲得見解和發(fā)現(xiàn)模式，將數(shù)據(jù)可視化將使其更有價值，更容易探索。不同類型的圖表和圖表使交流數(shù)據(jù)發(fā)現(xiàn)更快和更有效。

可視化數(shù)據(jù)的重要性不僅僅是簡化數(shù)據(jù)的解釋?？梢暬瘮?shù)據(jù)有很多好處，比如:

顯示數(shù)據(jù)隨時間的變化。
確定相關(guān)事件發(fā)生的頻率。
指出不同事件之間的相關(guān)性。
分析不同機會的價值和風(fēng)險。

在本文中，我們將介紹一個Python庫，它可以幫助我們創(chuàng)建引人注目的、令人驚嘆的、交互式的可視化。它就是Pygal

2Pygal介紹

當(dāng)使用Python可視化數(shù)據(jù)時，大多數(shù)數(shù)據(jù)科學(xué)家使用臭名昭著的Matplotlib、Seaborn或Bokeh。然而，一個經(jīng)常被忽視的庫是Pygal。Pygal允許用戶創(chuàng)建漂亮的交互式圖，這些圖可以以最佳的分辨率轉(zhuǎn)換成svg，以便使用Flask或Django打印或顯示在網(wǎng)頁上。

熟悉Pygal

Pygal提供了各種各樣的圖表，我們可以使用它們來可視化數(shù)據(jù)，確切地說，Pygal中有14種圖表類別，比如柱狀圖、柱狀圖、餅狀圖、樹形圖、測量圖等等。

要使用Pygal，我們得先安裝它。

 
 
 
 
  
  
  
  $ pip install pygal

我們來畫第一張圖。我們將從最簡單的字符開始，一個條形圖。要使用Pygal繪制條形圖，我們需要創(chuàng)建一個圖表對象，然后向其添加一些值。

 
 
 
 
  
  
  
  bar_chart = pygal.Bar()

我們將繪制0到5的階乘。在這里，我定義了一個簡單的函數(shù)來計算一個數(shù)字的階乘，然后使用它生成一個數(shù)字從0到5的階乘列表。

 
 
 
 
  
  
  
  def factorial(n): 
  
  
  
      if n == 1 or n == 0: 
  
  
  
          return 1 
  
  
  
      else: 
  
  
  
          return n * factorial(n-1) 
  
  
  
  fact_list = [factorial(i) for i in range(11)]

現(xiàn)在，我們可以使用它來創(chuàng)建我們的繪圖

 
 
 
 
  
  
  
  bar_chart = pygal.Bar(height=400) 
  
  
  
  bar_chart.add('Factorial', fact_list) 
  
  
  
  display(HTML(base_html.format(rendered_chart=bar_chart.render(is_unicode=True))))

這將生成一個漂亮的交互圖

如果我們想要繪制不同類型的圖表，我們將遵循相同的步驟。您可能已經(jīng)注意到，用于將數(shù)據(jù)鏈接到圖表的主要方法是add方法。

現(xiàn)在，讓我們開始基于實際數(shù)據(jù)構(gòu)建一些東西。

應(yīng)用

接下來，我將使用美國COVID-19病例數(shù)據(jù)集來解釋Pygal的不同方面。

首先，為了確保一切順利進行，我們需要確保兩件事:

Pandas和Pygal都裝上了。
在jupiter Notebook中，我們需要啟用IPython顯示和HTML選項。

 
 
 
 
  
  
  
  from IPython.display import display, HTML 
  
  
  
  base_html = """ 
  
  
  
   
  
  
  
   
  
  
  
     
  
  
  
     
  
  
  
     
  
  
  
     
  
  
  
     
  
  
  
       
  
  
  
        {rendered_chart} 
  
  
  
       
  
  
  
     
  
  
  
   
  
  
  
  """

現(xiàn)在我們已經(jīng)設(shè)置好了，我們可以開始使用Pandas來探索我們的數(shù)據(jù)，然后使用不同類型的圖表來操作和準(zhǔn)備它。

 
 
 
 
  
  
  
  import pygal 
  
  
  
  import pandas as pd 
  
  
  
  data = pd.read_csv("https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-counties.csv")

該數(shù)據(jù)集包含基于日期、縣和州的COVID-19病例和死亡信息。我們可以通過data.column看出這一點。列，以了解數(shù)據(jù)的形狀。執(zhí)行該命令將返回:

 
 
 
 
  
  
  
  Index(['date', 'county', 'state', 'fips', 'cases', 'deaths'], dtype='object')

我們可以獲得一個10行的樣本來查看我們的數(shù)據(jù)幀是什么樣子的。

 
 
 
 
  
  
  
  data.sample(10)

條形圖

讓我們首先繪制一個柱狀圖，顯示每個狀態(tài)的案例數(shù)的平均值。為此，我們需要執(zhí)行以下步驟:

將數(shù)據(jù)按狀態(tài)分組，提取每個狀態(tài)的案例號，然后計算每個狀態(tài)的平均值。

 
 
 
 
  
  
  
  mean_per_state = data.groupby('state')['cases'].mean()

開始構(gòu)建數(shù)據(jù)并將其添加到條形圖中。

 
 
 
 
  
  
  
  barChart = pygal.Bar(height=400) 
  
  
  
  [barChart.add(x[0], x[1]) for x in mean_per_state.items()] 
  
  
  
  display(HTML(base_html.format(rendered_chart=barChart.render(is_unicode=True))))

瞧，我們有一個條形圖。我們可以通過從圖例列表中取消選擇來刪除數(shù)據(jù)，也可以通過重新選擇來重新添加數(shù)據(jù)。

柱狀圖的完整代碼

 
 
 
 
  
  
  
  #Import needed libraries 
  
  
  
  import pygal 
  
  
  
  import pandas as pd 
  
  
  
  #Parse the dataframe 
  
  
  
  data = pd.read_csv("https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-counties.csv")  
  
  
  
  #Get the mean number of cases per states 
  
  
  
  mean_per_state = data.groupby('state')['cases'].mean() 
  
  
  
  #Draw the bar chart 
  
  
  
  barChart = pygal.Bar(height=400) 
  
  
  
  [barChart.add(x[0], x[1]) for x in mean_per_state.items()] 
  
  
  
  display(HTML(base_html.format(rendered_chart=barChart.render(is_unicode=True))))

Treemap

條形圖有助于顯示整體數(shù)據(jù)，但如果我們想要更具體，我們可以選擇不同類型的char，即treemap。樹圖對于顯示數(shù)據(jù)中的類別非常有用。例如，在我們的數(shù)據(jù)集中，我們有基于每個州每個縣的病例數(shù)量。柱狀圖顯示了每個州的均值，但我們看不到每個州每個縣的病例分布。一種方法是使用樹圖。

假設(shè)我們想要查看案例數(shù)量最多的10個州的詳細案例分布情況。然后，在繪制數(shù)據(jù)之前，我們需要先對數(shù)據(jù)進行操作。

我們需要根據(jù)案例對數(shù)據(jù)進行排序，然后按州進行分組。

 
 
 
 
  
  
  
  sort_by_cases = data.sort_values(by=['cases'],ascending=False).groupby(['state'])['cases'].apply(list)

使用排序列表來獲得案例數(shù)量最多的前10個州。

 
 
 
 
  
  
  
  top_10_states = sort_by_cases[:10]

使用這個子列表來創(chuàng)建我們的樹圖。

 
 
 
 
  
  
  
  treemap = pygal.Treemap(height=400) 
  
  
  
  [treemap.add(x[0], x[1][:10]) for x in top_10_states.items()] 
  
  
  
  display(HTML(base_html.format(rendered_chart=treemap.render(is_unicode=True))))

然而，這個樹圖沒有被標(biāo)記，所以當(dāng)我們懸停在方塊上時，我們無法看到縣名。我們將在該州的所有縣街區(qū)上看到該州的名稱。為了避免這種情況并將縣名添加到我們的treemap中，我們需要標(biāo)記向圖表提供的數(shù)據(jù)。

 
 
 
 
  
  
  
  #Import needed libraries 
  
  
  
  import pygal 
  
  
  
  import pandas as pd 
  
  
  
  #Parse the dataframe 
  
  
  
  data = pd.read_csv("https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-counties.csv")  
  
  
  
  #Sort states by cases count 
  
  
  
  sort_by_cases = data.sort_values(by=['cases'],ascending=False).groupby(['state'])['cases'].apply(list) 
  
  
  
  #Get the top 10 states with the highest number of cases 
  
  
  
  top_10_states = sort_by_cases[:10] 
  
  
  
  #Draw the treemap 
  
  
  
  treemap = pygal.Treemap(height=400) 
  
  
  
  [treemap.add(x[0], x[1][:10]) for x in top_10_states.items()] 
  
  
  
  display(HTML(base_html.format(rendered_chart=treemap.render(is_unicode=True))))

在此之前，我們的數(shù)據(jù)每天都會更新。因此，每個縣將進行幾次重復(fù)。因為我們關(guān)心每個縣的病例總數(shù)，所以在將數(shù)據(jù)添加到樹圖之前，我們需要清理數(shù)據(jù)。

 
 
 
 
  
  
  
  #Get the cases by county for all states 
  
  
  
  cases_by_county = data.sort_values(by=['cases'],ascending=False).groupby(['state'], axis=0).apply( 
  
  
  
      lambda x : [{"value" : l, "label" : c } for l, c in zip(x['cases'], x['county'])]) 
  
  
  
  cases_by_county= cases_by_county[:10] 
  
  
  
  #Create a new dictionary that contains the cleaned up version of the data 
  
  
  
  clean_dict = {} 
  
  
  
  start_dict= cases_by_county.to_dict() 
  
  
  
  for key in start_dict.keys(): 
  
  
  
      values = [] 
  
  
  
      labels = [] 
  
  
  
      county = [] 
  
  
  
      for item in start_dict[key]: 
  
  
  
          if item['label'] not in labels: 
  
  
  
              labels.append(item['label']) 
  
  
  
              values.append(item['value']) 
  
  
  
          else: 
  
  
  
              i = labels.index(item['label']) 
  
  
  
              values[i] += item['value'] 
  
  
  
       
  
  
  
      for l,v in zip(labels, values): 
  
  
  
          county.append({'value':v, 'label':l}) 
  
  
  
      clean_dict[key] = county 
  
  
  
  #Convert the data to Pandas series to add it to the treemap 
  
  
  
  new_series = pd.Series(clean_dict)

然后，我們可以將該系列添加到treemap，并繪制它的標(biāo)記版本。

 
 
 
 
  
  
  
  treemap = pygal.Treemap(height=200) 
  
  
  
  [treemap.add(x[0], x[1][:10]) for x in new_series.iteritems()] 
  
  
  
  display(HTML(base_html.format(rendered_chart=treemap.render(is_unicode=True))))

太棒了!現(xiàn)在我們的樹形圖被標(biāo)記了。如果將鼠標(biāo)懸停在這些塊上，就可以看到縣的名稱、州和該縣的病例數(shù)。

完整的代碼

 
 
 
 
  
  
  
  #Import needed libraries 
  
  
  
  import pygal 
  
  
  
  import pandas as pd 
  
  
  
  #Parse the dataframe 
  
  
  
  data = pd.read_csv("https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-counties.csv")  
  
  
  
  #Get the cases by county for all states 
  
  
  
  cases_by_county = data.sort_values(by=['cases'],ascending=False).groupby(['state'], axis=0).apply( 
  
  
  
      lambda x : [{"value" : l, "label" : c } for l, c in zip(x['cases'], x['county'])]) 
  
  
  
  cases_by_county= cases_by_county[:10] 
  
  
  
  #Create a new dictionary that contains the cleaned up version of the data 
  
  
  
  clean_dict = {} 
  
  
  
  start_dict= cases_by_county.to_dict() 
  
  
  
  for key in start_dict.keys(): 
  
  
  
      values = [] 
  
  
  
      labels = [] 
  
  
  
      county = [] 
  
  
  
      for item in start_dict[key]: 
  
  
  
          if item['label'] not in labels: 
  
  
  
              labels.append(item['label']) 
  
  
  
              values.append(item['value']) 
  
  
  
          else: 
  
  
  
              i = labels.index(item['label']) 
  
  
  
              values[i] += item['value'] 
  
  
  
       
  
  
  
      for l,v in zip(labels, values): 
  
  
  
          county.append({'value':v, 'label':l}) 
  
  
  
      clean_dict[key] = county 
  
  
  
  #Convert the data to Pandas series to add it to the treemap 
  
  
  
  new_series = pd.Series(clean_dict) 
  
  
  
  #Draw the treemap 
  
  
  
  treemap = pygal.Treemap(height=200) 
  
  
  
  [treemap.add(x[0], x[1][:10]) for x in new_series.iteritems()] 
  
  
  
  display(HTML(base_html.format(rendered_chart=treemap.render(is_unicode=True))))

餅狀圖

我們可以用另一種形式來展示這一信息，那就是用餅狀圖來展示案例數(shù)量最多的10個州。使用餅狀圖，我們可以看到一個州的案例數(shù)相對于其他州的百分比。

由于我們已經(jīng)完成了所有的數(shù)據(jù)幀操作，我們可以使用它來立即創(chuàng)建餅圖。

 
 
 
 
  
  
  
  first10 = list(sort_by_cases.items())[:10] 
  
  
  
  [pi_chart.add(x[0], x[1]) for x in first10] 
  
  
  
  display(HTML(base_html.format(rendered_chart=pi_chart.render(is_unicode=True))))

餅狀圖的完整代碼

 
 
 
 
  
  
  
  #Import needed libraries 
  
  
  
  import pygal 
  
  
  
  import pandas as pd 
  
  
  
  #Parse the dataframe 
  
  
  
  data = pd.read_csv("https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-counties.csv")  
  
  
  
  #Get the mean number of cases per states 
  
  
  
  sort_by_cases = data.sort_values(by=['cases'],ascending=False).groupby(['state'])['cases'].apply(list) 
  
  
  
  #Draw the bar chart 
  
  
  
  pi_chart = pygal.Pie(height=400) 
  
  
  
  #Get the top 10 states 
  
  
  
  first10 = list(sort_by_cases.items())[:10] 
  
  
  
  [pi_chart.add(x[0], x[1]) for x in first10] 
  
  
  
  display(HTML(base_html.format(rendered_chart=pi_chart.render(is_unicode=True))))

網(wǎng)站欄目：在Python中使用Pygal進行交互可視化
分享路徑：http://www.dlmjj.cn/article/djdsope.html

日本综合一区二区|亚洲中文天堂综合|日韩欧美自拍一区|男女精品天堂一区|欧美自拍第6页亚洲成人精品一区|亚洲黄色天堂一区二区成人|超碰91偷拍第一页|日韩av夜夜嗨中文字幕|久久蜜综合视频官网|精美人妻一区二区三区

新聞中心

其他資訊