matplotlib:如何防止 x 轴标签重叠 [英] matplotlib: how to prevent x-axis labels from overlapping

查看:96
本文介绍了matplotlib:如何防止 x 轴标签重叠的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 matplotlib 生成条形图.一切正常,但我不知道如何防止 x 轴的标签相互重叠.这里有一个例子:

以下是 postgres 9.1 数据库的一些示例 SQL:

删除表如果存在mytable;创建表 mytable(id bigint, version smallint, date_from 时间戳不带时区);插入到 mytable(id, version, date_from) 值中('4084036', '1', '2006-12-22 22:46:35'),('4084938', '1', '2006-12-23 16:19:13'),('4084938', '2', '2006-12-23 16:20:23'),('4084939', '1', '2006-12-23 16:29:14'),('4084954', '1', '2006-12-23 16:28:28'),('4250653', '1', '2007-02-12 21:58:53'),('4250657', '1', '2007-03-12 21:58:53');

这是我的python脚本:

# -*- 编码:utf-8 -*-#!/usr/bin/python2.7导入 psycopg2导入 matplotlib.pyplot 作为 pltfig = plt.figure()# 对于 savefig()导入pylab###### 使用 psycopg2 连接到数据库###尝试:conn_string="dbname='x' user='y' host='z' password='pw'"打印 "连接到数据库
->%s" % (conn_string)conn = psycopg2.connect(conn_string)打印成功建立与数据库的连接"除了:打印连接到数据库失败"###### 执行 SQL 查询#### sql的新游标方法cur = conn.cursor()# 执行 SQL 查询.对于多于一行,使用三个 '"'尝试:cur.execute("""-- 这些点是在哪一年/哪一个月创建的?-- 需要 'yyyymm' 因为我只需要带年的月份(值是相加的).如果没有,查询每天都会返回数据库有一个条目.SELECT to_char(s.day,'yyyymm') AS 月,count(t.id)::int AS 计数从  (选择 generate_series(min(date_from)::date,max(date_from)::date,间隔1天")::日期为天从 mytable t) sLEFT JOIN mytable t ON t.date_from::date = s.day按月分组按月订购;""")# 返回查询结果.fetchall() = 所有行,fetchone() = 第一行记录 = cur.fetchall()cur.close()除了:打印无法执行查询"# 从 db-query 中解压数据.顺序与 db-query 输出相同年,计数 = zip(*记录)###### 情节(条形图)#### 计算count-values、y-axis-values、axis-labels的位置、legend-label的范围的长度plt.bar(range(len(count)), count, align='center', label='Amount of created/edited points')# 将数据库值添加到绘图中,偏移量为 10px/10pxax = fig.add_subplot(111)对于 zip(year,count) 中的 i,j:ax.annotate(str(j), xy=(i,j), xytext=(10,10), textcoords='offset points')# 在 x 轴上旋转 x 标签fig.autofmt_xdate()# x 和 y 轴的标签值plt.xticks(range(len(count)), (year))# 标记 x 和 y 轴plt.xlabel('年')plt.ylabel('创建/编辑点的数量')# 在图上定位图例 (http://matplotlib.org/users/legend_guide.html#legend-location)plt.legend(loc=1)# 情节标题plt.title("随着时间的推移创建/编辑的点数")# 显示情节pylab.show()

有什么方法可以防止标签相互重叠?理想情况下是自动方式,因为我无法预测条形数量.

解决方案

  • OP 中的问题是日期格式为 string 类型.matplotlib 将每个值绘制为刻度标签,刻度位置是基于值数量的 0 索引数.
  • 此问题的解决方案是将所有值转换为正确的type,在这种情况下为datetime.
    • 一旦 axes 具有正确的 type,还有额外的 matplotlib

      从您的数据库游标中获取一个简单的元组列表应该像...

      data_tuples = []对于游标中的行:data_tuples.append(row)

      但是,我在这里发布了一个函数版本,我用来直接使用 db 游标来记录数组或 Pandas 数据帧:如何将SQL查询结果转换为PANDAS数据结构?

      希望这也有帮助.

      I'm generating a bar-chart with matplotlib. It all works well but I can't figure out how to prevent the labels of the x-axis from overlapping each other. Here an example:

      Here is some sample SQL for a postgres 9.1 database:

      drop table if exists mytable;
      create table mytable(id bigint, version smallint, date_from timestamp without time zone);
      insert into mytable(id, version, date_from) values
      
      ('4084036', '1', '2006-12-22 22:46:35'),
      ('4084938', '1', '2006-12-23 16:19:13'),
      ('4084938', '2', '2006-12-23 16:20:23'),
      ('4084939', '1', '2006-12-23 16:29:14'),
      ('4084954', '1', '2006-12-23 16:28:28'),
      ('4250653', '1', '2007-02-12 21:58:53'),
      ('4250657', '1', '2007-03-12 21:58:53')
      ;  
      

      And this is my python-script:

      # -*- coding: utf-8 -*-
      #!/usr/bin/python2.7
      import psycopg2
      import matplotlib.pyplot as plt
      fig = plt.figure()
      
      # for savefig()
      import pylab
      
      ###
      ### Connect to database with psycopg2
      ###
      
      try:
        conn_string="dbname='x' user='y' host='z' password='pw'"
        print "Connecting to database
      ->%s" % (conn_string)
      
        conn = psycopg2.connect(conn_string)
        print "Connection to database was established succesfully"
      except:
        print "Connection to database failed"
      
      ###
      ### Execute SQL query
      ###  
      
      # New cursor method for sql
      cur = conn.cursor()
      
      # Execute SQL query. For more than one row use three '"'
      try:
        cur.execute(""" 
      
      -- In which year/month have these points been created?
      -- Need 'yyyymm' because I only need Months with years (values are summeed up). Without, query returns every day the db has an entry.
      
      SELECT to_char(s.day,'yyyymm') AS month
            ,count(t.id)::int AS count
      FROM  (
         SELECT generate_series(min(date_from)::date
                               ,max(date_from)::date
                               ,interval '1 day'
                )::date AS day
         FROM   mytable t
         ) s
      LEFT   JOIN mytable t ON t.date_from::date = s.day
      GROUP  BY month
      ORDER  BY month;
      
        """)
      
      # Return the results of the query. Fetchall() =  all rows, fetchone() = first row
        records = cur.fetchall()
        cur.close()
      
      except:
        print "Query could not be executed"
      
      # Unzip the data from the db-query. Order is the same as db-query output
      year, count = zip(*records)
      
      ###
      ### Plot (Barchart)
      ###
      
      # Count the length of the range of the count-values, y-axis-values, position of axis-labels, legend-label
      plt.bar(range(len(count)), count, align='center', label='Amount of created/edited points')
      
      # Add database-values to the plot with an offset of 10px/10px
      ax = fig.add_subplot(111)
      for i,j in zip(year,count):
          ax.annotate(str(j), xy=(i,j), xytext=(10,10), textcoords='offset points')
      
      # Rotate x-labels on the x-axis
      fig.autofmt_xdate()
      
      # Label-values for x and y axis
      plt.xticks(range(len(count)), (year))
      
      # Label x and y axis
      plt.xlabel('Year')
      plt.ylabel('Amount of created/edited points')
      
      # Locate legend on the plot (http://matplotlib.org/users/legend_guide.html#legend-location)
      plt.legend(loc=1)
      
      # Plot-title
      plt.title("Amount of created/edited points over time")
      
      # show plot
      pylab.show()
      

      Is there a way how I can prevent the labels from overlapping each other? Ideally in an automatic way, because I can't predict the amount of bars.

      解决方案

      • The issue in the OP is the dates are formatted as string type. matplotlib plots every value as a tick label with the tick location being a 0 indexed number based on the number of values.
      • The resolution to this issue is to convert all values to the correct type, datetime in this case.
        • Once the axes have the correct type, there are additional matplotlib methods, which can be used to further customize the tick spacing.
      • The answers to What is plotted when string data is passed to the matplotlib API? explain in more detail what happens when string values are passed to matplotlib.
      • As of 2014-09-30, pandas has a read_sql function, which has a parse_dates parameter. You definitely want to use that instead.

      Original Answer

      Here's how you should convert your date string into real datetime objects:

      import numpy as np
      import matplotlib.pyplot as plt
      import matplotlib.dates as mdates
      data_tuples = [
          ('4084036', '1', '2006-12-22 22:46:35'),
          ('4084938', '1', '2006-12-23 16:19:13'),
          ('4084938', '2', '2006-12-23 16:20:23'),
          ('4084939', '1', '2006-12-23 16:29:14'),
          ('4084954', '1', '2006-12-23 16:28:28'),
          ('4250653', '1', '2007-02-12 21:58:53'),
          ('4250657', '1', '2007-03-12 21:58:53')]
      datatypes = [('col1', 'i4'), ('col2', 'i4'), ('date', 'S20')]
      data = np.array(data_tuples, dtype=datatypes)
      col1 = data['col1']
      
      # convert the dates to a datetime type
      dates = mdates.num2date(mdates.datestr2num(data['date']))
      fig, ax1 = plt.subplots()
      ax1.bar(dates, col1)
      fig.autofmt_xdate()
      

      Getting a simple list of tuples out of your database cursor should be as simple as...

      data_tuples = []
      for row in cursor:
          data_tuples.append(row)
      

      However, I posted a version of a function that I use to take db cursors directly to record arrays or pandas dataframes here: How to convert SQL Query result to PANDAS Data Structure?

      Hopefully that helps too.

      这篇关于matplotlib:如何防止 x 轴标签重叠的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆