使用pandas或其他python模块读取特定列 [英] Read specific columns with pandas or other python module

查看：807 发布时间：2017/2/24 20:50:04 python csv pandas

本文介绍了使用pandas或其他python模块读取特定列的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个csv档案来自此网页。
我想读取下载文件中的某些列（csv版本可以在右上角下载）。

我想说的是2列：

59在标题中 star_name

，由于某种原因，网页的作者有时决定移动列。

最后，我想要这样的东西，记住值可能会丢失。
data =以一种聪明的方式读取数据 names = data ['star_name'] ras = data ['ra']
这将防止我的程序在列再次更改

到目前为止，我已经尝试了使用 csv 模块的各种方法，重新使用 pandas 模块。没有任何运气。

EDIT（添加了两行+我的数据文件的标题，对不起，但非常长。）
＃名称，质量，mass_error_min，mass_error_max，半径，radius_error_min，radius_error_max，orbital_period，orbital_period_err_min，orbital_period_err_max，semi_major_axis，semi_major_axis_error_min，semi_major_axis_error_max，偏心，eccentricity_error_min，eccentricity_error_max，angular_distance，倾角，inclination_error_min，inclination_error_max，tzero_tr，tzero_tr_error_min，tzero_tr_error_max，tzero_tr_sec，tzero_tr_sec_error_min，tzero_tr_sec_error_max，lambda_angle，lambda_angle_error_min，lambda_angle_error_max，impact_parameter，impact_parameter_error_min，impact_parameter_error_max，tzero_vr，tzero_vr_error_min，tzero_vr_error_max，K，K_error_min，K_error_max，temp_calculated，temp_measured，hot_point_lon，反照率，albedo_error_min ，albedo_error_max，log_g，publication_status，发现更新，欧米茄，omega_error_min，omega_error_max，tperi，tperi_error_min，tperi_error_max，detection_type，mass_detection_type，radius_detection_type，alternate_names，分子，star_name，RA，DEC，mag_v，mag_i，mag_j，mag_h，mag_k，star_distance ，star_metallicity，star_mass，star_radius，star_sp_type，star_age，star_teff，star_detected_disc，star_magnetic_field 11的COM b，19.4,1.5,1.5 ,,,, 326.03,0.32,0.32,1.29,0.05,0.05,0.231,0.005,0.005 ，0.011664 ,,,,,,,,,,,,,,,,,,,,,,,,,,,,, 1,2008,2011-12-23,94.8,1.5,1.5,2452899.6,1.6 ，1.6，径向速度,,,, 11 Com，185.1791667,17.7927778,4.74 ,,,, 110.6，-0.35,2.7,19.0，G8 III ,, 4742.0 ,, 11 UMi b，10.5,2.47 ，2.47 ,,,, 516.22,3.25,3.25,1.54,0.07,0.07,0.08,0.03,0.03,0.012887 ,,,,,,,,,,,,,,,,,,, ,,,, 1,2009,2009-08-13,117.63,21.06,21.06,2452861.05,2.06,2.06，径向速度,,,,, 11 UMI，229.275,71.8238889,5.02 ,,,,, 119.5,0.04,1.8， 24.08，K4III，1.56,4340.0 ,,

解决方案
方法是使用 pandas 库。
import大熊猫作为PD 栏= ['star_name'，'岭'] DF = pd.read_csv（'data.csv'，skipinitialspace = TRUE，usecols =字段）＃See the keys print df.keys（）＃查看'star_name'中的内容 print df.star_name
这里的问题是 skipinitialspace ，它删除了标题中的空格。因此，star_name变成了star_name

I have a csv file from this webpage. I want to read some of the columns in the downloaded file (the csv version can be downloaded in the upper right corner).

Let's say I want 2 columns:

59 which in the header is star_name

60 which in the header is ra.

However, for some reason the authors of the webpage sometimes decide to move the columns around.

In the end I want something like this, keeping in mind that values can be missing.
data = #read data in a clever way names = data['star_name'] ras = data['ra']
This will prevent my program to malfunction when the columns are changed again in the future, if they keep the name correct.

Until now I have tried various ways using the csv module and resently the pandas module. Both without any luck.

EDIT (added two lines + the header of my datafile. Sorry, but it's extremely long.)
# name, mass, mass_error_min, mass_error_max, radius, radius_error_min, radius_error_max, orbital_period, orbital_period_err_min, orbital_period_err_max, semi_major_axis, semi_major_axis_error_min, semi_major_axis_error_max, eccentricity, eccentricity_error_min, eccentricity_error_max, angular_distance, inclination, inclination_error_min, inclination_error_max, tzero_tr, tzero_tr_error_min, tzero_tr_error_max, tzero_tr_sec, tzero_tr_sec_error_min, tzero_tr_sec_error_max, lambda_angle, lambda_angle_error_min, lambda_angle_error_max, impact_parameter, impact_parameter_error_min, impact_parameter_error_max, tzero_vr, tzero_vr_error_min, tzero_vr_error_max, K, K_error_min, K_error_max, temp_calculated, temp_measured, hot_point_lon, albedo, albedo_error_min, albedo_error_max, log_g, publication_status, discovered, updated, omega, omega_error_min, omega_error_max, tperi, tperi_error_min, tperi_error_max, detection_type, mass_detection_type, radius_detection_type, alternate_names, molecules, star_name, ra, dec, mag_v, mag_i, mag_j, mag_h, mag_k, star_distance, star_metallicity, star_mass, star_radius, star_sp_type, star_age, star_teff, star_detected_disc, star_magnetic_field 11 Com b,19.4,1.5,1.5,,,,326.03,0.32,0.32,1.29,0.05,0.05,0.231,0.005,0.005,0.011664,,,,,,,,,,,,,,,,,,,,,,,,,,,,,1,2008,2011-12-23,94.8,1.5,1.5,2452899.6,1.6,1.6,Radial Velocity,,,,,11 Com,185.1791667,17.7927778,4.74,,,,,110.6,-0.35,2.7,19.0,G8 III,,4742.0,, 11 UMi b,10.5,2.47,2.47,,,,516.22,3.25,3.25,1.54,0.07,0.07,0.08,0.03,0.03,0.012887,,,,,,,,,,,,,,,,,,,,,,,,,,,,,1,2009,2009-08-13,117.63,21.06,21.06,2452861.05,2.06,2.06,Radial Velocity,,,,,11 UMi,229.275,71.8238889,5.02,,,,,119.5,0.04,1.8,24.08,K4III,1.56,4340.0,,

解决方案
An easy way to do this is using the pandas library like this.
import pandas as pd fields = ['star_name', 'ra'] df = pd.read_csv('data.csv', skipinitialspace=True, usecols=fields) # See the keys print df.keys() # See content in 'star_name' print df.star_name
The problem here was the skipinitialspace which remove the spaces in the header. So ' star_name' becomes 'star_name'

这篇关于使用pandas或其他python模块读取特定列的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用pandas或其他python模块读取特定列 [英] Read specific columns with pandas or other python module

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用pandas或其他python模块读取特定列 [英] Read specific columns with pandas or other python module

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭