如何从列表中拆分子列表并将数据添加到另一个列表中 [英] How to split the sublist from the list and add the data in another list
问题描述
我需要再创建一个具有列格式的数据框
I need to create one more dataframe which has the column format
df_codata["Latlong"]= [{"__type":"GeoPoint","latitude":x,"longitude":y}]
如图所示,"Latlong"包含具有子列表的"latitude".我需要创建一个数据帧,该数据帧给我输出,例如单个"C_ID",它具有多个纬度"子列表,我想添加以在df_codata数据帧的纬度"中添加子列表的第一个数字,并在经度"中添加子列表的第二个数字数据帧".希望在下面附加所需的输出: 请对此问题提出建议.我无法拆分子列表,并将其插入纬度"和经度"两个子列
As seen in the image the "Latlong" contains the "latitude" which has sublist. I need to create dataframe which give me output such as for the single "C_ID" it has multiple sublist of "latitude" I want to add to add sublist first number in "latitude" of df_codata dataframe and second number of the sublist in "longitude" of dataframe. Want the desired output attached below: Kindly suggest for the issue. I am unable to split the sublist and insert it two subcolumns of "latitude" and "longitude"
推荐答案
您可以进行某种数据预处理以获取所需的输出.这是解决您的任务的可能解决方案之一:
You can do some kind of data preprocessing to get your desired output. Here is one of the possible solutions to solve your task:
import pandas as pd
import json
def extend_coordinates(coordinates, c_id, geo_type):
result = []
for elem in coordinates:
if len(elem) == 2:
# normal case
latitude, longitude = elem
else:
latitude, longitude = [None, None]
result.append([c_id, geo_type, latitude, longitude])
return result
data_json = [
{
'C_ID' : '1',
'Latlong' : {
'__type' : 'GeoPoint',
'latitude' : [[1, 2], [3, 4], [5, 6], [7, 8], [9, 10]]
}
},
{
'C_ID' : '2',
'Latlong' : {
'__type' : 'GeoPoint',
'latitude' : [[1, 2], [3, 4], [5, 6], [7, 8], [9, 10]]
}
},
{
'C_ID' : '3',
'Latlong' : {
'__type' : 'GeoPoint',
'latitude' : [[1, 2], [3, 4], [5, 6], [7, 8], [9, 10]]
}
}]
data = pd.read_json(json.dumps(data_json))
data['Common'] = data.apply(lambda x: extend_coordinates(coordinates=x['Latlong'].get('latitude', [None, None]),
c_id=x['C_ID'],
geo_type=x['Latlong'].get('__type', None)), axis=1)
data_ext = pd.DataFrame(np.concatenate(data['Common']),
columns=['C_ID', '__type', 'latitude', 'longitude'])
# if data_ext dataframe is not enough, you can combine data to your desired output
data_ext['Latlong'] = data_ext.apply(lambda x: {'__type': x['__type'],
'latitude': x['latitude'],
'longitude': x['longitude']}, axis=1)
result = data_ext[['C_ID', 'Latlong']]
del data_ext, data
更新:
要将纬度和经度保持为整数格式,可以执行以下操作(假设所有行的经度和纬度都为非空,并且它们正确表示了整数):
To keep latitude and longitude in integer format you can do the following (assuming that all rows have non-null longitude and latitude and they correctly represent integer numbers):
data_ext['Latlong'] = data_ext.apply(lambda x: {'__type': x['__type'],
'latitude': int(x['latitude']),
'longitude': int(x['longitude'])}, axis=1)
这篇关于如何从列表中拆分子列表并将数据添加到另一个列表中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!