如何使用Python将XML编码为ESRI Shapefile? [英] How to encode XML into ESRI Shapefiles using Python?
本文介绍了如何使用Python将XML编码为ESRI Shapefile?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我准备了以下Python脚本,以将此XML 导入ESRI形状文件。该脚本的起点是此帖子。
I prepared the following Python script to this XML into ESRI Shapefiles. The starting point for the script was this post.
#!/usr/bin/env python
# -*- coding: utf-8 -*-
#
# Requires pyshp: https://pypi.python.org/pypi/pyshp
#
# Conversion for http://daten.berlin.de/datensaetze/liste-der-gedenktafeln-berlin
# File: http://gedenktafeln-in-berlin.de/index.php?id=31&type=123
#
from xml.etree import ElementTree
from datetime import datetime
import shapefile
import os
def get_value(list, index, default):
value = list[index]
if value is None:
value = default
else:
value = value.text
if value is None:
value = default
else:
# value = value.replace(u'\xdf', u' ')
value = value.encode("utf-8")
return value
def add_shape(writer, attributes):
uid = int(get_value(attributes, 0, 0))
url = get_value(attributes, 1, "")
tstamp = get_value(attributes, 2, None)
if tstamp is not None:
tstamp = datetime.strptime(tstamp, '%d.%m.%Y')
ortsteil = get_value(attributes, 3, "")
strasse = get_value(attributes, 4, "")
longitude = get_value(attributes, 5, None)
latitude = get_value(attributes, 6, None)
Name = get_value(attributes, 7, "")
inhalt = get_value(attributes, 8, "")
erlauterung = get_value(attributes, 9, "")
swo = get_value(attributes, 10, "")
literatur = get_value(attributes, 11, "")
personen = get_value(attributes, 12, "")
entfernt = int(get_value(attributes, 13, 0))
if longitude is not None or latitude is not None:
longitude = float(longitude)
latitude = float(latitude)
# Fix interchanged coordinates
temp = 0
if longitude > latitude:
temp = latitude
latitude = longitude
longitude = temp
# Add coordinates
writer.point(longitude, latitude)
# Add attributes
writer.record(uid, url, tstamp, ortsteil, strasse, Name, inhalt, erlauterung, swo, literatur, personen, entfernt)
xml_file = 'gedenktafeln.xml'
shape_file = 'gedenktafeln.shp'
projection = 'GEOGCS["GCS_WGS_1984",DATUM["D_WGS_1984",SPHEROID["WGS_1984",6378137.0,298.257223563]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]]'
tree = ElementTree.parse(xml_file)
writer = shapefile.Writer(shapefile.POINT)
writer.field('uid', fieldType = 'N', size = 5, decimal = 0)
writer.field('url', fieldType = 'C', size = 255)
writer.field('tstamp', fieldType = 'C', size = 19) # Type 'D' seems to be not working here.
writer.field('ortsteil', fieldType = 'C', size = 200)
writer.field('strasse', fieldType = 'C', size = 200)
writer.field('Name', fieldType = 'C', size = 255)
writer.field('inhalt', fieldType = 'C', size = 255)
writer.field('erlauterung', fieldType = 'C', size = 255)
writer.field('swo', fieldType = 'C', size = 255)
writer.field('literatur', fieldType = 'C', size = 255)
writer.field('personen', fieldType = 'C', size = 255)
writer.field('entfernt', fieldType = 'N', size = 1, decimal = 0)
root = tree.getroot()
shapes = root.getchildren()
for shape in shapes:
attributes = shape.getchildren()
add_shape(writer, attributes)
try:
writer.save(shape_file)
except Exception, e:
print "ortsteil: " + ortsteil
print "strasse: " + strasse
print "Name: " + Name
print "inhalt: " + inhalt
print "erlauterung: " + erlauterung
print "swo: " + swo
print "literatur: " + literatur
print "personen: " + personen
print "entfernt: " + entfernt
raise
# create the PRJ file
with open(os.path.splitext(shape_file)[0] + os.extsep + 'prj', 'w') as prj:
prj.write(projection)
剩余问题:
-
特殊字符不按预期编码。我不相信value = value.encode(utf-8)
是正确的 - 请对此发表评论。//感谢ptrv
-
url
已被切断。//已解决blah238
- 坐标不在柏林,而在波茨坦。这可能是原始数据中的错误。或者我使用错误的
投影
设置。
Special characters are not encoded as expected. I am not convinced thatvalue = value.encode("utf-8")
is correct - please comment on this.// Thanks to ptrv
Theurl
is cut off.// Resolved by blah238
- Coordinates are not in Berlin but in Potsdam. This might be a mistake in the original data. Or I use the wrong
projection
settings.
为了检查数据我将其转换为GeoJSON,以便您可以在线查看他们。
In order to inspect the data I converted them into GeoJSON so you can view them online.
推荐答案
对于第一个问题,尝试删除行 value = value.replace(u'\xdf',u)
For the first issue, try removing the line value = value.replace(u'\xdf', u' ')
这对我有效,我可以成功地生成包含特殊字符的属性的shapefile。
This worked for me and I successfully could generate the shapefile with attributes containing special characters.
这篇关于如何使用Python将XML编码为ESRI Shapefile?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文