使用Python将包含逗号的抓取字符串转换为整数 [英] Convert a scraped string containing comma into an integer using Python

查看:39
本文介绍了使用Python将包含逗号的抓取字符串转换为整数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试用硒来消除追随者的数量,但可以清楚地识别出"ValueError"错误.作为数字:

快照:

代码试用:

  follower_count = int(browser.find_element_by_xpath('/html/body/div/div/div/div/div [2]/main/div/div/div/div [1]/div/div [2]/div/div/div[1]/div/div[5]/div[2]/a/span[1]/span').text)following_count = int(browser.find_element_by_xpath('/html/body/div/div/div/div/div [2]/main/div/div/div/div/div [1]/div/div [2]/div/div/div[1]/div/div [5]/div [1]/a/span [1]/span').text) 

错误消息:

解决方案

提取的文本(即 1,961 )之间包含一个 字符.因此,您将无法直接在其上调用 int().


解决方案

您需要先从文本 1,961 中对字符进行 replace(),然后调用 int()如下:

  • 代码块:

     #count = browser.find_element_by_xpath('/html/body/div/div/div/div/div [2]/main/div/div/div/div/div [1]/div/div [2]/div/div/div [1]/div/div [5]/div [2]/a/span [1]/span').text计数="1,961".print(int(count.replace(,","))))print(type(int(count.replace(,","))))) 

  • 控制台输出:

      1961< class'int'> 


此用例

有效地,您的代码行将是:

  follower_count = int(browser.find_element_by_xpath('/html/body/div/div/div/div/div [2]/main/div/div/div/div [1]/div/div [2]/div/div/div [1]/div/div [5]/div [2]/a/span [1]/span').text.replace(,","))following_count = int(browser.find_element_by_xpath('/html/body/div/div/div/div/div [2]/main/div/div/div/div/div [1]/div/div [2]/div/div/div[1]/div/div [5]/div [1]/a/span [1]/span').text.replace(,","))) 


参考文献

您可以在以下位置找到相关的详细讨论:

Im trying to scrape the number of followers count with selenium but it clearly identify the "ValueError" as a number:

Snapshot:

Code trials:

follower_count =int(browser.find_element_by_xpath('/html/body/div/div/div/div[2]/main/div/div/div/div[1]/div/div[2]/div/div/div[1]/div/div[5]/div[2]/a/span[1]/span').text)
following_count = int(browser.find_element_by_xpath('/html/body/div/div/div/div[2]/main/div/div/div/div[1]/div/div[2]/div/div/div[1]/div/div[5]/div[1]/a/span[1]/span').text)
        

The error message:

解决方案

The extracted text i.e. 1,961 contains a , character in between. So you won't be able to invoke int() directly on it.


Solution

You need to replace() the , character from the text 1,961 first and then invoke int() as follows:

  • Code Block:

    # count = browser.find_element_by_xpath('/html/body/div/div/div/div[2]/main/div/div/div/div[1]/div/div[2]/div/div/div[1]/div/div[5]/div[2]/a/span[1]/span').text
    count = "1,961"
    print(int(count.replace(",","")))
    print(type(int(count.replace(",",""))))
    

  • Console Output:

    1961
    <class 'int'>
    


This usecase

Effectively, your line of code will be:

follower_count =int(browser.find_element_by_xpath('/html/body/div/div/div/div[2]/main/div/div/div/div[1]/div/div[2]/div/div/div[1]/div/div[5]/div[2]/a/span[1]/span').text.replace(",",""))
following_count = int(browser.find_element_by_xpath('/html/body/div/div/div/div[2]/main/div/div/div/div[1]/div/div[2]/div/div/div[1]/div/div[5]/div[1]/a/span[1]/span').text.replace(",",""))


References

You can find a relevant detailed discussion in:

这篇关于使用Python将包含逗号的抓取字符串转换为整数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆