显示图像和字幕在蟒蛇2列 [英] Display images and titles in 2 column in python

查看:209
本文介绍了显示图像和字幕在蟒蛇2列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我刮所有标题和图像源链接到一个文本文件,然后用数据从文本文件输出一个html文件,2列,一个用于图像,另一个用于titles.How显示可点击图像,画面标题并在2列格式的图像?这里是我有

 从BS4进口BeautifulSoup标题= []
图像= []
HREF = []R =了urllib.urlopen('https://www.open2study.com/courses').read()
汤= BeautifulSoup(R)因为我在soup.find_all('格',{阶级:courses_adblock_rollover}):
    titles.append(i.h2.text)因为我在soup.find_all('IMG',{阶级:形象的风格,当然,徽标主题块}):
    images.append(i.get(SRC))用开放('的test.txt',W)为f:
    对于ZIP我(标题,图像):
        f.write(I [0] .EN code('ASCII码','无视')+'\\ N'
                +我[1] .EN code('ASCII码','无视')+
                的'\\ n \\ n)标题='<!doctyle HTML>< HTML>< HEAD><标题>我的网页< /标题>< /头><身体GT;'
身体='<表>< TR>< TD>< / TD>< TD>< / TD>< / TR>'页脚='< /表>< /身体GT;< / HTML>'
用开放('test.txt的','R')作为输入,开放('的test.html','W')作为输出:
   output.write(头)
   output.write(体)   在输入线:
    #ignore空行
       如果行=='\\ n'的
            继续       COL1 = line.rstrip()
       #阅读下一行
       COL2 =下一个(输入).rstrip()
       output.write('< TR>< TD> {}< / TD>< TD>< IMG SRC ={}的风格=宽度:160像素,高度:100像素>< / TD&GT ;< / TR> \\ n \\ n'.format(COL1,COL2))
       output.write(页脚)


解决方案

我觉得你做你自己该刮的事情真的很难。它更容易开始与最大的元素第一,即全程格,然后拉出它以后的信息。

这code让你在第一列和第二列的课程标题可点击的图像。

 从BS4进口BeautifulSoup
进口的urllibBASE_URL ='https://www.open2study.com
R =了urllib.urlopen(BASE_URL +'/课程')。阅读()汤= BeautifulSoup(Rhtml.parser)课程= soup.find_all('格',{阶级:courses_adblock_start})编码=UTF-8
PAGE_TITLE ='艾张庭选'
html_template ='<!doctyle HTML>< HTML>< HEAD><标题> {}< /标题><间的charset ={}/>< /头><身体GT; { }< /身体GT;< / HTML>'
table_template ='<表> {}< /表>'
table_row_template ='&所述; TR>&下; TD> {}&下; / TD>&下; TD> {}&下; / TD>&下; / TR>'
img_template ='&下; A HREF ={}>&下; IMG SRC ={}宽度=160像素; ALT ={}>&下; / A>'table_rows =''
在课程C:
    标题= c.h2.text.en code(编码)
    图像= c.find('IMG',{'类':'形象的风格,当然,徽标主体块'})。获得(SRC)
    HREF = c.parent.get('href属性)
    img_tag = img_template.format(BASE_URL + HREF,图片,标题)
    table_rows + = table_row_template.format(img_tag,标题)table_tag = table_template.format(table_rows)开放(当然-scrape.html','W')为html_out:
    html_out.write(html_template.format(PAGE_TITLE,编码,table_tag))

输出

快照

I scraped all titles and image source links into a text file, then use the data from the text file to output a html file with 2 columns, one for images and one for titles.How to display clickable images, and display title and image in 2 column format? Here is what i have

from bs4 import BeautifulSoup

titles = []
images = []
href  = []

r = urllib.urlopen('https://www.open2study.com/courses').read()
soup = BeautifulSoup(r)

for i in soup.find_all('div', {'class': "courses_adblock_rollover"}):
    titles.append(i.h2.text)

for i in soup.find_all('img', {'class': "image-style-course-logo-subjects-block"}):
    images.append(i.get('src'))



with open('test.txt', "w") as f:
    for i in zip(titles, images):
        f.write(i[0].encode('ascii', 'ignore') + '\n'
                +i[1].encode('ascii', 'ignore') +
                '\n\n')

header = '<!doctyle html><html><head><title>My page</title></head><body>'
body = '<table><tr><td></td><td></td></tr>'

footer = '</table></body></html>'


with open('test.txt', 'r') as input, open('test.html', 'w') as output:
   output.write(header)
   output.write(body)

   for line in input:
    #ignore blank lines
       if line == '\n':
            continue

       col1 = line.rstrip()
       #read next line
       col2 = next(input).rstrip()
       output.write('<tr><td>{}</td><td><img src="{}" style="width: 160px;             height: 100px"></td></tr>\n\n'.format(col1, col2))
       output.write(footer)

解决方案

I feel like you're making this scraping thing really difficult on yourself. It's easier to start with the largest element first, i.e. the whole course div, then pull information out of it later on.

This code give you the clickable images in the first column and titles of the courses in the second column.

from bs4 import BeautifulSoup
import urllib

base_url = 'https://www.open2study.com'
r = urllib.urlopen(base_url + '/courses').read()

soup = BeautifulSoup(r, "html.parser")

courses = soup.find_all('div', {'class': "courses_adblock_start"})

encoding = "utf-8"
page_title = 'Ai Truong'
html_template = '<!doctyle html><html><head><title>{}</title><meta charset="{}" /></head><body>{}</body></html>'
table_template = '<table>{}</table>'
table_row_template = '<tr><td>{}</td><td>{}</td></tr>'
img_template = '<a href="{}"><img src="{}" width="160px;" alt="{}"></a>'

table_rows = ''
for c in courses:
    title = c.h2.text.encode(encoding)
    image = c.find('img', {'class': 'image-style-course-logo-subjects-block'}).get('src')
    href = c.parent.get('href')
    img_tag = img_template.format(base_url + href, image, title)
    table_rows += table_row_template.format(img_tag, title)

table_tag = table_template.format(table_rows)

with open('course-scrape.html', 'w') as html_out:
    html_out.write(html_template.format(page_title, encoding, table_tag))

Output

这篇关于显示图像和字幕在蟒蛇2列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆