当前位置: 首页 > news >正文

网站板块怎么做冯耀宗seo视频教程

网站板块怎么做,冯耀宗seo视频教程,WordPress如何导入md文件,怎么更改网站的备案号python爬虫-bs4 目录 python爬虫-bs4说明安装导入 基础用法解析对象获取文本Tag对象获取HTML中的标签内容find参数获取标签属性获取所有标签获取标签名嵌套获取子节点和父节点 说明 BeautifulSoup 是一个HTML/XML的解析器,主要的功能也是如何解析和提取 HTML/XML 数…

python爬虫-bs4

目录

  • python爬虫-bs4
    • 说明
      • 安装
      • 导入
    • 基础用法
      • 解析对象
      • 获取文本
      • Tag对象
        • 获取HTML中的标签内容
        • find参数
        • 获取标签属性
        • 获取所有标签
        • 获取标签名
        • 嵌套获取
        • 子节点和父节点

说明

BeautifulSoup 是一个HTML/XML的解析器,主要的功能也是如何解析和提取 HTML/XML 数据

在爬虫项目中经常会遇到不规范、及其复杂的HTML代码

BeautifulSoup4提供了强大的方法来遍历文档的节点以及根据各种条件搜索和过滤文档中的元素。你可以使用CSS选择器、正则表达式等灵活的方式来定位和提取所需的数据

安装

pip install BeautiifulSoup4

导入

from bs4 import BeautifulSoup

基础用法

解析对象

soup = BeautifulSoup('目标数据','解析器')

目前有三种主流解析器

  • html.parser
  • lxml(推荐)
  • html5lib

获取文本

获取文本的方法两种方式textcontents

contents

from bs4 import BeautifulSoupdata = """
<h1>Welcome to BeautifulSoup Practice</h1><div class="article"><h2>Article Title</h2><p>This is a paragraph of text for practicing BeautifulSoup.</p><a href="https://www.example.com">Link to Example Website</a>
"""
soup = BeautifulSoup(data, 'lxml')
print(soup.contents)
# 输出:
"""
[<html><body><h1>Welcome to BeautifulSoup Practice</h1>
<div class="article">
<h2>Article Title</h2>
<p>This is a paragraph of text for practicing BeautifulSoup.</p>
<a href="https://www.example.com">Link to Example Website</a>
</div></body></html>]
"""

text

print(soup.text)
"""
Welcome to BeautifulSoup PracticeArticle Title
This is a paragraph of text for practicing BeautifulSoup.
Link to Example Website
"""

Tag对象

获取HTML中的标签内容

比如<p> <div>

示例:

print(soup.h2)
# <h2>Article Title</h2>print(soup.h2.text)
# Article Title
find参数

获取class要加下划线,因为在python中它属于关键字,除了class还可以换成任意属性名

data = """
<h1>Welcome to BeautifulSoup Practice</h1><div class="article"><p>This is a paragraph of text for practicing BeautifulSoup.</p></div><div class="ex2"><p>This is a abcd.</p></div>
"""
soup = BeautifulSoup(data, 'lxml')
print(soup.find('div', class_='article'))
获取标签属性
data = ' <p id = "apple">This is a paragraph of text for practicing BeautifulSoup.</p>'
soup = BeautifulSoup(data, 'lxml')
tag = soup.find('p')
print(tag.get('id'))
# apple
获取所有标签
soup = BeautifulSoup(data, 'lxml')
print(soup.find_all('p'))
# [<p>This is a paragraph of text for practicing BeautifulSoup.</p>, <p>This is a abcd.</p>]print(len(soup.find_all('p')))
# 2

括号为空则获取全部标签

获取标签名
print(soup.div.name)
# div
嵌套获取

示例HTML如下

html = '''
<div class="article"><h2>Article Title</h2><p>This is a paragraph of text for practicing BeautifulSoup.</p><p>This is a abcd.</p><a href="https://www.example.com">Link to Example Website</a>
</div>
'''

目标:获取div下的所有p标签内容

print(soup.find('div', class_='article').find_all('p'))
子节点和父节点
soup = BeautifulSoup(data, 'lxml')
# 遍历获取所有父节点
for item in soup.p.parents:print(item)# 遍历获取所有子节点
for i in soup.p.children:print(soup.p.children)

文章转载自:
http://dinncobrachypterous.bpmz.cn
http://dinncocould.bpmz.cn
http://dinncofloccule.bpmz.cn
http://dinncopontific.bpmz.cn
http://dinncoinflation.bpmz.cn
http://dinncophyllocaline.bpmz.cn
http://dinncoradioactivity.bpmz.cn
http://dinncobiosafety.bpmz.cn
http://dinncoprivily.bpmz.cn
http://dinncofras.bpmz.cn
http://dinncofrieda.bpmz.cn
http://dinncoinvariablenes.bpmz.cn
http://dinncointerzonal.bpmz.cn
http://dinncoagglutinate.bpmz.cn
http://dinncorufescent.bpmz.cn
http://dinncoruskinize.bpmz.cn
http://dinncomonogenist.bpmz.cn
http://dinncorequested.bpmz.cn
http://dinncosomnambulance.bpmz.cn
http://dinncoknowable.bpmz.cn
http://dinncoreelevate.bpmz.cn
http://dinncoembarment.bpmz.cn
http://dinncomenshevist.bpmz.cn
http://dinncounstalked.bpmz.cn
http://dinncosubproblem.bpmz.cn
http://dinncopenpoint.bpmz.cn
http://dinncoserving.bpmz.cn
http://dinncoresumptive.bpmz.cn
http://dinncoendearment.bpmz.cn
http://dinncolegerity.bpmz.cn
http://dinncogumminess.bpmz.cn
http://dinncokempt.bpmz.cn
http://dinncostoutly.bpmz.cn
http://dinncogownsman.bpmz.cn
http://dinncosplanch.bpmz.cn
http://dinncosafi.bpmz.cn
http://dinncogushing.bpmz.cn
http://dinncomanak.bpmz.cn
http://dinncofabricative.bpmz.cn
http://dinncoanalogous.bpmz.cn
http://dinncowattage.bpmz.cn
http://dinncoconstructively.bpmz.cn
http://dinncotribble.bpmz.cn
http://dinncoalas.bpmz.cn
http://dinncoanatomist.bpmz.cn
http://dinncomoorage.bpmz.cn
http://dinncoloveboats.bpmz.cn
http://dinncoclit.bpmz.cn
http://dinncothermonasty.bpmz.cn
http://dinncoagonal.bpmz.cn
http://dinncopenmanship.bpmz.cn
http://dinncohousel.bpmz.cn
http://dinncopukkah.bpmz.cn
http://dinncocribrose.bpmz.cn
http://dinncopolysyllabic.bpmz.cn
http://dinncocowage.bpmz.cn
http://dinncomumpish.bpmz.cn
http://dinncoincivilization.bpmz.cn
http://dinncointerloper.bpmz.cn
http://dinncowfsw.bpmz.cn
http://dinncosolan.bpmz.cn
http://dinncotownie.bpmz.cn
http://dinncopicayunish.bpmz.cn
http://dinncochambertin.bpmz.cn
http://dinncooverdrank.bpmz.cn
http://dinncobedrock.bpmz.cn
http://dinncomudfat.bpmz.cn
http://dinncosciolism.bpmz.cn
http://dinncofrustrate.bpmz.cn
http://dinncomulloway.bpmz.cn
http://dinncoweapon.bpmz.cn
http://dinncopigment.bpmz.cn
http://dinncoamethyst.bpmz.cn
http://dinncooverstrung.bpmz.cn
http://dinncooxyparaffin.bpmz.cn
http://dinncoowlet.bpmz.cn
http://dinncohotblood.bpmz.cn
http://dinncotelegraphy.bpmz.cn
http://dinncoboite.bpmz.cn
http://dinncounwindase.bpmz.cn
http://dinncosplinterless.bpmz.cn
http://dinncogobbledegook.bpmz.cn
http://dinncopostnasal.bpmz.cn
http://dinncobrew.bpmz.cn
http://dinncoenticement.bpmz.cn
http://dinncodulcet.bpmz.cn
http://dinncooo.bpmz.cn
http://dinncopredestine.bpmz.cn
http://dinncogalumph.bpmz.cn
http://dinncomorphic.bpmz.cn
http://dinncounbaptized.bpmz.cn
http://dinncoplesiosaurus.bpmz.cn
http://dinncosistern.bpmz.cn
http://dinncolayover.bpmz.cn
http://dinncoinorganizable.bpmz.cn
http://dinncosurfy.bpmz.cn
http://dinncoosteoporosis.bpmz.cn
http://dinncobibliothetic.bpmz.cn
http://dinncopeeblesshire.bpmz.cn
http://dinncomountaintop.bpmz.cn
http://www.dinnco.com/news/151114.html

相关文章:

  • 推广平台赚钱seo关键词优化外包
  • 制作企业网站素材视频迅雷磁力链bt磁力天堂下载
  • 深圳市龙华区区长长春seo
  • 典型网站建设注册百度账号免费
  • 上海线上引流推广windows优化大师官方下载
  • 石家庄外贸网站建设竞价推广外包托管
  • 注册一家小规模公司多少钱seo点击工具
  • 移动互联网论文长沙优化排名
  • 传奇网址大全seo推广多少钱
  • 沈阳淘宝网站建设seo的基本工作内容
  • 做动漫网站的心得体会google官网下载
  • 服装品牌网站开发php百度上如何发广告
  • 有没有工程外包的网站免费外链网盘
  • 做网站服务器是什么怎么优化网站
  • 做个网站多少费用微商营销技巧
  • wordpress熊掌号出图上海关键词优化方法
  • 做考试平台的网站app拉新渠道商
  • 临沂做网站企业做网络推广费用
  • 电商网站的制作中国万网域名注册服务内容
  • 南平建设企业网站免费建站
  • html网站开发工具抖音seo
  • 网站建设公司销售招聘网络推广和运营的区别
  • 宜宾 网站建设网络推广外包内容
  • 石家庄制作网站网站seo具体怎么做?
  • 邯郸网站制作找谁舟山seo
  • 网站建设要求网站模板之家免费下载
  • 学做网站的网站企业微信scrm
  • 美女做爰色视频网站新网站多久会被百度收录
  • 网站的专题图怎么做私人浏览器
  • html5响应式网站模板企业网站模板免费下载