打開(kāi)快手主頁(yè),進(jìn)行頁(yè)面分析
對(duì)于快手這種平臺(tái),分析完頁(yè)面代碼之后,無(wú)任何想要的信息,所以,只能進(jìn)行json數(shù)據(jù)的抓取,這些視頻都是通過(guò)json語(yǔ)句傳給前段,然后進(jìn)行循環(huán)生成,所以,我們來(lái)看抓的json包
然后進(jìn)行詳情頁(yè)鏈接分析
接下來(lái)看json數(shù)據(jù)
補(bǔ)充一下,這里由于頁(yè)面刷新了,所以看到的兩個(gè)鏈接不一樣,方法就是這樣的
然后拼接出來(lái)二級(jí)路徑,進(jìn)行訪(fǎng)問(wèn)詳情頁(yè)
最后在詳情頁(yè)按照常規(guī)方法進(jìn)行分析頁(yè)面爬取數(shù)據(jù)就行了
分享一下代碼
import requests
from bs4 import BeautifulSoup
import json
import time
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36',
}
def first_get_request(first_request):
first_data = json.loads(first_request.text)
print(first_data)
#進(jìn)入第二層
first_two_data = first_data['data']['videoFeeds']['list']
for num in first_two_data:
two_url = 'https://live.kuaishou.com/u/' + num['user']['id'] + '/' + num['photoId']
# print(two_url)
two_get_request(two_url)
def two_get_request(two_url):
two_data = requests.get(url=two_url,headers=headers,verify=False)
soup = BeautifulSoup(two_data.text,'lxml')
#頭像
name_photo = soup.select('.profile-user img')[0]['src']
#名字
name = soup.select('.video-card-footer-user-name')[0].text
#點(diǎn)贊量
number = soup.select('.profile-user-count-info > .watching-count')[0].text
#點(diǎn)心量
num = soup.select('.profile-user-count-info > .like-count')[0].text
#內(nèi)容
text = soup.select('.profile-user > .profile-user-desc > span')[0].text
item = {
'頭像':name_photo,
'名字':name,
'內(nèi)容':text,
'點(diǎn)贊量':number,
'點(diǎn)心量':num
}
with open('爬取的信息.txt','a',encoding='utf8') as f:
f.write(str(item) + '\n')
time.sleep(3)
def main():
first_url = 'https://live.kuaishou.com/graphql'
formdata = {
"operationName": "videoFeedsQuery", "variables": {"count": 50, "pcursor": "50"},
"query": "fragment VideoMainInfo on VideoFeed {\n photoId\n caption\n thumbnailUrl\n poster\n viewCount\n likeCount\n commentCount\n timestamp\n workType\n type\n useVideoPlayer\n imgUrls\n imgSizes\n magicFace\n musicName\n location\n liked\n onlyFollowerCanComment\n width\n height\n expTag\n __typename\n}\n\nquery videoFeedsQuery($pcursor: String, $count: Int) {\n videoFeeds(pcursor: $pcursor, count: $count) {\n list {\n user {\n id\n eid\n profile\n name\n __typename\n }\n ...VideoMainInfo\n __typename\n }\n pcursor\n __typename\n }\n}\n"
}
#訪(fǎng)問(wèn)快手界面
first_request = requests.post(url=first_url,headers=headers,data=formdata,verify=False)
#分析首頁(yè)鏈接
first_get_request(first_request)
if __name__ == '__main__':
main()
最后就找到了我們想要的東西,
更多文章、技術(shù)交流、商務(wù)合作、聯(lián)系博主
微信掃碼或搜索:z360901061

微信掃一掃加我為好友
QQ號(hào)聯(lián)系: 360901061
您的支持是博主寫(xiě)作最大的動(dòng)力,如果您喜歡我的文章,感覺(jué)我的文章對(duì)您有幫助,請(qǐng)用微信掃描下面二維碼支持博主2元、5元、10元、20元等您想捐的金額吧,狠狠點(diǎn)擊下面給點(diǎn)支持吧,站長(zhǎng)非常感激您!手機(jī)微信長(zhǎng)按不能支付解決辦法:請(qǐng)將微信支付二維碼保存到相冊(cè),切換到微信,然后點(diǎn)擊微信右上角掃一掃功能,選擇支付二維碼完成支付。
【本文對(duì)您有幫助就好】元
