av一区二区在线观看_亚洲男人的天堂网站_日韩亚洲视频_在线成人免费_欧美日韩精品免费观看视频_久草视

您的位置:首頁技術文章
文章詳情頁

python - 抓取一個代理ip網頁,使用cookie但是報錯

瀏覽:120日期:2022-08-10 14:36:44

問題描述

from urllib.request import *from http.cookiejar import *url = ’http://www.kuaidaili.com/proxylist/8/’cookies = MozillaCookieJar()hander = HTTPCookieProcessor(cookies)opener = build_opener(hander)install_opener(opener)html = urlopen(url).read()print(html)

這個網頁訪問是需要cookie的,我用上述方法訪問,顯示:httperror:521錯誤異常

問題解答

回答1:

這個網站的策略就是這樣,你的第一次訪問都是會返回512的,但是頁面還是有內容的

<html><body><script language='javascript'> window.onload=setTimeout('hv(233)', 200); function hv(OL) {var qo, mo='', no='', oo = [0xd9,0xa6,0x34,0xc9,0x42,0x3c,0xb1,0x27,0xf0,0x55,0x1b,0xb4,0x8a,0x64,0x48,0x5e,0x98,0x0e,0x03,0x58,0x2f,0x51,0x8a,0xf3,0x89,0x73,0xec,0xa2,0xda,0x63,0x19,0xe2,0x7c,0xf1,0xe6,0xaa,0xdf,0x55,0x7a,0x04,0x98,0x29,0x32,0x67,0xeb,0x70,0xd4,0x85,0x0f,0xda,0x94,0x0a,0x4e,0x92,0x0c,0x51,0xd4,0x5a,0x8f,0x15,0x9e,0xd3,0x28,0x8a,0x80,0x06,0x3b,0xdf,0x84,0x76,0x0c,0x70,0xe5,0x5a,0xee,0xe4,0x9a,0x5d,0xa1,0x16,0xcf,0xc1,0xe6,0x70,0xc0,0x41,0x76,0xea,0x5f,0xd8,0x59,0x43,0x87,0x1c,0xa1,0x3b,0x2d,0xe1,0xe3,0x48,0x79,0x2e,0xe2,0x67,0xab,0x69,0x1e,0x53,0xd7,0xec,0x8e,0x08,0x4e,0x77,0x20,0x56,0xde,0x58,0xf0,0xb4,0xa5,0x40,0xb8,0x7e,0x64,0x06,0x32,0xd6,0x5b,0x4d,0x05,0xad,0x36,0x09,0xfe,0xb3,0x08,0xa9,0x4e,0x83,0xaf,0xb4,0x15,0xa9,0xae,0x63,0xe7,0xb8,0x5a,0xb1,0xa9,0x14,0x25,0xca,0x37,0xa0,0x76,0x70,0x26,0x60,0x26,0x4a,0x3f,0x01,0x1b,0x93,0x49,0x83,0x6a,0xd3,0x89,0xc3,0xa9,0xe3,0xa5,0x9a,0x34,0x0a,0x04,0x15,0xba,0x63,0xa9,0x63,0xcb,0xf1,0xe6,0xbc,0x0e,0x6b,0x80,0x22,0x7a,0xb4,0x7a,0xe3,0x41,0x1b,0x73,0x35,0x9e,0x78,0x0e,0xfc,0x71,0x6b,0xe4,0xaa,0x13,0xd8,0xbd,0xa7,0x7d,0x17,0xd0,0x35,0x6f,0x6c,0x42,0x0c,0x00,0x66,0x40,0xd5,0x8d,0x06,0xff,0x75,0x3f,0xa7,0x69,0x1b,0x91,0x1c,0xc7,0x3b];qo = 'qo=234; do{oo[qo]=(-oo[qo])&0xff; oo[qo]=(((oo[qo]>>2)|((oo[qo]<<6)&0xff))-169)&0xff;} while(--qo>=2);'; eval(qo);qo = 233; do { oo[qo] = (oo[qo] - oo[qo - 1]) & 0xff; } while (-- qo >= 3 );qo = 1; for (;;) { if (qo > 233) break; oo[qo] = ((((((oo[qo] + 72) & 0xff) + 72) & 0xff) << 6) & 0xff) | (((((oo[qo] + 72) & 0xff) + 72) & 0xff) >> 2); qo++;}po = ''; for (qo = 1; qo < oo.length - 1; qo++) if (qo % 7) po += String.fromCharCode(oo[qo] ^ OL);eval('qo=eval;qo(po);');} </script> </body></html>

他把重要的key隱藏到js中,并通過eval函數進行轉換跳轉,起到一個混攪代碼的作用,使用selenium的話也許可以解決這個問題

話外: 代理網站本身自己就是爬蟲的代理提供者,在這反爬上面是做的很不錯的。我覺得一個爬蟲的重心應該是搞定主要內容,如果為了節約錢去爬取免費代理,這上面花的時間是很多的,效率未免太低了。我在公司里是直接用的kuaidaili的付費代理,基本沒有在代理獲取上想太多,只需要思考高并發條件下如何更好的利用代理就OK了~

標簽: Python 編程
主站蜘蛛池模板: 69福利影院 | 国产精品一区在线观看 | 久草在线青青草 | 久久久精品一区二区三区 | 日本中文字幕一区 | 九九久久精品 | 国产91综合一区在线观看 | 精品久久国产视频 | 国产成人精品免费 | 日韩欧美国产精品 | 亚洲精品久久久久久首妖 | 久久综合一区 | 欧美一级在线视频 | 中文久久 | 手机看片在线播放 | 精品国产一区二区三区久久狼黑人 | 成人精品在线观看 | 国产亚洲精品一区二区三区 | 尤物在线视频 | 亚洲区中文字幕 | 日韩另类视频 | 婷婷久久综合 | 九九热精 | 欧美成人a∨高清免费观看 老司机午夜性大片 | 欧美第一区 | 亚洲一区中文 | 日日爽| 九九综合 | 中文字幕高清免费日韩视频在线 | 日韩国产精品一区二区三区 | 成人日b视频 | 久久久久久久久久久久久久久久久久久久 | 手机在线不卡av | 亚洲一区二区中文字幕 | 亚洲a一区二区 | 国产成人综合av | 国产观看| 亚洲三级在线观看 | 久久久性色精品国产免费观看 | 国产伦精品一区二区三区高清 | 欧美午夜精品理论片a级按摩 |