xml地图|网站地图|网站标签 [设为首页] [加入收藏]
澳门新葡亰手机版存储结构
分类:编程

    在地方介绍过栈(Stack卡塔 尔(阿拉伯语:قطر‎的蕴藏结构,接下去介绍另生龙活虎种存款和储蓄结构字典(Dictionary卡塔尔国。 字典(Dictionary卡塔 尔(阿拉伯语:قطر‎里面包车型客车每一个成分都是一个键值对(由一个因素构成:键和值) 键必须是唯意气风发的,而值无需唯意气风发的,键和值都能够是其他项目。字典(Dictionary)是常用于查找和排序的列表。

cookiejar模块:

  • 管住储存cookie,将盛传的http须求加多cookie
  • cookie存款和储蓄在内部存款和储蓄器中,库克ieJar示例回笼后cookie将自行消失

 

平台:mac

  接下去看一下Dictionary的局地方法和类的后面部分实今世码:

实例:用cookjar访谈人人网主页

 1 import json
 2 from urllib import request,parse
 3 from http import cookiejar
 4 
 5 #实例化一个cookiejar对象
 6 cookiejar_object = cookiejar.CookieJar()
 7 #生成cookie管理器
 8 cookie_handler = request.HTTPCookieProcessor(cookiejar_object)
 9 #有了opener,就可以代替urlopen来进行请求网页
10 opener = request.build_opener(cookie_handler)
11 
12 
13 #获取网页的登陆接口
14 url = 'http://www.renren.com/ajaxLogin/login?1=1&uniqueTimestamp=2018721532875'
15 
16 #将网页的form信息获取
17 form = {
18     'email':'xxx',
19     'icode':'',
20     'origURL':'http://www.renren.com/home',
21     'domain':'renren.com',
22     'key_id':'1',
23     'captcha_type':'web_login',
24     'password':'xxx',
25     'rkey':'79d8184f25d678248262a91caf3e7ea8',
26     'f':'http%3A%2F%2Fzhibo.renren.com%2Ftop',
27 }
28 
29 
30 #将数据转换成二进制
31 form_b = parse.urlencode(form).encode('utf-8')
32 
33 #将url和表单信息,还有获取到的cookie去访问网页
34 response = opener.open(url,form_b)
35 html_b = response.read()#字节类型
36 # print(html_b)
37 
38 
39 res_dict = json.loads(html_b.decode('utf-8'))
40 #获取登陆后的个人主页url
41 home_url = res_dict['homeUrl']
42 
43 # print(home_url)
44 # 访问个人主页
45 response = opener.open(home_url)
46 html_bytes = response.read()
47 print(html_bytes.decode('utf-8'))

 

 

 

网站:人人网

  1.Add:将钦命的键和值加多到字典中。

python Proxy代理

 1 from urllib import request
 2 
 3 proxy = {
 4     'http': 'http://219.141.153.41:80'
 5 }
 6 
 7 url = 'http://www.baidu.com/s?wd=ip'
 8 # request.HTTPCookieProcessor(cookie)
 9 handler = request.ProxyHandler(proxy)
10 
11 # 生成 opener.open = urlopen
12 opener = request.build_opener(handler)
13 
14 # 同过opener访问百度
15 
16 response = opener.open(url, timeout=5)
17 # 存储页面
18 with open('baidu.html', 'wb') as f:
19     f.write(response.read())

 

 

 

 

新近练习爬虫登入,方法一是找页面里的js文件,通过分析js文件找到cookie音信再保持。但明天的站点登录都有验证码,况且最烦的是request时data表单里的值基本上并未有不加密的,js学的不得了,就别想着破解了。所以想起了用的少之又少的selenium模块,用于模拟登入并获得cookie。

public void Add(TKey key, TValue value) {
            Insert(key, value, true); 
        }

        private void Insert(TKey key, TValue value, bool add) {

            if( key == null ) { 
                ThrowHelper.ThrowArgumentNullException(ExceptionArgument.key);
            } 

            if (buckets == null) Initialize(0);
            int hashCode = comparer.GetHashCode(key) & 0x7FFFFFFF;
            int targetBucket = hashCode % buckets.Length; 

#if FEATURE_RANDOMIZED_STRING_HASHING 
            int collisionCount = 0; 
#endif

            for (int i = buckets[targetBucket]; i >= 0; i = entries[i].next) {
                if (entries[i].hashCode == hashCode && comparer.Equals(entries[i].key, key)) {
                    if (add) {
                        ThrowHelper.ThrowArgumentException(ExceptionResource.Argument_AddingDuplicate); 
                    }
                    entries[i].value = value; 
                    version++; 
                    return;
                } 

#if FEATURE_RANDOMIZED_STRING_HASHING
                collisionCount++;
#endif 
            }
            int index; 
            if (freeCount > 0) { 
                index = freeList;
                freeList = entries[index].next; 
                freeCount--;
            }
            else {
                if (count == entries.Length) 
                {
                    Resize(); 
                    targetBucket = hashCode % buckets.Length; 
                }
                index = count; 
                count++;
            }

            entries[index].hashCode = hashCode; 
            entries[index].next = buckets[targetBucket];
            entries[index].key = key; 
            entries[index].value = value; 
            buckets[targetBucket] = index;
            version++; 

#if FEATURE_RANDOMIZED_STRING_HASHING
            if(collisionCount > HashHelpers.HashCollisionThreshold && HashHelpers.IsWellKnownEqualityComparer(comparer))
            { 
                comparer = (IEqualityComparer<TKey>) HashHelpers.GetRandomizedEqualityComparer(comparer);
                Resize(entries.Length, true); 
            } 
#endif

        }

有道词典翻译接口 

 1 import time
 2 import random
 3 import json
 4 from Day1.tuozhan_all import post
 5 
 6 def md5_my(need_str):
 7     import hashlib
 8     # 创建md5对象
 9     md5_o = hashlib.md5()
10     # 需要有bytes, 作为参数
11     # 由str, 转换成 bytes encode-------str.encode('utf-8')
12     # 由bytes转换成 str, decode---------bytes.decode('utf-8')
13     sign_bytes = need_str.encode('utf-8')
14     # 更新md5 object的值
15     md5_o.update(sign_bytes)
16     sign_str = md5_o.hexdigest()
17     return sign_str
18 
19 # url
20 
21 def translate(kw):
22     url = 'http://fanyi.youdao.com/translate_o?smartresult=dict&smartresult=rule'
23 
24     headers = {
25         'Accept': 'application/json, text/javascript, */*; q=0.01',
26         'Accept-Language': 'zh-CN,zh;q=0.9,en;q=0.8',
27         'Connection': 'keep-alive',
28         'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
29         'Cookie': 'OUTFOX_SEARCH_USER_ID=-493176930@10.168.8.63; OUTFOX_SEARCH_USER_ID_NCOO=38624120.26076847; SESSION_FROM_COOKIE=unknown; JSESSIONID=aaabYcV4ZOU-JbQUha2uw; ___rl__test__cookies=1534210912076',
30         'Host': 'fanyi.youdao.com',
31         'Origin': 'http://fanyi.youdao.com',
32         'Referer': 'http://fanyi.youdao.com/',
33         'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36',
34         'X-Requested-With': 'XMLHttpRequest',
35     }
36 
37 
38     # form 的生成1. i 需要确定, 2, salt, 3, sign
39     key= kw
40 
41     # salt : ((new Date).getTime() + parseInt(10 * Math.random(), 10))
42     salt = int(time.time()*1000 + random.randint(0,10))
43     print('salt:',salt)
44     salt_str = str(salt)
45 
46     # sign : o = u.md5(S + n + r + D);
47     # S = "fanyideskweb"
48     # D = "ebSeFb%=XZ%T[KZ)c(sy!"
49     # n = key
50     # r = salt_str
51     S = "fanyideskweb"
52     D = "ebSeFb%=XZ%T[KZ)c(sy!"
53     sign_str = S + key + salt_str + D
54     # md5 加密的方法
55     sign_md5_str = md5_my(sign_str)
56 
57     form = {
58         'i': key,
59         'from': 'AUTO',
60         'to': 'AUTO',
61         'smartresult': 'dict',
62         'client': 'fanyideskweb',
63         'salt': salt_str,
64         'sign': sign_md5_str,
65         'doctype': 'json',
66         'version': '2.1',
67         'keyfrom': 'fanyi.web',
68         'action': 'FY_BY_REALTIME',
69         'typoResult': 'false',
70     }
71 
72     html_bytes = post(url, form, headers=headers)
73 
74     # 将 json 类型的 str, 转化成, 字典
75     res_dict = json.loads(html_bytes.decode('utf-8'))
76     #print(html_bytes.decode('utf-8'))
77 
78     translate_res = res_dict['translateResult'][0][0]['tgt']
79     return translate_res
80 
81 if __name__ == '__main__':
82     ret = translate('中国')
83 
84     print('翻译的结果:' + ret)

 

import time,random
from selenium import webdriver
import requests
from urllib import request
from lxml import etree

driver = webdriver.Chrome(executable_path=r'/Applications/Google Chrome.app/chromedriver')
driver.get('http://www.renren.com/PLogin.do')
time.sleep(2)
driver.find_element_by_id('email').clear()
driver.find_element_by_id('email').send_keys('myusername')  # 输入用户名
driver.find_element_by_id('password').clear()
driver.find_element_by_id('password').send_keys('mypassword')  # 输入密码

img_url = 'http://icode.renren.com/getcode.do?t=web_login&rnd='+str(random.random())
request.urlretrieve(img_url,'renren_yzm.jpg')
try:
    driver.find_element_by_id('icode').clear()
    img_res = input('输入验证码:')  # 如果需要输入验证码,可以手工,或者接口给打码平台
    driver.find_element_by_id('icode').send_keys(img_res)
except:
    pass
driver.find_element_by_id('autoLogin').click()  # 自动登陆
driver.find_element_by_id('login').click()  # 登陆
time.sleep(3)
cookie_items = driver.get_cookies()  # 获取cookie值

post = {} # 保存cookie值
for cookie in cookie_items:
    post[cookie['name']] = cookie['value']
print(post['t'])  # 人人网登陆后需要保持登陆的cookie信息
driver.quit()  # 退出selenium
# ------------------------------------------------------------

url = 'http://www.renren.com/265025131/profile'
headers = {
    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36',
    'Cookie':'t='+post['t'],
}
response = requests.get(url,headers=headers)
print('-'*50)

html = etree.HTML(response.text)
title = html.xpath('//title/text()')
print('目前得到的页面信息',title)
print(response.url)

   2.Clear():从 Dictionary<TKey, 电视alue> 中移除全部的键和值。

计算:使用selenium模拟登录、 获取cookie没用有个别时间,但想当然的感到步入renren的民用页面必需使用获取的具备cookie值,徒浪费N五个钟头,结果只保留了cookie内的't'值,就形成保险登入, 所以,不断的测量试验,是比较关键的。

本文由澳门新葡亰手机版发布于编程,转载请注明出处:澳门新葡亰手机版存储结构

上一篇:没有了 下一篇:没有了
猜你喜欢
热门排行
精彩图文