HTTP协议

请求
请求的数据
- URL
- URL的参数,比如http://httpbin.org/get?key1=value1&key2=value2
- 请求的数据体,比如填写的表单或者请求的JSON数据
请求的headers,比如:
headers = {
    'useragent': 'Mozilla/5.0(WindowsNT6.1;Win64;x64)AppleWebKit/537.36(KHTML,likeGecko)Chrome/52.0.2743.82Safari/537.36'}返回
返回的状态码:
- r.text是服务器响应的内容,会自动根据响应头部的字符编码进行解码。
- r.encoding是服务器内容使用的文本编码。
- r.status_code用于检测响应的状态码,如果返回200,就表示请求成功了;如果返回的是4xx,就表示客户端错误;返回5xx则表示服务器错误响应。我们可以用r.status_code来检测请求是否正确响应。
- r.content是字节方式的响应体,会自动解码gzip和deflate编码的响应数据。
- r.json()是Requests中内置的JSON解码器。
requests发送GET请求
GET用于获取网页服务端的数据
import requests
r = requests.get('https://baidu.com/')
r.encoding = "utf-8"
print("文本编码:", r.encoding)
print("响应状态码:", r.status_code)
print("字符串方式的响应体:", r.text)其中 r.encoding 可以修改编码,不然乱码
返回的response:
r.status_code用于检测响应的状态码
如果返回200,就表示请求成功了;
如果返回的是4xx,就表示客户端错误;
返回5xx则表示服务器错误响应。
我们可以用r.status_code来检测请求是否正确响应。
requests发送data请求
import requests
import json
key_dict = {"key1": "value1", "key2": "value2"}
r = requests.post("http://httpbin.org/post", data=json.dumps(key_dict))
print(r.text)
# 发送form和data的区别结果为:
{
  "args": {}, 
  "data": "{\"key1\": \"value1\", \"key2\": \"value2\"}", 
  "files": {}, 
  "form": {}, 
  "headers": {
    "Accept": "*/*", 
    "Accept-Encoding": "gzip, deflate", 
    "Content-Length": "36", 
    "Host": "httpbin.org", 
    "User-Agent": "python-requests/2.14.0", 
    "X-Amzn-Trace-Id": "Root=1-64196f9e-6b54025b02d214ad5a2762a9"
  }, 
  "json": {
    "key1": "value1", 
    "key2": "value2"
  }, 
  "origin": "61.16.102.74", 
  "url": "http://httpbin.org/post"
}
requests发送form请求
区别是一个字典,不是json
import requests
import json
key_dict = {"key1": "value1", "key2": "value2"}
# r = requests.post("http://httpbin.org/post", data=json.dumps(key_dict))
r = requests.post("http://httpbin.org/post", data=key_dict)
print(r.text)
结果为:
{
  "args": {}, 
  "data": "", 
  "files": {}, 
  "form": {
    "key1": "value1", 
    "key2": "value2"
  }, 
  "headers": {
    "Accept": "*/*", 
    "Accept-Encoding": "gzip, deflate", 
    "Content-Length": "23", 
    "Content-Type": "application/x-www-form-urlencoded", 
    "Host": "httpbin.org", 
    "User-Agent": "python-requests/2.14.0", 
    "X-Amzn-Trace-Id": "Root=1-64196fbf-6533dc414251c27e59971d5a"
  }, 
  "json": null, 
  "origin": "162.219.34.250", 
  "url": "http://httpbin.org/post"
}超时时间
import requests
link = "http://www.crazyant.net/"
r = requests.get(link, timeout=0.001)
print(r.text)
会报错:
requests.exceptions.ConnectTimeout: HTTPConnectionPool(host='www.crazyant.net', port=80): Max retries exceeded with url: / (Caused by ConnectTimeoutError(<requests.packages.urllib3.connection.HTTPConnection object at 0x7f9acab71650>, 'Connection to www.crazyant.net timed out. (connect timeout=0.001)'))
附带headers
import requests
headers = {
    'useragent': 'Mozilla/5.0(WindowsNT6.1;Win64;x64)AppleWebKit/537.36(KHTML,likeGecko)Chrome/52.0.2743.82Safari/537.36',
    "cookie ":"BIDUPSID=377D2B91ED643DFB14D8CC6E8B3626A2; PSTM=1664248378; BAIDUID=377D2B91ED643DFBF1BDCE23F61D2A0D:SL=0"
    }
r = requests.get('http://httpbin.org/post', headers=headers)
print("响应状态码:", r.status_code)
返回结果:
响应状态码: 200
上传文件
import requests
img = open("图片.jpg", "rb")
myfiles = {"myfile": img}
resp = requests.post("http://httpbin.org/post", files=myfiles)
print(resp.text)
返回结果:
{
  "args": {}, 
  "data": "", 
  "files": {
    "myfile": "data:application/octet-stream;base64,/9j/4AAQSkZJRgABAQAASABIAAD/49k="
  }, 
  "form": {}, 
  "headers": {
    "Accept": "*/*", 
    "Accept-Encoding": "gzip, deflate", 
    "Content-Length": "121974", 
    "Content-Type": "multipart/form-data; boundary=32affeb7cdd148288dd51b55c7c936c9", 
    "Host": "httpbin.org", 
    "User-Agent": "python-requests/2.14.0", 
    "X-Amzn-Trace-Id": "Root=1-64197251-407fa733740f8f0678287acb"
  }, 
  "json": null, 
  "origin": "61.16.102.77", 
  "url": "http://httpbin.org/post"
}