Python WSGI的深入理解

2023-09-09 13:51:05 55

前言

本文主要介绍的是PythonWSGI相关内容，主要来自以下网址：

WhatisWSGI?
WSGITutorial
AnIntroductiontothePythonWebServerGatewayInterface(WSGI)

可以看成一次简单粗暴的翻译。

什么是WSGI

WSGI的全称是WebServerGatewayInterface，这是一个规范，描述了webserver如何与webapplication交互、webapplication如何处理请求。该规范的具体描述在PEP3333。注意，WSGI既要实现webserver，也要实现webapplication。

实现了WSGI的模块/库有wsgiref(python内置)、werkzeug.serving、twisted.web等，具体可见ServerswhichsupportWSGI。

当前运行在WSGI之上的web框架有Bottle、Flask、Django等，具体可见FrameworksthatrunonWSGI。

WSGIserver所做的工作仅仅是将从客户端收到的请求传递给WSGIapplication，然后将WSGIapplication的返回值作为响应传给客户端。WSGIapplications可以是栈式的，这个栈的中间部分叫做中间件，两端是必须要实现的application和server。

WSGI教程

这部分内容主要来自WSGITutorial。

WSGIapplication接口

WSGIapplication接口应该实现为一个可调用对象，例如函数、方法、类、含__call__方法的实例。这个可调用对象可以接收2个参数：

一个字典，该字典可以包含了客户端请求的信息以及其他信息，可以认为是请求上下文，一般叫做environment（编码中多简写为environ、env）；
一个用于发送HTTP响应状态（HTTPstatus）、响应头（HTTPheaders）的回调函数。

同时，可调用对象的返回值是响应正文（responsebody），响应正文是可迭代的、并包含了多个字符串。

WSGIapplication结构如下：

defapplication(environ,start_response):

response_body='Requestmethod:%s'%environ['REQUEST_METHOD']

#HTTP响应状态
status='200OK'

#HTTP响应头，注意格式
response_headers=[
('Content-Type','text/plain'),
('Content-Length',str(len(response_body)))
]

#将响应状态和响应头交给WSGIserver
start_response(status,response_headers)

#返回响应正文
return[response_body]

Environment

下面的程序可以将environment字典的内容返回给客户端（environment.py）：

#!/usr/bin/envpython
#-*-coding:utf-8-*-

#导入python内置的WSGIserver
fromwsgiref.simple_serverimportmake_server

defapplication(environ,start_response):

response_body=[
'%s:%s'%(key,value)forkey,valueinsorted(environ.items())
]
response_body='\n'.join(response_body)#由于下面将Content-Type设置为text/plain，所以`\n`在浏览器中会起到换行的作用

status='200OK'
response_headers=[
('Content-Type','text/plain'),
('Content-Length',str(len(response_body)))
]
start_response(status,response_headers)

return[response_body]

#实例化WSGIserver
httpd=make_server(
'127.0.0.1',
8051,#port
application#WSGIapplication，此处就是一个函数
)

#handle_request函数只能处理一次请求，之后就在控制台`print'end'`了
httpd.handle_request()

print'end'

浏览器（或者curl、wget等）访问http://127.0.0.1:8051/，可以看到environment的内容。

另外，浏览器请求一次后，environment.py就结束了，程序在终端中输出内容如下：

127.0.0.1--[09/Sep/201523:39:09]"GET/HTTP/1.1"2005540
end

可迭代的响应

如果把上面的可调用对象application的返回值：

return[response_body]

改成：

returnresponse_body

这会导致WSGI程序的响应变慢。原因是字符串response_body也是可迭代的，它的每一次迭代只能得到1byte的数据量，这也意味着每一次只向客户端发送1byte的数据，直到发送完毕为止。所以，推荐使用return[response_body]。

如果可迭代响应含有多个字符串，那么Content-Length应该是这些字符串长度之和：

#!/usr/bin/envpython
#-*-coding:utf-8-*-

fromwsgiref.simple_serverimportmake_server

defapplication(environ,start_response):

response_body=[
'%s:%s'%(key,value)forkey,valueinsorted(environ.items())
]
response_body='\n'.join(response_body)

response_body=[
'TheBeggining\n',
'*'*30+'\n',
response_body,
'\n'+'*'*30,
'\nTheEnd'
]

#求Content-Length
content_length=sum([len(s)forsinresponse_body])

status='200OK'
response_headers=[
('Content-Type','text/plain'),
('Content-Length',str(content_length))
]

start_response(status,response_headers)
returnresponse_body

httpd=make_server('localhost',8051,application)
httpd.handle_request()

print'end'

解析GET请求

运行environment.py，在浏览器中访问http://localhost:8051/?age=10&hobbies=software&hobbies=tunning，可以在响应的内容中找到：

QUERY_STRING:age=10&hobbies=software&hobbies=tunning
REQUEST_METHOD:GET

cgi.parse_qs()函数可以很方便的处理QUERY_STRING，同时需要cgi.escape()处理特殊字符以防止脚本注入，下面是个例子：

#!/usr/bin/envpython
#-*-coding:utf-8-*-
fromcgiimportparse_qs,escape

QUERY_STRING='age=10&hobbies=software&hobbies=tunning'
d=parse_qs(QUERY_STRING)
printd.get('age',[''])[0]#['']是默认值，如果在QUERY_STRING中没找到age则返回默认值
printd.get('hobbies',[])
printd.get('name',['unknown'])

print10*'*'
printescape('')

输出如下：

10
['software','tunning']
['unknown']
**********
<script>alert(123);</script>

然后，我们可以写一个基本的处理GET请求的动态网页了：

#!/usr/bin/envpython
#-*-coding:utf-8-*-

fromwsgiref.simple_serverimportmake_server
fromcgiimportparse_qs,escape

#html中form的method是get，action是当前页面
html="""




Age:


Hobbies:
Software
AutoTunning






Age:%(age)s

Hobbies:%(hobbies)s



"""

defapplication(environ,start_response):

#解析QUERY_STRING
d=parse_qs(environ['QUERY_STRING'])

age=d.get('age',[''])[0]#返回age对应的值
hobbies=d.get('hobbies',[])#以list形式返回所有的hobbies

#防止脚本注入
age=escape(age)
hobbies=[escape(hobby)forhobbyinhobbies]

response_body=html%{
'checked-software':('','checked')['software'inhobbies],
'checked-tunning':('','checked')['tunning'inhobbies],
'age':ageor'Empty',
'hobbies':','.join(hobbiesor['NoHobbies?'])
}

status='200OK'

#这次的contenttype是text/html
response_headers=[
('Content-Type','text/html'),
('Content-Length',str(len(response_body)))
]

start_response(status,response_headers)
return[response_body]

httpd=make_server('localhost',8051,application)

#能够一直处理请求
httpd.serve_forever()

print'end'

启动程序，在浏览器中访问http://localhost:8051/、http://localhost:8051/?age=10&hobbies=software&hobbies=tunning感受一下～

这个程序会一直运行，可以使用快捷键Ctrl-C终止它。

这段代码涉及两个我个人之前没用过的小技巧：

>>>"Age:%(age)s"%{'age':12}
'Age:12'
>>>
>>>hobbies=['software']
>>>('','checked')['software'inhobbies]
'checked'
>>>('','checked')['tunning'inhobbies]
''

解析POST请求

对于POST请求，查询字符串（querystring）是放在HTTP请求正文（requestbody）中的，而不是放在URL中。请求正文在environment字典变量中键wsgi.input对应的值中，这是一个类似file的变量，这个值是一个。ThePEP3333指出，请求头中CONTENT_LENGTH字段表示正文的大小，但是可能为空、或者不存在，所以读取请求正文时候要用try/except。

下面是一个可以处理POST请求的动态网站：

#!/usr/bin/envpython
#-*-coding:utf-8-*-

fromwsgiref.simple_serverimportmake_server
fromcgiimportparse_qs,escape

#html中form的method是post
html="""




Age:


Hobbies:
Software
AutoTunning






Age:%(age)s

Hobbies:%(hobbies)s



"""

defapplication(environ,start_response):

#CONTENT_LENGTH可能为空，或者没有
try:
request_body_size=int(environ.get('CONTENT_LENGTH',0))
except(ValueError):
request_body_size=0

request_body=environ['wsgi.input'].read(request_body_size)
d=parse_qs(request_body)

#获取数据
age=d.get('age',[''])[0]
hobbies=d.get('hobbies',[])

#转义，防止脚本注入
age=escape(age)
hobbies=[escape(hobby)forhobbyinhobbies]

response_body=html%{
'checked-software':('','checked')['software'inhobbies],
'checked-tunning':('','checked')['tunning'inhobbies],
'age':ageor'Empty',
'hobbies':','.join(hobbiesor['NoHobbies?'])
}

status='200OK'

response_headers=[
('Content-Type','text/html'),
('Content-Length',str(len(response_body)))
]

start_response(status,response_headers)
return[response_body]

httpd=make_server('localhost',8051,application)

httpd.serve_forever()

print'end'

PythonWSGI入门

这段内容参考自AnIntroductiontothePythonWebServerGatewayInterface(WSGI)。

Webserver

WSGIserver就是一个webserver，其处理一个HTTP请求的逻辑如下：

iterable=app(environ,start_response)
fordatainiterable:
#senddatatoclient

app即WSGIapplication，environ即上文中的environment。可调用对象app返回一个可迭代的值，WSGIserver获得这个值后将数据发送给客户端。

Webframework/app

即WSGIapplication。

中间件（Middleware）

中间件位于WSGIserver和WSGIapplication之间，所以

一个示例

该示例中使用了中间件。

#!/usr/bin/envpython
#-*-coding:utf-8-*-

fromwsgiref.simple_serverimportmake_server

defapplication(environ,start_response):

response_body='helloworld!'

status='200OK'

response_headers=[
('Content-Type','text/plain'),
('Content-Length',str(len(response_body)))
]

start_response(status,response_headers)
return[response_body]

#中间件
classUpperware:
def__init__(self,app):
self.wrapped_app=app

def__call__(self,environ,start_response):
fordatainself.wrapped_app(environ,start_response):
yielddata.upper()

wrapped_app=Upperware(application)

httpd=make_server('localhost',8051,wrapped_app)

httpd.serve_forever()

print'end'

然后

有了这些基础知识，就可以打造一个web框架了。感兴趣的话，可以阅读一下Bottle、Flask等的源码。

在LearnaboutWSGI还有更多关于WSGI的内容。

总结

以上就是这篇文章的全部内容了，希望本文的内容对大家的学习或者工作具有一定的参考学习价值，如果有疑问大家可以留言交流，谢谢大家对毛票票的支持。

Python WSGI的深入理解

热门推荐

随机推荐