`
liaofeng_xiao
  • 浏览: 127633 次
  • 性别: Icon_minigender_1
  • 来自: 北京
社区版块
存档分类
最新评论

Quixote 1.2源码解读

 
阅读更多
要说quixote1.2真是老古董了,官网都已经不提供下载了(只维护1.3和2.x)。鉴于历史原因,公司暂时是不会做quixote升级的,而且就目前来看,完全没有问题。

下载源代码,主要看publish.py

有个publish方法:
def publish(self, stdin, stdout, stderr, env):
        """publish(stdin : file, stdout : file, stderr : file, env : dict)

        Create an HTTPRequest object from the environment and from
        standard input, process it, and write the response to standard
        output.
        stdin和stdout跟web服务器有关。比如使用gunicorn跑quixote,那stdin就是gunicorn的一个对象,如果是用cgi直接跑则是系统的stdin。env来自cgi。
        """
        request = self.create_request(stdin, env)
        output = self.process_request(request, env)

        # Output results from Response object
        if output:
            request.response.set_body(output)
        try:
            request.response.write(stdout)
        except IOError, exc:
            self.log('IOError caught while writing request (%s)' % exc)
        self._clear_request()


可以简单看一下create_request方法:
def create_request(self, stdin, env):
        ctype = get_content_type(env)
        if ctype == "multipart/form-data" and (
                    # workaround for safari bug
                    env.get('REQUEST_METHOD') != 'GET'
                    or env.get('CONTENT_LENGTH', '0') != '0'
                ):
            req = HTTPUploadRequest(stdin, env, content_type=ctype)
            req.set_upload_dir(self.config.upload_dir,
                               self.config.upload_dir_mode)
            return req
        else:
            return HTTPRequest(stdin, env, content_type=ctype)


然后核心就是process_request方法:
def process_request(self, request, env):
        """process_request(request : HTTPRequest, env : dict) : string

        Process a single request, given an HTTPRequest object.  The
        try_publish() method will be called to do the work and
        exceptions will be handled here.
        """
        self._set_request(request)
        try:
            self.parse_request(request)
            output = self.try_publish(request, env.get('PATH_INFO', ''))
        except errors.PublishError, exc:
            # Exit the publishing loop and return a result right away.
            output = self.finish_interrupted_request(request, exc)
        except:
            # Some other exception, generate error messages to the logs, etc.
            output = self.finish_failed_request(request)
        output = self.filter_output(request, output)
        self.log_request(request)
        return output


先看看parse_request方法,调用了http_request.py中的代码:
def process_inputs(self):
        """Process request inputs.
        """
        self.start_time = time.time()
        if self.get_method() != 'GET':
            # Avoid consuming the contents of stdin unless we're sure
            # there's actually form data.
            if self.content_type == "multipart/form-data":
                raise RuntimeError(
                    "cannot handle multipart/form-data requests")
            elif self.content_type == "application/x-www-form-urlencoded":
                fp = self.stdin
            else:
                return
        else:
            fp = None

        fs = FieldStorage(fp=fp, environ=self.environ, keep_blank_values=1)
        if fs.list:
            for item in fs.list:
                self.add_form_value(item.name, item.value)


注意add_form_value方法中防止hash攻击:
def add_form_value(self, key, value):
        if self.form.has_key(key):
            found = self.form[key]
            if type(found) is ListType:
                found.append(value)
            elif found != value:
                found = [found, value]
                self.form[key] = found
        else:
            self.form[key] = value
            # anti hash attack:
            # http://permalink.gmane.org/gmane.comp.security.full-disclosure/83694
            if len(self.form) % 50 == 0:
                hash_d = {}
                for k in self.form:
                    h = hash(k)
                    hash_d[h] = hash_d.get(h, 0) + 1
                m = max(hash_d.values())
                if m > len(self.form) / 10 or m > 10 or len(self.form) > 10000:
                    raise errors.RequestError("hash attack")


回到publish.py中,先看finish_interrupted_request方法:
def finish_interrupted_request(self, request, exc):
        request.response = HTTPResponse()
        # set response status code so every custom doesn't have to do it
        request.response.set_status(exc.status_code)
        if self.config.secure_errors and exc.private_msg:
            exc.private_msg = None # hide it

        # walk up stack and find handler for the exception
        stack = self.namespace_stack[:]
        while 1:
            handler = None
            while stack:
                object = stack.pop()
                if hasattr(object, "_q_exception_handler"):
                    handler = object._q_exception_handler
                    break
            if handler is None:
                handler = errors.default_exception_handler

            try:
                return handler(request, exc)
            except errors.PublishError:
                assert handler is not errors.default_exception_handler
                continue # exception was re-raised or another exception occured

当发生PublishError的时候,从namespace_stack(后面有解释)后面的元素(任何namespace,可以是object, module)开始依次往前找_q_exception_handler,找到则进行异常处理,遍历完namespace_stack后都没找到则使用errors.py中默认的default_excpetion_handler。如果找到了且在处理的过程中继续抛出了PublishError,继续寻找_q_exception_handler。但是,要保证不是default_exception_handler抛出的PublishError,因为默认的异常处理方法是(确保)不可能出现PublishError的(如果发生了,也没有别的异常处理代码能解决了)。

filter_output方法中判断如果支持response内容的压缩,则调用compress_output:
def compress_output(self, request, output):
        encoding = request.get_encoding(["gzip", "x-gzip"])
        n = len(output)
        if n > self._GZIP_THRESHOLD and encoding:
            co = zlib.compressobj(6, zlib.DEFLATED, -zlib.MAX_WBITS,
                                  zlib.DEF_MEM_LEVEL, 0)
            chunks = [self._GZIP_HEADER,
                      co.compress(output),
                      co.flush(),
                      struct.pack("<ll", binascii.crc32(output), len(output))]
            output = "".join(chunks)
            #self.log("gzip (original size %d, ratio %.1f)" %
            #           (n, float(n)/len(output)))
            request.response.set_header("Content-Encoding", encoding)
        return output

这里有个threshold,只有内容大于200字节才做压缩,否则认为不划算。

至此就剩下一个最关键的try_publish方法还没看了:
def try_publish(self, request, path):
        self.start_request(request)
        self.namespace_stack = []

        # Traverse package to a (hopefully-) callable object
        object = _traverse_url(self.root_namespace, path, request,
                               self.config.fix_trailing_slash,
                               self.namespace_stack)
        # None means no output -- traverse_url() just issued a redirect.
        if object is None:
            return None
        # Anything else must be either a string...
        if isstring(object):
            output = object
        # ...or a callable.
        elif callable(object) or hasattr(object, "__call__"):
            try:
                if callable(object):
                    output = object(request)
                else:
                    output = object.__call__(request)
            except SystemExit:
                output = "SystemExit exception caught, shutting down"
                self.log(output)
                self.exit_now = 1
            if output is None:
                raise RuntimeError, 'callable %s returned None' % repr(object)
        # Uh-oh: 'object' is neither a string nor a callable.
        else:
            raise RuntimeError(
                "object is neither callable nor a string: %s" % repr(object))
        # The callable ran OK, commit any changes to the session
        self.finish_successful_request(request)
        return output

准确来说,quixote是按namespace来分发请求的,而不是纯文件目录。如果访问www.site.com/blog/articles/100001/,怎样做url分发呢?

url分发的目标就是找到controller层的一个方法(可调用对象),调用然后得到请求的输出。在quixote中,www.site.com/blog/articles/100001/的request_path="blog/articles/100001/",于是以斜杆为分割符,依次寻找namespace知道找到目标调用对象。于是先在controller层的根目录(有的是应用的根目录,视项目情况定)寻找blog,找到blog对应的namespace A, 然后在A下寻找articles,假设为B,然后在B中寻找100001。。。

从代码中可以看出,寻找object对象是由_traverse_url方法来完成的:
def _traverse_url(root_namespace, path, request, fix_trailing_slash,
                  namespace_stack):
    if (not path and fix_trailing_slash):
        request.redirect(request.environ['SCRIPT_NAME'] + '/' ,
                         permanent=1)
        return None
    # replace repeated slashes with a single slash
    if path.find("//") != -1:
        path = _slash_pat.sub("/", path)
    # split path apart; /foo/bar/baz  -> ['foo', 'bar', 'baz']
    #                   /foo/bar/     -> ['foo', 'bar', '']
    path_components = path[1:].split('/')
    # Traverse starting at the root
    object = root_namespace
    namespace_stack.append(object)
    # Loop over the components of the path
    for component in path_components:
        if component == "":
            # "/q/foo/" == "/q/foo/_q_index"
            if (callable(object) or isstring(object)) and \
                        request.get_method() == "GET" and fix_trailing_slash:
                query = request.environ.get('QUERY_STRING', '')
                query = query and "?" + query
                request.redirect(request.get_path()[:-1] + query, permanent=1)
                return None
            component = "_q_index"
        object = _get_component(object, component, path, request,
                               namespace_stack)
    if not (isstring(object) or callable(object) or hasattr(object, '__call__')):
        # We went through all the components of the path and ended up at
        # something which isn't callable, like a module or an instance
        # without a __call__ method.
        if path[-1] != '/' :
            _obj = _get_component(object, "_q_index", path, request,
                               namespace_stack)
            if (callable(_obj) or isstring(_obj)) and \
                    request.get_method() == "GET" and fix_trailing_slash:
                # This is for the convenience of users who type in paths.
                # Repair the path and redirect.  This should not happen for
                # URLs within the site.
                query = request.environ.get('QUERY_STRING', '')
                query = query and "?" + query
                request.redirect(request.get_path() + "/" + query, permanent=1)
                return None

        raise errors.TraversalError(
                "object is neither callable nor string",
                private_msg=repr(object),
                path=path)
    return object


可见,如果是component是""(请求以/结尾),则试着寻找_q_index方法。这里做了一个patch,如果fix_trailing_slash等条件为true,则重定向。

核心方法还是_get_component
def _get_component(container, component, path, request, namespace_stack):
    if not hasattr(container, '_q_exports'):
        raise errors.TraversalError(
                    private_msg="%r has no _q_exports list" % container)

    if hasattr(container, '_q_access'):
        # will raise AccessError if access failed
        container._q_access(request)

    if component in container._q_exports or component == '_q_index':
        internal_name = component
    else:
        # check for an explicit external to internal mapping
        for value in container._q_exports:
            if type(value) is types.TupleType:
                if value[0] == component:
                    internal_name = value[1]
                    break
        else:
            internal_name = None
    if internal_name is None:
        # Component is not in exports list.
        object = None
        if hasattr(container, "_q_lookup"):
            object = container._q_lookup(request, component)
        elif hasattr(container, "_q_getname"):
            warnings.warn("_q_getname() on %s used; should "
                          "be replaced by _q_lookup()" % type(container))
            object = container._q_getname(request, component)
        if object is None:
            raise errors.TraversalError(
                private_msg="object %r has no attribute %r" % (
                                                    container,
                                                    component))

    # From here on, you can assume that the internal_name is not None
    elif hasattr(container, internal_name):
        # attribute is in _q_exports and exists
        object = getattr(container, internal_name)

    elif internal_name == '_q_index':
        if hasattr(container, "_q_lookup"):
            object = container._q_lookup(request, "")
        else:
            raise errors.AccessError(
                private_msg=("_q_index not found in %r" % container))

    elif hasattr(container, "_q_resolve"):
        object = container._q_resolve(internal_name)
        if object is None:
            raise RuntimeError, ("component listed in _q_exports, "
                                 "but not returned by _q_resolve(%r)"
                                 % internal_name)
        else:
            # Set the object, so _q_resolve won't need to be called again.
            setattr(container, internal_name, object)

    elif type(container) is types.ModuleType:
        mod_name = container.__name__ + '.' + internal_name
        object = _get_module(mod_name)
    else:
        raise errors.TraversalError(
                private_msg=("%r in _q_exports list, "
                             "but not found in %r" % (component,container)))

    namespace_stack.append(object)
    return object

这里都是quixote的语法,没什么好说的了。

贴的代码跟官方quixote1.2的代码有些出入,都是一些patch。

可以看看SessionManager,还不错的。
分享到:
评论

相关推荐

    Quixote

    "Quixote"这个名字源于西班牙文学巨匠塞万提斯的经典小说《堂吉诃德》(Don Quixote),在IT领域中,它可能是指一个特定的项目、工具或者技术。不过,由于提供的信息有限,无法确定具体是哪个IT相关的"Quixote"。...

    Quixote入门指南

    ### Quixote入门指南 #### Quixote简介:一个简单的链接展示(LinkDisplay) Quixote是一个用Python编写的轻量级Web应用框架。它强调简洁性、灵活性和易用性,特别适合那些希望快速开发并部署Web应用的开发者。本...

    前端项目-quixote.zip

    在这个领域,"前端项目-quixote.zip" 提供了一个关于CSS单元和集成测试的解决方案。Quixote,这个名字来源于西班牙文学中的经典角色唐吉诃德,寓意着它可能是一个旨在挑战常规、解决复杂前端问题的工具或框架。 **...

    PyPI 官网下载 | Quixote-3.1a1.tar.gz

    通常,一个Python项目的源码结构会包含以下几个部分: 1. `setup.py`:用于安装和打包项目的脚本。 2. `README`:项目介绍和使用指南。 3. `LICENSE`:项目的授权协议。 4. `requirements.txt`:依赖库列表。 5. `...

    python 的django,quixote,uliweb三个web框架

    在Web开发领域,Python提供了多个优秀的Web框架,其中最知名的包括Django、Quixote和Uliweb。下面将对这三个框架进行详细阐述。 首先,Django是Python中最广泛使用的Web框架之一,被誉为“ batteries included ”的...

    PyPI 官网下载 | etna_quixote-1.0.8-py3-none-any.whl

    《PyPI官网下载 | etna_quixote-1.0.8-py3-none-any.whl》 在Python的世界中,PyPI(Python Package Index)是官方的第三方库仓库,它为开发者提供了一个集中地发布和获取Python软件包的平台。资源“etna_quixote-...

    quixote:CSS单元和集成测试

    Quixote-CSS单元和集成测试 Quixote是用于测试CSS的库。 它速度很快-每秒可进行100多次测试-并且具有强大的API。 您可以将其用于单元测试(直接测试CSS文件)或集成测试(针对真实服务器进行测试)。 无论哪种方式,...

    quixote:为CityShelf提供支持的搜索服务

    吉x德关于Quixote是为提供支持的搜索服务, 是一个网络应用程序,可使通过本地和独立书商快速便捷地搜索书籍。API端点所有请求/响应的Content-Type是application / JSON。 方法小路回复得到/ books /?field = value...

    miguel-de-cervantes-saavedra_don-quixote_john-ormsby:Miguel de Cervantes撰写的Don Quixote标准电子书版本的Epub来源。 约翰·奥姆斯比(John Ormsby)翻译

    在这个名为“miguel-de-cervantes-saavedra_don-quixote_john-ormsby”的压缩包中,包含的是《堂吉诃德》的Epub格式电子书源代码。Epub是一种开放的电子书标准,它允许内容以结构化的方式组织,支持文本重排和不同...

    豆瓣网技术架构

    - **Quixote**: 作为一种简单的Python Web框架,Quixote因其轻量级特性而被采用。它支持实现RESTful URL,并且易于集成到现有的系统中。下面是一个示例代码片段展示了如何使用Quixote来处理请求: ```python def ...

    豆瓣网技术架构的发展历程

    - **早期架构**:基于单台服务器构建的系统,使用Gentoo Linux、MySQL、Quixote、Lighttpd和Memcached等技术。 - **中期架构**:随着用户规模的增长,对硬件进行了升级,增加了服务器数量,并采用了双线双IP机房来...

    quix-murder:Paradox Interactive的Crusader Kings 3活动模组。 将添加数十个新的谋杀事件,重点放在沉浸感,角色扮演和风格上,而不会影响整体游戏的音调或平衡

    Quixotic Murder-十字军之王3 Mod 版本:1.2.2.003埃尔Ingenioso伊达尔戈Ÿ卡瓦列罗堂吉诃德日拉曼恰〜 增加20个新的谋杀计划成果事件链厌倦了用数百年的相同方式杀死人们! 这个mod的目的是大大增强游戏中谋杀方案...

    豆瓣网技术架构变迁

    6. Quixote框架:在Django、TurboGears和Pylons等现代Web框架出现之前,豆瓣网使用了Quixote框架,它简单且轻量级,便于实现REST风格的URL。 7. Lighttpd Web服务器:Lighttpd以其出色的动态和静态内容处理能力著称...

    豆瓣的成长路线.pdf

    - **Quixote**:在没有Django、TurboGears等现代Web框架的时代,Quixote以其简单轻量的特点被选用。它支持REST风格的URL设计,有助于实现良好的用户体验。 - **Lighttpd + SCGI**:Lighttpd具有出色的动态和静态性能...

    豆瓣网技术架构及其演变过程

    软件栈包括MySQL 5、Quixote(Python Web框架)、Lighttpd(Web服务器)以及Memcached(缓存系统)。这一架构支撑着网站的日常运营,但随着用户量的激增,单一服务器的局限性开始显现,尤其是磁盘I/O成为瓶颈,限制...

    KSZ9031RNXIA.pdf

    * 芯片上LDO控制器,支持单3.3V电源操作,只需要一个外部FET生成1.2V核心电压 * 支持巨帧,最大达16 KB * 125 MHz参考时钟输出 * 能源检测电源关闭模式,减少了无缆连接时的电源消耗 * Wake-On-LAN(WOL)支持,具有...

    python常见面试题集.docx

    python常见面试题集全文共7页,当前为第1页。python常见面试题集全文共7页,当前为...1:Django 2:Tornado 3:Bottle 4:web.py 5:web2py 6:Quixote(豆瓣网就是基于该框架开发的) 4、python几种流行框架的比较 htt

Global site tag (gtag.js) - Google Analytics