Hi all! There's been a lot of good discussion here about doing paging efficiently, as opposed to just using offset. We've been discussing efficient paging internally too. We hope to eventually provide something built in, but in the short term we have other priorities. Still, we've come up with some interesting ideas, and I'd like to share one of them here.
Most of the efficient paging techniques discussed here have been based on sorting and filtering on a unique property. However, now that we have __key__ filtering, http://code.google.com/appengine/docs/datastore/queriesandindexes.html#Queries_on_Keys , we have a built-in unique field that we can filter and sort on. We can use that to support efficient paging in any query, on any schema, purely in user land, without requiring extra properties or other schema changes. (It will sometimes require a new composite index, but that's less invasive than requiring a new property.)
The executive summary is, when you first run the query, if it doesn't already have a sort order on __key__, append one. Then, when you run the query, save the last entity as a bookmark. To pick up from there later, set up the original query, add filters on all of the sort order properties so that it starts immediately after the bookmark entity.
The full design is below. We may provide some kind of built-in paging support in the future, but it won't be this. Given that, I'd love to see someone pick this up and implement it as an open source project!
Bookmarks
==
A bookmark is the last result entity that was returned to the app for a given query. That bookmark entity is serialized and base64-encoded so that apps can pass it around in URL query parameters, cookies, etc. To keep bookmarks reasonably sized, raw properties and properties not involved in the query should be stripped from the bookmark entity before serializing and encoding it.
Queries
==
If you want to be able to resume a query at a bookmark, you need to know that when you first run the query. That's because you need to append an ascending __key__ sort order to the query if it's not already there.
To resume a query at a bookmark, first, decode the bookmark, validate it, and check that it has all of the properties needed by the query. Then, use the algorithm below to generate and run a series of derived queries that transparently resume the query at the bookmarked entity.
Yes, queries plural. If your query has a sort order, you'll need to run two queries in serial to resume from a bookmark, due to the single inequality filter property limit. More accurately, if your query has n sort orders, you'll need to run n + 1 queries.
Also, as mentioned above, making a query bookmarkable sometimes require extra composite indices that the corresponding non-bookmarkable query wouldn't require. This is due to the extra __key__ sort order.
Resuming from a bookmark
==
Here's Python-like pseudocode for resuming a query at a bookmark. It generates a series of queries derived from the original query that, when run in order, return the original query's results starting at the bookmark. (The indentation is meaningful, so you might want to turn on a fixed-width font in your viewer here.)
given a query and a bookmark entity:
assert that all of the properties in the query are also in the bookmark entity
# massage the query. this may add extra sort orders, but won't remove the
# existing ones, so it doesn't change the query's invariants.
if the query has an inequality filter but no sort order:
add an ASC sort order on the inequailty filter property
if the query doesn't have a sort order on __key__:
append an ASC sort order on __key__
# prepare the original query as a template for the derived queries
if the query has inequality filters:
remove them and record them
for prop in the query's sort orders:
add a new filter to the query: "prop = [bookmark entity's value for prop]"
remove the query's sort orders, except for __key__, and record them
# generate the derived queries
make an empty FIFO queue for the derived queues
for prop in reversed(original query's sort orders):
make a deep copy of the original query
on that deep copy:
remove the = filter on prop
replace it with a new filter:
if original sort order on prop was ASC:
operator = ">"
else:
operator = "<"
the new filter is "prop [operator] [bookmark entity's value for prop]"
also, if there were other inequality filters on prop, add them back too
also add back the sort order on prop, at the beginning of the sort orders
enqueue this query
To run the query starting from a bookmark, unwind the queue, ie pop the derived queries off one at a time, run them, and return their results. Then each query runs out of results, pop the next, and so on.
Also, note that ancestor is unrelated. If it's specified on the original query, it should be included in the bookmarkable starting query and all of the derived queries, but it doesn't otherwise affect the algorithm otherwise.
Examples
==
(Again, turn on a fixed-width font in your viewer here.)
| *Original starting query* | *Bookmarkable starting query* | *Derived queries for starting from bookmark entity B* |
| SELECT * FROM Foo | SELECT * FROM Foo ORDER BY __key__ ASC | SELECT * FROM Foo WHERE __key__ > B ORDER BY __key__ ASC |
| WHERE x = 0 | WHERE x = 0 ORDER BY __key__ ASC | WHERE x = 0 AND __key__ > B ORDER BY __key__ ASC |
| WHERE x > 0 | WHERE x > 0 ORDER BY x ASC, __key__ ASC | WHERE x = B.x AND __key__ > B ORDER BY __key__ ASC |
| | | WHERE x > B.x ORDER BY x ASC, __key__ ASC |
| WHERE x = 0 AND y > 0 | WHERE x = 0 AND y > 0 ORDER BY y ASC, __key__ ASC | WHERE x = 0 AND y = B.y AND __key__ > B ORDER BY __key__ ASC |
| | | WHERE x = 0 AND y > B.y ORDER BY y ASC, __key__ ASC |
| WHERE x > 0 AND x < 9 | WHERE x > 0 AND x < 9 ORDER BY x ASC, __key__ ASC | WHERE x = B.x AND __key__ > B ORDER BY __key__ ASC |
| | | WHERE x > B.x AND x < 9 ORDER BY x ASC, __key__ ASC |
| WHERE __key__ > A AND __key__ < Z | WHERE __key__ > A AND __key__ < Z ORDER BY __key__ ASC | WHERE __key__ > B AND __key__ < Z ORDER BY __key__ ASC |
| ORDER BY x ASC | ORDER BY x ASC, __key__ ASC | WHERE x = B.x AND __key__ > B ORDER BY __key__ ASC |
| | | WHERE x > B.x ORDER BY x ASC, __key__ ASC |
| ORDER BY x DESC | ORDER BY x DESC, __key__ ASC | WHERE x = B.x AND __key__ > B ORDER BY __key__ ASC |
| | | WHERE x < B.x ORDER BY x DESC, __key__ ASC |
| ORDER BY __key__ ASC | ORDER BY __key__ ASC | WHERE __key__ > B ORDER BY __key__ ASC |
| ORDER BY __key__ DESC | ORDER BY __key__ DESC | WHERE __key__ < B ORDER BY __key__ DESC |
| ORDER BY x ASC, y DESC | ORDER BY x ASC, y DESC, __key__ ASC | WHERE x = B.x AND y = B.y AND __key__ > B ORDER BY __key__ ASC |
| | | WHERE x = B.x AND y < B.y ORDER BY y DESC, __key__ ASC |
| | | WHERE x > B.x ORDER BY x ASC, y DESC, __key__ ASC |
| ORDER BY x ASC, __key__ DESC | ORDER BY x ASC, __key__ DESC | WHERE x = B.x AND __key__ < B ORDER BY __key__ DESC |
| | | WHERE x > B.x ORDER BY x ASC, __key__ DESC |
| WHERE x = 0 ORDER BY y DESC | WHERE x = 0 ORDER BY y DESC, __key__ ASC | WHERE x = 0 AND y = B.y AND __key__ > B ORDER BY __key__ ASC |
| | | WHERE x = 0 AND y < B.y ORDER BY y DESC, __key__ ASC |
| WHERE x > 0 AND x < 9 ORDER BY x DESC | WHERE x > 0 AND x < 9 ORDER BY x DESC, __key__ ASC | WHERE x = B.x AND __key__ > B ORDER BY __key__ ASC |
| | | WHERE x < B AND x > 0 ORDER BY x DESC, __key__ ASC |
分享到:
相关推荐
【标题】"gae-pytorch-master_pytorch_pytorchgae_GAE_自编码器_gaepytorchmaster_" 提供的信息表明,这是一个使用PyTorch实现的图自编码器(Graph Autoencoder, GAE)项目,其核心是将自编码器的概念应用于图数据。...
【基于GAE的Demo】是一个使用Eclipse集成开发环境构建的项目,主要展示了如何在Google App Engine(GAE)平台上整合Struts2、Spring和Tiles框架。GAE是Google提供的一个云计算平台,允许开发者在Google的基础设施上...
**图形自动编码器(GAE)在PyTorch中的实现** **一、GAE概述** 图形自动编码器(Graph Autoencoder, GAE)是一种应用于图数据的深度学习模型,它结合了自动编码器(Autoencoder)的思想与图神经网络(Graph Neural...
GAE使用规则GAE使用规则GAE使用规则GAE使用规则GAE使用规则GAE使用规则GAE使用规则GAE使用规则GAE使用规则
【标题】"Spring+GAE"揭示了将Google App Engine(GAE)与Spring框架集成的主题,这是一个在云端运行Java应用程序的关键技术组合。Spring是一个广泛使用的开源Java框架,提供了依赖注入、面向切面编程和MVC(模型-...
标题 "GAE包(以配置好,解压可用)" 提供的信息表明,这是一个已经预配置好的Google App Engine (GAE)开发环境的压缩包。GAE是Google提供的一项平台即服务(PaaS),允许开发者在Google的基础设施上运行自己的Web...
pass之GAE入门教程, 学习GAE
### GAE之webapp框架详解 #### 一、引言 在Google App Engine (GAE) 平台上进行Web应用开发时,选择合适的框架对于提高开发效率至关重要。其中,`webapp` 框架因其简洁高效而备受开发者青睐。本篇文章将详细介绍`...
标题“GAE blog安装”指的是在Google App Engine (GAE)上部署一个博客应用的过程。GAE是一个由Google提供的平台即服务(PaaS)云环境,允许开发者构建、运行和维护Web应用程序,无需管理和维护底层基础设施。在这个...
1. **安装和配置Quercus**:首先,你需要下载Quercus的Java库,并将其添加到你的GAE项目类路径中。这可以通过在你的项目`lib`目录下放置Quercus的JAR文件来完成。 2. **构建PHP处理程序**:创建一个Java类作为PHP...
**谷歌应用引擎(Google App Engine, GAE)**是谷歌提供的一种云计算平台,允许开发者构建、部署和运行基于Web的应用程序。GAE支持多种编程语言,包括Python、Java、Go、Node.js等,提供了完整的基础设施,如数据库...
《GAE编程指南》是一种云计算服务,跟其他的同类产品不同,它提供了一种简单的应用程序构建模型,通过这种模型,你可以轻松地构建出能够容纳数百万用户的应用程序。《GAE编程指南》是介绍使用这个强大平台的专家级...
标题“GAE read rss send to 腾讯微博”指的是一个使用Google App Engine(GAE)平台开发的应用程序,该程序的功能是从RSS源读取数据并将其发布到腾讯微博。RSS(Really Simple Syndication)是一种内容聚合格式,常...
云计算下的PaaS中的GAE和SAE平台
标题中的“gtap,基于GAE的代理”指的是一个名为GTAProxy的项目,它是一个构建在Google App Engine(GAE)平台上的代理服务。这个服务的主要目的是为用户提供访问Twitter API的能力,尤其在某些地区或者特定网络环境...
### GAE(Google App Engine)空间申请使用教程及 GAE域名捆绑方法 #### GAE简介与功能概述 GAE(Google App Engine)是由谷歌提供的一个强大的云服务平台,它允许开发者构建并托管各种类型的应用程序。从实用性...
GAE上可以用的JAVA Blog源代码 可以在GAE上直接使用,支持图片上传等。 源代码是修改其他网友的普通blog程序而来,只做了必要的修改,原结构保留 最新版本请去主页下载 http://redpower1998.appspot.com 主页包括...
【标题】"GAE扩展样例程序"是一个针对Google App Engine (GAE) 平台的EGL(Enterprise Generation Language)扩展项目。这个程序的主要目的是为开发者提供一个模板或者起点,帮助他们更好地理解和实践如何在GAE上...
《GAE编程指南》是一种云计算服务,跟其他的同类产品不同,它提供了一种简单的应用程序构建模型,通过这种模型,你可以轻松地构建出能够容纳数百万用户的应用程序。《GAE编程指南》是介绍使用这个强大平台的专家级...