- 浏览: 150540 次
- 性别:
- 来自: 杭州
-
文章分类
最新评论
-
fei33423:
流程引擎的流转比状态机要复杂太多了把. 我觉得你即可以看成平行 ...
工作流引擎是否应该建立在有限状态机(Finite State Machine, FSM)引擎之上? -
c601097836:
为什么你们都说FSM,而不是UML State machine ...
工作流引擎是否应该建立在有限状态机(Finite State Machine, FSM)引擎之上? -
ronghao:
di1984HIT 写道类似啊,我现在也有这个困惑、我的工作流 ...
工作流引擎是否应该建立在有限状态机(Finite State Machine, FSM)引擎之上? -
di1984HIT:
不错啊。学习了。
[转]hadoop使用中的几个小细节(一) -
di1984HIT:
好的、
工作流20种基本模式的理解
22 Feb 2012, by Bright Zheng (IT进行时)
写在这章前面的几点牢骚或感慨:
1. 我发现建模是比较别扭的一件事情,尤其是你的脑子里都是RDBMS的ERD的时候;
2. 本人试图通过两者的建模过程体现思考要点,但感觉在NoSQL的建模上有点“那个”——如果不在大型项目上吃亏过或者直接受教于前辈,总感觉缺那么点味道;
3. 这篇是我写的最郁闷的一篇,而且可能后面需要无数个补丁,但管不了了,有错误才有感悟
5. Data Modeling
Data Modeling is one of the most important things in experiencing Cassandra, especially to those who have lots of experiences with RDBMS data modeling.
By admiring Twissandra project, we name it as Jtwissandra as an example. If possible, I’ll try to create and implement it and share it in GitHub.
This is a simple example to showcase the NoSQL concepts by admiring the Twitter via Cassandra.
5.1. Tranditional RDBMS Data Modeling
Following are the core Entities & Relationships if we’re modeling in RDBMS concepts.
Here are some pseudo codes for demonstrating the business logic/requirements:
1. Adding a new user:
USER.insert(user_id, user_name, user_password, create_timestamp); |
2. Following a friend:
FRIEND.insert(user_id, followed_id, create_timestamp) as ($current_user_id, user_id, create_timestamp); FOLLOWER.insert(user_id, follower_id, create_timestamp) as (user_id, $current_user_id, create_timestamp); |
3. Tweetting:
FRIEND.insert(user_id, followed_id, create_timestamp) as ($current_user_id, user_id, create_timestamp); FOLLOWER.insert(user_id, follower_id, create_timestamp) as (user_id, $current_user_id, create_timestamp); |
4. Getting Tweets (that are twitted by self and friends):
select * from TWEET t where t.user_id = $current_user_id or t.user_id in ( select followed_id from FRIEND where user_id = $current_user_id ) |
Comment:: What a bottleneck is here!! That’s also the most important reason why Twitter has to migrate to NoSQL solutions.
5.2. NoSQL Data Modeling
Before we go deeper of NoSQL data modeling with Cassandre, we must understand the key design points of it.
1. Cassandra is a key-value based model
2. Cassandra supports more complex modeling by importing the concept of Super Column
3. The data can be stored in two ways: as column names or as values (it’s really confusing for the beginners sometimes, but you will be free if you understand more especially on the indexing)
4. The Columns, normal Columns or Super ones, in the Column Family is sorted by Column Names, not values
So let’s get started.
We need to create the Keyspace first.
create keyspace JTWISSANDRA with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy' and strategy_options = [{replication_factor:1}]; |
Under this Keyspace, we’ll be working on the data modeling one by one.
5.2.1. User
The key points should be under consideration:
- The key we can simply use Time UUID
- The user_name must be (secondary) indexed because we may use it for search
- The create_timestamp should be (secondary) indexed because we may use it for search or some kinds of partitioning
- The password must be encoded as base64. No more CSDN story please.
So the sample data model will be as following:
ColumnFamily: USER |
||
Key |
Columns |
|
550e8400-e29b-41d4-a716-446655440000 |
name |
value |
|
“user_name” |
“itstarting” |
|
“password” |
"******" |
|
“create_timestamp” |
1329836819890000 |
550e8400-e29b-41d4-a716-446655440001 |
name |
value |
|
“user_name” |
“test1” |
|
“password” |
"******" |
|
“create_timestamp” |
1329836819890001 |
Here is the create script:
create column family USER with comparator = UTF8Type and key_validation_class = UTF8Type and default_validation_class = UTF8Type and column_metadata = [ {column_name: user_name, validation_class: UTF8Type, index_name:user_name_idx, index_type:KEYS } {column_name: user_password, validation_class: UTF8Type} {column_name: create_timestamp, validation_class: LongType, index_name:create_timestamp_idx, index_type:KEYS} ]; |
And the insert script/CLI for showcase only:
// insert user 550e8400-e29b-41d4-a716-446655440000 set USER[‘550e8400-e29b-41d4-a716-446655440000’][‘user_name’] = ‘itstarting’; set USER[‘550e8400-e29b-41d4-a716-446655440000’][‘password’] = ‘111222’; set USER[‘550e8400-e29b-41d4-a716-446655440000’][‘create_timestamp’] = 1329836819890000;
// insert user 550e8400-e29b-41d4-a716-446655440001 set USER[‘550e8400-e29b-41d4-a716-446655440001’][‘user_name’] = ‘test1; set USER[‘550e8400-e29b-41d4-a716-446655440001’][‘password’] = ‘222111’; set USER[‘550e8400-e29b-41d4-a716-446655440001’][‘create_timestamp’] = 1329836819890001; |
5.2.2. Friend
The friends mean: who are the user X following?
The key points should be under consideration:
- The key should be the uuid of the user X
- The timestamp when the relationship is built is the column (for friend sorting) and the friend’s uuid is the value. Wow again here. Right?
Let’s say the two users we created are friends each other.
So the sample data model will be as following:
ColumnFamily: FRIEND |
||
Key |
Columns |
|
550e8400-e29b-41d4-a716-446655440000 |
name |
value |
|
“1329836819859000” |
“550e8400-e29b-41d4-a716-446655440001” |
|
If the guy has more friends, insert colums here |
|
550e8400-e29b-41d4-a716-446655440001 |
name |
value |
|
“1329836819781000” |
“550e8400-e29b-41d4-a716-446655440000” |
|
If the guy has more friends, insert colums here |
The first record means the user X is 550e8400-e29b-41d4-a716-446655440000 and his/her friend is 550e8400-e29b-41d4-a716-446655440001 and the relationship is established at timestamp of 1329836819859000.
Here is the create script:
create column family FRIEND with comparator = UTF8Type LongType and key_validation_class = UTF8Type and default_validation_class = UTF8Type; |
No more column name definitions here? Yes, Cassandra is a so-called schema-free data store. Wow!
And the insert script/CLI for showcase only:
set FRIEND[‘550e8400-e29b-41d4-a716-446655440000’][‘1329836819859000’] = ‘550e8400-e29b-41d4-a716-446655440001;
set FRIEND[‘550e8400-e29b-41d4-a716-446655440001’][‘1329836819781000’] = ‘550e8400-e29b-41d4-a716-446655440000; |
5.2.3. Follower
The Follower is a reversed concept compared to Friend: Who are following user X?
The key points should be under consideration:
- The key should be the uuid of the user X
- The timestamp when the relationship is built is the column (for follower sorting) and the follower’s user uuid is the value.
Actually the logic should be within the same transaction of friend creation. So we’d like to follow the sample in Friend chapter.
So the sample data model will be as following:
ColumnFamily: FOLLOWER |
||
Key |
Columns |
|
550e8400-e29b-41d4-a716-446655440001 |
name |
value |
|
“1329836819859000” |
“550e8400-e29b-41d4-a716-446655440000” |
|
If the guy has more friends, insert colums here |
|
550e8400-e29b-41d4-a716-446655440000 |
name |
value |
|
“1329836819781000” |
“550e8400-e29b-41d4-a716-446655440001” |
|
If the guy has more friends, insert colums here |
Here is the create script:
create column family FOLLOWER with comparator = UTF8Type LongType and key_validation_class = UTF8Type and default_validation_class = UTF8Type; |
And the insert script/CLI for showcase only:
set FOLLOWER[‘550e8400-e29b-41d4-a716-446655440001’][‘1329836819859000’’] = ‘550e8400-e29b-41d4-a716-446655440000;
set FOLLOWER[‘550e8400-e29b-41d4-a716-446655440000’][‘329836819781000’] = ‘550e8400-e29b-41d4-a716-446655440001; |
5.2.4. Tweet & Timeline
The tweets are the soul of Twitter.
The key points should be under consideration:
- How to get my tweets?
- How to get my friends’ tweets without join?
- How to sort all tweets including mine and my friends’.
That’s why Twitter imported the concept of Timeline.
Let’s imagine something like this (please correct me if I’m wrong on following discussions):
<!--
Copied from: http://user.services.openoffice.org/en/forum/viewtopic.php?f=9&t=32185
All the events (tweets) are going along the time.
The Timeline means the line with the specified user’s all related events including
- The events (tweets) I sent
- The events (tweets) my friends sent
So the tweets are inserted to CF of Tweet but need to add one more CF: Timeline.
CAUTION: The following learning experiences/exercises might be not correct, please take your own risks if you still want to read on. But of course, any feedback is welcome.
5.2.4.1. Tweet
ColumnFamily: TWEET |
||
Key |
Columns |
|
550e8400-e29b-41d4-a716-446655440011 |
name |
value |
|
“user_uuid” |
“550e8400-e29b-41d4-a716-446655440000” |
|
“tweet_content” |
“Hello world: 11” |
550e8400-e29b-41d4-a716-446655440012 |
name |
value |
|
“user_uuid” |
“550e8400-e29b-41d4-a716-446655440000” |
|
“tweet_content” |
“Hello world: 12” |
550e8400-e29b-41d4-a716-446655440021 |
name |
value |
|
“user_uuid” |
“550e8400-e29b-41d4-a716-446655440001” |
|
“tweet_content” |
“Hello world: 21” |
550e8400-e29b-41d4-a716-446655440022 |
name |
value |
|
“user_uuid” |
“550e8400-e29b-41d4-a716-446655440001” |
|
“tweet_content” |
“Hello world: 22” |
Here is the create script:
create column family TWEET with comparator = UTF8Type and key_validation_class = UTF8Type and default_validation_class = UTF8Type and column_metadata = [ {column_name: user_uuid, validation_class: UTF8Type} {column_name: tweet_content, validation_class: UTF8Type} ]; |
And the insert script/CLI for showcase only:
set TWEET['550e8400-e29b-41d4-a716-446655440011']['user_uuid'] = '550e8400-e29b-41d4-a716-446655440000'; set TWEET['550e8400-e29b-41d4-a716-446655440011']['tweet_content'] = 'Hello world: 11';
set TWEET['550e8400-e29b-41d4-a716-446655440012']['user_uuid'] = '550e8400-e29b-41d4-a716-446655440000'; set TWEET['550e8400-e29b-41d4-a716-446655440012']['tweet_content'] = 'Hello world: 12';
set TWEET['550e8400-e29b-41d4-a716-446655440021']['user_uuid'] = '550e8400-e29b-41d4-a716-446655440001'; set TWEET['550e8400-e29b-41d4-a716-446655440021']['tweet_content'] = 'Hello world: 21';
set TWEET['550e8400-e29b-41d4-a716-446655440022']['user_uuid'] = '550e8400-e29b-41d4-a716-446655440001'; set TWEET['550e8400-e29b-41d4-a716-446655440022']['tweet_content'] = 'Hello world: 22'; |
5.2.4.2. Timeline
ColumnFamily: TIMELINE |
||
Key |
Columns |
|
550e8400-e29b-41d4-a716-446655440000 |
name |
value |
|
“1329883039824000” |
“550e8400-e29b-41d4-a716-446655440011” |
|
“1329883039825000” |
“550e8400-e29b-41d4-a716-446655440021” |
|
“1329883039934000” |
“550e8400-e29b-41d4-a716-446655440012” |
|
“1329883039935000” |
“550e8400-e29b-41d4-a716-446655440022” |
550e8400-e29b-41d4-a716-446655440001 |
name |
value |
|
“1329883039824000” |
“550e8400-e29b-41d4-a716-446655440011” |
|
“1329883039825000” |
“550e8400-e29b-41d4-a716-446655440021” |
|
“1329883039934000” |
“550e8400-e29b-41d4-a716-446655440012” |
|
“1329883039935000” |
“550e8400-e29b-41d4-a716-446655440022” |
Here is the create script:
create column family TIMELINE with comparator = UTF8Type LongType and key_validation_class = UTF8Type and default_validation_class = UTF8Type; |
And the insert script/CLI for showcase only:
set TIMELINE['550e8400-e29b-41d4-a716-446655440000']['1329883039824000'] = '550e8400-e29b-41d4-a716-446655440011'; set TIMELINE['550e8400-e29b-41d4-a716-446655440000']['1329883039825000'] = '550e8400-e29b-41d4-a716-446655440021'; set TIMELINE['550e8400-e29b-41d4-a716-446655440000']['1329883039834000'] = '550e8400-e29b-41d4-a716-446655440012'; set TIMELINE['550e8400-e29b-41d4-a716-446655440000']['1329883039835000'] = '550e8400-e29b-41d4-a716-446655440022';
set TIMELINE['550e8400-e29b-41d4-a716-446655440001']['1329883039824000'] = '550e8400-e29b-41d4-a716-446655440011'; set TIMELINE['550e8400-e29b-41d4-a716-446655440001']['1329883039825000'] = '550e8400-e29b-41d4-a716-446655440021'; set TIMELINE['550e8400-e29b-41d4-a716-446655440001']['1329883039834000'] = '550e8400-e29b-41d4-a716-446655440012'; set TIMELINE['550e8400-e29b-41d4-a716-446655440001']['1329883039835000'] = '550e8400-e29b-41d4-a716-446655440022'; |
发表评论
-
Apache Cassandra Learning Step by Step (5): 实战性的JTwissandra项目
2012-02-25 22:08 2670在完成了Apache Cassandra的四个基本学习步骤之后 ... -
Apache Cassandra Learning Step by Step (3): Samples ABC
2012-02-16 16:48 2796====16 Feb 2012, by Bright Zhen ... -
Apache Cassandra Learning Step by Step (2): Core Concepts
2012-02-15 21:04 2359====15 Feb 2012, by Bright Zhen ... -
Apache Cassandra Learning Step by Step (1)
2012-02-14 21:58 2722By Bright Zheng (IT进行时) 1. A ...
相关推荐
Learning Apache Cassandra - Second Edition by Sandeep Yarabarla English | 25 Apr. 2017 | ASIN: B01N52R0B5 | 360 Pages | AZW3 | 10.68 MB Key Features Install Cassandra and set up multi-node clusters ...
嵌入式八股文面试题库资料知识宝典-华为的面试试题.zip
训练导控系统设计.pdf
嵌入式八股文面试题库资料知识宝典-网络编程.zip
人脸转正GAN模型的高效压缩.pdf
少儿编程scratch项目源代码文件案例素材-几何冲刺 转瞬即逝.zip
少儿编程scratch项目源代码文件案例素材-鸡蛋.zip
嵌入式系统_USB设备枚举与HID通信_CH559单片机USB主机键盘鼠标复合设备控制_基于CH559单片机的USB主机模式设备枚举与键盘鼠标数据收发系统支持复合设备识别与HID
嵌入式八股文面试题库资料知识宝典-linux常见面试题.zip
面向智慧工地的压力机在线数据的预警应用开发.pdf
基于Unity3D的鱼类运动行为可视化研究.pdf
少儿编程scratch项目源代码文件案例素材-霍格沃茨魔法学校.zip
少儿编程scratch项目源代码文件案例素材-金币冲刺.zip
内容概要:本文深入探讨了HarmonyOS编译构建子系统的作用及其技术细节。作为鸿蒙操作系统背后的关键技术之一,编译构建子系统通过GN和Ninja工具实现了高效的源代码到机器代码的转换,确保了系统的稳定性和性能优化。该系统不仅支持多系统版本构建、芯片厂商定制,还具备强大的调试与维护能力。其高效编译速度、灵活性和可扩展性使其在华为设备和其他智能终端中发挥了重要作用。文章还比较了HarmonyOS编译构建子系统与安卓和iOS编译系统的异同,并展望了其未来的发展趋势和技术演进方向。; 适合人群:对操作系统底层技术感兴趣的开发者、工程师和技术爱好者。; 使用场景及目标:①了解HarmonyOS编译构建子系统的基本概念和工作原理;②掌握其在不同设备上的应用和优化策略;③对比HarmonyOS与安卓、iOS编译系统的差异;④探索其未来发展方向和技术演进路径。; 其他说明:本文详细介绍了HarmonyOS编译构建子系统的架构设计、核心功能和实际应用案例,强调了其在万物互联时代的重要性和潜力。阅读时建议重点关注编译构建子系统的独特优势及其对鸿蒙生态系统的深远影响。
嵌入式八股文面试题库资料知识宝典-奇虎360 2015校园招聘C++研发工程师笔试题.zip
嵌入式八股文面试题库资料知识宝典-腾讯2014校园招聘C语言笔试题(附答案).zip
双种群变异策略改进RWCE算法优化换热网络.pdf
内容概要:本文详细介绍了基于瞬时无功功率理论的三电平有源电力滤波器(APF)仿真研究。主要内容涵盖并联型APF的工作原理、三相三电平NPC结构、谐波检测方法(ipiq)、双闭环控制策略(电压外环+电流内环PI控制)以及SVPWM矢量调制技术。仿真结果显示,在APF投入前后,电网电流THD从21.9%降至3.77%,显著提高了电能质量。 适用人群:从事电力系统研究、电力电子技术开发的专业人士,尤其是对有源电力滤波器及其仿真感兴趣的工程师和技术人员。 使用场景及目标:适用于需要解决电力系统中谐波污染和无功补偿问题的研究项目。目标是通过仿真验证APF的有效性和可行性,优化电力系统的电能质量。 其他说明:文中提到的仿真模型涉及多个关键模块,如三相交流电压模块、非线性负载、信号采集模块、LC滤波器模块等,这些模块的设计和协同工作对于实现良好的谐波抑制和无功补偿至关重要。
内容概要:本文探讨了在工业自动化和物联网交汇背景下,构建OPC DA转MQTT网关软件的需求及其具体实现方法。文中详细介绍了如何利用Python编程语言及相关库(如OpenOPC用于读取OPC DA数据,paho-mqtt用于MQTT消息传递),完成从OPC DA数据解析、格式转换到最终通过MQTT协议发布数据的关键步骤。此外,还讨论了针对不良网络环境下数据传输优化措施以及后续测试验证过程。 适合人群:从事工业自动化系统集成、物联网项目开发的技术人员,特别是那些希望提升跨协议数据交换能力的专业人士。 使用场景及目标:适用于需要在不同通信协议间建立高效稳定的数据通道的应用场合,比如制造业生产线监控、远程设备管理等。主要目的是克服传统有线网络限制,实现在不稳定无线网络条件下仍能保持良好性能的数据传输。 其他说明:文中提供了具体的代码片段帮助理解整个流程,并强调了实际部署过程中可能遇到的问题及解决方案。
基于C#实现的检测小说章节的重复、缺失、广告等功能+源码+项目文档,适合毕业设计、课程设计、项目开发。项目源码已经过严格测试,可以放心参考并在此基础上延申使用,详情见md文档 基于C#实现的检测小说章节的重复、缺失、广告等功能+源码+项目文档,适合毕业设计、课程设计、项目开发。项目源码已经过严格测试,可以放心参考并在此基础上延申使用,详情见md文档~ 基于C#实现的检测小说章节的重复、缺失、广告等功能+源码+项目文档,适合毕业设计、课程设计、项目开发。项目源码已经过严格测试,可以放心参考并在此基础上延申使用,详情见md文档 基于C#实现的检测小说章节的重复、缺失、广告等功能+源码+项目文档,适合毕业设计、课程设计、项目开发。项目源码已经过严格测试,可以放心参考并在此基础上延申使用,详情见md文档 基于C#实现的检测小说章节的重复、缺失、广告等功能+源码+项目文档,适合毕业设计、课程设计、项目开发。项目源码已经过严格测试,可以放心参考并在此基础上延申使用,详情见md文档 基于C#实现的检测小说章节的重复、缺失、广告等功能+源码+项目文档,适合毕业设计、课程设计、项目开发。项目源码已经过严格测试,可以放心参考并在此基础上延申使用,详情见md文档