- 浏览: 543736 次
- 性别:
- 来自: 上海
文章分类
- 全部博客 (231)
- 一个操作系统的实现 (20)
- 汇编(NASM) (12)
- Linux编程 (11)
- 项目管理 (4)
- 计算机网络 (8)
- 设计模式(抽象&封装) (17)
- 数据结构和算法 (32)
- java基础 (6)
- UML细节 (2)
- C/C++ (31)
- Windows (2)
- 乱七八糟 (13)
- MyLaB (6)
- 系统程序员-成长计划 (8)
- POJ部分题目 (10)
- 数学 (6)
- 分布式 & 云计算 (2)
- python (13)
- 面试 (1)
- 链接、装载与库 (11)
- java并行编程 (3)
- 数据库 (0)
- 体系结构 (3)
- C++ template / STL (4)
- Linux环境和脚本 (6)
最新评论
-
chuanwang66:
默默水塘 写道typedef void(*Fun)(void) ...
C++虚函数表(转) -
默默水塘:
typedef void(*Fun)(void);
C++虚函数表(转) -
lishaoqingmn:
写的很好,例子简单明了,将观察者模式都表达了出来。
这里是ja ...
观察者模式——Observer
Combining Probabilities
Suppose that Mr. Smith, who is correct 75% of the time, claims that a certain event X will NOT occur. It would seem on this basis that the probability of X ocurring is 0.25. On the other hand, Mr. Jones, who is correct 60% of the time, claims that X WILL occur. Given both of these predictions, with their respective reliabilities, what is the probability that X will occur? The problem is underspecified. Essentially the overall context has 3 parameters, S (Smith's prediction), J (Jones's prediction), R (the real outcome). Thus, letting "y" and "n" respectively denote X and not X, the eight possibilities along with their probabilities are S J R probability - - - ----------- n n n p0 n n y p1 n y n p2 n y y p3 y n n p4 y n y p5 y y n p6 y y y p7 where p0 + p1 + p2 + p3 + p4 + p5 + p6 + p7 = 1 Also, since S=R 75% of the time, we have p0 + p2 + p5 + p7 = 0.75 and since J=R 60% of the time, we have p0 + p3 + p4 + p7 = 0.60 From this we want to determine the probability that R=y given that S=n and J=y. Thus, we need to find the value of p3/(p2+p3), which is the probability of [n y y] divided by the probability of [n y *], where "*" indicates "either y or n". Clearly the problem is under- specified. Setting A=p0+p7, B=p1+p6, C=p2+p5, and D=p3+p4, the conditions can be written as A+B+C+D = 1.00 A +C = 0.75 A +D = 0.60 which is three linear equations in four unknowns (with the extra constraint that each probability is in the interval 0 to 1), so there are infinitely many solutions. For example, we can set A=0.5, B=0.25, C=0.15, and D=0.10, and satisfy all three equations, but we could also set A=0.6, B=0.25, C=0.15, and D=0.00. Furthermore, even if we arbitrarily select one of these solutions, there are still infinitely many ways of partitioning C and D to give the values of p2 and p3. For example, suppose we take the solution with C=0.15 and D=0.10. We then have p2+p5 = 0.15 and p3+p4 = 0.10. If we take p4=0.10 and p5=0.00, we have p3=0.00 and p2=0.15, so the probability of X is 0. On the other hand, we can equally well take p4=0.00 and p5=0.15, which gives p3=0.10 and p2=0.00, so the probability of X is 1. Thus, any answer from 0 to 1 is strictly consistent with the stated conditions. Nevertheless, in real life the problems we confront are often (always?) underspecified, and our customers will likely not be satisfied with that as an answer. Are there any "reasonable" assumptions we could make, in the absence of more information, that would enable us to give a "reasonable" answer? One approach would be to estimate (guess) how much correlation exists between the correctness of J and S. For example, since S seems to be smarter than J, we might assume that S is correct whenever J is correct, as well as being correct on some experiments when J is incorrect. This would imply that p3=p4=0 and p2>0, so the probability of X is 0. -------------------------------------------------------------------------------------------------------------------- 预测独立+结果的先验概率是1/2 Another approach would be to assume that the correctness of S and J's predictions are statistically independent两人的预测和另外一人统计独立 , in the sense that they are each just as likely to be right regardless of whether the other is right or wrong. This assumption implies p0+p7 p3+p4 0.60 = ------------- = -------------- p0+p2+p5+p7 p1+p3+p4+p6 即,P(J正确)=P(J正确|S正确)=P(J正确|S错误) and p0+p7 p2+p5 0.75 = ------------ = ------------- p0+p3+p4+p7 p1+p2+p5+p6 即,P(S正确)=P(S正确|J正确)=P(S正确|J错误) Letting u=0.6 and v=0.75 denote the probabilities of correctness for Jones and Smith respectively, these equations together with the previous constraints uniquely determine the four sums p0+p7 = uv = 9/20 p1+p6 = (1-u)(1-v) = 2/20 p2+p5 = (1-u)v = 6/20 p3+p4 = u(1-v) = 3/20 but this still doesn't uniquely determine the value of p3/(p2+p3). We need at least one more assumption. I would suggest that we assume symmetry between "y" and "n". In other words, assume that probability of any combination is equal to the probability of the complementaty combination, given by changing each "y" to an "n" and vice versa . This amounts to(相当于) the assumption that X (the real answer) has an a priori probability of 1/2 , AND that the probability of predicting correctly is the same regardless of whether R is "y" or "n". On this basis we have p0=p7, p1=p6, p2=p5, and p3=p4, so we have p2=6/40, p3=3/40, and p3/(p2+p3) = 1/3. Therefore, assuming S and J are not correlated, and assuming "y" and "n" are symmetrical, the probability of X, given that S[75%] says X will not occur and J[60%] says it will, is 33.3%. Any (positive) correlation between S and J would tend to lower this probability. -------------------------------------------------------------------------------------------------------------------- 推广一:预测独立+结果的先验概率是x Notice that our resulting value for the probability of X is not equal to the a priori value of 1/2 that we assumed by imposing(vi. vt施加影响) symmetry between "y" and "n". If we have some a_priori reason to believe the probability of X is something different than 1/2, we could re-do the calculation using this value. (Of course, we cannot use the computed probability of X for the particular conditions at hand, because the a_priori probability of X applies to all possible conditions, not just when Smith says it won't occur and Jones says it will.) To account for(对…做出解释) this additional information (if we have it), we can let x denote the a_priori probability of X, and then write the individual state probabilities as p0 = uv(1-x) p7 = uvx p1 = (1-u)(1-v)x p6 = (1-u)(1-v)(1-x) p2 = (1-u)v(1-x) p5 = (1-u)vx p3 = u(1-v)x p4 = u(1-v)(1-x) On this basis the probability of X is p3 u(1-v)x Pr{X} = ------- = --------------------- p2 + p3 (1-u)v(1-x) + u(1-v)x Naturally if we have no knowledge of the a_priori probability of X, we just assume x=1/2, and this formula reduces to the one given previously. For a slightly more complicated case, suppose Mr. Red's ability to correctly identify the outcome of a TRUE/FALSE experiment is 75%, Mr. Green's is 60% and Mr. Blue's is 55%. If Mr. Blue, Mr. Green, and Mr. Red all agree that the outcome of the experiment is TRUE, is the resulting probability of "TRUE" 75% or is it weighted somewhere between 75% and 55% ? -------------------------------------------------------------------------------------------------------------------- 推广二:两两独立+"y"/"n"对称 Again this is underspecified, but if we impose the assumptions of (1) pairwise independence两两独立 and (2) "y"/"n" symmetry变量对称, then in the general case of N prognosticators these two assumptions are sufficient to uniquely determine the answer. In other words, if N people with reliabilities r1, r2, ..., rN have each predicted the outcome will be 'TRUE', and if we assume the correctness of their predictions have no correlation, and that there is symmetry between TRUE and FALSE outcomes, then the probability of a "TRUE" outcome is (r1)(r2)...(rN) Pr{TRUE} = --------------------------------------- (r1)(r2)...(rN) + (1-r1)(1-r2)...(1-rN) Thus, in the particular example described above with r1=3/4, r2=3/5, and r3=11/20, the probability of "TRUE" is 11/13 (i.e., about 84.6%). Let Q=[q1,q2,...,qN] denote a logical vector (i.e., each component qj is either "TRUE" or "FALSE") and let Q' denote the complement of Q. Also, define / 1-r if q=FALSE f(r,q) = ( \ r if q=TRUE and let F(Q) denote the product of f(ri,qi), i=1 to N. Then the probability that the outcome will be TRUE given the predictions Q is given by F(Q) Pr{TRUE} = ------------- F(Q) + F(Q') This result is formally correct, given the stated assumptions, but as discussed earlier, the most important thing to realize about these problems is that they are underspecified and have no definite answer. For example, if the a_priori probability of the outcome "TRUE" is known to be x, then the above formula becomes x F(Q) Pr{TRUE} = -------------------- x F(Q) + (1-x) F(Q') Given various sets of assumptions, all of which satisfy the stated conditions of the problem, the correct probability can have any value from 0.0 to 1.0. The formula P = F(Q)/(F(Q)+F(Q')) is valid ONLY for one specific set of assumptions, and those assumptions are not particularly realistic. It assumes that the correctness of Smith's predictions is totally uncorrelated with the correctness of Jones's predictions, which would almost certainly NOT be the case in any realistic situation. (It's much more likely that Jones and Smith use at least some of the same criteria for making their predictions). To really answer the original question we would need to supply more information, specifically, the probabilities of each of the eight possible combinations of predictions and outcomes, as discussed previously. For another example, suppose the Yankees and the Red Sox are playing, and the Red Sox have won 70% of their games, and the Yankees have won 50% of their games. What is the probability that the Yankees will win? Again the context is clearly underspecified, because the conditions of the question can be met by many different contexts, leading to many different outcome distributions. However, if we need to assign a probability based on this information alone, it's clear that our answer must assume the probability of Y beating R is some function of y and w (the fraction of games wone by Y and R respectively). Thus we need a function F(y,r) such that Pr{Y beats R} = F(y,r) It follows that F(y,r) + F(r,y) = 1 and 0 <= F(x,y) <= 1 for any x,y in [0,1]. One class of functions that satisfies this requirement is f(y) F(y,r) = ----------- f(y) + f(r) where f is any mapping from [0,1] to [0,+inf]. For example, suppose y=0.5 and r=0.7. Taking f(x) = x this gives Y a 41.7% chance of winning and R a 58.3% chance of winning. More generally, if we set f(x) = x^k and reduce the exponent k so it approaches 0, the probabilities approach 50/50, whereas with k greater than 1 the probability of Y winning goes to zero. What is the "best" or optimal choice for f(x)? We might assume each team has a "skill level", and this level is distributed binomially. Then, given the percentage of games won by a certain team we could infer the skill level by integration over the whole population, assuming that each team plays every other team the same number of times, and assuming Pr{i beats j} = si/(si+sj). Another approach that is sometimes suggested is to use the expression (Y)(R)/((Y)(R) + (Y')(R')), where Y' and R' are the conjugates of Y and R. The two possible outcomes are Ywins-Rloses, and Yloses-Rwins. To find the probability (only from w/l record) of R winning we would then have (Rwin)(Ylose) (.5)(.3) ----------------------------- = ------------------- = 0.3 (Rwin)(Ylose) + (Ywin)(Rlose) (.5)(.3) + (.5)(.7) This formula has a certain aesthetic appeal, but it also has some possibly counter-intuitive consequences. For example, suppose the two best teams in the league, X and Y, win x=99% and y=97% of their games, respectively. We might expect these two teams to be fairly evenly matched, which would be consistent with the formula x Pr{X beats Y} = ------- = 0.5051 x + y In contrast, the alternative formula gives x(1-y) Pr{X beats Y} = --------------- = 0.7538 x(1-y) + (1-x)y It isn't obvious to me that a 99% team should be this heavily favored over a 97% team. If this really was the applicable formula, then the presence of a 99% team in the league would almost preclude the existence of a 97% team, depending on how many teams are in the league and how often these teams play each other. One possible objection to the simple weighting function f(y) ----------- f(y) + f(r) with f(x)=x is that it seems the "system" will tend towards equilibrium. For systems of more than two teams, the teams will always come to equalibrium regardless of the initial conditions. In other words, each team would converge to the same win/loss record. On the other hand, a team with a winning percentage of .800 will have ample opportunity to sustain their winning ways by using the latter expression y(1-x) --------------- y(1-x) + (1-y)x as a model. It's a good idea to impose the overall equilibrium requirement on the whole population when deriving a model. Of course, the second model is really a special case of the "simple weighted" model. In other words, we have y(1-x) Pr{Y beats X} = --------------- y(1-x) + (1-y)x and dividing the numerator and denominator by (1-x)(1-y) gives the equivalent form y/(1-y) f(y) Pr{Y beats X} = ---------------- = ----------- y/(1-y) + x/(1-x) f(y) + f(x) where f(z) = z/(1-z). This particular function f(z) is not unique in giving a self consistent population. A more fundamental approach would be to model the underlying process. For example, suppose there are 256 ranked players in world, with skill levels ranging from 1 to 9 distributed binomially as follows skill number of level players 1 1 2 8 3 28 4 56 5 70 6 56 7 28 8 8 9 1 Of course, "skill" might be a matrix rather than a scalar, and you could get into all sorts of interesting interactions (scissors cuts paper, paper wraps stone, stone breaks scissors, etc), but let's just assume that "skill" in this game can be modelled by a simple scalar. Now we must also specify to what extent skill determines the outcome of a contest. If the game's outcome is largly determined by chance, then the world's most skillful player may only beat the least skillfull player 60% of the time. One way of modelling this is to say that the probability of player P_m beating player P_n is (s_m)^k Pr{ m beats n } = -------------------- (s_m)^k + (s_n)^k where s_j is the skill of player P_j and the constant k determines the importance of skill in this game. As k goes to 0 all the probabilities go to 0.5, meaning that the outcome of a game is only weakly determined by skill. If k is very large, then the more skillful player will almost always win. Now we have a simple but complete model for which we can compute the long-term win/loss records of each skill level. In general, for a league of 2^N players with binomially distributed skill levels and assuming a skill factor of k (and every player plays every other player equally often), the "winning percentage" of a player with skill level q is _ _ | N C(N,j) | 1 | q^k SUM ------------- | - --- |_ j=0 q^k + (j+1)^k _| 2 Win(q) = ------------------------------------- 2^N - 1 where C(N,j) is the binomial coefficient N!/((n-j)! j!). Taking N=8 and k=2, the winning percentages for each of the 9 skill levels are as shown below: skill number of winning level players percentage 1 1 4.9393 2 8 16.4711 3 28 29.5133 4 56 41.5554 5 70 51.7592 6 56 60.0785 7 28 66.7554 8 8 72.0936 9 1 76.3722 Of course, the weighted average of all these winning percentages is 50%. Also, since Win(q) is invertible, it follows that for any system of this general type the formula for predicting winners can be expressed in the form f(x)/(f(x)+f(y)).
发表评论
-
Java调用matlab ???
2012-09-16 11:38 2171参考:http://blog.csdn.net/wannsha ... -
连续、可导、可微
2012-08-15 16:11 1<!-- [if !mso]> <styl ... -
从“模线性方程”到“中国剩余定理”
2012-07-06 13:07 1877中国余数定理 即 ... -
Benford's Law
2012-05-22 14:31 766看这里: http://www.mathpages.com/h ... -
泰勒公式与人生
2012-02-05 14:49 1114如果一个函数如果 ... -
Bayes过滤器 和 Bayes分类器
2012-02-05 13:55 1075Bayes过滤器 贝叶斯推断及其互联网应用(一) http: ...
相关推荐
标题中“21.03_21_已知联合概率求边缘以及条件概率.pdf”揭示了文档的核心内容是关于概率论中边缘概率和条件概率的计算方法,特别是在已知联合概率分布的情况下。在概率论和统计学中,理解边缘概率和条件概率是非常...
已知二维联合概率密度matlab求解 二维联合概率密度 matlab 待定系数 边缘分布
对于二维情况,若随机变量X和Y具有联合概率密度函数fxy(x, y),且它们通过连续可导的函数u = h(x, y)和v = g(x, y)进行变换,那么新坐标系(u, v)下的联合概率密度函数fuv(u, v)可以通过雅各比行列式与原坐标系下的...
本课件“联合概率数据互联算法进行多目标跟踪详细课件.zip”深入探讨了这一主题,主要涵盖了三个关键知识点:多目标跟踪处理流程、数据关联算法以及联合概率数据互联算法。 首先,多目标跟踪处理流程是整个系统的...
4.3_联合概率、边缘概率、条件概率|概率统计|程序员数学
JPDAF,即Joint Probabilistic Data Association Filter(联合概率数据关联滤波器),是多目标跟踪领域中的一种重要算法。它通过考虑所有可能的数据关联方案,对每个目标的轨迹进行概率建模,从而解决了数据关联问题...
联合概率数据关联滤波(JPDAF)是一种在多目标跟踪领域广泛应用的算法,它结合了概率数据关联(PDA)和卡尔曼滤波(KF)的思想,用于处理多个目标在传感器观测下的跟踪问题。本代码是用MATLAB实现的JPDAF算法,能够...
本课件将深入探讨一种关键的多目标跟踪算法——联合概率数据互联(Joint Probabilistic Data Association, JPDA)算法。 首先,我们来看多目标跟踪的基本处理流程。这一过程通常包括四个主要步骤:检测、跟踪初始化...
联合概率数据关联 雷达数据处理上面的例子程序 确认矩阵拆分为互联矩阵子函数马上穿上来
在多目标跟踪领域,联合概率数据关联(Joint Probabilistic Data Association, JPDA)是一种有效的方法,用于解决在存在不确定性时如何将传感器观测值与多个动态目标之间建立正确关联的问题。这里提供的MATLAB代码...
在IT领域,尤其是在数据分析、机器学习以及统计建模中,概率密度曲线、求概率密度以及联合概率是至关重要的概念。这些工具和方法帮助我们理解和解释数据的分布特性,以及不同变量之间的相互关系。 首先,我们要了解...
针对智能车辆前向多传感器多目标跟踪融合问题,提出一种基于改进的联合概率数据关联的车用多传感器跟踪融合算法。首先,建立了基于改进的JPDA的单传感器多目标跟踪算法;其次,采用相关序贯关联法进行多传感器间目标...
在概率论中,联合概率(Joint Probability)是一个基础且重要的概念,它描述的是两个或多个事件同时发生的概率。本文将深入探讨联合概率的定义、性质以及如何与其他概率概念结合使用,如条件概率和贝叶斯定理。 ...
标题中的"LDA.rar_LDA c语言实现_LDA的c 实现_LDA的联合概率_c语言实现lda_lda"提到了几个关键概念,分别是LDA(Latent Dirichlet Allocation)、C语言实现、LDA的联合概率以及C语言实现的LDA(lda_lda)。...
### 利用联合概率分布评估贝叶斯网络的不确定性 #### 摘要与背景介绍 在当前的大数据时代背景下,机器学习技术取得了显著的进步,其中贝叶斯网络作为一种强大的工具,被广泛应用于数据模式识别及未来事件预测。...
"基于Gumbel copula联合概率分布的电力系统综合净负荷预测" 本文探讨了基于Gumbel copula联合概率分布的电力系统综合净负荷预测方法。该方法通过引入Gumbel copula联合概率分布模型来对负荷、风能和太阳能发电的...
基于集合预报的强对流天气联合概率预报研究 本研究以大数据和算法为基础,旨在解决强对流天气的预报问题。通过对强对流天气的研究,我们可以更好地预测天气的变化,从而减少天气灾害的影响。 1. 强对流天气联合...