- 浏览: 2556867 次
- 性别:
- 来自: 成都
文章分类
最新评论
-
nation:
你好,在部署Mesos+Spark的运行环境时,出现一个现象, ...
Spark(4)Deal with Mesos -
sillycat:
AMAZON Relatedhttps://www.godad ...
AMAZON API Gateway(2)Client Side SSL with NGINX -
sillycat:
sudo usermod -aG docker ec2-use ...
Docker and VirtualBox(1)Set up Shared Disk for Virtual Box -
sillycat:
Every Half an Hour30 * * * * /u ...
Build Home NAS(3)Data Redundancy -
sillycat:
3 List the Cron Job I Have>c ...
Build Home NAS(3)Data Redundancy
Classification(1)Find Phrases from String
1. Find Import Phrase in All the Content
Start my Local Zeppelin
> bin/zeppelin-daemon.sh start
Because My local Zeppelin is connecting to my virtual box yarn cluster. So I need to start my virtual box and ubuntu-master, ubuntu-dev1, ubuntu-dev2.
How to Load Jar
z.load("org.scalaz:scalaz-core_2.10:7.2.0-M2")
How to Connect to S3
val rdd = sc.textFile("s3n://sillycat/jobs.csv")
How to Add Customer Jar to Zeppelin
in the file zeppelin-env.sh
export ZEPPELIN_JAVA_OPTS="-Dspark.jars=/home/spark-seed-assembly-0.0.1.jar,/home/classifier-assembly-1.0.jar"
README.md Format will Help a lot
# Classification System #
### What is this repository for? ###
* NLP and classification
### How do I get set up? (TODO)###
* Summary of set up
Special Character in HTML
http://www.degraeve.com/reference/specialcharacters.php
Really Nice Codes to Filter the Charactors
IncludetextMunging.scala
IncludeTextMungingSpec.scala
Get Phrases from One String
/**
* Counts phrases using a sliding window.
*
* Example:
* In: getPhrasesInTitle(Job("foo foo foo foo foo foo", ""), 2)
* Out: Map( -> 0, foo foo -> 5)
*
* In: getPhrasesInTitle(Job("foo foo foo foo foo foo bar foo", ""), 2)
* Out: Map( -> 0, foo foo -> 5, foo bar -> 1, bar foo -> 1)
*/
def getPhrasesInTitle(job: Job, numWordsInPhrase: Int) = {
val phrases = job.title.split(" ").sliding(numWordsInPhrase).foldLeft(Map("" -> 0)) {
(phraseCounts: Map[String, Int], phrase: Array[String]) =>
phrase.size == numWordsInPhrase match {
case true =>
val str = phrase.mkString(" ")
val count = phraseCounts.getOrElse(str, 0) + 1
phraseCounts + (str -> count)
case false =>
phraseCounts
}
}
phrases - ""
}
One Map Operation
scala> val m1 = Map( ""->0, "s1" ->1)
val m2 = m1 - ""
m2: scala.collection.immutable.Map[String,Int] = Map(s1 -> 1)
val m3 = m2 - "s1"
m3: scala.collection.immutable.Map[String,Int] = Map()
Merge Map
http://stackoverflow.com/questions/20047080/scala-merge-map
http://www.nimrodstech.com/scala-map-merge/
Then merge the map by map1 |+| map2
https://github.com/scalaz/scalaz
How to add scalaz-core in your class path
https://keramida.wordpress.com/2013/12/02/using-sbt-to-experiment-with-new-scala-libraries/
Directly on Command
> wget http://central.maven.org/maven2/org/scalaz/scalaz-core_2.10/7.1.3/scalaz-core_2.10-7.1.3.jar
> scala -cp scalaz-core_2.10-7.1.3.jar
scala> import scalaz.Scalaz._
scala> val k1 = Map( "key"->1, "key22"->3)
k1: scala.collection.immutable.Map[String,Int] = Map(key -> 1, key22 -> 3)
scala> val k2 = Map( "key1"->11, "key122"->13)
k2: scala.collection.immutable.Map[String,Int] = Map(key1 -> 11, key122 -> 13)
scala> val k3 = k1 |+| k2
k3: scala.collection.immutable.Map[String,Int] = Map(key1 -> 11, key122 -> 13, key -> 1, key22 -> 3)
Or put the jar in one place and this will work
> scala -cp lib/*
The Whole Flow of Phrase Finding will be
item = “foo foo foo foo” —> Map(“foo foo” -> 4, “ok hello” -> 3)
items.map( item => ).reduce(_ |+| _ )
Scala Skill Tip
1. How to use _
var className: ClassName = _
similar to
var className: ClassName = null
2. foldLeft/: and foldRight:\ and fold
val numbers = List(5,1,3,3)
numbers.fold(0) { (z, i) =>
z+i
}
This function will init the 0, use 0 and add one element in the list, the result will be 5, then the result will add another element in the list.
Another UseCase
class Foo(val name: String, val age: Int, val sex: Symbol)
object Foo {
def apply(name:String, age:Int, sex: Symbol) = new Foo(name, age, sex)
}
val fooList = Foo(“Carl”, 33, ‘male) :: Foo(“Kiko”, 23, ‘female) :: Nil
val stringList = fooList.foldLeft(List[String]()) { (z, f) =>
val title = f.sex match {
case ‘male => “Mr."
case ‘female => “Ms."
}
z :+ s”$title ${f.name}, ${f.age}"
} //stringList(0) Mr. Carl, 33
folerLeft will begin from Left, folderRight will from Right, fold will be no order.
3. Iterator.Sliding
sliding[B>:A](size: Int, step: Int) size of the window, step of the window
scala> (1 to 5).iterator.sliding(3).toList
res0: List[Seq[Int]] = List(List(1, 2, 3), List(2, 3, 4), List(3, 4, 5))
scala> (1 to 5).iterator.sliding(4, 3).toList
res1: List[Seq[Int]] = List(List(1, 2, 3, 4), List(4, 5))
scala> (1 to 5).iterator.sliding(4, 3).withPartial(false).toList
res2: List[Seq[Int]] = List(List(1, 2, 3, 4))
References:
scala underscore
http://stackoverflow.com/questions/8000903/what-are-all-the-uses-of-an-underscore-in-scala
foldLeft
http://hongjiang.info/foldleft-and-foldright/
http://www.iteblog.com/archives/1228
sliding
http://daily-scala.blogspot.com/2009/11/iteratorsliding.html
http://hongjiang.info/scala-counting-reduplicated-character/
1. Find Import Phrase in All the Content
Start my Local Zeppelin
> bin/zeppelin-daemon.sh start
Because My local Zeppelin is connecting to my virtual box yarn cluster. So I need to start my virtual box and ubuntu-master, ubuntu-dev1, ubuntu-dev2.
How to Load Jar
z.load("org.scalaz:scalaz-core_2.10:7.2.0-M2")
How to Connect to S3
val rdd = sc.textFile("s3n://sillycat/jobs.csv")
How to Add Customer Jar to Zeppelin
in the file zeppelin-env.sh
export ZEPPELIN_JAVA_OPTS="-Dspark.jars=/home/spark-seed-assembly-0.0.1.jar,/home/classifier-assembly-1.0.jar"
README.md Format will Help a lot
# Classification System #
### What is this repository for? ###
* NLP and classification
### How do I get set up? (TODO)###
* Summary of set up
Special Character in HTML
http://www.degraeve.com/reference/specialcharacters.php
Really Nice Codes to Filter the Charactors
IncludetextMunging.scala
IncludeTextMungingSpec.scala
Get Phrases from One String
/**
* Counts phrases using a sliding window.
*
* Example:
* In: getPhrasesInTitle(Job("foo foo foo foo foo foo", ""), 2)
* Out: Map( -> 0, foo foo -> 5)
*
* In: getPhrasesInTitle(Job("foo foo foo foo foo foo bar foo", ""), 2)
* Out: Map( -> 0, foo foo -> 5, foo bar -> 1, bar foo -> 1)
*/
def getPhrasesInTitle(job: Job, numWordsInPhrase: Int) = {
val phrases = job.title.split(" ").sliding(numWordsInPhrase).foldLeft(Map("" -> 0)) {
(phraseCounts: Map[String, Int], phrase: Array[String]) =>
phrase.size == numWordsInPhrase match {
case true =>
val str = phrase.mkString(" ")
val count = phraseCounts.getOrElse(str, 0) + 1
phraseCounts + (str -> count)
case false =>
phraseCounts
}
}
phrases - ""
}
One Map Operation
scala> val m1 = Map( ""->0, "s1" ->1)
val m2 = m1 - ""
m2: scala.collection.immutable.Map[String,Int] = Map(s1 -> 1)
val m3 = m2 - "s1"
m3: scala.collection.immutable.Map[String,Int] = Map()
Merge Map
http://stackoverflow.com/questions/20047080/scala-merge-map
http://www.nimrodstech.com/scala-map-merge/
Then merge the map by map1 |+| map2
https://github.com/scalaz/scalaz
How to add scalaz-core in your class path
https://keramida.wordpress.com/2013/12/02/using-sbt-to-experiment-with-new-scala-libraries/
Directly on Command
> wget http://central.maven.org/maven2/org/scalaz/scalaz-core_2.10/7.1.3/scalaz-core_2.10-7.1.3.jar
> scala -cp scalaz-core_2.10-7.1.3.jar
scala> import scalaz.Scalaz._
scala> val k1 = Map( "key"->1, "key22"->3)
k1: scala.collection.immutable.Map[String,Int] = Map(key -> 1, key22 -> 3)
scala> val k2 = Map( "key1"->11, "key122"->13)
k2: scala.collection.immutable.Map[String,Int] = Map(key1 -> 11, key122 -> 13)
scala> val k3 = k1 |+| k2
k3: scala.collection.immutable.Map[String,Int] = Map(key1 -> 11, key122 -> 13, key -> 1, key22 -> 3)
Or put the jar in one place and this will work
> scala -cp lib/*
The Whole Flow of Phrase Finding will be
item = “foo foo foo foo” —> Map(“foo foo” -> 4, “ok hello” -> 3)
items.map( item => ).reduce(_ |+| _ )
Scala Skill Tip
1. How to use _
var className: ClassName = _
similar to
var className: ClassName = null
2. foldLeft/: and foldRight:\ and fold
val numbers = List(5,1,3,3)
numbers.fold(0) { (z, i) =>
z+i
}
This function will init the 0, use 0 and add one element in the list, the result will be 5, then the result will add another element in the list.
Another UseCase
class Foo(val name: String, val age: Int, val sex: Symbol)
object Foo {
def apply(name:String, age:Int, sex: Symbol) = new Foo(name, age, sex)
}
val fooList = Foo(“Carl”, 33, ‘male) :: Foo(“Kiko”, 23, ‘female) :: Nil
val stringList = fooList.foldLeft(List[String]()) { (z, f) =>
val title = f.sex match {
case ‘male => “Mr."
case ‘female => “Ms."
}
z :+ s”$title ${f.name}, ${f.age}"
} //stringList(0) Mr. Carl, 33
folerLeft will begin from Left, folderRight will from Right, fold will be no order.
3. Iterator.Sliding
sliding[B>:A](size: Int, step: Int) size of the window, step of the window
scala> (1 to 5).iterator.sliding(3).toList
res0: List[Seq[Int]] = List(List(1, 2, 3), List(2, 3, 4), List(3, 4, 5))
scala> (1 to 5).iterator.sliding(4, 3).toList
res1: List[Seq[Int]] = List(List(1, 2, 3, 4), List(4, 5))
scala> (1 to 5).iterator.sliding(4, 3).withPartial(false).toList
res2: List[Seq[Int]] = List(List(1, 2, 3, 4))
References:
scala underscore
http://stackoverflow.com/questions/8000903/what-are-all-the-uses-of-an-underscore-in-scala
foldLeft
http://hongjiang.info/foldleft-and-foldright/
http://www.iteblog.com/archives/1228
sliding
http://daily-scala.blogspot.com/2009/11/iteratorsliding.html
http://hongjiang.info/scala-counting-reduplicated-character/
发表评论
-
Stop Update Here
2020-04-28 09:00 320I will stop update here, and mo ... -
NodeJS12 and Zlib
2020-04-01 07:44 481NodeJS12 and Zlib It works as ... -
Docker Swarm 2020(2)Docker Swarm and Portainer
2020-03-31 23:18 373Docker Swarm 2020(2)Docker Swar ... -
Docker Swarm 2020(1)Simply Install and Use Swarm
2020-03-31 07:58 373Docker Swarm 2020(1)Simply Inst ... -
Traefik 2020(1)Introduction and Installation
2020-03-29 13:52 340Traefik 2020(1)Introduction and ... -
Portainer 2020(4)Deploy Nginx and Others
2020-03-20 12:06 433Portainer 2020(4)Deploy Nginx a ... -
Private Registry 2020(1)No auth in registry Nginx AUTH for UI
2020-03-18 00:56 441Private Registry 2020(1)No auth ... -
Docker Compose 2020(1)Installation and Basic
2020-03-15 08:10 378Docker Compose 2020(1)Installat ... -
VPN Server 2020(2)Docker on CentOS in Ubuntu
2020-03-02 08:04 460VPN Server 2020(2)Docker on Cen ... -
Buffer in NodeJS 12 and NodeJS 8
2020-02-25 06:43 390Buffer in NodeJS 12 and NodeJS ... -
NodeJS ENV Similar to JENV and PyENV
2020-02-25 05:14 482NodeJS ENV Similar to JENV and ... -
Prometheus HA 2020(3)AlertManager Cluster
2020-02-24 01:47 426Prometheus HA 2020(3)AlertManag ... -
Serverless with NodeJS and TencentCloud 2020(5)CRON and Settings
2020-02-24 01:46 340Serverless with NodeJS and Tenc ... -
GraphQL 2019(3)Connect to MySQL
2020-02-24 01:48 252GraphQL 2019(3)Connect to MySQL ... -
GraphQL 2019(2)GraphQL and Deploy to Tencent Cloud
2020-02-24 01:48 454GraphQL 2019(2)GraphQL and Depl ... -
GraphQL 2019(1)Apollo Basic
2020-02-19 01:36 330GraphQL 2019(1)Apollo Basic Cl ... -
Serverless with NodeJS and TencentCloud 2020(4)Multiple Handlers and Running wit
2020-02-19 01:19 316Serverless with NodeJS and Tenc ... -
Serverless with NodeJS and TencentCloud 2020(3)Build Tree and Traverse Tree
2020-02-19 01:19 323Serverless with NodeJS and Tenc ... -
Serverless with NodeJS and TencentCloud 2020(2)Trigger SCF in SCF
2020-02-19 01:18 298Serverless with NodeJS and Tenc ... -
Serverless with NodeJS and TencentCloud 2020(1)Running with Component
2020-02-19 01:17 314Serverless with NodeJS and Tenc ...
相关推荐
【船级社】 DNV Rules for classification Rules for classification_ High speed and light craft (RU-HSLC) 2022-07.pdf
在"PRHW1Solution.pdf"和"HW1 solution.pdf"中,可能包含了对以上概念的实际应用解析,比如通过具体的数据集进行实验,展示如何应用所学知识解决实际问题。解题过程可能会包含代码实现、结果解释和性能评估,这对于...
This book is a practical guide that explains the classification algorithms provided in Apache Mahout with the help of actual examples. Starting with the introduction of classification and model ...
Pattern Classification duda 课后答案 Pattern Classification duda 课后答案 Pattern Classification duda 课后答案 Pattern Classification duda 课后答案 Pattern Classification duda 课后答案
《Moving Target Classification and Tracking from Real-time Video》一文由卡内基梅隆大学机器人研究所的Alan J. Lipton、Hironobu Fujiyoshi和Raju S. Patil共同撰写,提出了一种端到端的方法,用于从实时视频流...
二元分类(Binary Classification)是机器学习领域中的一个重要概念,主要目标是将数据分为两个不同的类别。在实际应用中,这种技术广泛应用于垃圾邮件检测、医学诊断、信用评分、情感分析等多个场景。在这个主题中...
1. **支持向量机(SVM)**:SVM是一种二分类模型,它的基本模型是定义在特征空间上的间隔最大的线性分类器,间隔最大使它有别于感知机;SVM还包括核技巧,这使得它成为实质上的非线性分类器。在classification ...
1. 下载名为Classification_toolbox.zip的压缩文件。 2. 将压缩文件解压到一个新的目录中。 3. 在MATLAB命令窗口中输入`addpath <directory>`,将新目录添加到MATLAB的搜索路径中,其中`<directory>`应替换为实际的...
二值化是图像处理的一种基本方法,它将图像简化为黑(0)和白(1)两种颜色,有利于后续的纹理分析和特征提取。在这个阶段,图像的边缘和形状得以清晰呈现,对于特征提取尤其有利。 接下来,对这些二值化图片进行...
Deep Learning for the Classification Deep Learning for the Classification Deep Learning for the Classification
Pattern classification Second Edition David G. Stork Richard O. Duda Peter E. Hart 中文翻译人员: 李宏东 姚天翔
《李宏毅Classification课程解析与实战》 在深度学习领域,分类问题占据着核心地位,尤其是在计算机视觉、自然语言处理等多个领域。李宏毅教授的"Classification"课程,旨在为学生提供全面而深入的分类技术理解,...
Pattern classification
pattern classification duda的ppt课件
JCOS之Classification.ppt
### 模式分类 Pattern Classification #### 一、概述与背景 《模式分类》是一本在模式识别领域享有盛誉的经典著作。自首次出版以来,已过去了超过一个季度世纪的时间,作者们在这期间对本书进行了全面的修订与更新...
《Pattern Classification》(第二版)由R.O. Duda、P.E. Hart与D.G. Stork合著,是一本在模式识别领域内极具影响力的教材。本书不仅涵盖了模式识别的基础理论,还深入探讨了各种高级算法和技术。David G. Stork为...