`
endual
  • 浏览: 3566009 次
  • 性别: Icon_minigender_1
  • 来自: 杭州
社区版块
存档分类
最新评论

在java代码中用weka

    博客分类:
  • java
 
阅读更多

 























guest|Join|Help|Sign In
     
 
The most common components you might want to use are
  • Instances - your data
  • Filter - for preprocessing the data
  • Classifier/Clusterer - built on the processed data
  • Evaluating - how good is the classifier/clusterer?
  • Attribute selection - removing irrelevant attributes from your data

The following sections explain how to use them in your own code. A link to an example class can be found at the end of this page, under the Links section. The classifiers and filters always list their options in the Javadoc API (bookstabledeveloper version) specification.

You might also want to check out the Weka Examples collection, containing examples for the different versions of Weka. Another, more comprehensive, source of information is the chapter Using the API of the Weka manual for the stable-3.6 and developer version (snapshots and releases later than 09/08/2009).

Instances

ARFF File

Pre 3.5.5 and 3.4.x

Reading from an ARFF file is straightforward:
 import weka.core.Instances;
 import java.io.BufferedReader;
 import java.io.FileReader;
 ...
 BufferedReader reader = new BufferedReader(
                              new FileReader("/some/where/data.arff"));
 Instances data = new Instances(reader);
 reader.close();
 // setting class attribute
 data.setClassIndex(data.numAttributes() - 1);

The class index indicates the target attribute used for classification. By default, in an ARFF file, it is the last attribute, which explains why it's set to numAttributes-1.
You must set it if your instances are used as a parameter of a weka function (e.g.,: weka.classifiers.Classifier.buildClassifier(data))

3.5.5 and newer

The DataSource class is not limited to ARFF files. It can also read CSV files and other formats (basically all file formats that Weka can import via its converters).
 import weka.core.converters.ConverterUtils.DataSource;
 ...
 DataSource source = new DataSource("/some/where/data.arff");
 Instances data = source.getDataSet();
 // setting class attribute if the data format does not provide this information
 // For example, the XRFF format saves the class attribute information as well
 if (data.classIndex() == -1)
   data.setClassIndex(data.numAttributes() - 1);

Database

Reading from Databases is slightly more complicated, but still very easy. First, you'll have to modify your DatabaseUtils.props file to reflect your database connection. Suppose you want to connect to a MySQL server that is running on the local machine on the default port 3306. The MySQL JDBC driver is called Connector/J. (The driver class is org.gjt.mm.mysql.Driver.) The database where your target data resides is called some_database. Since you're only reading, you can use the default user nobody without a password. Your props file must contain the following lines:
 jdbcDriver=org.gjt.mm.mysql.Driver
 jdbcURL=jdbc:mysql://localhost:3306/some_database
Secondly, your Java code needs to look like this to load the data from the database:
 import weka.core.Instances;
 import weka.experiment.InstanceQuery;
 ...
 InstanceQuery query = new InstanceQuery();
 query.setUsername("nobody");
 query.setPassword("");
 query.setQuery("select * from whatsoever");
 // You can declare that your data set is sparse
 // query.setSparseData(true);
 Instances data = query.retrieveInstances();

Notes:
  • Don't forget to add the JDBC driver to your CLASSPATH.
  • For MS Access, you must use the JDBC-ODBC-bridge that is part of a JDK. The Windows databases article explains how to do this.
  • InstanceQuery automatically converts VARCHAR database columns to NOMINAL attributes, and long TEXT database columns to STRING attributes. So if you use InstanceQuery to do text mining against text that appears in a VARCHAR column, Weka will regard such text as nominal values. Thus it will fail to tokenize and mine that text. Use the NominalToString or StringToNominal filter (package weka.filters.unsupervised.attribute) to convert the attributes into the correct type.

Option handling

Weka schemes that implement the weka.core.OptionHandler interface, such as classifiers, clusterers, and filters, offer the following methods for setting and retrieving options:
  • void setOptions(String[] options)
  • String[] getOptions()
There are several ways of setting the options:
  • Manually creating a String array:
 String[] options = new String[2];
 options[0] = "-R";
 options[1] = "1";
  • Using a single command-line string and using the splitOptions method of the weka.core.Utils class to turn it into an array:
 String[] options = weka.core.Utils.splitOptions("-R 1");
  • Using the   class to automatically turn a command line into code. Especially handy if the command line contains nested classes that have their own options, such as kernels for SMO:
 java OptionsToCode weka.classifiers.functions.SMO
  • will generate output like this:
 // create new instance of scheme
 weka.classifiers.functions.SMO scheme = new weka.classifiers.functions.SMO();
 // set options
 scheme.setOptions(weka.core.Utils.splitOptions("-C 1.0 -L 0.0010 -P 1.0E-12 -N 0 -V -1 -W 1 -K \"weka.classifiers.functions.supportVector.PolyKernel -C 250007 -E 1.0\""));
Also, the   tool allows you to view a nested options string, e.g., used at the command line, as a tree. This can help you spot nesting errors.

Filter

A filter has two different properties:
  • supervised or unsupervised
    either takes the class attribute into account or not
  • attribute- or instance-based
    e.g., removing a certain attribute or removing instances that meet a certain condition

Most filters implement the OptionHandler interface, which means you can set the options via a String array, rather than setting them each manually via set-methods.
For example, if you want to remove the first attribute of a dataset, you need this filter
 weka.filters.unsupervised.attribute.Remove
with this option
 -R 1
If you have an Instances object, called data, you can create and apply the filter like this:
 import weka.core.Instances;
 import weka.filters.Filter;
 import weka.filters.unsupervised.attribute.Remove;
 ...
 String[] options = new String[2];
 options[0] = "-R";                                    // "range"
 options[1] = "1";                                     // first attribute
 Remove remove = new Remove();                         // new instance of filter
 remove.setOptions(options);                           // set options
 remove.setInputFormat(data);                          // inform filter about dataset **AFTER** setting options
 Instances newData = Filter.useFilter(data, remove);   // apply filter

Filtering on-the-fly

The FilteredClassifier meta-classifier is an easy way of filtering data on the fly. It removes the necessity of filtering the data before the classifier can be trained. Also, the data need not be passed through the trained filter again at prediction time. The following is an example of using this meta-classifier with the Remove filter and J48 for getting rid of a numeric ID attribute in the data:
 import weka.classifiers.meta.FilteredClassifier;
 import weka.classifiers.trees.J48;
 import weka.filters.unsupervised.attribute.Remove;
 ...
 Instances train = ...         // from somewhere
 Instances test = ...          // from somewhere
 // filter
 Remove rm = new Remove();
 rm.setAttributeIndices("1");  // remove 1st attribute
 // classifier
 J48 j48 = new J48();
 j48.setUnpruned(true);        // using an unpruned J48
 // meta-classifier
 FilteredClassifier fc = new FilteredClassifier();
 fc.setFilter(rm);
 fc.setClassifier(j48);
 // train and make predictions
 fc.buildClassifier(train);
 for (int i = 0; i < test.numInstances(); i++) {
   double pred = fc.classifyInstance(test.instance(i));
   System.out.print("ID: " + test.instance(i).value(0));
   System.out.print(", actual: " + test.classAttribute().value((int) test.instance(i).classValue()));
   System.out.println(", predicted: " + test.classAttribute().value((int) pred));
 }

Other handy meta-schemes in Weka:

Batch filtering

On the command line, you can enable a second input/output pair (via -r and -s) with the -b option, in order to process the second file with the same filter setup as the first one. Necessary, if you're using attribute selection or standardization - otherwise you end up with incompatible datasets. This is done fairly easy, since one initializes the filter only once with the setInputFormat(Instances) method, namely with the training set, and then applies the filter subsequently to the training set and the test set. The following example shows how to apply the Standardize filter to a train and a test set.
 Instances train = ...   // from somewhere
 Instances test = ...    // from somewhere
 Standardize filter = new Standardize();
 filter.setInputFormat(train);  // initializing the filter once with training set
 Instances newTrain = Filter.useFilter(train, filter);  // configures the Filter based on train instances and returns filtered instances
 Instances newTest = Filter.useFilter(test, filter);    // create new test set

Calling conventions

The setInputFormat(Instances) method always has to be the last call before the filter is applied, e.g., with Filter.useFilter(Instances,Filter)Why? First, it is the convention for using filters and, secondly, lots of filters generate the header of the output format in the setInputFormat(Instances) method with the currently set options (setting otpions after this call doesn't have any effect any more).

Classification

The necessary classes can be found in this package:
 weka.classifiers

Building a Classifier

Batch

A Weka classifier is rather simple to train on a given dataset. E.g., we can train an unpruned C4.5 tree algorithm on a given dataset data. The training is done via the buildClassifier(Instances) method.
 import weka.classifiers.trees.J48;
 ...
 String[] options = new String[1];
 options[0] = "-U";            // unpruned tree
 J48 tree = new J48();         // new instance of tree
 tree.setOptions(options);     // set the options
 tree.buildClassifier(data);   // build classifier

Incremental

Classifiers implementing the weka.classifiers.UpdateableClassifier interface can be trained incrementally. This conserves memory, since the data doesn't have to be loaded into memory all at once. See the Javadoc of this interface to see what classifiers are implementing it.

The actual process of training an incremental classifier is fairly simple:
  • Call buildClassifier(Instances) with the structure of the dataset (may or may not contain any actual data rows).
  • Subsequently call the updateClassifier(Instance) method to feed the classifier new weka.core.Instance objects, one by one.

Here is an example using data from a weka.core.converters.ArffLoader to train weka.classifiers.bayes.NaiveBayesUpdateable:
 // load data
 ArffLoader loader = new ArffLoader();
 loader.setFile(new File("/some/where/data.arff"));
 Instances structure = loader.getStructure();
 structure.setClassIndex(structure.numAttributes() - 1);
 
 // train NaiveBayes
 NaiveBayesUpdateable nb = new NaiveBayesUpdateable();
 nb.buildClassifier(structure);
 Instance current;
 while ((current = loader.getNextInstance(structure)) != null)
   nb.updateClassifier(current);

A working example is  .

Evaluating

Cross-validation

If you only have a training set and no test you might want to evaluate the classifier by using 10 times 10-fold cross-validation. This can be easily done via the Evaluation class. Here we seed the random selection of our folds for the CV with 1. Check out the Evaluation class for more information about the statistics it produces.
 import weka.classifiers.Evaluation;
 import java.util.Random;
 ...
 Evaluation eval = new Evaluation(newData);
 eval.crossValidateModel(tree, newData, 10, new Random(1));

Note: The classifier (in our example tree) should not be trained when handed over to the crossValidateModel method. Why? If the classifier does not abide to the Weka convention that a classifier must be re-initialized every time the buildClassifiermethod is called (in other words: subsequent calls to the buildClassifier method always return the same results), you will get inconsistent and worthless results. The crossValidateModel takes care of training and evaluating the classifier. (It creates a copy of the original classifier that you hand over to the crossValidateModel for each run of the cross-validation.)

Train/test set

In case you have a dedicated test set, you can train the classifier and then evaluate it on this test set. In the following example, a J48 is instantiated, trained and then evaluated. Some statistics are printed to stdout:
 import weka.core.Instances;
 import weka.classifiers.Evaluation;
 import weka.classifiers.trees.J48;
 ...
 Instances train = ...   // from somewhere
 Instances test = ...    // from somewhere
 // train classifier
 Classifier cls = new J48();
 cls.buildClassifier(train);
 // evaluate classifier and print some statistics
 Evaluation eval = new Evaluation(train);
 eval.evaluateModel(cls, test);
 System.out.println(eval.toSummaryString("\nResults\n======\n", false));

Statistics

Some methods for retrieving the results from the evaluation:
  • nominal class
    • correct() - number of correctly classified instances (see also incorrect())
    • pctCorrect() - percentage of correctly classified instances (see also pctIncorrect())
    • kappa() - Kappa statistics
  • numeric class
    • correlationCoefficient() - correlation coefficient
  • general
    • meanAbsoluteError() - the mean absolute error
    • rootMeanSquaredError() - the root mean squared error
    • unclassified() - number of unclassified instances
    • pctUnclassified() - percentage of unclassified instances

If you want to have the exact same behavior as from the command line, use this call:
 import weka.classifiers.trees.J48;
 import weka.classifiers.Evaluation;
 ...
 String[] options = new String[2];
 options[0] = "-t";
 options[1] = "/some/where/somefile.arff";
 System.out.println(Evaluation.evaluateModel(new J48(), options));

ROC curves/AUC

Since Weka 3.5.1, you can also generate ROC curves/AUC with the predictions Weka recorded during testing. You can access these predictions via the predictions() method of the Evaluation class. See the Generating ROC curve article for a full example of how to generate ROC curves.

Classifying instances

In case you have an unlabeled dataset that you want to classify with your newly trained classifier, you can use the following code snippet. It loads the file /some/where/unlabeled.arff, uses the previously built classifier tree to label the instances, and saves the labeled data as /some/where/labeled.arff.
 import java.io.BufferedReader;
 import java.io.BufferedWriter;
 import java.io.FileReader;
 import java.io.FileWriter;
 import weka.core.Instances;
 ...
 // load unlabeled data
 Instances unlabeled = new Instances(
                         new BufferedReader(
                           new FileReader("/some/where/unlabeled.arff")));
 
 // set class attribute
 unlabeled.setClassIndex(unlabeled.numAttributes() - 1);
 
 // create copy
 Instances labeled = new Instances(unlabeled);
 
 // label instances
 for (int i = 0; i < unlabeled.numInstances(); i++) {
   double clsLabel = tree.classifyInstance(unlabeled.instance(i));
   labeled.instance(i).setClassValue(clsLabel);
 }
 // save labeled data
 BufferedWriter writer = new BufferedWriter(
                           new FileWriter("/some/where/labeled.arff"));
 writer.write(labeled.toString());
 writer.newLine();
 writer.flush();
 writer.close();

Note on nominal classes:
  • If you're interested in the distribution over all the classes, use the method distributionForInstance(Instance). This method returns a double array with the probability for each class.
  • The returned double value from classifyInstance (or the index in the array returned by distributionForInstance) is just the index for the string values in the attribute. That is, if you want the string representation for the class label returned above clsLabel, then you can print it like this:
System.out.println(clsLabel + " -> " + unlabeled.classAttribute().value((int) clsLabel));

Clustering

Clustering is similar to classification. The necessary classes can be found in this package:
 weka.clusterers

Building a Clusterer

Batch

A clusterer is built in much the same way as a classifier, but the buildClusterer(Instances) method instead of buildClassifier(Instances). The following code snippet shows how to build an EM clusterer with a maximum of 100 iterations.
 import weka.clusterers.EM;
 ...
 String[] options = new String[2];
 options[0] = "-I";                 // max. iterations
 options[1] = "100";
 EM clusterer = new EM();   // new instance of clusterer
 clusterer.setOptions(options);     // set the options
 clusterer.buildClusterer(data);    // build the clusterer

Incremental

Clusterers implementing the weka.clusterers.UpdateableClusterer interface can be trained incrementally (available since version 3.5.4). This conserves memory, since the data doesn't have to be loaded into memory all at once. See the Javadoc for this interface to see which clusterers implement it.

The actual process of training an incremental clusterer is fairly simple:
  • Call buildClusterer(Instances) with the structure of the dataset (may or may not contain any actual data rows).
  • Subsequently call the updateClusterer(Instance) method to feed the clusterer new weka.core.Instance objects, one by one.
  • Call updateFinished() after all Instance objects have been processed, for the clusterer to perform additional computations.

Here is an example using data from a weka.core.converters.ArffLoader to train weka.clusterers.Cobweb:
 // load data
 ArffLoader loader = new ArffLoader();
 loader.setFile(new File("/some/where/data.arff"));
 Instances structure = loader.getStructure();
 
 // train Cobweb
 Cobweb cw = new Cobweb();
 cw.buildClusterer(structure);
 Instance current;
 while ((current = loader.getNextInstance(structure)) != null)
   cw.updateClusterer(current);
 cw.updateFinished();

A working example is  .

Evaluating

For evaluating a clusterer, you can use the ClusterEvaluation class. In this example, the number of clusters found is written to output:
 import weka.clusterers.ClusterEvaluation;
 import weka.clusterers.Clusterer;
 ...
 ClusterEvaluation eval = new ClusterEvaluation();
 Clusterer clusterer = new EM();                                 // new clusterer instance, default options
 clusterer.buildClusterer(data);                                 // build clusterer
 eval.setClusterer(clusterer);                                   // the cluster to evaluate
 eval.evaluateClusterer(newData);                                // data to evaluate the clusterer on
 System.out.println("# of clusters: " + eval.getNumClusters());  // output # of clusters

Or, in the case of density based clusters, you can cross-validate the clusterer (Note: with MakeDensityBasedClusterer you can turn any clusterer into a density-based one):
 import weka.clusterers.ClusterEvaluation;
 import weka.clusterers.DensityBasedClusterer;
 import weka.core.Instances;
 import java.util.Random;
 ...
 Instances data = ...                                     // from somewhere
 DensityBasedClusterer clusterer = new ...                // the clusterer to evaluate
 double logLikelyhood =
    ClusterEvaluation.crossValidateModel(                 // cross-validate
    clusterer, data, 10,                                  // with 10 folds
    new Random(1));                                       // and random number generator with seed 1

Or, if you want the same behavior/print-out from command line, use this call:
 import weka.clusterers.EM;
 import weka.clusterers.ClusterEvaluation;
 ...
 String[] options = new String[2];
 options[0] = "-t";
 options[1] = "/some/where/somefile.arff";
 System.out.println(ClusterEvaluation.evaluateClusterer(new EM(), options));

Clustering instances

The only difference with regard to classification is the method name. Instead of classifyInstance(Instance), it is now clusterInstance(Instance). The method for obtaining the distribution is still the same, i.e.,distributionForInstance(Instance).

Classes to clusters evaluation

If your data contains a class attribute and you want to check how well the generated clusters fit the classes, you can perform a so-called classes to clusters evaluation. The Weka Explorer offers this functionality, and it's quite easy to implement. These are the necessary steps (complete source code:  ):
  • load the data and set the class attribute
 Instances data = new Instances(new BufferedReader(new FileReader("/some/where/file.arff")));
 data.setClassIndex(data.numAttributes() - 1);
  • generate the class-less data to train the clusterer with
 weka.filters.unsupervised.attribute.Remove filter = new weka.filters.unsupervised.attribute.Remove();
 filter.setAttributeIndices("" + (data.classIndex() + 1));
 filter.setInputFormat(data);
 Instances dataClusterer = Filter.useFilter(data, filter);
  • train the clusterer, e.g., EM
 EM clusterer = new EM();
 // set further options for EM, if necessary...
 clusterer.buildClusterer(dataClusterer);
  • evaluate the clusterer with the data still containing the class attribute
 ClusterEvaluation eval = new ClusterEvaluation();
 eval.setClusterer(clusterer);
 eval.evaluateClusterer(data);
  • print the results of the evaluation to stdout
 System.out.println(eval.clusterResultsToString());

Attribute selection

There is no real need to use the attribute selection classes directly in your own code, since there are already a meta-classifier and a filter available for applying attribute selection, but the low-level approach is still listed for the sake of completeness. The following examples all use CfsSubsetEval and GreedyStepwise (backwards). The code listed below is taken from the  .

Meta-Classifier

The following meta-classifier performs a preprocessing step of attribute selection before the data gets presented to the base classifier (in the example here, this is J48).
  Instances data = ...  // from somewhere
  AttributeSelectedClassifier classifier = new AttributeSelectedClassifier();
  CfsSubsetEval eval = new CfsSubsetEval();
  GreedyStepwise search = new GreedyStepwise();
  search.setSearchBackwards(true);
  J48 base = new J48();
  classifier.setClassifier(base);
  classifier.setEvaluator(eval);
  classifier.setSearch(search);
  // 10-fold cross-validation
  Evaluation evaluation = new Evaluation(data);
  evaluation.crossValidateModel(classifier, data, 10, new Random(1));
  System.out.println(evaluation.toSummaryString());

Filter

The filter approach is straightforward: after setting up the filter, one just filters the data through the filter and obtains the reduced dataset.
  Instances data = ...  // from somewhere
  AttributeSelection filter = new AttributeSelection();  // package weka.filters.supervised.attribute!
  CfsSubsetEval eval = new CfsSubsetEval();
  GreedyStepwise search = new GreedyStepwise();
  search.setSearchBackwards(true);
  filter.setEvaluator(eval);
  filter.setSearch(search);
  filter.setInputFormat(data);
  // generate new data
  Instances newData = Filter.useFilter(data, filter);
  System.out.println(newData);

Low-level

If neither the meta-classifier nor filter approach is suitable for your purposes, you can use the attribute selection classes themselves.
  Instances data = ...  // from somewhere
  AttributeSelection attsel = new AttributeSelection();  // package weka.attributeSelection!
  CfsSubsetEval eval = new CfsSubsetEval();
  GreedyStepwise search = new GreedyStepwise();
  search.setSearchBackwards(true);
  attsel.setEvaluator(eval);
  attsel.setSearch(search);
  attsel.SelectAttributes(data);
  // obtain the attribute indices that were selected
  int[] indices = attsel.selectedAttributes();
  System.out.println(Utils.arrayToString(indices));

Note on randomization

Most machine learning schemes, like classifiers and clusterers, are susceptible to the ordering of the data. Using a different seed for randomizing the data will most likely produce a different result. For example, the Explorer, or a classifier/clusterer run from the command line, uses only a seeded java.util.Random number generator, whereas the weka.core.Instances.getgetRandomNumberGenerator(int) (which the   uses) also takes the data into account for seeding. Unless one runs 10-fold cross-validation 10 times and averages the results, one will most likely get different results.

See also


Examples

The following are a few sample classes for using various parts of the Weka API:


Links

 
     

 

分享到:
评论

相关推荐

    Java实现对Weka算法的应用案例

    Weka提供了许多预处理工具,如`Remove`, `ReplaceMissingValues`, `Normalize`, `Scale`等,可以方便地在Java代码中调用。 5. **分类算法**:Weka包含各种经典的分类算法,如决策树(C4.5, J48)、随机森林、朴素...

    JAVA课设使用weka包算法代码

    JAVA课设使用weka包算法代码

    java中调用weka

    在Java中调用Weka可以让开发者们更方便地使用Weka的功能。本文将介绍如何在Java中调用Weka,并对Weka中的主要组件进行介绍。 首先,我们需要了解Weka中的主要组件。Weka中有五种主要的组件:Instances、Filter、...

    Java调用weka神经网络算法预测股票 代码及数据

    在`Test2.java`中,我们可能看到创建并配置神经网络模型的代码,如设定层数、每层神经元数量、激活函数等参数。 4. **训练模型**:使用`Classifier`类的`buildClassifier`方法,将`Instances`数据集传递给模型进行...

    weka连接mysql数据库,完美修改配置(java代码)

    在Weka中,我们不能直接读取数据库,但可以通过编写Java代码来实现这一目标。Weka的API提供了`DataSource`类,它可以加载各种数据源,包括数据库。因此,我们需要借助`JDBC`(Java Database Connectivity)驱动来...

    使用Eclipse在Java中调用weka

    接下来,为了在Java代码中使用Weka,你需要导入必要的类和方法。Weka的核心类库提供了一个名为Classifier的接口,用于表示所有的分类器。RBFtree是基于径向基函数的决策树算法,它继承自Classifier接口。因此,你...

    java调用weka

    java调用weka。weka是很好用的机器学习库,这里就不详细介绍了。 言归正传,要使用程序方式使用weka,步骤如下: 一、在eclipse里新建一个Java project: 1. 建立工程:单击菜单中file-&gt;new-&gt;Java project,在弹出...

    Weka各类分类器的使用(Java)

    本文将详细介绍如何使用Weka在Java中实现分类器的使用,包括配置MyEclipse2013+Weka3.6+libsvm3.18+Jdk1.7+Win8.1环境,使用Weka实现分类器的步骤,以及使用LibSVM实现分类器的步骤。 一、配置环境 在使用Weka之前...

    weka-3.7.3.jar- java开发包

    通过在 Java 项目中引入这个 JAR 包,开发者可以使用 Java 代码调用 Weka 的各种算法,进行数据处理和建模工作。例如,你可以创建一个 `DataSource` 对象加载数据,使用 `Classifier` 或 `Clusterer` 进行训练和预测...

    JAVA-weka包.zip

    在Java中使用Weka,可以方便地集成到各种应用系统中,进行复杂的数据分析任务。 **1. 数据预处理** 在数据挖掘过程中,预处理是非常重要的步骤。Weka提供了多种预处理方法,包括: - 缺失值处理:可以使用不同的...

    weka开发java版jar包和源码

    在Java开发中,如果要将Weka集成到项目中,首先需要在项目的类路径中添加`weka.jar`。这样,就可以直接使用Weka提供的类和方法,例如`weka.classifiers.Classifier`类用于训练和预测模型,`weka.core.Instances`类...

    Java实现随机森林算法

    在Java中实现随机森林算法通常需要使用机器学习库,比如Weka或者Apache Spark的MLlib。下面我将展示一个使用Weka库的简单示例,来说明如何使用随机森林算法对数据进行分类。 首先,你需要在项目中引入Weka库。如果...

    weka java 开发软件

    "weka.jar"是Weka的核心库,包含了所有预定义的数据挖掘算法和工具,开发者可以直接在Java代码中引用这个库,调用Weka的功能。而"weka_src.jar"则包含了Weka的源代码,这对于开发者来说非常宝贵,因为可以查看并理解...

    weka源码学习

    Weka开发[-1]——在你的代码中使用Weka 51 挖掘多标签数据综述(multi-label data mining)[Available] 62 数据流-移动超平面(HyperPlane)构造 63 Weka开发[17]——关联规则之Apriori 66 Weka开发[18]——寻找K...

    weka生成arff文件的简单代码

    通过以上步骤,我们可以使用Weka的Java API从自定义的Java对象生成ARFF文件,这在进行机器学习项目时非常有用,因为它允许我们方便地将数据转换为Weka能够识别的格式。在实际应用中,你可能还需要处理更复杂的数据...

    weka-src.rar_ weka_Weka 聚类_java 数据挖掘_weka java_聚类 java

    标题中的“weka-src.rar”指的是Weka的数据挖掘工具的源代码压缩包,而“weka_Weka 聚类_java 数据挖掘_weka java_聚类 java”这部分描述了该软件的主要功能,即Weka在Java环境下进行数据挖掘,特别是聚类分析。Weka...

    python-weka-wrapper, 使用javabridge的Weka的python 包装器.zip

    python-weka-wrapper, 使用javabridge的Weka的python 包装器 python-weka-wrapper使用库的Java机器学习工作台 Weka的python 包装器。要求:python 2.7 ( 用于 python 3版本,请参见这里的 )javabridge (&gt; = 1.0.1

    weka的JAVA程序

    这个压缩包文件包含的是基于 Java 编写的 Weka 程序,意味着你可以通过编写 Java 代码来利用 Weka 的功能进行数据分析和建模。Weka 提供了丰富的 API 接口,使得在 Java 环境中集成数据预处理、分类、聚类、回归等...

    weka-3-4-12.rar_SVM in JAVA_weka_weka平台实现

    "weka-3-4-12.rar_SVM in JAVA_weka_weka平台实现" 这个标题表明我们关注的是一个关于在Java环境下使用Weka平台实现支持向量机(SVM)的教程或者项目。Weka是新西兰怀卡托大学开发的一个开源数据挖掘工具,它提供了...

    weka-src.zip_FPGROWTH.java weka_fpgrowth weka_weka_weka src

    这个压缩包`weka-src.zip`包含了Weka的核心源代码,特别是`FPGROWTH.java`,它是Weka中实现频繁项集挖掘算法FPGrowth的源文件。通过对这些源代码的学习和分析,我们可以深入了解Weka的工作机制以及FPGrowth算法的...

Global site tag (gtag.js) - Google Analytics